random thoughts about current sed and development

This is the title of one of the longest threads that populated the sed-users@yahoogroups.com mailing list.
It started at the list's message number 2696, and ran from 31/Jan/2003 to 14/Mar/2003.

The subject was about new SED commands.
On GNU SED 4.x, several new commands were introduced.

So, we had a thread :)

Lots of issues was discussed under the same thread, so they're splitted into sections.
I've tryed join all e-mails into one single document, to a general overview.
I hope you find it readable.

The original thread e-mails can be seen by following this link
(You must be a sed-users subscriber to access the list history)

Table of Contents


Note:

Paolo Bonzini (identified as "paolo" on the following messages)
is the current GNU SED maintainer.


First Message of the Thread


From: "aurelio" <aurelio@...>
Date: Fri, 31 Jan 2003 11:16:08 -0300 (ART)
Subject: random thoughts about current sed and development

hi all,

as the subject is "give your opinion", i'll do my part. :)
i'll extend the subject and talk about new GNU sed features
and KISS.

i'm sorry my words may seem stronger than the subject desires,
but it's really frustating that my english vocabulary is so
limited. i can't write exactly the way i think, but i hope
i'll not offend anybody, this is not what i want.


i really admire paolo's work and disposal to take the
GNU-package-that-NOBODY-wanted-to-touch, improving the
regex machine speed, correcting bugs and adding the
brother commands as 'Q', 'R' and 'W'

but i'm also worried that sed will follow the ssed way,
which in my opinion "has crossed the line" and its
more-than-sed. i appreciate the ideas and code effort,
but i don't use it because i like sed simple.
no perl mimic, no multiline mode, no system
command execution.




i dislike zap command idea

   there's no need to add a command that can be substituted
   by just 2 commands, and x;s/.*//;x is not that long!

   i think sed must remain as UNIX, little commands, that
   *joined* do big things.  x;s/.*//;x is just about it.
   "know your tools"

  
i dislike \# to do count idea
i dislike the macro idea

   these two introduce new concepts on sed, which in my point
   of view, are out of the sed scope and principles.

   sed is a text editor, not a programming language.
   anything more complicated can be easily made in
   perl/python/bash/whatever.

   there is no need to bloat sed.


i also dislike sed doing Perl weird Regexes (as ssed does)

   it was not proposed, but i want to make it clear that
   i think paolo's option to NOT include it on the GNU sed
   was right.

   ssed is a personal project and can go wherever it wants.
   but GNU sed is a worlwide program that ALL Linux
   distributions use for BOOT process (init.d scripts) and
   other critical stuff, so it must no follow the trendy way,
   but stay diet.

   if you want sed to mimic perl, install miniperl instead
   and be happy! sed is older than perl and has nothing to
   do with it.
   

i dislike the 'e' command and modifier inserted into GNU sed

   sed interacting with system commands is way far from
   what sed proposes to be. sed is about text, not commands.
   
   using this command brings to sed scripts a new world of
   problems it should never have! problems that the shell
   and shell script language should be used for.
    
   now sed scripts could not work if the system command used
   by the 'e' command:

     - has removed
     - has moved
     - was a symlink and the target has moved
     - is not of PATH anymore
     - is not executable (chmod -x) anymore
     - was updated and the syntax changed
     - ... (long list)

   and of course, platform compatibility is completelly
   lost on scripts that uses that.
   

i dislike the 'M', 'm' modifiers inserted into GNU sed

   sed is not multiline. sed was never multiline.
   sed knows about line.

   using 'G' is not multiline, it is still a single
   line on pattern space and it is treated as that.


i dislike the 'L' command inserted into GNU sed

   insert fmt into sed?
   not much to talk about this one. just plain wrong.


i dislike the 'T' command inserted into GNU sed

   maybe i didn't get the point, but it is really needed?
   the if/then/else structures was already fully supported
   by plain sed, as eric shows, so why T?
   http://www.student.northpark.edu/pemente/sed/ifelse.txt
   


well, now i said what i should have said before but the
lack of time didn't allowed me to do it.

paolo, it is nothing personal. i admire you.
but as nobody droped a single line against *any* GNU sed
new feature, and they were HUGE, i wanted to speak.

maybe i'm just a KISS freak, maybe i'm a dinossaur inside
a young body, but i don't want sed to follow the way to
BLOATware as many GNU tools did. (hint: sort -u)


final thoughts:
  after all, dc.sed was written in vanilla sed.
  why us plain mortals will need more commands to "edit text"?

&;)





SED must stay diet



[aurelio]
hi all,

as the subject is "give your opinion", i'll do my part. :) i'll extend
the subject and talk about new GNU sed features and KISS.

i'm sorry my words may seem stronger than the subject desires, but
it's really frustating that my english vocabulary is so limited. i
can't write exactly the way i think, but i hope i'll not offend
anybody, this is not what i want.

i really admire paolo's work and disposal to take the
GNU-package-that-NOBODY-wanted-to-touch, improving the regex machine
speed, correcting bugs and adding the brother commands as 'Q', 'R' and
'W'

  [paolo]
  I'd add 'T' to the list.  It is actually much more useful than W,
  which is actually there more for symmetry and HHsed-compatibility
  than for anything else. (more on this later).

  [peter tillier]
  I agree totally.

[aurelio]
but i'm also worried that sed will follow the ssed way, which in my
opinion "has crossed the line" and its more-than-sed. i appreciate the
ideas and code effort, but i don't use it because i like sed simple.
no perl mimic, no multiline mode, no system command execution.

  [peter tillier]
  Interestingly I feel the same about the changes to GNU sed and some
  recently proposed changes to GNU awk.  I'm a big fan of both tools,
  I write more awk, but I'm more fond of sed.  Why do I write more
  awk? Well, because some of the scripts that I write will be
  maintained by others at work and I think that awk's syntax is easier
  to learn than sed's.

  I don't like perl much as my sig. shows.  Why?  Because of the
  things that perl is and which worry Aurelio about the latest version
  of sed. IMO perl is a bloated, everything-but-the-kitchen-sink,
  language and it's too darn big!  And I don't want sed to follow the
  same route.

    [björn]
    I agree with both Peter and aurelio. Sed is sed. It has a long
    Unix history and it is available in different versions on all Unix
    platforms. To many new features and extensions to GNU sed will
    make it not sed anymore.

  [peter tillier]
  On one OS that I use the maximum memory is 4Mb or 2Mb depending on
  the machine and I can run an ANSI-89 C compiler for that OS, plus
  awk and GNU sed up to 3.02.80 from a 1.44Mb floppy disk.  Perl 4.036
  can (just) be compiled on the 2Mb machine, but won't fit on a floppy
  alongside the compiler, awk and sed.  As to perl 5.0 it won't even
  compile on the 4Mb machine and would take a lot of floppies to
  accommodate the modules, etc..  I haven't yet tried to compile GNU
  sed 4.0 for this OS.

  One thing recently asked for once again in GNU awk (it raises its
  ugly head about once a year or so) is an include facility similar to
  the C pre-processor directive.  This is currently made available
  through the use of an external shell program called igawk on systems
  that support it, but people seem to think that it would be better if
  it was a built- in function.  No one has yet, IMO, provided a cogent
  reason for its adoption.  There's no need for something that is
  currently available in another tool.

[aurelio]
* i dislike zap command idea

there's no need to add a command that can be substituted
by just 2 commands, and x;s/.*//;x is not that long!

  [brian hiles]
  I have come rather late to this thread; indeed, I have purposely
  avoided the discussion thread of sed extensions because in my past
  capacity as a compiler writer, "language lawyer," and language
  developer, I am -- what can be the word without offending anybody?
  -- "concerned" by what I see (and have seen and seen and seen...) as
  "suboptimal" language design. Lest this degenerate into a rant, let
  me instead be constructive and give Brian's Three Rules of Language
  Design:

  (1) What _should_ work, _will_ work! (The language is consistent).
  (2) Provide _tools_, not _features_, at well-defined levels of
  abstraction. (The language is complete).
  (3) Never, EVER, tell the programmer what he or she must or should
  do. It is the ONLY duty of the language designer to satisfactorily
  fulfill rules (1) and (2) and the rest will take care of itself.

  If they sound rather putative and didactic, I admit this. It's just
  that it's so frustrating when so few programmers understand the
  mathematical concepts of language design and parsing theory that it
  cannot be explained, ironically, that the essence of proper design
  is just good common sense.

  Concerning the above, do you realize what x;s/.*//;x has to _do_
  merely to reset the "t" flag -- if I understand the context
  correctly? Except for not having a "T" command which does this, the
  following is MUCH more efficient, and doesn't eat into the command
  number limit of some legacy versions of sed(1), nor zap the hold
  buffer. RTFM!

  t label
  : label

[aurelio]
i think sed must remain as UNIX, little commands, that
*joined* do big things.  x;s/.*//;x is just about it.
"know your tools"

  [ed rosten]
  True in one way, except that z will presumable run rather faster.
  
  [peter tillier]
  I agree (sorry Eric) I much prefer the early UNIX philosophy of many
  tools that each perform a set of well-defined functions linked by
  pipes, etc.

    [björn]
    I agree as well. Maybe in the smaller picture a zap command may
    look convenient. But in the big picture, each new
    replace-two-commands-by-one command addition has a much larger
    negative impact.

[aurelio]
* i dislike \# to do count idea

  [ed rosten]
  It seems like pointless bloat to me. Awk is much more suitable for
  that kind of thing and has far greater (well easier to use)
  abilities in that regard. I'm not sure about \=. = is a completely
  useless command. I've sometimes wanted something similar, but = is
  so completely utterly hopeless, that I've gone on without it.

[aurelio]
* i dislike the macro idea

  [paolo]
  So do I.  I am mildly in favor of \= only because = is completely
  broken.

  [ed rosten]
  Agreed: One can always run sed through M4 if yow want macros. The
  tools exist.

[aurelio]
these two introduce new concepts on sed, which in my point of view,
are out of the sed scope and principles.

sed is a text editor, not a programming language. anything more
complicated can be easily made in perl/python/bash/whatever.

  [peter tillier]
  I know of some interpreters for esoteric languages that have been
  written in sed.  If you need to do counting in sed Greg Ubben has
  demonstrated how this can be achieved in dc.sed and other of his sed
  scripts.  Clearly if Aurelio can implement sokoban in sed then it is
  already a pretty powerful programming language.

[aurelio]
there is no need to bloat sed.

  [peter tillier]
  I agree.  It can already do most of these things if you want it to
  (one way or another).  I think it was Paolo who demonstrated how to
  use sed to write sed scripts (apologies here if it was someone
  else).

[aurelio]
* i also dislike sed doing Perl weird Regexes (as ssed does)

it was not proposed, but i want to make it clear that i think paolo's
option to NOT include it on the GNU sed was right.

  [peter tillier]
  I agree.  Not because I dislike the PCRE regexes, but because I
  don't want sed to become bloated.  I think the addition of EREs is
  fine, though.

    [björn]
    I agree. EREs are an established Unix(Posix) standard by now, but
    Perl REs aren't.

[aurelio]
ssed is a personal project and can go wherever it wants. but GNU sed
is a worlwide program that ALL Linux distributions use for BOOT
process (init.d scripts) and other critical stuff, so it must no
follow the trendy way, but stay diet.

  [paolo]
  Right.

[aurelio]
if you want sed to mimic perl, install miniperl instead and be happy!
sed is older than perl and has nothing to do with it.

  [peter tillier]
  As Arnold Robbins (the gawk maintainer) has sometimes written, "If
  you want perl then you know where to get it."  I sometimes
  abbreviate this as IYWPTYKWTGI - along the lines of TMTOWTDI.

[aurelio]
* i dislike the 'e' command and modifier inserted into GNU sed

sed interacting with system commands is way far from what sed proposes
to be. sed is about text, not commands.

  [paolo]
  You're right, but it makes sed usable for very simple things, like
  inserting the current date in a log-processing command, which were
  not possible otherwise.  I also use 'sed s/.../.../ | sh' pipes
  often enough that s///e is a nice addition for me.

  [ed rosten]
  I'll agree with that. Piping the result to sh or xargs is easy
  enough and has far fewer problems assosciated with it: ie none for
  sed, since it never knows.

[aurelio]
using this command brings to sed scripts a new world of problems it
should never have! problems that the shell and shell script language
should be used for.
 
now sed scripts could not work if the system command used by the 'e'
command:

- has removed
- has moved
- was a symlink and the target has moved
- is not of PATH anymore
- is not executable (chmod -x) anymore
- was updated and the syntax changed
- ... (long list)

and of course, platform compatibility is completelly lost on scripts
that uses that.

  [paolo]
  Right, and the manual warns about this.

  [peter tillier]
  This may be handy, but is it sed?  Not really. it's a cut-down
  version of some of perl's functionality.  IYWPTYKWTGI

[aurelio]
* i dislike the 'M', 'm' modifiers inserted into GNU sed

sed is not multiline. sed was never multiline.
sed knows about line.

  [paolo]
  Again, why?

  [peter tillier]
  Perl is multi-line - IYWPTYKWTGI!

[aurelio]
using 'G' is not multiline, it is still a single
line on pattern space and it is treated as that.

* i dislike the 'L' command inserted into GNU sed

insert fmt into sed?
not much to talk about this one. just plain wrong.

  [paolo]
  This is the only thing where I have the doubt of having "crossed the
  line" :-)  -i cost a lot of code (also to implement the associated
  option -s) and \[lLuUE] did as well, but they are so darn useful.

[aurelio]
* i dislike the 'T' command inserted into GNU sed

maybe i didn't get the point, but it is really needed?
the if/then/else structures was already fully supported
by plain sed, as eric shows, so why T?
http://www.student.northpark.edu/pemente/sed/ifelse.txt

  [paolo]
  The  ty bx :y sequence can be replaced by Ty.

  [ed rosten]
  I'm not sure about this one. I find the structure suggested by
  ty bx :y quite common. I have less issue with inserting small
  commands that add very little extra, and which can improve
  readability of scripts. But it isn't necessary.

  [peter tillier]
  And others have shown how to inclement while and for loops in sed if
  you really need them.

[aurelio]
well, now i said what i should have said before but the
lack of time didn't allowed me to do it.

paolo, it is nothing personal. i admire you.
but as nobody droped a single line against *any* GNU sed
new feature, and they were HUGE, i wanted to speak.

  [paolo]
  You did the right thing, of course it is nothing personal.  You
  might like to hear that I'm not going to implement $n in the s///
  command. :-)

  [peter tillier]
  I agree, too many changes make the tool too far removed from sed on
  other platforms and make it more like perl - IYWPTYKWTGI!

[aurelio]
maybe i'm just a KISS freak, maybe i'm a dinossaur inside
a young body, but i don't want sed to follow the way to
BLOATware as many GNU tools did. (hint: sort -u)

  [paolo]
  Again, I think the particular example you made is a matter of a
  feature being very economic to implement.  But it strikes me that
  with all these bloats cut -v is not there.

  [peter tillier]
  Putting too much into any tool (or language) makes it unwieldy.
  Look at C versus C++, I know which I'd prefer to program in.
  Certainly not C++, which has been through many variations until the
  standard was published. I think that the recent C standard may have
  gone too far with C.  I'm quite happy with the C89/ISO C90 standard
  thanks.

    [björn]
    This is a good example, but from another angle too; I don't think
    there is anything wrong with C++. However, C++ and C are
    completely different languages, each with their own language
    specification. C++ is still mostly backwards compatible, being a
    super set of C. Much like I imgaine super sed to be. It is a super
    set of sed, but it is also it's own program. Anyone can install
    them side by side, or replace their system sed with super sed. But
    for sed itself, just as with C, I think there is a lot of legacy
    responsibility to not extend it into something else.

[aurelio]
final thoughts:
after all, dc.sed was written in vanilla sed.

  [peter tillier]
  And on a Sun box where there are limits to the number of sed
  commands and it's still pretty efficient in operation.

[aurelio]
why us plain mortals will need more commands to "edit text"?

  [peter tillier]
  Amen to that.



Proposing of new SED commands



[brian hiles]
... Thus my "sd" debugger on Eric Pement's site. It was the program I
wrote before I commenced (and subsequently finished) writing a k/sh
lexical scanner and parser in (old) sed plus (old) awk (itself 2000
lines of code!), and implements conditional spypoint tracing on line
and/or pattern range(s), for the pattern and/or hold space, and for
all or a subset of embedded spypoints, just like any good debugger
does. sed(1) cannot step and break: hint, hint! A sed "hook" (perhaps
to be implemented with a new printf directive -- see following text)
for use with an external debugger, is the _only_ debugging command
necessary for sed.

Again, I admit that I have come late to this thread, but tolerate me
for just a bit longer, to say that I am of Peter's opinion concerning
complexity, and let me further add that ideas I have heard about the
"s" command are just plain wrong. Solve the _real_ problem and
implement extensions by allowing multiple and additional flags:

s///e # use ANSI escape sequences, including \n

  [paolo]
  GNU sed does this by default (i.e. unless you have POSIXLY_CORRECT
  set).

    [brian hiles]
    GNU sed has many, many wonderful options! But I was talking
    about (1) canonical enhancements that can be applied without
    compromising backward compatibility to (2) a sed in the same
    lineage (and design philosophy) as distribution sed(1).

      [paolo]
      I don't think POSIX allows things such as \t or \xAA, yet I
      don't see how this seriously hinders backward compatibility
      more than \+ or \|.

        [brian hiles]
		That's why a terminating option, to assert extended ANSI
		usage, is a good idea -- it provides necessarily (and often
		asked for) functionality without compromising backward
		compatibility.

[brian hiles]
s///v # allow var substs of (pushed) \1, \2, ...

  [paolo]
  Can you expand on this?

    [brian hiles]
    I'm sure that I wasn't being very clear when I wrote this; to
    wit: inasmuch as I was discussing pushing and popping regex
    environments, the ability to push/pop regex's AND pattern/hold
    AND even numbered buffers separately was key. The "v" flag would
    substitution \1, \2, ... from the current environment instead of
    the conventionally under- stood behavior. The would be like
    variables, then....

      [björn]
	  No, no, no! This is simply not sed. Use awk instead. It is
	  ridiculous to make sed into a fully fledged programming language
	  (it already is, but I hope you get my meaning). There is already
	  a multitude of other tools to choose from if you need printf,
	  file manipulation, variables, debugger, etc. Why bloat sed? Why
	  make GNU sed into something which is not sed? 

	  I would recommend anyone thinking of all these dramatic
	  extensions to sed to read the original sed manual at Laura's
	  site (http://lf.8k.com/UNIX/SED.HTM). Then ask yourself if your
	  extension is really in the spirit of sed.

	  All the extensions I have read about so far, was technically
	  possible to implement in the orginal sed (except maybe perl REs
	  :-), yet they wasn't.
      
        [paolo]
        Q is not (except at a very high cost in performance, or by
        forcing one to use -n).  R is not.

          [björn]
		  I am not sure I understand you? I am saying that most of the
		  features suggested (or implemented) now were possible to
		  implement in the original sed way back when, and so I am
		  suggesting that leaving them out was a design choice made by
		  the original authors of sed. That is why I am talking about
		  the design intentions of the original authors of sed.

            [paolo]
			Ah, I meant "it was possible to obtain their effect with
			the original sed".
      
              [björn]
			  Oh, ok. All the below is said IMO. I'd like to first say
			  that I don't think the fact that a feature is useful is
			  a strong enough motivation for including it. Any feature
			  can be "useful" depending on how you look at it. It is
			  possible to come up with an example where even the
			  horrible L command could be "useful". IMO, one always
			  has to look at the bigger picture also.

        [brian hiles]
		I amused -- or frustrated -- that we are really talking about
		exactly the same thing. I encourage you to read the entire
		thread to understand the reason I made one of my infrequent
		rants/contributions was to defer what I perceived was (1)
		another ill-conceived attempt at featuritis; (2) inelegant and
		unacceptable proposals for the bugs and/or omissions in sed(1)
		that I feel do need to be addressed.

          [björn]
		  I did read the whole thread, albeit cursory, if you are
		  referring to the thread "random thoughts about current sed
		  and development" that is. I think I understand now what you
		  mean though, even if I disagree that there are any larger
		  omissions in sed that needs to be corrected.

        [brian hiles]
        We're on the same side!

          [björn]
          I see that now.

        [brian hiles]
		What I was doing was examining a protocol for a controlled
		development upon sed(1) -- not that I was necessarily
		encouraging enhancements.

          [björn]
		  I see now that you were arguing that IF changes were to be
		  made, they should be more in line with an overall design
		  principle rather than small patches here and there, ie
		  features.

        [brian hiles]
		And BTW, there is a VERY fundamental difference between
		language completeness and "featuritis." As a talented language
		designer, I cannot entertain discussion on this until at least
		the [mathematical] paradigm is accepted.

          [björn]
		  I agree. That is why I am talking about general design
		  guidelines and dito intentions. Changing those is an
		  organised, structured way to make changes rather than adding
		  features here and there without considering the language
		  design as a whole. Am I interpreting you correctly?

            [brian hiles]
            Yes.

      [björn]
	  The design goals of sed obviously are different from those of
	  other Unix tools. I think any extension made to sed should be
	  made trying to keep the original desgin goals in mind.

	  I hope I don't come off to strong here, it is just that I feel
	  strongly about this issue, and about the Unix spirit. I don't
	  mean this as a rant. If GNU sed becomes to feature filled and
	  bloated (and I don't mean bloated as in binary size or memory
	  foot print, but rather in the featuritis sense), I personally
	  will switch to BSD sed or earlier GNU versions for use on my
	  GNU/Linux systems.
      
	  (I don't like most of the new commands I've seen from the new
	  GNU sed 4. I wasn't aware of them until I saw them described by
	  aurelio earlier in this thread.) If GNU sed is becoming super
	  sed,

        [paolo]
		Since GNU sed 4, super sed does not have anything new except
		Perl REs. That is, super sed is simply GNU sed  with a
		different regular expression matcher and with Perl REs.
      
          [björn]
          Ok. I never did look too closely on super sed, all I know
          about it is from what I've read on this group.

      [björn]
	  perhaps bug fixes could still be back ported to eg GNU sed 3.x?
	  Then there would still be a GNU sed for people who don't want
	  the bloat.

[brian hiles]
s///g,w filename # [multiple flags!]

  [paolo]
  Already there (s///gp or s///gw filename both work).

    [brian hiles]
    "s///gw filename" works? Not on my (admittedly ancient) sed(1)!
    Nice to know, though.

[brian hiles]
s///1-4,34-,w filename

  [paolo]
  Hmmm..., this would not exactly be a breeze to implement!  But I
  agree it is very powerful.

    [brian hiles]
	Why not? Enumerated substitution sequence substitution is already
	supported, and this is merely a range extension to that idea
	(1-4,34- instead of just one number). I sincerely hope that you
	may find it at least straighforward to implement.

      [paolo]
	  Yes, but the parsing stage of sed (at least GNU sed) is already
	  quite convoluted.  Well I could steal some code from cut.

        [brian hiles]
		It was afraid it was so. I have not seen the source code of
		distribution sed(1) nor GNU sed, but knowing Thompson's
		algorithm for the generation of IFAs, I would not be surprised
		if it was goto-hell spaghetti code.

[brian hiles]
s///g,v,t,e,1-4,w filename
# any others you can think of?

The above, as well as new printf/readf commands, a command to redirect
I/O to given file unit numbers (file descripters), are the only
commands needed to be added. Rule (1).

Any extensions to the language syntax itself must be implemented as
allowing for a push-down stack: for pattern space AND hold space,

  [paolo]
  Yep.  I thought of having > and < commands that push and pop the
  contents of pattern space (not both, because you can always do
  >;x;>;x and x;<;x;<).

    [brian hiles]
    Certainly that's an idea.

[brian hiles]
saved buffers (vars),

  [paolo]
  This is very powerful, but maybe this is overkill.

    [brian hiles]
	I agree. But I was making a true effort (I made 15-20 drafts
	before I -- incorrect ;) -- sent the post) as to Rule (2) --
	logical completeness of the specification.

[brian hiles]
printf/readf directives,

  [paolo]
  I don't agree you need these.

    [brian hiles]
    It's true that readf is not in the vernacular of sed(1), but
    since I have wished over and over for numerical evaluation and
    formatting, strings in specific field widths, etcetera -- and
    especially because field extraction and handling is such a pain
    in sed(1)! -- a printf would really be nice. Sed(1) _is_ a
    filter, so I thought....

      [paolo]
      I'd use awk for numeric stuff...

        [brian hiles]
		I had sent the email (after 15 drafts! :) before I realized
		the reason that I had thought readf was so important was the
		very reason I was going on and on about providing just such a
		hook for an external debugger.
        
		My aforementioned debugger (before it get written _back_ into
		ksh(1)) cannot break at a spypoint -- that is, it cannot have
		specified the place to stop execution temporarily to allow the
		debugger to browse the current environment -- without some
		kind of read statement. Very, very important.

    [brian hiles]
	I was hoping not to impress upon a list of enhancements, per se,
	but to apply a little common sense to the "featuritis" that I see
	creeping into the proposed extended sed. My only intention -- and
	I truly made a effort to succeed at expressing this -- was to
	proffer a comprehensive _minimum_ set of language elements
	providing _maximum_ usability. Of course, the final decision is
	yours -- but even this does not necessarily mitigate against all
	that I have said if you keep to a _same_ given level of design
	sophistication. Distribution sed(1) shows by the existence of this
	very mailing list how much can be done with so little, and is a
	credit to the "do one thing, and do it well" overall design of
	Unix, which has worked so well.

[brian hiles]
and regexes -- which would otherwise be implemented by functions,
macros, multiple I/O streams a la m4(1), etcetera. Rule (2).

Anything more and you might as well program in awk, as has been
previously observed. Rule (3).

Allow a command line option to source a given sed file, like bc(1)
does with its "-l" option. Make sure it is allowed to specify more
than one -l option argument.

  [paolo]
  Why not -f?

    [brian hiles]
    Because the -l option applies to defining functions, setting
    macroes, etcetera -- all those things I said were the "_tools_,
    not _features_, at well-defined levels of abstraction." I hope
    you understand that it _cannot_ be provided as an -f option.

      [paolo]
	  Well, that implies that you have functions, macros, and features
	  of *that* level.  It's quite a long way from the current sed --
	  the creeping features in sed did anyway keep the same core
	  concepts for all the commands except perhaps the fmt-like L, and
	  e (which I think is maybe not orthodox for scripts, but is very
	  nice for one-liners and pipelines).
      
	  The commands I added don't add constructs to sed, only
	  functionality (Q to quit without printing, T to jump on not
	  substituted, R to read one line of a given file into pattern
	  space, W to write the first line to a file).

        [brian hiles]
        I totally agree. Again, I was attempting to be "complete."

        How is "Q" different from "d;q"?

          [stew ravenhall]
          In the version of HP-UX sed I use "d" deletes the pattern
          space, and execution resumes at the first line of the sed
          script, so the "q" would never be executed.



GNU SED 4 new commands



[paolo]
Don't worry, I am not going to add commands any more to GNU sed :-)

  [björn]
  Didn't you alreay add several commands? I count to seven GNU
  specific commands in the 4.05 manual. Since the orginal sed commands
  are 24, I consider that a lot.

    [paolo]
    Yes, but I'm not going to add any more. Now:
	- W is present for compatibility with other seds that implement
	  it; it can be useful anyway

      [björn]
      Which other seds implement it?

        [paolo]
		(Don't take this as a flame war, but rather as a sorely needed
		explanation of some of my choices).

        HHsed and sed 1.6

          [björn]
		  Not at all. If anything, I am afraid that I am too hard in
		  my critique. Being a maintainer isn't always the most
		  grateful job to have.

		  Is this GNU sed 1.6 you are referring to? (I have never
		  heard of HHsed before.)

            [paolo]
			HHsed and sed 1.6 are both improved versions of the
			original Eric Raymond sed.

              [björn]
			  I see. Are they actually in use, ie are they the default
			  sed on any platform? (Asking only out of curiousity.)

      [björn]
	  I don't see why it would ever be particularly useful, especially
	  considering it is very similar to the existing w command.

        [paolo]
        Well, if so, P would also be useless :-)

          [björn]
		  I guess you could argue like that, but at least half of my
		  argument is that GNU sed shouldn't be turned into something
		  which is no longer sed. Maybe if someone were to design a
		  streaming editor today from scratch, it wouldn't look much
		  like sed. Nevertheless, sed has an important legacy to
		  consider.

    [paolo]
	- T is a shortcut which can make sed scripts less spaghetti-like

      [björn]
	  It's functionality is easily replaced by three other lines. I
	  disagree that the impact of adding a completely new command
	  outweighs saving 2 lines at rare places.

    [paolo]
    - R is very useful

      [björn]
      What is so useful with R that cannot be done with r?

        [paolo]
		Everything :-)  R reads *a line* of a file *into pattern
		space*.  r prints the whole contents of a file without
		allowing any kind of editing.  It is a very common question
		"how do I mix files with sed" and my solution is usually to
		take one file, pipe it through sed to generate a sed script,
		and run the script on the other file.  R adds a much simpler
		alternative.

		On second thought, it would have probably been better to add
		optional file name arguments to the n and N commands.  But
		then w is also a mistake, it would have been better to add
		file names to p and P which would have removed the need for
		W... the original sed is damn good, but not perfect (and I
		have not -yet- taken = into account...)

          [björn]
		  I agree it is not perfect -- but it is sed. Still, it is
		  possible to achieve similar effects by using more than one
		  sed invocation. Sed is not designed to be used for
		  everything, and should not be used for everything. Many
		  times when you have several simultaneous input files, awk is
		  a better choice of tool for example.

    [paolo]
    - Q can often avoid using -n and obscuring scripts

      [björn]
      What is wrong with using -n?

        [paolo]
        On a one-liner I prefer /bar/Q to -n /bar/q;p

          [björn]
		  Is that miniscule difference really worth introducing a new
		  command into sed?

            [paolo]
            IMHO yes.  Of course other's mileage may vary...

        [paolo]
		There are additions, like \[lLuUE], which could be
		misinterpreted by other seds.  By adding a v command you can
		ensure correct results.  Or there might be known bugs that are
		fixed in later versions.

          [björn]
		  I still don't understand. Are you saying that I could
		  include the v command in my GNU sed scripts to make sure
		  that they break rather than give unexpected results on
		  another sed?

            [paolo]
            Yes.

    [paolo]
    - L is definitely a mistake :-)

      [björn]
      I agree. :-)

    [paolo]
    - e is useful though very controversial

      [björn]
	  My opinion is that it is a huge mistake in the same vein as L
	  is. It adds a whole new semantic to sed, while not being of any
	  use. If one would like to process output from another command,
	  the proper way is to pipe it to sed, the Streaming EDitor.

    [paolo]
	- v does nothing, it can prevent subtle as in 4.1 and 4.0.6 it can
	  accept a version number

      [björn]
	  So what is it used for? I'm afraid I don't see the point of a
	  command which does nothing. :-)

    [paolo]
	So I count 5 useful commands, 1 controversial command and 1
	mistake :-)

      [björn]
	  I count 4 questionable commands, 2 horrible semantic-destroying
	  commands, and 1 commands that does nothing.

        [paolo]
        Well, that's a point of view :-)

[paolo]
What will be added in GNU sed 4.1 (I already did so in my local copy,
but of course the release is far from mature) is:

- better treatment of multibyte characters.  A slash inside a
  multibyte character will not terminate a regex.

- fixing the bug with \n not being parsed correctly in the `y' command

- enabling // in POSIXLY_CORRECT mode

  [björn]
  All these seem good to me. :-)

[paolo]
- possibility to use Emacs-style backup file names when you use
  in-place editing.  I need to do this with coherency with patch and
  other GNU utilities, I agree it is not strictly necessary and bloats
  a bit.

  [björn]
  Are you saying that this is required by the GNU project? I'm not
  sure I understand, how would you do in place editing with sed?

    [paolo]
	It is just expected by some users who do use the VERSION_CONTROL
	variable with patch(1).  You do in-place editing with the -i
	option in GNU sed 4.  It works like

      [björn]
	  Ok, I didn't see that this was also added. In-place editing is
	  contardictory to sed being a /streaming/ editor, IMO. The usual
	  way of doing in-place editing would be with ed.

        [paolo]                
		Which is much more complex and not always really up to the
		job, for example for complex tasks like removing C comments.
		sed scripts are quite widespread (at least on this list's
		subscriber's PCs...) and it takes little to add -i to a
		command line.
      
          [björn]
		  If I were to change C source files, I sure as hell wouldn't
		  run an automated script without keeping backups until I can
		  verify that the script result worked out ok. In short: it is
		  almost always preferable to keep the old file until the new
		  transformed file can be verified.

            [paolo]
			Usually what I do is tarring the whole directory
			structure, running sed on a couple of files to check the
			results, then doing

               find . -type f -print0 | xargs -0 sed -i -f script.sed

              [björn]
			  So you are keeping the tar archive as a backup? I don't
			  see why that would be more convenient than to just
			  rename all files with a backup extension, and then run
			  sed producing the new files. To each his own I guess. I
			  often move the file(s) I want to edit to $filename.orig
			  or something like that before I sed it back into the
			  original name.

          [björn]
		  I should mention that there are a few other ways of doing
		  in-place editing with sed:
          
          1) The traditional method:
          
             sed -f script file > file.tmp
             mv file.tmp file
          
             Easy and reliable.
          
		  2) There are also ways to avoid having to create a temporary
			 file. As I understand it, even gsed -i creates a
			 temporary, so this method has an advantage to gsed too:
          
             (rm -f file; sed -f script > file) < file

            [paolo]
            Cool!

          [björn]
		  The method relies on the fact that a file is not unlinked as
		  long as it is being accessed.

      [björn]
	  I also question the -s option. It is very easy to implement the
	  -s option, eg with Bourne shell syntax:
      
         for f in file1 file2 file3
         do
            sed -f script.sed $f
         done

        [paolo]                
		The -s option is a freebie that is needed to implement -i
		correctly because -i implies it.  It might be featuritis to
		allow it even when -i is not there.

          [björn]
          I see. I didn't make the connection between the two.

      [björn]
	  It seems to me that several of the exensions are for making it
	  easier to write self-contained sed scripts.

        [paolo]
		No, my intention was to make it easier to replace complex
		pipelines with a single sed invocation.

      [björn]
	  Scripts that doesn't have to be wrapped in shell scripts, or
	  having to make use of any external utility. IMO, this is very
	  wrong. sed was designed from the start to be used in conjunction
	  with the other Unix tools, not replacing them. All of e, L &
	  -s are features of this type, and I suspect that R, W & Q
	  are in a sense too.

        [paolo]
		Don't consider L.  It is a mistake indeed.  But e is designed
		to run other Unix tools, and hence to make sed work in
		conjunction with them!

          [björn]
		  IMO, e is an absolute abomination. It doesn't fit in at all
		  with the rest of the sed commands or the sed philosophy,
		  IMO.

            [paolo]
            I might be too picky in counting keypresses, but I prefer
            
            ls | sed 's/.*/mv & \L&/e'
            
            to
            
            ls | sed 's/.*/cp & \L&/' | sh
            
            :-)

              [björn]
			  I never use sed for such things. I would consider it a
			  shell duty. The common way to do it in the shell is by a
			  loop,

                 for f in *; do <sthg with $f>; done

              I have written a shell function that permits me to write

                 each "*" mv %1 %1.orig          # (bad example)

          [björn]
		  (The same goes for L, but since you say it was a mistake, I
		  won't harp on it. BTW, if you consider L to be a mistake,
		  couldn't you describe it as deprecated in the manual, and
		  say that it might be removed in a future version of GNU sed?
		  I cannot imagine it is of any wide use anyway.)

            [paolo]
            Yes, I was thinking of this too.

        [paolo]
		Why Q?  And if R and W are designed to replace Unix tools I
		don't see why r and w aren't.
        
          [björn]
		  Are you suggesting that I cannot be in favour of not having
		  R & W without also wanting to get rid of r & w?

            [paolo]
            No, that I did not understand your parallel between [eL]
            and [QRW].

              [björn]
			  Well maybe there is no parallell. The reason I am
			  opposed to e & L is beacuse they are not 'sed', and
			  the only things they make easier are things that should
			  be done with other tools, or in conjunction with other
			  tools. The reason I am a bit doubtful about Q, R & W
			  is that I don't think the gain from them outweighs the
			  negative aspects of introducing new commands, and
			  breaking legacy with original sed.

          [björn]
		  I am not arguing for making changes to the original sed,
		  quite the opposite. I am arguing that intrusive changes to
		  the sed language should not be made to a sed which is the
		  default sed on many platforms.

            [paolo]
			Note that all the changes in GNU sed, except escapes in
			regular expressions, are 100% backwards compatible.  I am
			not sure this is true of bash.

              [björn]
			  I only mentioned bash because it is not sh, and does not
			  try to be sh, but sh-compatible. Let me show you what I
			  mean:

              1497 d95-bli@hasse:~> ll /bin/*awk*
              lrwxrwxrwx  1 root  root       4 sep 23 15:30 /bin/awk -> gawk
              -rwxr-xr-x  2 root  root  248748 mar 18  2002 /bin/gawk

              1498 d95-bli@hasse:~> ll /bin/*sh*
              -rwxr-xr-x  1 root  root  541096 apr 12  2002 /bin/bash
              lrwxrwxrwx  1 root  root       4 sep 23 15:29 /bin/sh -> bash

              1499 d95-bli@hasse:~> ll /bin/*sed*
              -rwxr-xr-x  1 root  root   54949 apr  5  2002 /bin/sed

			  I am not comparing sed to bash or gawk, only saying that
			  if GNU sed is going to aim to be a superset of
			  original/POSIX sed, then I wpuld rather see the last
			  example to look like

              -rwxr-xr-x  1 root  root   54949 apr  5  2002 /bin/gsed
              lrwxrwxrwx  1 root  root       4 sep 23 15:29 /bin/sed -> gsed

              (This is all on a Redhat 7.3 system.)

			  You are right that bash is not completely compatible
			  with either old sh or POSIX.
            

            [paolo]
			I also happen to agree with you about obtrusive changes,
			and that's why I am not ever going to add Perl REs to sed!

              [björn]
              Thank you!

            [paolo]
			I think \[lLuUE] escapes are *much* more intrusive, both
			in terms of source code and in terms of  backward
			incompatibility (in that the script behaves wrong silently
			instead of breaking), than for example Q or W, yet you
			don't seem to have problems with them, only with new
			*commands*.  In other words, I don't understand exactly
			what kind of extension you would favor.

              [björn]
			  That is because I was not aware of those extensions
			  before you told me just now. :-) Well, for the record I
			  think those escape sequences are just as bad as the e
			  & L commands. Actually, even worse since they break
			  backwards compatibility as you say. I also don't think
			  they are in the spirit of regular expressions. They even
			  more strongly motivates having GNU sed behave like an
			  ordinary sed when called as 'sed', but allowing all GNU
			  extensions when called as eg 'gsed'.

                [paolo]
                You can name the program gsed and then use a script

                #! /bin/sh
                POSIXLY_CORRECT=1 sed gsed ${1+"$@"}

                  [björn]
                  I presume you mean 
                  
                  POSIXLY_CORRECT=1 /bin/gsed "$@"
                  
				  ? If not, what is your script supposed to accomplish
				  exactly?

                    [paolo]
					Of course.  Also, `v' disables POSIXLY_CORRECT
					behavior so you can use extensions freely.
                    
                [paolo]
                Escapes are all disabled in POSIXLY_CORRECT mode.

                  [björn]
				  That is good. Are the extra GNU commands disabled
				  too?

                    [paolo]
					Not so far, but I can change my mind for 4.1
					except for `v'; as they don't break compatibility
					(besides, who would use \l in an expression) I
					don't think it's necessary.

                [paolo]
				(Don't do that in 3.x and 4.0.x, it will break the
				empty RE)

                  [björn]
				  Um, so which version can I actually do it in? :-) I
				  have GNU sed 3.02 on my home box.

                    [paolo]
					The yet-to-be-released 4.0a which is a pre-release
					for what will be 4.1 :-)

              [björn]
			  I admit that case conversion can be a bit unconvenient
			  in Unix though, and it wouldn't hurt having some tool
			  that would make it easier. It is just that I think the
			  sed extensions described above is a very ugly way of
			  accomplishing ushc a task.

			  What I'm really meaning to say, and hinting of in my
			  case conversion digression, is:

			  1) For short script and one-liners, if you want to do
				 case conversion, sed is probably the wrong tool.
				 There are already several other alternatives (like
				 tr).

			  2) For longer scripts i think using the y command for
				 case conversion is a fully acceptable method.
				 Especially if the alternative means an intrusion on
				 the design of sed while even breaking backwards
				 compatibility.

          [björn]
		  If someone wants all these little features I am much more
		  comfortable with them being made to something which is not
		  called or used as 'sed'. I thought that super sed was
		  something like this. Here is an idea I just thought of:
		  maybe GNU sed could be made so that when it is called as
		  'sed', all extensions are disabled, but when it is called as
		  'gsed' they are enabled? That would work a little bit like
		  bash, which when called as 'sh', tries to emulate a POSIX
		  shell more closely. It would make it easier to write
		  portable scripts.

    [paolo]
    sed -i s/^/>/ FILE
    adding > signs in front of every line of FILE.

      [björn]
      echo -e ',s/^/>/\nwq' | ed file
      
	  I know I'm coming of a bit harsh here, and in a way it is not
	  very useful to complain about the features you have already
	  included. I'm just out for the discussion really, of what sed is
	  and of what it should be. GNU sed is the default (and only) sed
	  on many platforms now, so it is a great responsibility.




SED is P/NP or P=NP or none?



[brian hiles]
All the above has the virtue of being both a proper superset of sed(1)
and P/NP complete. It is possible to prove this mathematically.

  [paolo]
  sed is P/NP complete.  There is a Turing machine script on the grab
  bag.

    [brian hiles]
	I don't think you realize either the complexity of the P/NP
	problem (it's a mathematical problem that has been worked on for a
	hundred years), nor the applicability to a Turing machine to its
	proof (the fact that turing.sed exists has nothing whatsoever to
	do with the fact that sed(1) is P/NP complete or not.)

      [paolo]
	  I probably misunderstood.  Isn't it that whatever is equivalent
	  to a Turing machine can compute any decidable problem?

        [brian hiles]
		Well, yes and no. The P/NP problem (AKA or analogous to the
		"P=NP? problem," "p-time reducibility," "NP-completeness,") is
		indeed a mathematical assertion asks if any set of finite
		objects of a certain mathematical space (numbers, polynomials,
		...) encoded in a finite language that is accepted by a
		non-deterministic Turing machine in polynomial time is also
		"decidable in polynomial time by a deterministic machine."
		according an old textbook from college I had to check;
		however, P/NP (not under that name) predates Turing, although
		P/NP has come to be described using the vernacular of that
		theory. Even in this, _I don't believe_ the Turing Machine has
		itself "answered" the P/NP Problem. It is generally
		conjectured that the answer is negative, but a proof of the
		conjecture seems to be far away.

		Additional examples of NP-complete problems are "SAT, 3-SAT,
		clique, 3-colorability, graph embedding, travelling salesman
		problem, Nullstellensatz over finite fields, bounded Hilbert's
		10th problem, integer linear programming, subsetsum problem,
		hitting set and covering set problem," from the same text, of
		which I have personally studied colorability, graph embedding,
		the travelling salesman problem -- and Frege, having taking a
		format logic class at Cal Berkeley by a protege of Kleene, a
		noted mathematician in that field.

      [björn]
	  I understand the P = NP problem, but I fail to see what it has
	  to do with sed, and -- more specifically -- what it has to do
	  with extensions to sed?

        [brian hiles]
		It has everything to do with sed, extensions to sed, language
		design, language theory, and indeed computers in general. I
		really cannot decide whether to be frank or sarcastic, but
		I'll defer and hope the issue will have been made clear with a
		reading of the previous threads.

          [björn]
		  Why don't you try being frank? I am a last year Masters
		  student in Computer Science, having studied both complexity
		  and some language theory. The statement of yours that I have
		  trouble understanding is the following:

              [brian hiles]
			  All the above has the virtue of being both a proper
			  superset of sed(1) and P/NP complete. It is possible to
			  prove this mathematically.
          
          [björn]
		  With "the above" referring to your suggestions of changes to
		  sed. In what way do you mean to say that those changes are
		  "P/NP complete"? Or are you referring to sed being "P/NP
		  complete"?

            [paolo]
			Actually I didn't understand this at all too.  I cannot
			see why the proposed additions (some of which might even
			be worse than L :-) make sed *computationally* more
			powerful.  sed is already Turing-complete (I too am a last
			year Masters student in Computer Science by the way).

              [brian hiles]
              Everybody here is a Masters student of CS? :)

			  To say "computational more powerful" is problematic
			  usage. _Theoretically_, sed(1) have been used to send a
			  man to the moon in the 60s. I've seen DOS Batch
			  libraries that do amazing things.

              ... But I wouldn't advise it.

			  If there is a keyword that for all my hot air would
			  encapsulate my intention and philosophy, it is "elegance
			  of design," which usually, BUT NOT NECESSARILY, is
			  minimalistic. The Unix philosophy is always a good
			  paradigm: make a program do _one_ thing, and do it
			  _well_.

            [brian hiles]
			I sought councel with a colleague who is more familiar
			with the distinction between the P=NP Problem and Turing
			Completeness. As I had indicated, the former predates the
			latter, and so the statement that _any_ language (a
			"language" satisfying the three criteria of variables,
			flow-of-control, and I/O) is Turing Complete. In this
			much, sed is Turing Complete, _although_ a turing.sed
			(which I had known of before) does not indicate TC in
			itself but that it is possible to have been written, which
			does satisfy the conditions.

			The P=NP (P/NP) Problem has always interested me; Turing
			Machines have not. It is enough that the latter's
			existence provides an algorithmic context to the former,
			but just like the fact that Cellular Automata is now
			mainly only of academic interest, it is not practical in
			the implementation.

			I find myself in the awkward position of reinforcing my
			original thesis that, although complexity and "featuritis"
			are NOT the same thing -- insofar as completeness and
			consistency of the language are concerned -- the fact that
			I have discussed push-down stacks of RE, pattern, hold,
			and numbered buffers, enhanced I/O, etcetera, I did so
			because if sed (or any other language) is to be extended
			and/or enhanced, there is definitely a right way and a
			wrong way to do it.

			Ultimately, although most of my ideas are for the
			"obvious" commands omitted in sed that through my
			programmming of "non-trivial" projects, I _really_ wish
			had been there from the inception, I am in favor of
			minimalism.

			I'm really quite proud of my aforementioned Three Rules of
			Language Design. I've never seen an instance where this
			wasn't apropos.




Reply Message Not Connected to Any


From: "Luciano ES" <luc-groups@...>
Date: Sat, 01 Feb 2003 16:09:48 -0200

I can't comment much on Aurelio's rants, for two reasons:

1 - I'm a sed neophyte. What do I know about it?
2 - I haven't followed the latest changes introduced in GNU sed. Even
	if I had, I wouldn't be able to tell new features from primeval
	ones.

But I do know that PCRE are a super-sed thing and, albeit I love them
(super-sed and PCRE) and the extremely favorable view I have of them
can easily be taken with a lot of reserve, I still think it is worth
considering this particular point: the PCRE capabilities in ssed do
not interfere with anything else in (s)sed. Even if you do know plain
POSIX RE but have no idea of what PCRE are, you can write all RE that
you want without the risk of incurring some PCRE syntax mistake. You
use PCRE if you want, and ssed will only recognize it if you turn on
the -R switch. Perhaps that could be said of other new features
recently introduced in GNU sed?

OTOH, I do agree that someone (Paolo?) might be just trying to force
sed to do what is otherwise another tool's job. I also think that such
improvements should be made to super-sed, which is great and is a lot
more interested in the future than in the past, and is a lot less
likely to break Jur... I mean, legacy setups.



Final Message, Paolo Conclusion


From: "Paolo Bonzini" 
Date: Thu, 20 Mar 2003 10:13:34 +0100

Let me clear up this.  The comments are regarding the future 4.1
version, of which I hope to release a beta soon (will be named 4.0a).
I hope you don't think I am abusing my role as gsed maintainer --
indeed I did change my mind about some things as a result from the
thread.

I am not going to turn this into another giant thread, but of course
feel free to reply and give me your opinion.  I'm asking however to
avoid reinstating what people said in other posts, and to avoid making
the thread too deep.  This will make this message and the replies more
useful to me and to my users (that is, you).

- 'L' will not be gone before 4.2
  Also because I don't intend to make another 4.0 release, so I can
  deprecate L in 4.1 but not remove it.

  I'm going to do the same in ssed as well.  ssed and gsed are not
  going to have any difference but the RE matcher.

- 'e' will not go away.  Sorry.  :-)

- $1, $2, ... will never be included.  They have serious backwards
  incompatibility problems and implementing them is not the easiest
  thing to do.

- I doubt 1,3-4,7- options in the 's' command will be implemented
  soon, but mostly for laziness.  I do think it is an extension which
  is worth being considered, and will put it into the TODO list.

- I doubt \= will be ever implemented, but I am not absolutely
  negative.

- I doubt more commands will be ever implemented, but I am not
  absolutely negative.

- POSIXLY_CORRECT behavior will disable \l \L \u \U \E.  Not because
  of popular request :-) but because it is the right thing to do (it
  is no different from disabling \t and the like).

- 'v' will override POSIXLY_CORRECT behavior.  This makes it more
  useful.

- I am going to think much more about generating backups with
  GNU-style filenames.  The implementation is clumsy because of
  backwards compatibility (if I really wanted to do this, I should
  have taken a look at the command line options for patch; now it is
  too late and besides things are simpler as they are now).

  IOW, this feature will 99% be removed.

- -i will stay.  I think that sed is different enough from ed that -i
  does fit in the picture.  -s will stay because it does not cost
  anything to implement it.

- I *might* consider if there is enough request disabling extended
  commands in POSIXLY_CORRECT mode.  v will be left there to enable
  other extended commands.

  Anyway, Eric is right saying that new commands and options are not
  backwards incompatible, and they make older seds abort.  OTOH
  escapes make older seds spit out incorrect results.

- The command will still be named sed.  If you want to have a bare
  bones sed, you can use the scriptlet that I posted (I advise against
  enabling POSIXLY_CORRECT behavior globally).

The End.