On Tue, Feb 12, 2019 at 11:51:37AM +0100, Joerg Schilling wrote:
"Dr. Werner Fink" <werner@suse.de> wrote:
I can second this. Using bash builtins and string features can speed up scripts a lot. That is avoiding forking often for external command within loops, using <() fifo together with an external command to handle large lists of lines of strings at once and read the resulting lines with loops only using bash builtins.
In ksh93, <() works using /dev/fd/*
The main difference to $(cmd) is that the output of the command is a normal pipe instead of letting the shell read from that pipe and create an argument list from it.
As former maintainer of the ksh93 I'm aware, bash does it similar
For bash that did not yet rewrite it's parser to implement more efficient pipelines, this may give an advantage, but for shells that run the rightmost command in the main shell in case it is a builtin, there is no difference. So this does not give performance benefits in ksh93 and it would not give performance benefits if bosh did start to implement <().
Also ksh93 loops can read from pipe (which is actual a combination of two socketpair()s) plus avoiding subprocess for the loop its self. But this is not portable to many bourne shell scripts as with ksh93 variables within the loop are visiable outside the loop. This can be an advantage but may break scripts depending on orignal bourne shell behaviour. The bash provides the <() fifo as a replacment for this. Then one can use the redirection operator to read from the <() fifo as stdin.
See above. The original Bourne Shell implementation from 1976 had a main problem: program size could not exceed 64k on a PDP11 unless you used slow overlays (as used in vi). So the Bourne Shell has been written in a way that created the smallest program regardless of the speed of that program.
ksh93 uses a new method to create pipelines. This is faster and it allows to use vfork() to speed up things. vfork() on a real UNIX system is typically 3-4x faster than fork() as it does not need to copy the address space description. This still applies even to modern UNIX versions like SunOS-4.x that introduced a copy on write fork() in 1988. This is where Linux could speed up BTW...as vfork() is using a copy on write based emulation on top of fork().
To be able to use vfork(), you need to use a different chain of processes when creating the processes for a pipeline. This requires a rewrite of the parser and the interpreter in the shell. Once you did this, it is not a big deal anymore to run the rightmost program of a pipeline in the main shell process in case it is a builtin command. This is what finally allows you to set up shell macros that are seen by the main shell, as in:
echo bla | read VAR
Another strange thing from the historic Bourne Shell is that a while loop with I/O redirection always has been run in a subshell. Ksh93 changed this.
Since bosh also implements both optimizations mentioned above, typical shell scripts like "configure" now run 30% faster. ksh93 uses virtual sub-shells to avoid fork() in many cases and gets another speed up, but the version of ksh93 that is created by RedHat people on github is no longer faster than "bosh" as important code has been removed or destroyed.
BTW: if you check the speed of "configure" in special, you will see that "echo" and "test" being a builtin makes the biggest difference and with GNU configure past 2.13, the biggest time consumer is the fact that "printf" is used instead of "echo" and if that is not built into the shell, a configure run on such a shell is extremely slow.
Hmmm ... AFAICR ksh93 was huge collection of bugs causing a lot of bugzilla entries (tagged 56 bugs). I can remember the problems the $(cmd) had caused whereas the old `cmd` code had worked flawless. I had spent a lot of time to hunt bugs in the multibyte code as well. One bug was memory allocing code in the signal handling. During maintaining systemd I had no time anymore to hunt further bugs down hence the new maintainer for ksh93/libast. From my point of view I prefere a stable shell with maintainable code more than a exterm fast shell with bugs and interwoven bad documented code. -- "Having a smoking section in a restaurant is like having a peeing section in a swimming pool." -- Edward Burr