[opensuse-packaging] Proposal for a shell scripting policy
Hello, following up the discussion on upstart and init scripts I'd like to propose a new policy on shell scripting. Shell scripting policy ---------------------- /bin/sh can be only expected to support the SUSv3 Shell Command Language, this allows dash, ksh, and bash to become /bin/sh. Init scripts should use /bin/sh, but may use /bin/bash with proper justification, e.g. not just for using ==, [[ ]], or echo -e which are easily replaced by SUSv3 compliant constructs. New packages containing scripts using /bin/sh must be tested with dash as /bin/sh (dash -n/checkbashisms.pl may aid debugging but are not sufficient by themselves). Any errors should be fixed and reported upstream. The rpm specfiles and scriptlets use bash and a GNU userland by default (I think everything else would be a pain to fix because of the many bashisms and GNUisms in specfiles). Rationale --------- * Choice and diversity are good: the ulimate goal would be to allow any SUSv3 compatible shell in openSUSE (currently ksh93, bash, and dash) to be used as /bin/sh * Quality and Correctness: /bin/sh is not /bin/bash, using other shells may expose bugs, corner cases etc. * Speed: bash is slow, allowing scripts to run on other shells such as dash or ksh93 may lead to speedups in certain situations * Portability: allow the re-use of scripts written for openSUSE on other OS What other distros are doing ---------------------------- Fedora does not seem to have a policy on shell scripts, according to the Debian policy handbook (which also applies to Ubuntu) /bin/sh can only be assumed to support the SUSv3 Command Language whith three exceptions, those are support of echo -n, test -a/-o and local. Note that this is not entirely correct, echo -n, test -a/-o are as XSI extensions part of the SUSv3 just not POSIX. Implementation -------------- The implementation of this policy could happen gradually over time assisted by a new rpmlint check which uses dash -n and checkbashisms.pl. In my opinion %_buildshell should point to /bin/bash, simply because fixing all the bashisms and gnuisms (pushd, popd, == etc.) in specfiles would be too much work. Building packages with dash as /bin/sh may expose bugs in build systems which will need to be fixed upstream. However, since Debian and Ubuntu have been doing this for some time, many issues are likely to be resolved already and there is pressure on upstream maintainers to fix their scripts. In the long term /bin/sh could default to dash, choosing ksh93 and bash should remain a possibility through the use of update-alternatives. Finally, here is a rough estimate on the scale of the issue: * on February 22nd the Factory OSS Repository (i586/i686/noarch) contained 8002 binary packages * there are 4065 /bin/sh scripts (267 of which are init scripts) in 1092 packages, furthermore there are 3481 /bin/sh rpm scriptlets in 1488 packages * checkbashisms.pl shows warnings for 794 scripts (100 of which are init scripts) in 372 packets (note that checkbashisms.pl is not very reliable) * dash -n leads shows syntax errors in 356 scripts (16 of which are init scripts) in 111 packages (haowever this does not detect all errors) Comments, Suggestions? -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Hi, On Tue, 23 Mar 2010, Guido Berhoerster wrote:
Rationale ---------
* Speed: bash is slow, allowing scripts to run on other shells such as dash or ksh93 may lead to speedups in certain situations
I repeatedly hear this claim, seldom to be followed up by credible numbers. And of course: when does it actually matter, even if some other shell was faster? (I already can feel the answer being "but booting will be so much faster then", which is wrong).
Comments, Suggestions?
Yes, I think we shouldn't have such policy. I have an alternative proposal: make all scripts hardcode /bin/bash. It's easier to implement and check for violations, we reduce choice (and therefore bug sources), bash is actively maintained, and the constructs it gives in addition to POSIX are visually more pleasing, and more expressive, hence will lead to less bugs in scripts written by inexperienced shell programmers. I also think you underestimate the work that would be required to implement your policy, first it's a huge amount of work upfront (and not all violation can be found via automatic means), with the potential of introducing as many bugs as there are changes. But what's worse is the constant amount of work that needs to be invested afterwards, just to keep bashisms to crop up again. At least for autoconf it's not trivial, and that is only one package. And local patches will make us deviate from upstream if the scripts originate there. For that reason alone my suggestion of hardcoding bash wasn't meant serious. I mean, if the original authors of the scripts decided that they want to invest that work, fine, more power to them (because in the abstract I agree with the wish of using only POSIX sh constructs in scripts claiming to use /bin/sh), but creating a policy that would force us to invest that work for the dubious value of purity, no, thank you. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Michael Matz <matz@suse.de> [2010-03-23 14:49]:
I repeatedly hear this claim, seldom to be followed up by credible numbers. And of course: when does it actually matter, even if some other shell was faster? (I already can feel the answer being "but booting will be so much faster then", which is wrong).
Well the "claim" acutally comes from the bash maintaines, read bash(1) BUGS. One example is startup time, try libmicro's system benchmark /usr/lib/libMicro/bin/system -E -C 200 -L -S -W -N "system" -I 1000000 with /bin/sh as a link to ksh or dash and empty .profile/ENV. dash is 2,5 times and ksh93 is still 1,5 times faster. I have anecdotal evidence that pattern matching while processing textfiles is significantly slower in bash compared to ksh93/dash. Furthermore, ksh93 offers a lot of builtins for standard commands which are significantly faster than forking and execing and which could be taken advantage of.
Comments, Suggestions?
Yes, I think we shouldn't have such policy. I have an alternative proposal: make all scripts hardcode /bin/bash. It's easier to implement and check for violations, we reduce choice (and therefore bug sources), bash is actively maintained, and the constructs it gives in addition to POSIX are visually more pleasing, and more expressive, hence will lead to less bugs in scripts written by inexperienced shell programmers.
I also think you underestimate the work that would be required to implement your policy, first it's a huge amount of work upfront (and not all violation can be found via automatic means), with the potential of introducing as many bugs as there are changes. But what's worse is the constant amount of work that needs to be invested afterwards, just to keep bashisms to crop up again. At least for autoconf it's not trivial, and that is only one package.
And local patches will make us deviate from upstream if the scripts originate there. For that reason alone my suggestion of hardcoding bash wasn't meant serious.
I mean, if the original authors of the scripts decided that they want to invest that work, fine, more power to them (because in the abstract I agree with the wish of using only POSIX sh constructs in scripts claiming to use /bin/sh), but creating a policy that would force us to invest that work for the dubious value of purity, no, thank you.
It will be some work to implement this but this can take place gradually and does not need to happen overnight. This policy is not intended for third party scripts, that is upstream scripts would only need to be changed from /bin/sh to /bin/bash and that should not be too hard to patch or to get into upstream projects. Ubuntu has been using dash as /bin/sh since 2006 and Debian is preparing to do the same for Squeeze, so a lot of the required changes have already been incoroporated into upstram projects. An don't forget that on Free/Net/OpenBSD /bin/sh has always been ash, on Opensolaris /bin/sh is a linked to ksh93. For our own scripts it would only mean that we enforce clean coding practices which is IMHO a good thing and it is also not too hard, you just need test them with dash instead of bash. -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Am Dienstag 23 März 2010 schrieb Guido Berhoerster:
* Michael Matz <matz@suse.de> [2010-03-23 14:49]:
I repeatedly hear this claim, seldom to be followed up by credible numbers. And of course: when does it actually matter, even if some other shell was faster? (I already can feel the answer being "but booting will be so much faster then", which is wrong).
Well the "claim" acutally comes from the bash maintaines, read bash(1) BUGS. One example is startup time, try libmicro's system benchmark /usr/lib/libMicro/bin/system -E -C 200 -L -S -W -N "system" -I 1000000 with /bin/sh as a link to ksh or dash and empty .profile/ENV. dash is 2,5 times and ksh93 is still 1,5 times faster. I have anecdotal evidence that pattern matching while processing textfiles is significantly slower in bash compared to ksh93/dash.
Well, being 2.5 times slower doesn't really answer "when does it matter?". I mean how often do you need the performance of 'system' to be as high as possible when usually the thing that sh executes takes a million more time? (honest question and one you need to have prepared). I'm not so afraid of /bin/sh pointing to dash in the running system, what I'm mostly afraid of is switching to dash in our build system - most debian bugs are about packages breaking / changing behaviour with dash. Greetings, Stephan -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Stephan Kulow <coolo@suse.de> [2010-03-23 18:03]:
Am Dienstag 23 März 2010 schrieb Guido Berhoerster:
* Michael Matz <matz@suse.de> [2010-03-23 14:49]:
I repeatedly hear this claim, seldom to be followed up by credible numbers. And of course: when does it actually matter, even if some other shell was faster? (I already can feel the answer being "but booting will be so much faster then", which is wrong).
Well the "claim" acutally comes from the bash maintaines, read bash(1) BUGS. One example is startup time, try libmicro's system benchmark /usr/lib/libMicro/bin/system -E -C 200 -L -S -W -N "system" -I 1000000 with /bin/sh as a link to ksh or dash and empty .profile/ENV. dash is 2,5 times and ksh93 is still 1,5 times faster. I have anecdotal evidence that pattern matching while processing textfiles is significantly slower in bash compared to ksh93/dash.
Well, being 2.5 times slower doesn't really answer "when does it matter?". I mean how often do you need the performance of 'system' to be as high as possible when usually the thing that sh executes takes a million more time? (honest question and one you need to have prepared).
To give you an example, I use a set of scripts for monitoring on my Debian server which are, similar to Munin, executed in intervals. By explicitly using /bin/dash rather than /bin/sh which is linked to bash this cuts down the time for executing all scripts and saves a significant amount of memory. It matters where you execute scripts sequentially and I/O is not the limiting factor (e.g. the Debian boot process ;). Another area in which bash is bad at and which matters even more are subshells, for a rough estimate try time $shell -c 'i=0; while [ $i -lt 10000 ]; do $(a=$$); i=$((i+1)); done' with bash, dash, and ksh93. Here dash is 3 times faster and ksh93 37 times faster than bash (ksh93 shines here because it does not actually use a subprocess). And, as I mentioned before, being able to switch to ksh93 can yield significant further performance gains due to a number of builtins. But performance was not my only rationale.
I'm not so afraid of /bin/sh pointing to dash in the running system, what I'm mostly afraid of is switching to dash in our build system - most debian bugs are about packages breaking / changing behaviour with dash.
Yes, that is actually the only critical part as we would hav to make a switch at some point in time. However, from a quick glance at the Debian bugtracker it seems that the vast majority of problems are debian/rules files and upstream scripts and not so much the build systems (for others: http://bugs.debian.org/cgi-bin/pkgreport.cgi?users=debian-release@lists.debian.org&tag=goal-dash gives you an overview over the issues and how Debian deals with it). Would it be possible to define an additional build target (that is an exact copy of Factory but with dash as /bin/sh) in the OBS for projects making the switch in order to ease the transition? -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Guido Berhoerster <guido+opensuse.org@berhoerster.name> [2010-03-23 20:00]:
* Stephan Kulow <coolo@suse.de> [2010-03-23 18:03]:
I'm not so afraid of /bin/sh pointing to dash in the running system, what I'm mostly afraid of is switching to dash in our build system - most debian bugs are about packages breaking / changing behaviour with dash.
Yes, that is actually the only critical part as we would hav to make a switch at some point in time. However, from a quick glance at the Debian bugtracker it seems that the vast majority of problems are debian/rules files and upstream scripts and not so much the build systems (for others:
To be precise, from 1005 bugs only 83 seem to be/have been causing build failures (FTBFS). -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Am Dienstag 23 März 2010 schrieb Guido Berhoerster:
Would it be possible to define an additional build target (that is an exact copy of Factory but with dash as /bin/sh) in the OBS for projects making the switch in order to ease the transition?
Yes, we can use the staging repo for that. Assuming that we put dash in front of bash when installed, all it would take is adding dash in that repo as default package. Greetings, Stephan -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On Tuesday 23 of March 2010, Guido Berhoerster wrote:
* Stephan Kulow <coolo@suse.de> [2010-03-23 18:03]:
Am Dienstag 23 März 2010 schrieb Guido Berhoerster:
* Michael Matz <matz@suse.de> [2010-03-23 14:49]:
I repeatedly hear this claim, seldom to be followed up by credible numbers. And of course: when does it actually matter, even if some other shell was faster? (I already can feel the answer being "but booting will be so much faster then", which is wrong).
Well the "claim" acutally comes from the bash maintaines, read bash(1) BUGS. One example is startup time, try libmicro's system benchmark /usr/lib/libMicro/bin/system -E -C 200 -L -S -W -N "system" -I 1000000 with /bin/sh as a link to ksh or dash and empty .profile/ENV. dash is 2,5 times and ksh93 is still 1,5 times faster. I have anecdotal evidence that pattern matching while processing textfiles is significantly slower in bash compared to ksh93/dash.
Speaking of anecdotes, there is one I remember from the times of the communist regime about plans set up by the party: "Our farm has completed the 5-year plan at 200%. We have four chickens instead of two."
Another area in which bash is bad at and which matters even more are subshells, for a rough estimate try time $shell -c 'i=0; while [ $i -lt 10000 ]; do $(a=$$); i=$((i+1)); done' with bash, dash, and ksh93. Here dash is 3 times faster and ksh93 37 times faster than bash (ksh93 shines here because it does not actually use a subprocess).
Let me give you a piece of advice: If you want to argue by technical arguments, then do so. "37 times faster", without anything else, is like "hair 87% more shiny" ads. If better performance should be a reason to avoid bashism, then say how big improvements you actually expect, and provide some real numbers to support that. "37 times faster" is completely unimpressive if it in practice may mean that this change will save quarter of a second of boot time. -- Lubos Lunak openSUSE Boosters team, KDE developer l.lunak@suse.cz , l.lunak@kde.org -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Lubos Lunak <l.lunak@suse.cz> [2010-03-23 22:04]:
Well the "claim" acutally comes from the bash maintaines, read bash(1) BUGS. One example is startup time, try libmicro's system benchmark /usr/lib/libMicro/bin/system -E -C 200 -L -S -W -N "system" -I 1000000 with /bin/sh as a link to ksh or dash and empty .profile/ENV. dash is 2,5 times and ksh93 is still 1,5 times faster. I have anecdotal evidence that pattern matching while processing textfiles is significantly slower in bash compared to ksh93/dash.
Speaking of anecdotes, there is one I remember from the times of the communist regime about plans set up by the party: "Our farm has completed the 5-year plan at 200%. We have four chickens instead of two."
Another area in which bash is bad at and which matters even more are subshells, for a rough estimate try time $shell -c 'i=0; while [ $i -lt 10000 ]; do $(a=$$); i=$((i+1)); done' with bash, dash, and ksh93. Here dash is 3 times faster and ksh93 37 times faster than bash (ksh93 shines here because it does not actually use a subprocess).
Let me give you a piece of advice: If you want to argue by technical arguments, then do so. "37 times faster", without anything else, is like "hair 87% more shiny" ads. If better performance should be a reason to avoid bashism, then say how big improvements you actually expect, and provide some real numbers to support that. "37 times faster" is completely unimpressive if it in practice may mean that this change will save quarter of a second of boot time.
And your point is? Firstly, the original question was whether there is evidence that bash is slow and where that matters, I think I have answered that. Secondly, this is not about boot speed, shell scripting has more use cases than simple boot scripts. And thirdly, speed advantages were only one, not the main rationale for my proposal. -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Hi, On Tue, 23 Mar 2010, Guido Berhoerster wrote:
I repeatedly hear this claim, seldom to be followed up by credible numbers. And of course: when does it actually matter, even if some other shell was faster? (I already can feel the answer being "but booting will be so much faster then", which is wrong).
Well the "claim" acutally comes from the bash maintaines, read bash(1) BUGS.
That's a witty remark, not evidence.
One example is startup time, try libmicro's system > benchmark /usr/lib/libMicro/bin/system -E -C 200 -L -S -W -N "system" -I 1000000 with /bin/sh as a link to ksh or dash and empty .profile/ENV. dash is 2,5 times and ksh93 is still 1,5 times faster.
Please? With -I 20000, so it takes some measurable time: pdksh: 13.6 seconds; 13501 usecs/call ash: 12.7 seconds; 12715 usecs/call zsh: 11.6 seconds; 11583 usecs/call ksh93: 3.2 seconds; 3139 usecs/call bash: 3.1 saconds; 3176 usecs/call That's on a 11.1 (and I don't have dash, only ash). I'm not impressed. But the above numbers don't mean that I would be opposed to making for instance zsh the default shell, even though it is four times slower than bash. Simply because system(3) isn't all that important, if you're spending more time in system(3) than in actual data processing you're doing something wrong anyway. That doesn't mean that system(3) should be arbitrarily slow, but it means that it's not a very large factor to consider.
I have anecdotal evidence that pattern matching while processing textfiles is significantly slower in bash compared to ksh93/dash.
Err, well, I have anectodal evidence that anecdotal evidences shouldn't be trusted too much :)
Furthermore, ksh93 offers a lot of builtins for standard commands which are significantly faster than forking and execing and which could be taken advantage of.
That is an advantage, indeed.
[... too much work ...]
It will be some work to implement this but this can take place gradually and does not need to happen overnight.
Well, if you want to work on that, be our guest. If you can convince the package maintainers to take patches, or upstream to integrate them (even better) all is fine. I'm only against making it a policy, which suddenly would force other people to invest work. That might be okay if the advantages are large and obvious, but in this case they aren't.
Ubuntu has been using dash as /bin/sh since 2006
I trust that that one is more capable than our ash? : % /bin/ash -c 'i=0; j=$((i+1))' arith: syntax error: "i+1"
and Debian is preparing to do the same for Squeeze, so a lot of the required changes have already been incoroporated into upstram projects. An don't forget that on Free/Net/OpenBSD /bin/sh has always been ash, on Opensolaris /bin/sh is a linked to ksh93.
Aha. And that's relevant exactly how? After all you said you were talking about internal scripts, not upstream ones?
For our own scripts it would only mean that we enforce clean coding practices which is IMHO a good thing and it is also not too hard, you just need test them with dash instead of bash.
See above, if you want to work on that, jolly good. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Michael Matz <matz@suse.de> [2010-03-24 14:51]:
One example is startup time, try libmicro's system > benchmark /usr/lib/libMicro/bin/system -E -C 200 -L -S -W -N "system" -I 1000000 with /bin/sh as a link to ksh or dash and empty .profile/ENV. dash is 2,5 times and ksh93 is still 1,5 times faster.
Please? With -I 20000, so it takes some measurable time: pdksh: 13.6 seconds; 13501 usecs/call ash: 12.7 seconds; 12715 usecs/call zsh: 11.6 seconds; 11583 usecs/call ksh93: 3.2 seconds; 3139 usecs/call bash: 3.1 saconds; 3176 usecs/call
That's on a 11.1 (and I don't have dash, only ash). I'm not impressed. But the above numbers don't mean that I would be opposed to making for instance zsh the default shell, even though it is four times slower than bash. Simply because system(3) isn't all that important, if you're spending more time in system(3) than in actual data processing you're doing something wrong anyway. That doesn't mean that system(3) should be arbitrarily slow, but it means that it's not a very large factor to consider.
# /usr/lib/libMicro/bin/system -E -C 200 -L -S -W -N system -I 20000· dash: 1.21781s; 1200.16680 usecs/call ash: 2.17763s; 2145.16600 usecs/call bash (LC_ALL=C): 3.40030s; 3354.16580 usecs/call bash (LC_ALL=en_US.UTF-8): 5.57932s; 5506.36800 usecs/call ksh93 (LC_ALL=C): 6.00663s; 5924.16780 usecs/call ksh93 (LC_ALL=en_US.UTF-8): 6.17549s; 6102.36820 usecs/call Startup time does e.g. matter for my monitoring scripts and CGI scripts. And it is just one example, another more relevant one which I have mentioned in this thread is subshell performance. And finally -- performance was not my _main_ and _only_ rationale for this.
Furthermore, ksh93 offers a lot of builtins for standard commands which are significantly faster than forking and execing and which could be taken advantage of.
That is an advantage, indeed.
It also has shcomp :)
Well, if you want to work on that, be our guest. If you can convince the· package maintainers to take patches, or upstream to integrate them (even· better) all is fine. I'm only against making it a policy, which suddenly· would force other people to invest work. That might be okay if the· advantages are large and obvious, but in this case they aren't.
I am interested in this or I wouldn't have brought it up here. It could be mandatory for newly added scripts only, otherwise it wouldn't make sense to me to fix things. Furthermore, it would probable not be that much work for maintainers to check whether upstream scripts work with /bin/sh when they upgrade packages and to patch /bin/sh to /bin/bash if necessary.
I trust that that one is more capable than our ash? : % /bin/ash -c 'i=0; j=$((i+1))' arith: syntax error: "i+1"
openSUSE ash is ancient. This and other issues have been for some time. If you're interested in the differences, see http://www.in-ulm.de/~mascheck/various/ash/
and Debian is preparing to do the same for Squeeze, so a lot of the required changes have already been incoroporated into upstram projects. An don't forget that on Free/Net/OpenBSD /bin/sh has always been ash, on Opensolaris /bin/sh is a linked to ksh93.
Aha. And that's relevant exactly how? After all you said you were talking about internal scripts, not upstream ones?
It is relevant for build systems which has been brought up here. Debian's efforts also have resulted in changes upstream.
For our own scripts it would only mean that we enforce clean coding practices which is IMHO a good thing and it is also not too hard, you just need test them with dash instead of bash.
See above, if you want to work on that, jolly good.
Working on it only makes sense if maintainers accept patches for such changes. -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Am Mittwoch 24 März 2010 schrieb Michael Matz:
I trust that that one is more capable than our ash? : % /bin/ash -c 'i=0; j=$((i+1))' arith: syntax error: "i+1"
Make it $i - then both ash and dash can do it. Greetings, Stephan -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Stephan Kulow <coolo@suse.de> [2010-03-24 16:53]:
Am Mittwoch 24 März 2010 schrieb Michael Matz:
I trust that that one is more capable than our ash? : % /bin/ash -c 'i=0; j=$((i+1))' arith: syntax error: "i+1"
Make it $i - then both ash and dash can do it.
Note that $i and i lead to different behavior with arithmetic expansion. $(( $i+1 )) happily yields 1 if i is set to a non-integer while $(( i+1 )) results in a syntax error. The latter is almost alway what you want since it catches potential errors. Assignments also do not work with $i, $(( $i=1 )) is an error while $(( i=1 )) is valid and assigns i. Except for legacy scripts it is probably a better idea to use dash since it has many other issues fixed. -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Am Mittwoch 24 März 2010 schrieb Michael Matz:
Well, if you want to work on that, be our guest. If you can convince the package maintainers to take patches, or upstream to integrate them (even better) all is fine. I'm only against making it a policy, which suddenly would force other people to invest work. That might be okay if the advantages are large and obvious, but in this case they aren't.
I think we're green here. If we agree that changing complex scripts to /bin/bash is a valid option, we should extend the rpmlint check with a _HUGE_ white list that lower the score and then make it fatal error to introduce new /bin/sh scripts that do not pass dash -n. Of course the white list scripts still create a warning in the logs. And then $Guido can work on fixing up the current scripts, I already identified quite some init scripts that are easy to change or need bash. Still I managed to boot with /bin/sh -> dash and you won't believe it, but: bash: 42s dash: 39.5s Of course it's very possible that some scripts just do not run because of syntax errors ;) Greetings, Stephan -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Stephan Kulow <coolo@suse.de> [2010-03-24 17:01]:
bash: 42s dash: 39.5s
Of course it's very possible that some scripts just do not run because of syntax errors ;)
That sounds allmost too good to be true ;) I need to try that on a vm tonight. -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Am Mittwoch 24 März 2010 schrieb Guido Berhoerster:
* Stephan Kulow <coolo@suse.de> [2010-03-24 17:01]:
bash: 42s dash: 39.5s
Of course it's very possible that some scripts just do not run because of syntax errors ;)
That sounds allmost too good to be true ;) I need to try that on a vm tonight.
Michael and me found the script that saved the seconds in only throwing a syntax error: /etc/X11/xdm/Xsetup Greetings, Stephan -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Am Donnerstag 25 März 2010 schrieb Stephan Kulow:
Am Mittwoch 24 März 2010 schrieb Guido Berhoerster:
* Stephan Kulow <coolo@suse.de> [2010-03-24 17:01]:
bash: 42s dash: 39.5s
Of course it's very possible that some scripts just do not run because of syntax errors ;)
That sounds allmost too good to be true ;) I need to try that on a vm tonight.
Michael and me found the script that saved the seconds in only throwing a syntax error: /etc/X11/xdm/Xsetup
So to summarize: with the hacks I applied /bin/sh symlink doesn't matter to boot time at all - that's what I always believed no matter what debian/ubuntu claimed. Of course it's little suprising, after all there are 35 /bin/sh processes and 66 /bin/bash processes according to preload trace. So even if we replaced the other 66 (a lot of those are user processes that are likely run with the login shell), still no gain. So if performance is any kind of argument, I suggest Guido changes his monitoring scripts to use /bin/dash directly ;) I still believe that we should treat bash scripts labeled as /bin/sh as bugs, but Michael is right that it doesn't justify a fatal error. Then again, we urgently need some kind of rpmlint statistic for our packages available. Greetings, Stephan -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Stephan Kulow <coolo@suse.de> [2010-03-25 10:19]:
So to summarize: with the hacks I applied /bin/sh symlink doesn't matter to boot time at all - that's what I always believed no matter what debian/ubuntu claimed.
Debian/Ubuntu have a different startup process, in Lenny parallel execution of boot scripts is still broken, so its possible that it will lead to minor speedups as they claim.
Of course it's little suprising, after all there are 35 /bin/sh processes and 66 /bin/bash processes according to preload trace. So even if we replaced the other 66 (a lot of those are user processes that are likely run with the login shell), still no gain.
So if performance is any kind of argument, I suggest Guido changes his monitoring scripts to use /bin/dash directly ;)
I already do that on my server since it's running Debian Lenny. However, I think we had agreed that performance was not the primary rationale. I don't know why everyone is so obsessed with this, bash is a nice interactive shell, but its performance and memory usage suck for not so simple use cases (and I think I demonstrated that for startup and subshell performance).
I still believe that we should treat bash scripts labeled as /bin/sh as bugs, but Michael is right that it doesn't justify a fatal error. Then again, we urgently need some kind of rpmlint statistic for our packages available.
It is a bug and it should be treated as such, particularly because changing #!/bin/sh to #!/bin/bash is not that difficult. I mean you also don't specify K&R mode if you want to compile C99? dash -n and checkbashism alone won't cut it, scripts need to be actually run and tested just as I test compiled binaries when packaging them. Without a packaging policy in place and the possibility to switch /bin/sh to dash, how will this happen and how will you avoid regressions, especially in maintainer scripts? I still think it is possible, certainly not before 11.3 but as a long term goal. However, at one point in time we need to say that all _newly written_ maintainer scripts must be tested with dash if they specify /bin/sh and that packages are tested on a system with /bin/sh pointing to dash. I don't think this would lead to that much work for package maintainers, but otherwise it would not make sense to gradually fix scripts as progress would be hampered by a continuous inflow of regressions. After making some serious progress /bin/sh could be managed by update-alternatives allowing those who oppose it being dash to keep bash. -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Hi, On Thu, 25 Mar 2010, Guido Berhoerster wrote:
So if performance is any kind of argument, I suggest Guido changes his monitoring scripts to use /bin/dash directly ;)
I already do that on my server since it's running Debian Lenny. However, I think we had agreed that performance was not the primary rationale. I don't know why everyone is so obsessed with this, bash is a nice interactive shell, but its performance and memory usage suck for not so simple use cases
Ahem, so you're arguing performance now, right after saying it's not a primary goal :-)
(and I think I demonstrated that for startup and subshell performance).
Partly. The only shell faster in startup than bash (with LANG=C and on a to-be 11.3 system) is dash (your measurement of ash must be mistaken, I can't at all reproduce it, on 11.1 and 11.3 it's consistently 5 times slower than bash). Subshell performance you're right, ksh93 of course shines, and bash is the slowest.
I still believe that we should treat bash scripts labeled as /bin/sh as bugs, but Michael is right that it doesn't justify a fatal error. Then again, we urgently need some kind of rpmlint statistic for our packages available.
It is a bug
Yes.
and it should be treated as such,
Not necessarily. It's always the question of how common that bug is, if it's so common that it became sort of it's own de-facto standard, the issues aren't black-white anymore. Don't get me wrong, I'm all for strictly adhering to standards, everybody who ever came to me with "your compiler miscompiles my code" knows that ;-) But sometimes we do implement extensions if a certain type of error became too common. You're right that the work needed to fix such bugs needs to be taken into account, which in this particular case is very low (1s/sh/bash/), per incidence. It's the sum of all those small changes I'm worried about, hence we should gradually go there, and in the mean-time not throw errors but warnings.
particularly because changing #!/bin/sh to #!/bin/bash is not that difficult. I mean you also don't specify K&R mode if you want to compile C99?
People don't think. If they write "#!/bin/sh" they don't mean 'give me a SUSv3 shell', they mean 'gimme some shell-thingy-stuff'. Most people writing shell scripts didn't ever hear about SUSv3. So it's a bit unforgiving to yell back at them saying 'Hah! Didn't you know what you were requesting?', true as it might be.
I still think it is possible, certainly not before 11.3 but as a long term goal. However, at one point in time we need to say that all _newly written_ maintainer scripts must be tested with dash if they specify /bin/sh and that packages are tested on a system with /bin/sh pointing to dash.
That's okay to _require_ after most of the work is already done. Until then it can only be a friendly request. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Michael Matz <matz@suse.de> [2010-03-25 14:47]:
I still believe that we should treat bash scripts labeled as /bin/sh as bugs, but Michael is right that it doesn't justify a fatal error. Then again, we urgently need some kind of rpmlint statistic for our packages available.
It is a bug
Yes.
and it should be treated as such,
Not necessarily. It's always the question of how common that bug is, if it's so common that it became sort of it's own de-facto standard, the issues aren't black-white anymore. Don't get me wrong, I'm all for strictly adhering to standards, everybody who ever came to me with "your compiler miscompiles my code" knows that ;-) But sometimes we do implement extensions if a certain type of error became too common.
You're right that the work needed to fix such bugs needs to be taken into account, which in this particular case is very low (1s/sh/bash/), per incidence. It's the sum of all those small changes I'm worried about, hence we should gradually go there, and in the mean-time not throw errors but warnings.
I agree and I can understand the concerns of maintainers about additional workload, my only concern is/was that we might chase a moving target as packages get added or updated.
particularly because changing #!/bin/sh to #!/bin/bash is not that difficult. I mean you also don't specify K&R mode if you want to compile C99?
People don't think. If they write "#!/bin/sh" they don't mean 'give me a SUSv3 shell', they mean 'gimme some shell-thingy-stuff'. Most people writing shell scripts didn't ever hear about SUSv3. So it's a bit unforgiving to yell back at them saying 'Hah! Didn't you know what you were requesting?', true as it might be.
That's a sad fact but no reason not to do somthing about it :) BTW, it's also nowhere mandated that /bin/sh has to be the SUSv3 shell, e.g. on Solaris 10 and the now defunct SXCE it is the Bourne shell, however there seems to be a convergence towards offering a shell that at least supports SUSv3 (on all Linux distros, Opensolaris, newer OS X, and HP-UX).
I still think it is possible, certainly not before 11.3 but as a long term goal. However, at one point in time we need to say that all _newly written_ maintainer scripts must be tested with dash if they specify /bin/sh and that packages are tested on a system with /bin/sh pointing to dash.
That's okay to _require_ after most of the work is already done. Until then it can only be a friendly request.
See above. There is now a checkbashisms package in home:gberh, should I submit that to devel:tools or openSUSE:Tools? I was a bit uncomfortable with introducing a perl dependency in the dash package. Ideally wed have a devscripts package to collect such scripts like Fedora and Debian have. -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Am Freitag 26 März 2010 schrieb Guido Berhoerster:
There is now a checkbashisms package in home:gberh, should I submit that to devel:tools or openSUSE:Tools? I was a bit uncomfortable with introducing a perl dependency in the dash package. Ideally wed have a devscripts package to collect such scripts like Fedora and Debian have.
We call that package "build" :) Greetings, Stephan -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Hi, On Wed, 24 Mar 2010, Stephan Kulow wrote:
I think we're green here. If we agree that changing complex scripts to /bin/bash is a valid option, we should extend the rpmlint check with a _HUGE_ white list that lower the score and then make it fatal error to introduce new /bin/sh scripts that do not pass dash -n.
Even when the script comes from upstream? I'm not thrilled about that idea. Warning, okay, fatal error is too harsh IMO.
Of course the white list scripts still create a warning in the logs.
Sure.
And then $Guido can work on fixing up the current scripts, I already identified quite some init scripts that are easy to change or need bash. Still I managed to boot with /bin/sh -> dash and you won't believe it, but:
bash: 42s dash: 39.5s
You're right, I don't believe you :) Repeat it ten times and I bet you get numbers between 38 and 43 seconds, you know that ;-) Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On Wed, 24 Mar 2010 17:00:51 +0100 Stephan Kulow <coolo@suse.de> wrote:
bash: 42s dash: 39.5s
It's nice if dash boots faster. But changing /bin/sh to dash is not a tolerable solution. The init scripts then should simply say "#!/bin/dash" and be done instead of imposing a vastly inferior shell onto everyone. -- Stefan Seyfried "Any ideas, John?" "Well, surrounding them's out." -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Stefan Seyfried <stefan.seyfried@googlemail.com> [2010-03-24 18:05]:
It's nice if dash boots faster. But changing /bin/sh to dash is not a tolerable solution. The init scripts then should simply say "#!/bin/dash" and be done instead of imposing a vastly inferior shell onto everyone.
/bin/sh is not the same as /bin/bash (and not necessarily a SUSv3 compliant shell although there seems to be a convergence towards that), if you need features of your superior shell you can simply request it by putting !#/bin/$superior_shell at the very top of it. That even gives you the added value of portability across many other (though not all) OSs. It is not about imposing a shell on you but on the contrary giving you choice as dash, bash, and ksh93 all support the SUSv3. -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Here is a rpmlint check which helps finding bashisms in /bin/sh scripts, it produces errors if dash -n fails and informational messages if checkbashisms finds anything (since it is just an indicators and produces false positives). It requires dash and checkbashism to be installed (the former is in factory the latter can be obtained at http://git.debian.org/?p=devscripts/devscripts.git) -- Guido Berhoerster
Am Mittwoch 24 März 2010 schrieb Guido Berhoerster:
Here is a rpmlint check which helps finding bashisms in /bin/sh scripts, it produces errors if dash -n fails and informational messages if checkbashisms finds anything (since it is just an indicators and produces false positives). It requires dash and checkbashism to be installed (the former is in factory the latter can be obtained at http://git.debian.org/?p=devscripts/devscripts.git)
How about putting it into the dash package for now? To give the list an idea about what we're talking in practise I ran it over my 11.2 init scripts (those we need to have fixed urgently if we want to make dash an option) - only examples below: possible bashism in /etc/init.d/ypbind line 86 (should be >word 2>&1): ypwhich &>/dev/null && { notfound=0 ; break; }; possible bashism in /etc/init.d/xinetd line 88 ('$(< foo)' should be '$(cat foo)'): pid=$(<$XINETD_PIDFILE) possible bashism in /etc/init.d/smartd line 80 ($"foo" should be eval_gettext "foo"): echo -n $"Reloading $prog daemon configuration: " possible bashism in /etc/init.d/smbfs line 80 ('$[' should be '$(('): timer=$[${timer}-1] possible bashism in /etc/init.d/syslog line 90 (should be 'b = a'): test "$1" == "stop" && exit 0 possible bashism in /etc/init.d/reboot line 175 (echo -e): echo -e "$rc_done_up" possible bashism in /etc/init.d/openvpn line 71 (shopt): shopt -s nullglob possible bashism in /etc/init.d/ntp line 88 ('function' is useless): function ntpd_is_running() { None of these cases are what Michael names "visually more pleasing". For many of those I would even argue that the "pure" version is easier to parse to the human eyes. There are surely some cases that are debatable, e.g. the function keyword is nice to have, but wouldn't justify a /bin/bash bang IMO. The shopt one (if really needed) surely is. And the echo one I have no idea about: I wonder how debian handles this problem. Greetings, Stephan -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Stephan Kulow <coolo@suse.de> [2010-03-24 10:58]:
How about putting it into the dash package for now?
I'd have preferred putting it in a separate package in the Tools repo, but if you think its better to bundle it with dash that'll be fine, too, there are not many upstream changes.
To give the list an idea about what we're talking in practise I ran it over my 11.2 init scripts (those we need to have fixed urgently if we
I have a list of potentially problematic scripts and I'd take a look if that's actually welcome. Now that upstart is in Factory is there actually any intention to use its native format in the forseeable future?
want to make dash an option) - only examples below:
possible bashism in /etc/init.d/ypbind line 86 (should be >word 2>&1): ypwhich &>/dev/null && { notfound=0 ; break; }; possible bashism in /etc/init.d/xinetd line 88 ('$(< foo)' should be '$(cat foo)'): pid=$(<$XINETD_PIDFILE) possible bashism in /etc/init.d/smartd line 80 ($"foo" should be eval_gettext "foo"): echo -n $"Reloading $prog daemon configuration: " possible bashism in /etc/init.d/smbfs line 80 ('$[' should be '$(('): timer=$[${timer}-1] possible bashism in /etc/init.d/syslog line 90 (should be 'b = a'): test "$1" == "stop" && exit 0 possible bashism in /etc/init.d/reboot line 175 (echo -e): echo -e "$rc_done_up" possible bashism in /etc/init.d/openvpn line 71 (shopt): shopt -s nullglob possible bashism in /etc/init.d/ntp line 88 ('function' is useless): function ntpd_is_running() {
None of these cases are what Michael names "visually more pleasing". For many of those I would even argue that the "pure" version is easier to parse to the human eyes.
There are surely some cases that are debatable, e.g. the function keyword is nice to have, but wouldn't justify a /bin/bash bang IMO. The shopt one (if really needed) surely is.
Most of the above is fairly trivial to convert to SUSv3 compliant constructs without impeding readability. You'll encounter == quite often because people are used to it from other languages and it works in bash although it doesn't offer any advantages. [[ is for example another popular candidate which actually has an advantage over [ since it doesn't do word splitting and pathname expansion, but you can achieve tha same by just putting test arguments inside quotes. For more complicated scripts with lots of functions it makes sense to use bash or ksh93 since they allow local variable scope. It's really a matter of using common sense and good judgement.
And the echo one I have no idea about: I wonder how debian handles this problem.
Debian only allows echo -n as an extension to the SUSv3. But don't use echo, use printf. echo has many problems, see the rationale at http://www.opengroup.org/onlinepubs/9699919799/utilities/echo.html and http://www.in-ulm.de/~mascheck/various/echo+printf/ for the ugly details. -- Guido Berhoerster -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
participants (5)
-
Guido Berhoerster
-
Lubos Lunak
-
Michael Matz
-
Stefan Seyfried
-
Stephan Kulow