Hello, On Sat, 28 Apr 2012, Cristian Rodríguez wrote:
On 28/04/12 00:24, Jeff Janes wrote:
perl -le 'my $x="x"x1024; print "123456,$x" foreach 1..1e6'| /usr/bin/time cut -d, -f2 |wc -c 71.93user 1.36system 1:13.58elapsed 99%CPU (0avgtext+0avgdata 3328maxresident)k
perl -le 'my $x="x"x1024; print "123456,$x" foreach 1..1e6'| /usr/bin/time ~/coreutils_suse_source/bin/cut -d, -f2 |wc -c 3.81user 0.85system 0:05.08elapsed 91%CPU (0avgtext+0avgdata 3104maxresident)k
perl -le 'my $x="x"x1024; print "123456,$x" foreach 1..1e6'| LANG=C /usr/bin/time ~/coreutils_suse_source/bin/cut -d, -f2 |wc -c 3.79user 0.89system 0:05.08elapsed 92%CPU (0avgtext+0avgdata 2368maxresident)k [..] Please file a bug report ;)
Seconded. ISTR something much like that some time ago that also was locale (UTF-8?) specific. So, it probably is a regression or basically the same bug cropping up in a different utility. JFTR (first run, second was actually a tad slower): $ perl -le 'my $x="x"x1024; print "123456,$x" foreach 1..1e6'|\ /usr/bin/time cut -d, -f2 |wc -c 3.82user 1.05system 0:05.38elapsed 90%CPU (0avgtext+0avgdata 2944maxresident)k 96inputs+0outputs (1major+233minor)pagefaults 0swaps $ echo $LANG en_US.iso885915 $ perl -le 'my $x="x"x1024; print "123456,$x" foreach 1..1e6'| \ LANG=en_US.UTF-8 /usr/bin/time cut -d, -f2 |wc -c 52.52user 1.34system 0:54.25elapsed 99%CPU (0avgtext+0avgdata 3152maxresident)k 680inputs+0outputs (11major+235minor)pagefaults 0swaps So, it's quite definitely UTF-8 related. Probably how cut parses lines to find the seperator in UTF-8 vs. 1 Byte charsets. For comparison: $ perl -le 'my $x="x"x1024; print "123456,$x" foreach 1..1e6'| \ LANG=en_US.UTF-8 /usr/bin/time awk -F, '{print $2;}' | wc -c 35.68user 1.34system 0:37.51elapsed 98%CPU (0avgtext+0avgdata 4368maxresident)k 720inputs+0outputs (1major+336minor)pagefaults 0swaps $ perl -le 'my $x="x"x1024; print "123456,$x" foreach 1..1e6'| \ /usr/bin/time awk -F, '{print $2;}' | wc -c 2.00user 0.98system 0:03.38elapsed 88%CPU (0avgtext+0avgdata 4432maxresident)k 0inputs+0outputs (0major+342minor)pagefaults 0swaps So awk seems to have the same problem (whoah: it's faster than 'cut' ;) Feel free to add above to the bug, and/or mail the Bug-No / add me to the CC-list of the bug and I'll add the above myself. HTH, -dnh -- NT is the only OS that has caused me to beat a piece of hardware to death with my bare hands. -- Derry Hamilton -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org