https://bugzilla.novell.com/show_bug.cgi?id=225618 Summary: UTF-8 performace issue Product: openSUSE 10.2 Version: RC 1 Platform: Other OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: koenig@linux.de QAContact: qa@suse.de while digging into my updatedb/ cron problem (see bug #225614) I run "updatedb" on the command line and noticed that sort took 5+ minutes CPU time to sort ~43MB output (800k lines) which IMHO is quite much. I used the first 200k file names of find output in updatedb to sort or just run "wc" and get a horrible picture about the performace with LC_CTYPE=de_DE.UTF-8 : sort wc de_DE.UTF-8 77.86 2.42 de_DE 1.88 0.29 I understand that updatedb from cron will run with LC_ALL=POSIX but users will use such tools like sort or wc and there performance sucks for german locale !! I have no idea which other applications and operations might be affected by this topic, but since it looks like some generic libc/locale issue, it might be a BIG problem... btw, this reminds me a bit to the still open emacs performace issue in bug #182294. here are the full sort/wc commands with output for reference: t3:~ # ( LC_CTYPE=de_DE.UTF-8 ; head -200000 /tmp/sort-in | time sort -f > /dev/null ) 77.86user 0.47system 1:19.22elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (1major+3897minor)pagefaults 0swaps t3:~ # ( LC_CTYPE=de_DE ; head -200000 /tmp/sort-in | time sort -f > /dev/null ) 1.88user 0.05system 0:02.04elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+3877minor)pagefaults 0swaps t3:~ # ( LC_CTYPE=de_DE.UTF-8 ; head -200000 /tmp/sort-in | time wc ) 200000 200014 10063202 2.42user 0.04system 0:02.58elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+223minor)pagefaults 0swaps t3:~ # ( LC_CTYPE=de_DE ; head -200000 /tmp/sort-in | time wc ) 200000 200014 10063202 0.29user 0.01system 0:00.38elapsed 79%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+204minor)pagefaults 0swaps -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.