Mailinglist Archive: opensuse-bugs (4766 mails)

< Previous Next >
[Bug 599000] New: sort from coreutils is broken when used in some locales (very slow and crashes)
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Thu, 22 Apr 2010 19:06:53 +0000
  • Message-id: <bug-599000-21960@xxxxxxxxxxxxxxxxxxxxxxxx/>
http://bugzilla.novell.com/show_bug.cgi?id=599000

http://bugzilla.novell.com/show_bug.cgi?id=599000#c0


Summary: sort from coreutils is broken when used in some
locales (very slow and crashes)
Classification: openSUSE
Product: openSUSE 11.3
Version: Milestone 5
Platform: x86-64
OS/Version: openSUSE 11.3
Status: NEW
Severity: Major
Priority: P5 - None
Component: Basesystem
AssignedTo: bnc-team-screening@xxxxxxxxxxxxxxxxxxxxxx
ReportedBy: xsov@xxxxxxx
QAContact: qa@xxxxxxx
Found By: ---
Blocker: ---


User-Agent: Mozilla/5.0 (compatible; Konqueror/4.4; Linux; ru)
KHTML/4.4.2 (like Gecko) SUSE

I am developer of Free-SA tool (free-sa.sf.net) which performance significantly
depends on speed of sort tool from coreutils package. Earlier many Free-SA
users complain that Free-SA is very slow on OpenSuSE. Since I migrated from
Slackware to OpenSuSE and notice same problem I decided to report it.

Below is samples using big Squid access.log file and 2 sort tools: 'sort' is
original OpenSuSE tool and 'sort.slackware' is from Slackware-current
(ftp://ftp.slackware.com/pub/slackware/slackware64-current/slackware64/a/coreutils-8.4-x86_64-2.txz).
Information about standard Squid log file used in below samples:
# ls -l access.log
-rw-rw-rw- 1 root root 892411388 Jan 31 2010 access.log

To conclude: sort from OpenSuSE coreutils package is slow, consume too much
memory and therefore sometimes do not finish its work properly when used in
some locales (at least in ru_RU.UTF-8).

Reproducible: Always

Steps to Reproduce:
# locale
LANG=ru_RU.UTF-8
LC_CTYPE="ru_RU.UTF-8"
LC_NUMERIC="ru_RU.UTF-8"
LC_TIME="ru_RU.UTF-8"
LC_COLLATE="ru_RU.UTF-8"
LC_MONETARY="ru_RU.UTF-8"
LC_MESSAGES="ru_RU.UTF-8"
LC_PAPER="ru_RU.UTF-8"
LC_NAME="ru_RU.UTF-8"
LC_ADDRESS="ru_RU.UTF-8"
LC_TELEPHONE="ru_RU.UTF-8"
LC_MEASUREMENT="ru_RU.UTF-8"
LC_IDENTIFICATION="ru_RU.UTF-8"
LC_ALL=
# cat access.log|awk '{print $3}'|time sort -k1,1n|uniq
sort: memory exhausted
Command exited with non-zero status 2
28.18user 1.52system 0:51.63elapsed 57%CPU (0avgtext+0avgdata
7667264maxresident)k
7304inputs+57176outputs (39major+829979minor)pagefaults 0swaps
Actual Results:
# cat access.log|awk '{print $3}'|time sort -k1,1n|uniq
sort: memory exhausted
Command exited with non-zero status 2
28.18user 1.52system 0:51.63elapsed 57%CPU (0avgtext+0avgdata
7667264maxresident)k
7304inputs+57176outputs (39major+829979minor)pagefaults 0swaps

Expected Results:
Everything works fine with sort from Slackware coreutils package:
# cat access.log|awk '{print $3}'|time sort.slackware -k1,1n|uniq
192.168.0.24
192.168.0.25
192.168.0.26
192.168.0.30
192.168.0.31
192.168.0.33
192.168.0.34
192.168.0.35
192.168.0.36
23.89user 0.18system 0:32.53elapsed 73%CPU (0avgtext+0avgdata
204192maxresident)k
0inputs+150160outputs (0major+13354minor)pagefaults 0swaps

Everything works fine with OpenSuSE sort when I set locale to 'C':
# export LC_ALL=C;export LANG=C;cat access.log|awk '{print $3}'|time sort
-k1,1n|uniq
192.168.0.24
192.168.0.25
192.168.0.26
192.168.0.30
192.168.0.31
192.168.0.33
192.168.0.34
192.168.0.35
192.168.0.36
8.25user 0.19system 0:19.44elapsed 43%CPU (0avgtext+0avgdata
203696maxresident)k
24inputs+150160outputs (0major+13322minor)pagefaults 0swaps

I checked and found that Slackware sort tool works properly, i.e. it performs
sorting as it is expected for Russian locale. It is just to confirm that
Slackware sort do not somehow fall back to 'C' locale internally and therefore
perform sorting faster and without consuming too much memory compared to
OpenSuSE version.

--
Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

< Previous Next >
Follow Ups