Uniq is broken on SLES 9

7 Feb 2005

      I have a very large syslog file, over 1 million plus lines and I am trying to truncate duplicate
entries, for example this filesystem message that spams every few seconds:

Feb  1 00:17:00 db-0202 ufs: [ID 845546 kern.notice] NOTICE: alloc: /u70: file system full
Feb  1 00:17:04 db-0202 ufs: [ID 845546 kern.notice] NOTICE: alloc: /u70: file system full
Feb  1 00:17:20 db-0202 ufs: [ID 845546 kern.notice] NOTICE: alloc: /u70: file system full
Feb  1 00:17:49 db-0202 ufs: [ID 845546 kern.notice] NOTICE: alloc: /u70: file system full

So I am doing this:

cat syslog | sort | uniq -f3 > tempfile

This manages to reduce about 103,894 messages like those above to only 3916, but it should be
reducing them to 1. The only difference between these entries are the timestamp, which -f3 is
telling uniq to ignore.

So, I next started looking at the resulting 3916 messages and can't find any differences between
them other than within the first 16 characters of each line, which is the timestamp. I compared
free space positions, I checked for non-printable characters using cat -a, etc....

I also tried adding -c to uniq to have it prefix each outline with a count of the number of
duplicates removed, and that seems to be working. (however, still getting 3916 identical messages
in the outfile)

Ugh. Next step is to compile solaris's uniq on SLES 9 and see if that resolves the problem. Anyone
else do any heavy duty work with uniq and notice anything similiar to what I am seeing?

Thx

=====
Chuck Carson - Sr. Systems Engineer
Syrrx, Inc. - www.syrrx.com
10410 Science Center Drive
San Diego, CA 92121
Work: 858.622.8528
Fax:  858.550.0526

Uniq is broken on SLES 9

Rhugga