Greg Freemyer wrote:
I need a trivial program that will read in a file and tell me how frequently each of the 255 btye values show up *whoops, I need to know about all 256 values"
I know I could do something in awk, etc, but I have 5 TB of data to process (only 15 or so files. One is over 2TB, and none are small)/
No programs that use 32-bit ints to hold the count.
If nothing exists, I guess I can write a c program relatively quickly. I'll give the brains here until morning (New York time) to suggest an efficient solution.
Greg, surely you could have written the code faster than asking the question? #include <stdio.h> unsigned long int count[256] = {0}; char buffer[1048576]; int main( int argc, char** argv ) { int f, i, len; while( fgets(buffer,1024,stdin) ) { f=open(buffer,"r"); while( len=read(f,buffer,sizeof(buffer)) ) { for ( i=0; i<len; i++ ) count[buffer[i]]++; } close(f); } for( i=0; i<256; i++ ) printf("%u = %lu\n", i, count[i]); Feed your filenames to stdin. -- Per Jessen, Zürich (6.8°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org