Greg Freemyer wrote:
All,
I have a huge text file (1.7 million lines) full of unicode and ascii text (half and half).
note: disk space for copies is not a problem, if I need to manipulate this file
Also, I have a 30 line file full of ascii text.
I need to search the large file for any occurrences of the keywords in the 30 line file.
Ignoring the unicode issue, I could use grep (of fgrep, egrep) with appropriate args.
I have no idea how to handle the unicode issue.
ASCII is a subset of UTF-8, so if your Unicode coding is UTF-8 there should be no problem. In other case I wonder how did you manage to create file with interspersed single and (fixed size) multibyte encodings... I guess you would have to split the file somehow. Best regards Petr -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org