editor for extremely large test file
Hello, Anyone know of an editor that can open a 39 GB text file? I need to locate and copy out a large block of text from this file, about 8 million rows, at roughly 200 MB of data. Thank you in advance, James
On Tue, 2005-05-10 at 15:06 -0700, James D. Parra wrote:
Hello,
Anyone know of an editor that can open a 39 GB text file? I need to locate and copy out a large block of text from this file, about 8 million rows, at roughly 200 MB of data.
Thank you in advance,
James
Is there any pattern to the text you need removed? Are they just random lines in the file? Others on the list may be able to help using sed. I at times will remove lines that have a pattern with cat <filename>|grep -v <pattern> > <newfile> If all went well rename the new file to the old one. I think vi would handle the size but it will take a l o n g time to open as vi makes a copy of the file being edited in case of a crash. -- Ken Schneider UNIX since 1989, linux since 1994, SuSE since 1998 "The day Microsoft makes something that doesn't suck is probably the day they start making vacuum cleaners." -Ernst Jan Plugge
On Wednesday 11 May 2005 00:06, James D. Parra wrote:
Hello,
Anyone know of an editor that can open a 39 GB text file? I need to locate and copy out a large block of text from this file, about 8 million rows, at roughly 200 MB of data.
sed once you know the starting and ending lines of the block you want to copy, you can do something like sed -ne '<starting line>,<finishing line>p' infile > outfile For example, if the block begins at line 5000 and ends at line 100000 it would be sed -ne '5000,100000p' infile > outfile If the starting and ending lines are known, you can also match on them. For example, if the starting line is ===start=== and the ending line is ===end=== it would be sed -ne '/===start===/,/===end===/p' infile > outfile
On Wed, 11 May 2005 08:06, James D. Parra wrote:
Hello,
Anyone know of an editor that can open a 39 GB text file? I need to locate and copy out a large block of text from this file, about 8 million rows, at roughly 200 MB of data.
Thank you in advance,
James
Well I don't know about handling a 39G text file but you could look at Crisp. I know it handles 1G files quite well. You can download a demo version which is time limited from http://www.crisp.com/ -- Regards, Graham Smith
On Wednesday 11 May 2005 13:09, Graham Smith wrote:
On Wed, 11 May 2005 08:06, James D. Parra wrote:
Hello,
Anyone know of an editor that can open a 39 GB text file? I need to locate and copy out a large block of text from this file, about 8 million rows, at roughly 200 MB of data.
Thank you in advance,
James
Well I don't know about handling a 39G text file but you could look at Crisp. I know it handles 1G files quite well. You can download a demo version which is time limited from http://www.crisp.com/
-- Regards,
Graham Smith
You must have 'some' idea of the text for which you are searching. Write a C program which reads and searches, and when your search string is found dump, say a MB, to a file and note the position (character count) in the file. Continue the search, creating a new file for each 'find'. Examine each file for the text you want, then seek(position) and read/ copy to a new file and edit it. Your initial search prog might take an evening to run, but ... Good luck, Colin
On Wednesday 11 May 2005 21:23, you wrote: On Wednesday 11 May 2005 13:09, Graham Smith wrote:
On Wed, 11 May 2005 08:06, James D. Parra wrote:
Hello,
Anyone know of an editor that can open a 39 GB text file? I need to locate and copy out a large block of text from this file, about 8 million rows, at roughly 200 MB of data.
Thank you in advance,
James
Well I don't know about handling a 39G text file but you could look at Crisp. I know it handles 1G files quite well. You can download a demo version which is time limited from http://www.crisp.com/
-- Regards,
Graham Smith
You must have 'some' idea of the text for which you are searching. Write a C program which reads and searches, and when your search string is found dump, say a MB, to a file and note the position (character count) in the file. Continue the search, creating a new file for each 'find'. Examine each file for the text you want, then seek(position) and read/ copy to a new file and edit it. Your initial search prog might take an evening to run, but ... Good luck, Colin
On Wednesday 11 May 2005 13:23, Colin Carter wrote:
You must have 'some' idea of the text for which you are searching. Write a C program which reads and searches, and when your search string is found dump, say a MB, to a file and note the position (character count) in the file.
There already is such a beast. It's called grep. For example grep -n -C 10000 "pattern" infile > outfile will find the pattern in the infile and print it, along with the 10000 preceding and 10000 following lines into the file outfile, with each line given its line number in the original file
On Thursday 12 May 2005 03:09, Anders Johansson wrote:
On Wednesday 11 May 2005 13:23, Colin Carter wrote:
You must have 'some' idea of the text for which you are searching. Write a C program which reads and searches, and when your search string is found dump, say a MB, to a file and note the position (character count) in the file.
There already is such a beast. It's called grep.
For example
grep -n -C 10000 "pattern" infile > outfile
will find the pattern in the infile and print it, along with the 10000 preceding and 10000 following lines into the file outfile, with each line given its line number in the original file
Thanks Anders. I am a beginner with grep (&Linux), so I only use grep in 'gentle' ways as yet. Regards, Colin
On Tue, 2005-05-10 at 18:06, James D. Parra wrote:
Hello,
Anyone know of an editor that can open a 39 GB text file? I need to locate and copy out a large block of text from this file, about 8 million rows, at roughly 200 MB of data.
Vedit from greenview data comes to mind http://www.vedit.com. The DOS based client was capable of opening files of up to 1G in size. It also ran under unix IIRC. You can find a more current version at the above link.
participants (6)
-
Anders Johansson
-
Colin Carter
-
Graham Smith
-
James D. Parra
-
Ken Schneider
-
Mike McMullin