[opensuse] bashscripting (?) help in working with text files
Hello List! Through many pains I got the text file with some data and for plotting purposes I need to separate it into files... with minimal knowledge on bash scripting I was thinking maybe you could help me to figure that out... (I tried to sort that out by hand, but then realised it will probably take a month...) I have special lines in my text file so, I'd love to be able to script something like this read through lines if come to the string "bla bla bla"{ take the number after ": " on that string and put in file named "something"... } Maybe any references where I can read that, without going through "bash scripting for dummies"? :) Thanks in advance and sorry for offtop. Sergey -- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy, Faculty of Science, University of Waterloo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Mon, 21 Jan 2008, by physwalker@gmail.com:
Hello List!
Through many pains I got the text file with some data and for plotting purposes I need to separate it into files... with minimal knowledge on bash scripting I was thinking maybe you could help me to figure that out... (I tried to sort that out by hand, but then realised it will probably take a month...)
I have special lines in my text file so, I'd love to be able to script something like this
read through lines if come to the string "bla bla bla"{ take the number after ": " on that string and put in file named "something"...
Can be done in different ways, of course. With grep and cut: grep "bla bla bla" file |cut -d: -f2 >> something With awk: awk -F: '/"bla bla bla"/ {print $2}' file >>something with Bash: OLDIFS=$IFS IFS=: while read Line;do set -- $Line case $1 in "bla bla bla") echo $2 >>something ;; esac done <file IFS=$OLDIFS Theo -- Theo v. Werkhoven Registered Linux user# 99872 http://counter.li.org ICBM 52 13 26N , 4 29 47E. + ICQ: 277217131 SUSE 10.3 + Jabber: muadib@jabber.xs4all.nl Kernel 2.6.22 + See headers for PGP/GPG info. Claimer: any email I receive will become my property. Disclaimers do not apply. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On January 21, 2008 06:26:18 pm Theo v. Werkhoven wrote:
Can be done in different ways, of course. With grep and cut: grep "bla bla bla" file |cut -d: -f2 >> something
With awk: awk -F: '/"bla bla bla"/ {print $2}' file >>something
with Bash: OLDIFS=$IFS IFS=: while read Line;do set -- $Line case $1 in "bla bla bla") echo $2 >>something ;; esac done <file IFS=$OLDIFS
Theo
Theo, Randall thanks really a lot... grep worked for me, but I also enjoyed reading man pages of awk and csplit and trying to figure them out too ;) That definitely helps me a lot! In fact, I customized the code in a way to get the output in a more "plottable" way, but even then I'll still need to use these tools to work with that! Thank you very much again! Sergey -- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy, Faculty of Science, University of Waterloo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Tuesday 22 January 2008 07:27, Sergey Mkrtchyan wrote:
...
Thank you very much again!
I'm always happy to help an astronomer!
Sergey
-- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy
Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On January 22, 2008 09:04:39 pm Randall R Schulz wrote:
I'm always happy to help an astronomer!
:D thanks a lot, it's really great to have you guys here. I hope you don't mind biophysicists too :-) (I gotta change to astro otherwise ;-) ) Cheers, -- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy, Faculty of Science, University of Waterloo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Monday 21 January 2008 13:56, Sergey Mkrtchyan wrote:
Hello List!
Through many pains I got the text file with some data and for plotting purposes I need to separate it into files... with minimal knowledge on bash scripting I was thinking maybe you could help me to figure that out... (I tried to sort that out by hand, but then realised it will probably take a month...)
Before you try to script a solution, check out whether the "csplit" (context split) program can do what you need.
Sergey Mkrtchyan
Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Sergey Mkrtchyan wrote:
Hello List!
Through many pains I got the text file with some data and for plotting purposes I need to separate it into files... with minimal knowledge on bash scripting I was thinking maybe you could help me to figure that out... (I tried to sort that out by hand, but then realised it will probably take a month...)
I have special lines in my text file so, I'd love to be able to script something like this
read through lines if come to the string "bla bla bla"{ take the number after ": " on that string and put in file named "something"... }
Maybe any references where I can read that, without going through "bash scripting for dummies"? :)
Thanks in advance and sorry for offtop.
Can you give us A) some sample input and B) A description of what you are trying to accomplish. That would help immensely. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Shell version: fgrep 'blah blah blah' file | head -1 | sed 's/^.*: \([0-9][0-9]*\).*$/\1/' > numberfile Perl version: perl -n -e 'if (/blah blah blah/) { s/^.*: ([0-9]+).*$/$1/; print $_; exit(0); }' < file_with_blah > numberfile Shell version will fork three processes; the Perl version will fork one and the Perl version will quit right after it finds the matching line. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On January 22, 2008 09:40:32 pm Scott Simpson wrote:
Shell version: fgrep 'blah blah blah' file | head -1 | sed 's/^.*: \([0-9][0-9]*\).*$/\1/' > numberfile
Perl version: perl -n -e 'if (/blah blah blah/) { s/^.*: ([0-9]+).*$/$1/; print $_; exit(0); }' < file_with_blah > numberfile
Shell version will fork three processes; the Perl version will fork one and the Perl version will quit right after it finds the matching line.
Thanks a lot Scott, Recently I gotta play around with text files too much, hopefully that will make me to start reading some shell scripting soon! Cheers, -- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy, Faculty of Science, University of Waterloo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Sergey Mkrtchyan wrote:
Recently I gotta play around with text files too much, hopefully that will make me to start reading some shell scripting soon!
If you're a biophysicist who expects to make much use of computers, I'd concentrate on learning Perl or Python rather than [bash] shell. IMHO, you'll find them more useful when you interact with other tools or do more complicated tasks and either can do much the same as bash. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday 23 January 2008 09:00, Dave Howorth wrote:
Sergey Mkrtchyan wrote:
Recently I gotta play around with text files too much, hopefully that will make me to start reading some shell scripting soon!
If you're a biophysicist who expects to make much use of computers, I'd concentrate on learning Perl or Python rather than [bash] shell. IMHO, you'll find them more useful when you interact with other tools or do more complicated tasks and either can do much the same as bash.
Yeah, I understand there's lots of genomics going on in astronomy departments these days. What an exciting time to be a grad student. Protons, neutrons, electrons, quarks, gluons, neutrinos and, of course, restriction enzymes. What fun!
Cheers, Dave
RRS -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On January 23, 2008 12:10:16 pm Randall R Schulz wrote:
Yeah, I understand there's lots of genomics going on in astronomy departments these days. What an exciting time to be a grad student. Protons, neutrons, electrons, quarks, gluons, neutrinos and, of course, restriction enzymes.
It's pretty tough too - courses, a lot of assignments, research, meetings, deadlines, exams, tough times living alone in other country and "no, there is no way I can save for a laptop" stuff ;-) But on average, it's pretty exciting! -- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy, Faculty of Science, University of Waterloo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Dave Howorth wrote:
Sergey Mkrtchyan wrote:
Recently I gotta play around with text files too much, hopefully that will make me to start reading some shell scripting soon!
If you're a biophysicist who expects to make much use of computers, I'd concentrate on learning Perl or Python rather than [bash] shell. IMHO, you'll find them more useful when you interact with other tools or do more complicated tasks and either can do much the same as bash.
The big problem with perl, however, is that it rapidly turns into "write-only" code. Although perl has excellant pattern-matching abilities, most pattern-matching strings are almost completely indecipherable, even to the programmer once he hasn't looked at them for a couple weeks. And 90% of programming is maintaining code written by you or someone else. If you can't understand your own code well enough to maintain it, what's it going to be like for someone else who is NOT you? If it weren't for that factor, perl would be great. But...with 15 different ways to do things, and now the language is so large that most people use use a subset of the language... you run into situations where you can't understand what someone is doing with their code, not because you're stupid, but because your method of implementing the algorithm, and the other guy's method of implementing the same algorithm look completely different. Perl is rapidly demonstrated the same sort of "2nd System Effect" (let's try to do EVERYTHING) that Multics did -- with the same result -- too much complexity, and the creation of a byzantine system which forever has mysterious corners, even to people who use it all day, every day.
Cheers, Dave
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, 2008-01-23 at 12:16 -0500, Aaron Kulkis wrote:
Dave Howorth wrote:
Sergey Mkrtchyan wrote:
Recently I gotta play around with text files too much, hopefully that will make me to start reading some shell scripting soon!
If you're a biophysicist who expects to make much use of computers, I'd concentrate on learning Perl or Python rather than [bash] shell. IMHO, you'll find them more useful when you interact with other tools or do more complicated tasks and either can do much the same as bash.
Often overlooked is Tcl (www.tcl.tk), which does excellent text processing. It properly supports regular expressions as well. And unicode text. Linux comes with it. ActiveState make packages for many platforms that makes installation a breeze. -- Roger Oberholtzer OPQ Systems / Ramböll RST Ramböll Sverige AB Kapellgränd 7 P.O. Box 4205 SE-102 65 Stockholm, Sweden Office: Int +46 8-615 60 20 Mobile: Int +46 70-815 1696 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Thu, 2008-01-24 at 11:24 +0100, Roger Oberholtzer wrote:
Dave Howorth wrote:
Sergey Mkrtchyan wrote:
Recently I gotta play around with text files too much, hopefully that will make me to start reading some shell scripting soon!
If you're a biophysicist who expects to make much use of computers, I'd concentrate on learning Perl or Python rather than [bash] shell. IMHO, you'll find them more useful when you interact with other tools or do more complicated tasks and either can do much the same as bash.
Often overlooked is Tcl (www.tcl.tk), which does excellent text processing. It properly supports regular expressions as well. And unicode text. Linux comes with it. ActiveState make packages for many platforms that makes installation a breeze.
I believe the important thing is not particularly the design or the power of the language itself but the available libraries and other components relevant to the job. I know tcl is used to some extent but AFAIK there are a lot more Perl and Python components that might be relevant to Sergey's work. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On January 24, 2008 03:36:05 pm Dave Howorth wrote:
I believe the important thing is not particularly the design or the power of the language itself but the available libraries and other components relevant to the job. I know tcl is used to some extent but AFAIK there are a lot more Perl and Python components that might be relevant to Sergey's work.
Now when you make me wonder, Perl is similiar to C/C++ right? (I've just never seen it). Because in C you have a lot of tools too, like GSL (GNU Scientific Library) which is really something awesome, and Numerical Recipes and stuff... Thanks. -- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy, Faculty of Science, University of Waterloo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On January 23, 2008 12:00:51 pm Dave Howorth wrote:
If you're a biophysicist who expects to make much use of computers, I'd concentrate on learning Perl or Python rather than [bash] shell. IMHO, you'll find them more useful when you interact with other tools or do more complicated tasks and either can do much the same as bash.
I'm in computational biophysics (Monte Carlo simulations), so yes, I use that a lot. I use C for coding, so I actually have to write it in a way to get the output already in a specific format I need, but I also have some files which are already generated (and it takes a while to generate them), so I was looking for ways to get them sorted out... Thanks for the advice, I'll definetely be looking into that! Cheers, -- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy, Faculty of Science, University of Waterloo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday 23 January 2008 09:25, Sergey Mkrtchyan wrote:
On January 23, 2008 12:00:51 pm Dave Howorth wrote:
If you're a biophysicist who expects to make much use of computers, I'd concentrate on learning Perl or Python rather than [bash] shell. IMHO, you'll find them more useful when you interact with other tools or do more complicated tasks and either can do much the same as bash.
I'm in computational biophysics (Monte Carlo simulations), so yes, I use that a lot. I use C for coding, so I actually have to write it in a way to get the output already in a specific format I need, but I also have some files which are already generated (and it takes a while to generate them), so I was looking for ways to get them sorted out...
Make a mockery of my mockery, will you! How do you come to be doing this work in the astronomy department??
Thanks for the advice, I'll definetely be looking into that!
Dave is right, of course (though I'm not sure I think the choice was all that appropriate), genomics folks use BLAST. But the field is moving fast, so you need to keep on top of the developments. E.g., BLAT does what BLAST does, but claims to be 50 times faster (not too suprising, given the simple-minded representation it uses and the fact that it's written in Perl).
Cheers, -- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy, Faculty of Science, University of Waterloo
Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Randall R Schulz wrote:
On Wednesday 23 January 2008 09:25, Sergey Mkrtchyan wrote:
I'm in computational biophysics ^^^^^^^
How do you come to be doing this work in the astronomy department??
-- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy, ^^^^^^^
!!
Faculty of Science, University of Waterloo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On January 23, 2008 12:50:38 pm Randall R Schulz wrote:
How do you come to be doing this work in the astronomy department??
hmm... I don't even know! :-) Biophysics is pretty good in here too, both theoretical and experimenal.
Dave is right, of course (though I'm not sure I think the choice was all that appropriate), genomics folks use BLAST. But the field is moving fast, so you need to keep on top of the developments. E.g., BLAT does what BLAST does, but claims to be 50 times faster (not too suprising, given the simple-minded representation it uses and the fact that it's written in Perl).
I'm in theory of computational physics in soft condensed matter. The thing I'm working on now is mc simulations of polymer chain adsorbed on the structured surface (ultimately we want to look at the adsorption on flexible membranes of two polymer chains). It's not much "bio" there, we just take biological object and look from the "physical point of view". So for example polymer is just a self-avoiding random walk (depends on the model, tho). BLAST is mostly for genomics stuff right? -- Sergey Mkrtchyan, PhD Student @ Department of Physics & Astronomy, Faculty of Science, University of Waterloo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday 23 January 2008 10:12, Sergey Mkrtchyan wrote:
On January 23, 2008 12:50:38 pm Randall R Schulz wrote:
How do you come to be doing this work in the astronomy department??
hmm... I don't even know! :-)
"How did I get here?" -- David Byrne
...
I'm in theory of computational physics in soft condensed matter. The thing I'm working on now is mc simulations of polymer chain adsorbed on the structured surface (ultimately we want to look at the adsorption on flexible membranes of two polymer chains). It's not much "bio" there, we just take biological object and look from the "physical point of view". So for example polymer is just a self-avoiding random walk (depends on the model, tho).
Very cool! There's no part of science I don't find fascinating. Not all equally, but still, fascinating.
BLAST is mostly for genomics stuff right?
Yup. You're spared.
-- Sergey Mkrtchyan
RRS -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (8)
-
Aaron Kulkis
-
Dave Howorth
-
Dave Howorth
-
Randall R Schulz
-
Roger Oberholtzer
-
Scott Simpson
-
Sergey Mkrtchyan
-
Theo v. Werkhoven