----- Original Message -----
From: "David C. Rankin"
Listmates,
I stumbled on a problem trying to read a file (email mailbox) line-by-line in bash. Using the built-in read, it strips the leading whitespace from each line making the subsequent write impossible. I was using a while loop as follows:
{ while read XTAG VALUE LINE; do if [[ ${XTAG} == "X-Mozilla-Status:" ]]; then case ${VALUE} in 1019 ) NEWVAL=1011;; 1009 ) NEWVAL=1001;; 001b ) NEWVAL=0013;; 0019 ) NEWVAL=0011;; 000b ) NEWVAL=0003;; 0009 ) NEWVAL=0001;; * ) NEWVAL=${VALUE};; esac echo -e "${XTAG} ${NEWVAL} ${LINE}" >> ${NEWFILE} else echo -e "${XTAG} ${VALUE} ${LINE}" >> ${NEWFILE} fi XTAG=''; VALUE=''; LINE=''; NEWVAL='' done } < ~/.thunderbird/2k12pnl0.default/Mail/pop.suddenlinkmail.com/openSuSE.sav
All of the lines in the mailbox with leading whitespace were written without the leading whitespace like:
original file:
Received: from edge03.suddenlink.net ([195.135.221.135]) by imta03.suddenlink.net
newfile:
Received: from edge03.suddenlink.net ([195.135.221.135]) by imta03.suddenlink.net
Is there a bash trick that will preserve the whitespace?
Try this: --- INFILE="~/.thunderbird/2k12pnl0.default/Mail/pop.suddenlinkmail.com/openSuSE.sav" OUTFILE="some_file" DONE=false until $DONE ; do IFS="" read || DONE=true [[ "${REPLY%%:*}" == "X-Mozilla-Status" ]] || echo "$REPLY" ; continue VAL=${REPLY#*:} VAL=${VAL// /} case "$VAL" in 1019) VAL=1011 ;; 1009) VAL=1001 ;; 001b) VAL=0013 ;; 0019) VAL=0011 ;; 000b) VAL=0003 ;; 0009) VAL=0001 ;; esac echo "${REPLY%%:*}: $VAL" done <$INFILE >$OUTFILE --- explaination: IFS="" Eliminate any word seperator so read sees the whole line as one big word, including the leading, trailing, and all other spaces. Line break still breaks on linefeed and the ifs change only effects the read command, nothing else in the script. read (with no variable) just my minimalist nature. we happen to only need one variable, and read happens to supply a variable REPLY if no other specified. ${REPLY%%:*} display part of $REPLY , from beginning to the first ":" , non-inclusive. For sanity sake, you should always try to compare things in the same context. so, either quote, or don't quote the values on both sides of the test comparator If you have to quote one side for any reason, then quote the other side too. Most times when either side is a variable then you should quote, to account for the possibility of the variable being empty. If that doesn't match, then echo the entire line verbatim and skip the rest of the loop. I re-arranged the loop that way because 99% of lines will not match, so this way 99% of the time we do almost no work. Also, this way we are almost impervious to the content of the line. We don't care what's in it or have to parse all the possible types of lines and reconstruct them, we just spit the whole line back out without even looking at it. The rest only ever happens on those rare lines when we didn't skip out above, VAL=${REPLY#*:} VAL=everything from the 1st colon to end of line case "${VAL// /}" in VAL with all spaces stripped out. I'm assuming that on THESE particular lines that this is reasonable. IE that these lines have the format "tag: value" with no other junk after the colon. So that taking everything after the colon, and then stripping all spaces anywhere, will leave you with a clean "####" for case-matching. If there is any other stuff, well no problem, since these are mail headers and they have an intentionally regular, defined, parseable structure. So if we need to do another split on a comma or a semicolon or something, it's just another VAL=${VAL%%,*} or some such. theres really no need for val & newval, just start with val, and sometimes overwrite it. echo "${REPLY%%:*}: $VAL" Whether we changed VAL or not, either way we simply (re)create the line out of the parts the same way every time. We word-split on the : so we have to put one back in ourselves. Finally, if the input file happens to end at the end of the last line (ie, no final linefeed), then that last line will not be processed and will NOT appear in the output, because the "while read" will exit with an exit status above 0 on hitting the end of file, and so the rest of the loop will not get a chance to run that one last time. To allow for that possibility you need to put the read within a different loop that exits on a variable instead of directly on the exit status of read itself. Then in the loop merely remember the exit status of read but proceed to do that iteration of the loop. So, this: while IFS="" read ;do [...] done <infile >outfile Becomes this: DONE=false until $DONE ;do IFS="" read || DONE=true [...] done <infile >outfile Brian K. White brian@aljex.com http://www.myspace.com/KEYofR +++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++. filePro BBx Linux SCO FreeBSD #callahans Satriani Filk! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org