On 11/11/2010 11:43 PM, David C. Rankin wrote:
Dave, all,
A quick bash quandary. I have a simple while loop that reads (stat -c output) from a file, but somehow I end up with an array index of 11 when it should be 10. Here is the code snippet and data file:
declare -a DARRAY TARRAY DATAFILE=test.dat let COUNT=0
[[ -r $DATAFILE ]]&& { while IFS=$' \t\n' read DARRAY[$COUNT] TARRAY[$COUNT]; do echo -e "DARRAY[$COUNT]=${DARRAY[$COUNT]}\nTARRAY[$COUNT]=${TARRAY[$COUNT]}" ((COUNT+=1)) done< $DATAFILE }
## test the index echo -e "\n count: $COUNT tarray num: ${#TARRAY[@]}\n"
## unset the empty one?? for ((i=0;i<${#TARRAY[@]};i++)); do if [[ ${TARRAY[$i]} == "" ]]; then unset TARRAY[$i] fi done
## test again echo -e "\n count: $COUNT tarray num: ${#TARRAY[@]}\n"
And the data file (first with nl, then plain for easy copy)
22:28 nirvana:/home/backup/rpms/data> nl 11.0/test.dat 1 factory_11.0-i586 1278476230.000000000 2 factory_11.0-noarch 1277970584.000000000 3 factory_11.0-src 1266840936.000000000 4 factory_11.0-x86_64 1278799222.000000000 5 openSUSE_11.0-i386 1247502195.000000000 6 openSUSE_11.0-i586 1286988092.000000000 7 openSUSE_11.0-i686 1272482130.000000000 8 openSUSE_11.0-noarch 1288994215.000000000 9 openSUSE_11.0-src 1276858154.000000000 10 openSUSE_11.0-x86_64 1288995398.000000000
22:34 nirvana:/home/backup/rpms/data> cat 11.0/test.dat factory_11.0-i586 1278476230.000000000 factory_11.0-noarch 1277970584.000000000 factory_11.0-src 1266840936.000000000 factory_11.0-x86_64 1278799222.000000000 openSUSE_11.0-i386 1247502195.000000000 openSUSE_11.0-i586 1286988092.000000000 openSUSE_11.0-i686 1272482130.000000000 openSUSE_11.0-noarch 1288994215.000000000 openSUSE_11.0-src 1276858154.000000000 openSUSE_11.0-x86_64 1288995398.000000000
How does ${#TARRAY[@]} = 11 after the read??
put "set -x" at the top of the script and watch every step it makes and you should see how it happens after you pore over the output enough. Also just for future reference in general, you should do DONE=false until $DONE ; do read .. || $DONE=true #process line done Because, in the event that the last line has no trailing linefeed, it will cause read to return false, even though it DID read in the line of text. And if the read is the direct test condition for the while, then the while immediately ends the loop, and that last line of text does not get processed. This way, when read returns false, all that does is set the done flag, and the loop continues and finishes the current iteration. The last line gets processed and THEN the while sees the done flag and exits. If you happen to know that the data will always be consistent, such as here where you can count on the fact that stat will always output the same kind of data, it's not problem. But in general reading text in other situations you might beat your head on that the same way you are here with the opposite problem, a mysterious extra value. As for your extra value, I have to do the same thing in a similar routine (delete a blank record after the end of the loop): This is part of a generic config/data file reader that loads up simple delimited files similar to the way /etc/passwd is, into a set of numbered arrays to make a sort of fake two dimensional array out of 20 arrays each with unlimited number of elements (however many line in the config/data file), like: array1[1] array2[1] array3[1]...array20[1] array1[2] array2[2] array3[2]...array20[2] array1[3] array2[3] array3[3]...array20[3] ... array1[?] array2[?] array3[?]...array20[?] --------------- # field delimiter: default ":" ACFG_DELIM=${2:-:} typeset -i ACFG_n=1 unset CF{1,2,3,4} ACFG_DONE=false until $ACFG_DONE ; do IFS=$ACFG_DELIM read CF{1,2,3,4}[$ACFG_n] junk || ACFG_DONE=true case "${CF1[ACFG_n]}" in '#'*|"") continue ;; esac ((ACFG_n++)) done < $ACFG_FILE unset CF{1,2,3,4}[$ACFG_n] unset ACFG_FILE ACFG_DELIM ACFG_DONE junk [[ $((--ACFG_n)) -gt 0 ]] -------------- So, after the parent script sources this, the arrays are in the partens environment like so: CF1[1] = Field 1 , line 1 CF4[22] = Field 4 , line 22 CF2[37] = Field 2 , line 37 etc. Field 1 is usually the key/index, like a customers name or account number. The line number (the array element number) only increments when a line is read and kept. If there were 30 lines in the file, but only 4 lines randomly scattered in the file were no comments or blanks, then the final array would not only have only exactly 4 lines in it, they would be consecutive counting from 1 to 4. I've tried to think of any way myself for this, but everything I try ends up having some problem and this is the best I can come up with too. Increment the counter AFTER the read, and thus always have to delete the last "pending" counter and record after the end of the loop. Hahaha something about that strikes me as "You're missing something obvious and you're brute-forcing it by sawing off the last bit instead of just not growing it in the first place..." I think there is no real error though. You either gotta have something extra at the beginning or at the end simple as that. If the last line of the data file doesn't have a trailing newline, I don't need the final unset CF* , BUT it doesn't hurt either. If the last line does have a newline (as most files usually will) then I do need the final unset or else I end up with an extra record (an extra set of array elements), all set to blanks same as you're seeing. So by having it, the routine handles all manner of data reliably. Also, that last line is a little unobvious trick. I do need to decrement the counter if I want it to reflect the line count after the script returns to the parent, but it's also in that -gt 0 test, and that test is the last line of the script, and there is no action or anything after it, no "then do ..." What that does is it ends up making the return value of the script is the return value of that [[ ]] test, which reflects whether we read any valid records or not. That's handy here & there. (btw this is used very heavily in a lot of scripts on a lot of production boxes, so there are definitely no off-by-ones ;) -- bkw -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org