[opensuse] BASH - better way to get total number of files matching glob??
Guys, Doing a quick backup of config files (like .bashrc) from within bashrc and I need to find a better way to find the total number of files matching a glob in a directory. Currently I'm using: total=$(ls $budir/bashrc-* | wc -l) to determine how many copies I have in my bashrc.tar.bz2 backup + the new one I just copied to the backup location so I can determine the number to delete to just save the last 3 when I tar it up again. The ls | wc combination works fine, but I don't like the fact that there is a pipe in the middle of it. If there is a cleaner way to do it, let me know. P.S. Yes, I've added an alias too many times by doing "alias xzy='whatever'
~/.bashrc" -- thus the quick backup from within bashrc :p Of course if I would just remember >> I'd be just fine....
-- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
David C. Rankin wrote:
Guys,
Doing a quick backup of config files (like .bashrc) from within bashrc and I need to find a better way to find the total number of files matching a glob in a directory. Currently I'm using:
total=$(ls $budir/bashrc-* | wc -l)
to determine how many copies I have in my bashrc.tar.bz2 backup + the new one I just copied to the backup location so I can determine the number to delete to just save the last 3 when I tar it up again.
The ls | wc combination works fine, but I don't like the fact that there is a pipe in the middle of it. If there is a cleaner way to do it, let me know.
I can't get rid of the pipe for you, but personally I'd do: total=$(find $budir -iname bashrc-* -type f | wc -l) -- Per Jessen, Zürich (16.4°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, 26 May, 2010 at 08:07:21 +0200, Per Jessen wrote:
David C. Rankin wrote:
Guys,
Doing a quick backup of config files (like .bashrc) from within bashrc and I need to find a better way to find the total number of files matching a glob in a directory. Currently I'm using:
total=$(ls $budir/bashrc-* | wc -l)
to determine how many copies I have in my bashrc.tar.bz2 backup + the new one I just copied to the backup location so I can determine the number to delete to just save the last 3 when I tar it up again.
The ls | wc combination works fine, but I don't like the fact that there is a pipe in the middle of it. If there is a cleaner way to do it, let me know.
I can't get rid of the pipe for you, but personally I'd do:
total=$(find $budir -iname bashrc-* -type f | wc -l)
As long as the glob matches (exactly) it can be done within bash, using an array. Something on the order of; array=(path/to/glob*) total=${#array[@]} Haven't tested it, but I'm pretty sure files with whitespace will break this, but you get the idea hth /jon -- YMMV -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 05/26/2010 02:05 AM, Jon Clausen wrote:
Haven't tested it, but I'm pretty sure files with whitespace will break this, but you get the idea
That's what IFS=$'\n' is for :p -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Thu, 27 May, 2010 at 03:11:18 -0500, David C. Rankin wrote:
On 05/26/2010 02:05 AM, Jon Clausen wrote:
Haven't tested it, but I'm pretty sure files with whitespace will break this, but you get the idea
That's what IFS=$'\n' is for :p
Indeed... however: Now I *have* tested, and somewhat surprisingly (to me at least) filenames with whitespace are not a problem: jon@nx8220:~/tmp/test> ll total 0 -rw-r--r-- 1 jon users 0 2010-05-27 11:39 file no1 -rw-r--r-- 1 jon users 0 2010-05-27 11:39 file no2 drwxr-xr-x 3 jon users 17 2010-04-21 09:32 Folder1 jon@nx8220:~/tmp/test> array=(*) jon@nx8220:~/tmp/test> echo ${array[@]} file no1 file no2 Folder1 jon@nx8220:~/tmp/test> echo ${#array[@]} 3 jon@nx8220:~/tmp/test> echo ${array[0]} file no1 jon@nx8220:~/tmp/test> echo ${#array[0]} 8 so there :) /jon -- YMMV -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 05/27/2010 04:47 AM, Jon Clausen wrote:
Indeed... however:
Now I *have* tested, and somewhat surprisingly (to me at least) filenames with whitespace are not a problem:
<snip>
so there :)
/jon
Hmm... Old dog, new trick! That's pretty cool: 09:55 alchemy:~/tmp> ( myarray=(*); for ((i=0;i<${#myarray[@]};i++)); do printf "myarray[%2d]: %s\n" ${i} "${myarray[${i}]}"; done ) myarray[ 0]: 2010 Narrative Update.txt myarray[ 1]: 80 myarray[ 2]: 90dcf370cbde.tif myarray[ 3]: Abstract-DarkGlow.emerald.zip <snip> What I am uncomfortable with is the spaces issue, despite having proven it to myself, but as long as globbing holds, then there should be no reason this wouldn't be as safe as 'for i in *; do ...xyz...; done. In fact, it is probably preferred over my usual: SARRAY=( $(ls $SEARCHDIR) ), though I can't lay my hand on a rule telling me which is preferred. With my usual method, setting IFS to newline is definitely required. What i like about (*) is that globbing handles all the spaces for you when loading the array. However, you would still need IFS set to newline (or religious quoting) for any subsequent processing of names with spaces later on in the script. New trick added to toolbox :p (as far as the quoting/IFS issue, try the above cli without quotes around "${myarray[${i}]}" ) -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Thu, 27 May, 2010 at 10:09:02 -0500, David C. Rankin wrote:
On 05/27/2010 04:47 AM, Jon Clausen wrote:
Indeed... however:
Now I *have* tested, and somewhat surprisingly (to me at least) filenames with whitespace are not a problem:
Hmm...
Old dog, new trick! That's pretty cool:
09:55 alchemy:~/tmp> ( myarray=(*); for ((i=0;i<${#myarray[@]};i++)); do printf "myarray[%2d]: %s\n" ${i} "${myarray[${i}]}"; done ) myarray[ 0]: 2010 Narrative Update.txt myarray[ 1]: 80 myarray[ 2]: 90dcf370cbde.tif myarray[ 3]: Abstract-DarkGlow.emerald.zip
whoa... you didn't say anything about *using* the information afterwards... ;)
What I am uncomfortable with is the spaces issue, despite having proven it to myself, but as long as globbing holds, then there should be no reason this wouldn't be as safe as 'for i in *; do ...xyz...; done. In fact, it is probably preferred over my usual: SARRAY=( $(ls $SEARCHDIR) ), though I can't lay my hand on a rule telling me which is preferred.
I think the argument would be that using bash's globbing to create the list avoids spawning a subprocess to execute 'ls *'. So I guess there might be a teeny little performance gain? But OTOH using globbing like this makes no distinction between various 'object' types. So you get everything that the glob matches in your list: files, directories, links, fifos, the lot. So if one has to do 'file type' operations on the objects later, one would have to amend the code with some [ -f ${array[$i]} ] -ish tests, which might end up eating away whatever performance gain one got to begin with. Actually, come to think of it, the above could be the argument for using Per's suggestion to use 'find', rather than to rely on globbing?
With my usual method, setting IFS to newline is definitely required. What i like about (*) is that globbing handles all the spaces for you when loading the array. However, you would still need IFS set to newline (or religious quoting) for any subsequent processing of names with spaces later on in the script.
Yeah I think that about sums it up. And thinking a little about it, it makes sense that you need either IFS manipulation or quoting to avoid trouble: It's how variable expansion works in bash, regardless of how the content was assigned to the variable in the first place... and I guess array element expansion more or less *should* behave like any other 'variable'...
New trick added to toolbox :p
To mine too :) I didn't know about ${#array[@]} before, but your question kicked a couple of synapses into action: I'm in the process of porting a shell script to perl, and over there I regularly use $#array... hence the association. /jon -- YMMV -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Thu, 27 May 2010 19:41:37 +0200, Jon Clausen
So if one has to do 'file type' operations on the objects later, one would have to amend the code with some [ -f ${array[$i]} ] -ish tests, which might end up eating away whatever performance gain one got to begin with.
Why? Bash has test built in AFAIK so you'd still have the advantage of no subprocess. Philipp -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Fri, 28 May, 2010 at 00:50:56 +0200, Philipp Thomas wrote:
On Thu, 27 May 2010 19:41:37 +0200, Jon Clausen
wrote: So if one has to do 'file type' operations on the objects later, one would have to amend the code with some [ -f ${array[$i]} ] -ish tests, which might end up eating away whatever performance gain one got to begin with.
Why? Bash has test built in AFAIK so you'd still have the advantage of no subprocess.
CMIIW: unless you explicitly call /usr/bin/test then the shell *will* use the builtin 'test' will it not? Looking only at: array=( $path/$glob ) vs. array=( $(ls $path/$glob) ) - where the resulting arrays should be more or less identical - then the former would 'win' over the latter, in that there are no subprocesses involved. The (potential) trouble with this is that such lists might need to be 'sanitized', in order to avert trouble. What I meant was that overall one *might* be better off using something like 'find' to generate the list in such a way that it only contains the desired type of objects to begin with. so it becomes: 1: fast, lightweight generation of a 'dirty' list that must be cleaned vs. 2: slower and 'heavier' generation of a 'clean' list I guess it depends on the situation, but at least in some cases I think that taking the 'filtering' penalty up front might be worth it in the end... /jon -- YMMV -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hello, On Thu, 27 May 2010, David C. Rankin wrote:
On 05/27/2010 04:47 AM, Jon Clausen wrote:
Indeed... however:
Now I *have* tested, and somewhat surprisingly (to me at least) filenames with whitespace are not a problem: [..] Old dog, new trick! That's pretty cool:
09:55 alchemy:~/tmp> ( myarray=(*); for ((i=0;i<${#myarray[@]};i++)); do printf "myarray[%2d]: %s\n" ${i} "${myarray[${i}]}"; done ) myarray[ 0]: 2010 Narrative Update.txt [..] What I am uncomfortable with is the spaces issue, despite having proven it to myself, but as long as globbing holds, then there should be no reason this wouldn't be as safe as 'for i in *; do ...xyz...; done. In fact, it is probably preferred over my usual: SARRAY=( $(ls $SEARCHDIR) ), ^^^^^^^^^^^ *SLAP* Just for the unquoted "$SEARCHDIR"!.
though I can't lay my hand on a rule telling me which is preferred.
The former is a 'no-brainer' to be preferred. Using 'ls' in such context is just plain stupid (or evil). Equivalent to (being forced to be) playing russian roulette with statistically about 5.952 bullets (in 6 chambers) or something. Oh, and BTW: bash-internal globbing as above or with 'for x in GLOB; do' will never result in a "too long" command line error. You might run out of memory putting the result into the array-variable, though.
With my usual method, setting IFS to newline is definitely required. What i like about (*) is that globbing handles all the spaces for you when loading the array. However, you would still need IFS set to newline (or religious quoting) for any subsequent processing of names with spaces later on in the script.
ALWAYS QUOTE VARIABLES! is also a no-brainer! I can't stress that enough. With proper quoting, even newlines and what-not are a non-problem! $ ls -b a a\ "b"\ c"\ d a\ '"b'\ c"\ d\ and\ e a\ b a\ b\nc d $ A=(*) $ echo "${#A[@]}" 6 $ for f in "${A[@]}"; do echo "-A»$f«"; done-b -A»a«-b -A»a "b" c" d«-b -A»a '"b' c" d and e«-b -A»a b«-b -A»a b-b c-A«-b -A»d«-b Note the embedded newline and quotes in the output. Can't get much worse than that (filenames containing '*' or '!' or stuff like that are no problem either). It really _IS_ very simple: _ _ __ _____ _____ ___ _ _ ___ _____ ___ _/\_/_\ | |\ \ / /_\ \ / / __|/\_ / _ \| | | |/ _ \_ _| __|
< _ \| |_\ \/\/ / _ \ V /\__ > < | (_) | |_| | (_) || | | _| \/_/ \_\____\_/\_/_/ \_\_| |___/\/ \__\_\\___/ \___/ |_| |___|
__ _____ _ _ ___ __ ___ ___ ___ _ ___ _ ___ ___ _ \ \ / / _ \| | | | _ \ \ \ / /_\ | _ \_ _| /_\ | _ ) | | __/ __| | \ V / (_) | |_| | / \ V / _ \| /| | / _ \| _ \ |__| _|\__ \_| |_| \___/ \___/|_|_\ \_/_/ \_\_|_\___/_/ \_\___/____|___|___(_) ___ ___ ___ ___ ___ ___ _ | _ \ __| _ \_ _/ _ \| \| | | _/ _|| /| | (_) | |) |_| |_| |___|_|_\___\___/|___/(_) That's not "religious", that's a no-brainer standard behaviour[1]. That's "defensive" scripting. ;) Set IFS only if you want to deliberately split at non-standard-IFS places. Or if reading non-standard-IFS seperated stuff with 'read' or when using the 'set' builtin. That's basically it, IIRC. Oh, and BTW, using "${VAR}" instead of just "$VAR" is also very much encouraged, even if more often superfluous (than using quotes). Can't hurt though, so use "${var}" instead of "$var" routinely. Happy Slapsgiving ;) HTH, -dnh [1] yes, there are cases where you can't quote, but those are catering for not being able to use arrays, i.e. writing for 'sh' compatibility. -- Get your acts together, guys. Stop blathering and frothing at the mouth. -- Linus Torvalds -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen wrote:
David C. Rankin wrote:
Guys,
Doing a quick backup of config files (like .bashrc) from within bashrc and I need to find a better way to find the total number of files matching a glob in a directory. Currently I'm using:
total=$(ls $budir/bashrc-* | wc -l)
to determine how many copies I have in my bashrc.tar.bz2 backup + the new one I just copied to the backup location so I can determine the number to delete to just save the last 3 when I tar it up again.
The ls | wc combination works fine, but I don't like the fact that there is a pipe in the middle of it. If there is a cleaner way to do it, let me know.
I can't get rid of the pipe for you, but personally I'd do:
total=$(find $budir -iname bashrc-* -type f | wc -l)
That barfs if you're in the directory with the bashrc files. You need to add single quotes around the -iname arg. Joachim -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Joachim Schrod Email: jschrod@acm.org Roedermark, Germany -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Joachim Schrod wrote:
Per Jessen wrote:
David C. Rankin wrote:
Guys,
Doing a quick backup of config files (like .bashrc) from within bashrc and I need to find a better way to find the total number of files matching a glob in a directory. Currently I'm using:
total=$(ls $budir/bashrc-* | wc -l)
to determine how many copies I have in my bashrc.tar.bz2 backup + the new one I just copied to the backup location so I can determine the number to delete to just save the last 3 when I tar it up again.
The ls | wc combination works fine, but I don't like the fact that there is a pipe in the middle of it. If there is a cleaner way to do it, let me know.
I can't get rid of the pipe for you, but personally I'd do:
total=$(find $budir -iname bashrc-* -type f | wc -l)
That barfs if you're in the directory with the bashrc files. You need to add single quotes around the -iname arg.
Or add a backslash, whic I always do: total=$(find $budir -iname bashrc-\* -type f | wc -l) -- Per Jessen, Zürich (18.4°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
El 26/05/10 01:47, David C. Rankin escribió:
total=$(ls $budir/bashrc-* | wc -l)
total=$(python -c "import glob; print len(glob.glob('<your pattern>'))" You have higher level tools, use them :) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hello, On Mon, 07 Jun 2010, Cristian Rodríguez wrote:
El 26/05/10 01:47, David C. Rankin escribió:
total=$(ls $budir/bashrc-* | wc -l)
total=$(python -c "import glob; print len(glob.glob('<your pattern>'))"
You have higher level tools, use them :)
perl -e 'print scalar @{[glob("*.txt")]};'
or
perl -e 'print $#{[glob("*.txt")]} + 1;'
or
perl -e 'print scalar @{[<*.txt>]};' ### [1]
or
perl -e 'print $#{[<*.txt>]}+1;'
There's a reason "perlgolf" exists :)
But actually, it'd probably be reasonable, to write the rest of the
script in python resp. perl as well. ;) As above is "perl" core only,
it's pretty fast. On this (500MHz Athlon) box, perl 5.10.0 is even
faster than python 2.5. The fastest of a about a dozen calls each:
python: real 0m0.137s
perl: real 0m0.053s
I'd guess it's the file-lookup + 'import' of 'glob' in python.
-dnh
[1] that actually is parsed as
use File::Glob (); print scalar @{[glob('*.txt')];};
as '
participants (7)
-
Cristian Rodríguez
-
David C. Rankin
-
David Haller
-
Joachim Schrod
-
Jon Clausen
-
Per Jessen
-
Philipp Thomas