[opensuse] supercool BASH input routine for files and directories with globbing without quoting input
Listmates, Here is an off-the-wall one for the BASH guys. Many times I need to get file or directory information from the command line with wildcards in the filename or path. I have always hated having to quote the filename or path to prevent shell expansion before it gets to my script. I was pecking around on a script for rsync that would take the same type of file and path information with wildcards and hit upon a really cool way to handle the input without having to quote the names with wildcards to prevent expansion. (.. at least I thought it was cool) Any way, I thought somebody else could get some use out of it. I've replaced the rsync calls with echo statments so your can test it if you have any use for it. Any script that deals with multiple files or filespecs is a candidate for its use. Take a look. The process is a 2-loop process with an Array of cli inputs as the outer and the inner tests whether the array element holds and directory or a filespec with globbing. If so, then the fileglob is processed with ls for the individual files. Works for one file or a thousand specified in as many chunks as you want to put on the command line: #!/bin/bash --norc OLDIFS=$IFS IFS=$'\n' ## Fill an Array with all CLI input declare -a CLIARRAY CLIARRAY=( "$@" ) ## Step through CLIARRAY with ls to expand wildcards and process ## files specified on the command line sequentially. Rely on ## rsync to throw error if bad filename ## Simple echo is used for this example for ((a=0;a<${#CLIARRAY[@]};a++)); do ## if the argument is a directory rsync in 1 shot, else rsync each file if [[ -d ${CLIARRAY[${a}]} ]]; then echo "directory: ${CLIARRAY[${a}]}" else for b in $(ls ${CLIARRAY[${a}]}); do echo "file: $b" done fi done IFS=$OLDIFS exit 0 If you are interested in using it with rsync as I had it originally, just replace the first echo with: rsync -ruv ${CLIARRAY[${a}]} ${SSUSER}@${DESTHOST}:${DESTPATH} and the second echo with: rsync -ruv $b ${SSUSER}@${DESTHOST}:${DESTPATH} This allowed me to collect a lot of dispersed information for troubleshooting X issues with a minimum of typing. I had used it like: 2nv /var/log/mes* /var/log/Xorg.* /var/log/kdm.log (the script name was 2nv (short for rsync to nirvana) but it allow chunks to be thown into it and it would nicely rsync the wanted files to their destination. Script in good health :p -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
David C. Rankin wrote:
I was pecking around on a script for rsync that would take the same type of file and path information with wildcards and hit upon a really cool way to handle the input without having to quote the names with wildcards to prevent expansion.
Personally, I would probably have used a filelist as input. In particular with rsync where it's difficult to use xargs. /Per -- Per Jessen, Zürich (5.6°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday 17 October 2009 02:53:58 am Per Jessen wrote:
David C. Rankin wrote:
I was pecking around on a script for rsync that would take the same type of file and path information with wildcards and hit upon a really cool way to handle the input without having to quote the names with wildcards to prevent expansion.
Personally, I would probably have used a filelist as input. In particular with rsync where it's difficult to use xargs.
/Per
Per, You got an example of a filelist I can steal? I'm always looking for better ways to do things. In the snippet I posted, I had just defined: SSUSER=me DESTHOST=mybox.mydomain.com DESTPATH=/the/long/path/to/where/the/files/went Then with the double loop, I cut all my typing down to 2box /first/path /next/setof/fil* /etc/X11/x* and let the two loops and rsync calls handle it from there: rsync -ruv ${CLIARRAY[${a}]} ${SSUSER}@${DESTHOST}:${DESTPATH} or rsync -ruv $b ${SSUSER}@${DESTHOST}:${DESTPATH} If a 'filelist' can do it with more flexibility, I'm all for learning (or at least trying to learn....:p) -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
David C. Rankin wrote:
On Saturday 17 October 2009 02:53:58 am Per Jessen wrote:
David C. Rankin wrote:
I was pecking around on a script for rsync that would take the same type of file and path information with wildcards and hit upon a really cool way to handle the input without having to quote the names with wildcards to prevent expansion.
Personally, I would probably have used a filelist as input. In particular with rsync where it's difficult to use xargs.
/Per
Per,
You got an example of a filelist I can steal? I'm always looking for better ways to do things.
Hi David Maybe the filelist doesn't really apply here - I have to admit not having studied your script in greater detail, but when I don't want wildcards expanded on the command line, I just escape them (\* etc), so perhaps I don't really recognize the problem you are solving.
Then with the double loop, I cut all my typing down to
2box /first/path /next/setof/fil* /etc/X11/x*
I think I would done something like this: rsync -av /first/path /next/setof/fil\* /etc/X11/x\* <destination> If it needs to be wrapped in a script: rsync -av $@ <destination> The filelist option is really when you want to make sure you can specify _any_ number of files. /Per -- Per Jessen, Zürich (5.1°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sunday 18 October 2009 07:01:07 am Per Jessen wrote:
David C. Rankin wrote:
On Saturday 17 October 2009 02:53:58 am Per Jessen wrote:
David C. Rankin wrote:
I was pecking around on a script for rsync that would take the same type of file and path information with wildcards and hit upon a really cool way to handle the input without having to quote the names with wildcards to prevent expansion.
Personally, I would probably have used a filelist as input. In particular with rsync where it's difficult to use xargs.
/Per
Per,
You got an example of a filelist I can steal? I'm always looking for better ways to do things.
Hi David
Maybe the filelist doesn't really apply here - I have to admit not having studied your script in greater detail, but when I don't want wildcards expanded on the command line, I just escape them (\* etc), so perhaps I don't really recognize the problem you are solving.
Then with the double loop, I cut all my typing down to
2box /first/path /next/setof/fil* /etc/X11/x*
I think I would done something like this:
rsync -av /first/path /next/setof/fil\* /etc/X11/x\* <destination>
If it needs to be wrapped in a script:
rsync -av $@ <destination>
The filelist option is really when you want to make sure you can specify _any_ number of files.
Well that is exactly what this snippet does. That was the beauty of it. In the past, as you say, you can always escape or quote wildcards to prevent expansion. That wasn't the point. The point was this two loop snippet allows you to handle all of the native wildcards and file globbing WITHOUT having to escape or quote. That allowed the flexibility I was looking for to be able to use the native bash command line as you would with ls or any of the other bash builtins. That makes it idiot proof for users or anyone that might make use of it later because it works like everything else bash already provides -- no special tricks required. I have tried a number of ways in the past and this just happened as sort of an accident. I thought it was helpful so I posted it. Also, since rsync wildcards require "double wildcards" (i.e. rsync ~/.bash**) this expansion in the second loop also took care of that problem. So it was the killing 2 birds with one stone combination that made it noteworthy to me. If your ever stuck with the criteria of taking input containing wildcards without escapes or quotes, that is where this combo will help :) -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
David C. Rankin wrote:
On Sunday 18 October 2009 07:01:07 am Per Jessen wrote:
The filelist option is really when you want to make sure you can specify _any_ number of files.
Well that is exactly what this snippet does. That was the beauty of it.
We're probably splitting hairs, but I meant any number of files regardless of the shell input length restriction. When you don't use --files-from or similar, your rsync command will exceed the max input length with e.g. 32000 files or patterns. /Per -- Per Jessen, Zürich (5.2°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday 21 October 2009 01:29:16 am Per Jessen wrote:
David C. Rankin wrote:
On Sunday 18 October 2009 07:01:07 am Per Jessen wrote:
The filelist option is really when you want to make sure you can specify _any_ number of files.
Well that is exactly what this snippet does. That was the beauty of it.
We're probably splitting hairs, but I meant any number of files regardless of the shell input length restriction. When you don't use --files-from or similar, your rsync command will exceed the max input length with e.g. 32000 files or patterns.
Well, to be splitting it, we both have to see what we're splitting. The limit completely evaded me as I have never hit it and I have done some REALLY BIG rsync pulls and pushes from server to server. I guess if I tried to do rsync -uav /usr box2:/big/backup I might have had more of a feel for what you are saying. The only size limit in the snippet would be the array limit and I've routinely used 12-15,000 element arrays in bash. It's probably the same limit somewhere around 32,000. I just hope I never see that one either ;-) -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
We're probably splitting hairs, but I meant any number of files regardless of the shell input length restriction. When you don't use --files-from or similar, your rsync command will exceed the max input length with e.g. 32000 files or patterns.
Well, to be splitting it, we both have to see what we're splitting. The limit completely evaded me as I have never hit it and I have done some REALLY BIG rsync pulls and pushes from server to server. I guess if I tried to do
rsync -uav /usr box2:/big/backup
I might have had more of a feel for what you are saying.
nah, that wouldn't do it, that command line has only one [relevant] token. however, rsync -uav /var/spool/cache/squid/*/*/* box2:/squid/backup That's gonna do it! /var/spool/cache/squid/*/*/* will be expanded out to produce many, many tokens... Phil -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Philip Dowie wrote:
We're probably splitting hairs, but I meant any number of files regardless of the shell input length restriction. When you don't use --files-from or similar, your rsync command will exceed the max input length with e.g. 32000 files or patterns.
Well, to be splitting it, we both have to see what we're splitting. The limit completely evaded me as I have never hit it and I have done some REALLY BIG rsync pulls and pushes from server to server. I guess if I tried to do
rsync -uav /usr box2:/big/backup
I might have had more of a feel for what you are saying.
nah, that wouldn't do it, that command line has only one [relevant] token. however,
When David later on expands the patterns inside his script, the resulting rsync command will be subject to the same restriction. /Per -- Per Jessen, Zürich (8.4°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Friday 23 October 2009 01:01:11 am Per Jessen wrote:
I might have had more of a feel for what you are saying.
nah, that wouldn't do it, that command line has only one [relevant] token. however,
When David later on expands the patterns inside his script, the resulting rsync command will be subject to the same restriction.
/Per
Yes, but since the expansion is handled "before" the rsync call, rsync receives the full file or directory name as needed. Look and the conditional inside the loop. rsync is only called the exact number of times needed as a result. If a directory can be passed to rsync, then it is so you eliminate any unneeded calls. (there was quite a bit under the seemingly simple hood) -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
David C. Rankin wrote:
Yes, but since the expansion is handled "before" the rsync call, rsync receives the full file or directory name as needed. Look and the conditional inside the loop. rsync is only called the exact number of times needed as a result. If a directory can be passed to rsync, then it is so you eliminate any unneeded calls.
Your script still seems unnecessarily complicated to me (could be my problem): your example or calling '2nv' from your first posting: 2nv /var/log/mes* /var/log/Xorg.* /var/log/kdm.log Now, your file globbing is resolved by the shell when you call this, so you end up with something more like this: 2nv /var/log/messages [...] /var/log/Xorg.0.log [...] /var/log/kdm.log Given that, your script should not need to be more than this: #!/bin/sh SSUSER=user DESTHOST=destination DESTPATH=path rsync -ruv $@ ${SSUSER}@${DESTHOST}:${DESTPATH} I'm asusming you're achieving something else with your script, but I just can't see what it is. /Per -- Per Jessen, Zürich (11.6°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
David, The arguments are globbed alright, a simple test shows that (I named the script guy.sh) : # ./guy.sh ../../work/* Number of array elements: 29 Element 0 : ../../work/ARCSDEsysdesig.pdf file: ../../work/ARCSDEsysdesig.pdf Element 1 : ../../work/bash directory: ../../work/bash Element 2 : ../../work/contacts.csv file: ../../work/contacts.csv Element 3 : ../../work/cpio.txt file: ../../work/cpio.txt Element 4 : ../../work/dhopor directory: ../../work/dhopor Element 5 : ../../work/gi1 file: ../../work/gi1 Element 6 : ../../work/gi2 file: ../../work/gi2 Element 7 : ../../work/g.sh file: ../../work/g.sh Element 8 : ../../work/guy1 file: ../../work/guy1 Element 9 : ../../work/guy2 file: ../../work/guy2 Element 10 : ../../work/guy2.orig file: ../../work/guy2.orig Element 11 : ../../work/guy.in file: ../../work/guy.in Element 12 : ../../work/guy.out file: ../../work/guy.out Element 13 : ../../work/guy.sh file: ../../work/guy.sh Element 14 : ../../work/index.html file: ../../work/index.html Element 15 : ../../work/input.txt file: ../../work/input.txt Element 16 : ../../work/java_gcverbose_options.odt file: ../../work/java_gcverbose_options.odt Element 17 : ../../work/java_gcverbose_output.odt file: ../../work/java_gcverbose_output.odt Element 18 : ../../work/modresconf file: ../../work/modresconf Element 19 : ../../work/perltime.sh file: ../../work/perltime.sh Element 20 : ../../work/portaal.zip file: ../../work/portaal.zip Element 21 : ../../work/prod_ora_20326.trc.gz file: ../../work/prod_ora_20326.trc.gz Element 22 : ../../work/sensors_info.txt file: ../../work/sensors_info.txt Element 23 : ../../work/t file: ../../work/t Element 24 : ../../work/TABLES_TS_MAP.jpg file: ../../work/TABLES_TS_MAP.jpg Element 25 : ../../work/TABLES_TS_SPACE_ADVISORY_2008-05-24.jpg file: ../../work/TABLES_TS_SPACE_ADVISORY_2008-05-24.jpg Element 26 : ../../work/thunderbird directory: ../../work/thunderbird Element 27 : ../../work/thunderbird.csv.ldif file: ../../work/thunderbird.csv.ldif Element 28 : ../../work/winterdienst directory: ../../work/winterdienst Only if you quote will you get your argument back, as expected : [guy@gz:~/work/bash] # ./guy.sh "../../work/*" Number of array elements: 1 Element 0 : ../../work/* file: ../../work/ARCSDEsysdesig.pdf file: ../../work/contacts.csv file: ../../work/cpio.txt file: ../../work/gi1 file: ../../work/gi2 file: ../../work/g.sh file: ../../work/guy1 file: ../../work/guy2 file: ../../work/guy2.orig file: ../../work/guy.in file: ../../work/guy.out file: ../../work/guy.sh file: ../../work/index.html file: ../../work/input.txt file: ../../work/java_gcverbose_options.odt file: ../../work/java_gcverbose_output.odt file: ../../work/modresconf file: ../../work/perltime.sh file: ../../work/portaal.zip file: ../../work/prod_ora_20326.trc.gz file: ../../work/sensors_info.txt file: ../../work/t file: ../../work/TABLES_TS_MAP.jpg file: ../../work/TABLES_TS_SPACE_ADVISORY_2008-05-24.jpg file: ../../work/thunderbird.csv.ldif file: ../../work/bash: file: guy.sh file: ../../work/dhopor: or if you use "set -o noglob" in your shell : # set -o noglob # ./guy.sh ../../work/* Number of array elements: 1 Element 0 : ../../work/* file: ../../work/ARCSDEsysdesig.pdf file: ../../work/contacts.csv file: ../../work/cpio.txt file: ../../work/gi1 file: ../../work/gi2 file: ../../work/g.sh file: ../../work/guy1 file: ../../work/guy2 file: ../../work/guy2.orig file: ../../work/guy.in file: ../../work/guy.out file: ../../work/guy.sh file: ../../work/index.html file: ../../work/input.txt file: ../../work/java_gcverbose_options.odt file: ../../work/java_gcverbose_output.odt file: ../../work/modresconf file: ../../work/perltime.sh file: ../../work/portaal.zip file: ../../work/prod_ora_20326.trc.gz file: ../../work/sensors_info.txt file: ../../work/t file: ../../work/TABLES_TS_MAP.jpg file: ../../work/TABLES_TS_SPACE_ADVISORY_2008-05-24.jpg file: ../../work/thunderbird.csv.ldif file: ../../work/bash: file: guy.sh file: ../../work/dhopor: file: control_files file: data model file: dho-portaal_sync.sh file: ../../work/thunderbird: file: Persoonlijk_adresboek.vcf file: ../../work/winterdienst: file: aip file: aip_reworked.tgz.eds file: aip_wd.tgz file: control_files file: data model file: dho-portaal_sync.sh file: ../../work/thunderbird: file: Persoonlijk_adresboek.vcf file: ../../work/winterdienst: file: aip file: aip_reworked.tgz.eds file: aip_wd.tgz Gtz, Guy. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Thursday 22 October 2009 04:41:46 pm Guy wrote:
David,
The arguments are globbed alright, a simple test shows that (I named the script guy.sh) :
Sorry Guy, The snippet implied setting the internal field separator to only break on end of line. That's how you handle spaces in file names, etc. Just add: IFS=$'\n' immediately before the loops and it will do what you want :-) -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
David, David C. Rankin schreef:
On Thursday 22 October 2009 04:41:46 pm Guy wrote:
David,
The arguments are globbed alright, a simple test shows that (I named the script guy.sh) :
Sorry Guy,
The snippet implied setting the internal field separator to only break on end of line. That's how you handle spaces in file names, etc. Just add:
IFS=$'\n'
immediately before the loops and it will do what you want :-)
The script is exactly as you supplied it. Can you : 1) give us your output of "set|grep SHELLOPTS". 2) run this script with the arguments of your choice : #!/bin/bash --norc OLDIFS=$IFS IFS=$'\n' echo "No of params : "$# ## Fill an Array with all CLI input declare -a CLIARRAY CLIARRAY=( "$@" ) ## Step through CLIARRAY with ls to expand wildcards and process ## files specified on the command line sequentially. Rely on ## rsync to throw error if bad filename ## Simple echo is used for this example echo "Number of array elements:" ${#CLIARRAY[@]} for ((a=0;a<${#CLIARRAY[@]};a++)); do echo "Element $a : ${CLIARRAY[${a}]}" ## if the argument is a directory rsync in 1 shot, else rsync each file if [[ -d ${CLIARRAY[${a}]} ]]; then echo "directory: ${CLIARRAY[${a}]}" else for b in $(ls ${CLIARRAY[${a}]}); do echo "file: $b" done fi done IFS=$OLDIFS exit This will show us what your script is realy doing. Gtz, Guy. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (4)
-
David C. Rankin
-
Guy
-
Per Jessen
-
Philip Dowie