On 06/07/2010 01:28 AM, David C. Rankin wrote:
Guys,
I frequently need to pull multiple files from a remote host and I like to do it with a list of links that I just feed to wget with the -i option (and -b) and let it do its thing. The only pain is building the 'getfile' list. I wrote a little script that helps. If you are interested, you can grab it at:
http://www.3111skyline.com/dl/dev/scr/net/lynxdump.sh
As the name implies, it uses 'lynx -dump' to generate the list and then it parses the return to leave just the links in the output file. Normally, it will just create an output file with all links from the urls given on the command line in a single file 'without' any subdirectories included.
Update: I have included a new --nodebug flag that works with the --rpm flag to eliminate and exclude all debugsource and debuginfo rpms from the list of rpms URLs returned. The changes are reflected in the updated help for the lynxdump script: 00:57 nirvana:/srv/http/dl/dev/scr/net> sh lynxdump.sh Error: No input URL provided, exiting... Usage: lynxdump.sh [-h|--help] [-v|--verbose] [-r|--rpm] [--nodebug] [-d|--dirs] [-b|--base] url-with-links [-o|--outfile outfile] lynxdump.sh uses 'lynx -dump' to capture all links from 'url-with-links' and parses the output leaving only the direct URLs. The resulting links written to 'outfile' (default: ./lynxdump.txt) can be used with 'wget -i outfile' to retrieve all files from the remote host. Options: -h | --help show this help and exit (must be only option given). -b | --base the next URL provides the baseURL information as well as a directory (i.e. -b http://download.lynx.org/docs) All other urls with the same baseURL need only provide the directory name (i.e. download, svn). -d | --dirs include sub-directories in the list of links. -o | --outfile the following command line option profides the output file name. -r | --rpm changes dump file parsing so that only rpm links are saved. --nodebug excludes debuginfo and debugsource files. (use with -r | --rpm) -v | --verbose additional output of script operations. Example: lynxdump -b http://download.opensuse.org/repositories/X11/i586 src x86_64 --rpm --nodebug creates an output file with the links to rpms in ../X11/i586 ../X11/src and ../X11/src directories without the debuginfo or debugsource files included. If you would like to grab a few of your favorite 11.0 rpms before they mysteriously disappear, then you can simply issue the command: lynxdump -b http://download.opensuse.org/repositories/your_favorite/openSUSE_11.0/i586 src noarch x86_64 -r --nodebug -o getfile.txt That will create for you a list of complete URLs to all rpms contained in all 4 of the repository directories without including any debuginfo or debugsource rpms. Then to retrieve the packages to your local system, simply use wget -i URLsFile, here: wget -i getfile.txt -b ## add the -b to wget to background the retieval. Have fun. If you find any bugs, let me know :) -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org