Cristian,
Script to download several images (not really tested, but should work):
grep ".." imglist.txt | while read file ; do img=$(echo $file | sed 's/^Image://') imgpath=$(echo "$img" | md5sum - | sed 's§^\(.\)\(.\).*§\1/\1\2§') wget "http://old-en.opensuse.org/images/$imgpath/$img" done
This will download all files listed in imglist.txt with wget. imglist could look like that:
With a lot of pain and sorrow I finaly got the sed command to work. the § character was real pain in the ass, since putty did strange things with it. After a lot of research a figured out you used § as an alternative escape character. I replaced it with @ and it worked!!! Now the other thing... my md5sum output is different than it is in the wiki structure. I have output: http://old-en.opensuse.org/images/f/fc/Ltsp-firewall1.png while this file is stored under http://old-en.opensuse.org/images/e/e5/Ltsp-firewall1.png Just without the script the md5sum is: echo Ltsp-firewall1.png | md5sum fc0ffae0dc420b931ad46ea1eaa05d10 - So finally the script is okay, but now the md5sum appears to be different. Tried all sorts of different combo's. With or without File: with or without suffix .png. Tried different kind of capitalization. Nothing seems to give the correct md5sum. Sure it is md5sum? Matthew,
How are you determining which files you want to get? If it makes life easier for you, I can just allow index view on >files.opensuse.org, which I should probably do anyway. With that, you can download all the files on old-en.opensuse.org >into one directory with a single wget command. That would probably make it a lot easier for you to pick and choose the >files you need.
Determining the files: I just wanted copy any red link with File: in a transferred article, paste them in a local text file and let wget download it for me. The only thing is, the subdirs. Being able to browse the subfolders doesn't make live easier. I already find it when I follow the image link on the File: page. But this requires a one by one action: Copy the red link text form article, Search on old-en.o.o, right click the image, select download, save as. But thanks anyway for opening the index service, it may come off good use for other things
I have added the Indexes directive to files.opensuse.org. You should be able to go in there and file the uploaded files for all the wikis. More importantly, you can now get all the files together with a single wget command if that will help you out in your quest. Something like the following should work:
wget -r -np -nd ?http://files.opensuse.org/opensuse/old-en/?
This will download every file that has been uploaded to the old-en wiki and place it under a single directory. I believe all the filenames should be different, and I believe it is set to no-clobber by default, so this should be a safe command to run. You may also want to consider adding a little wait time between requests (with a -w), so that you don't destroy your bandwidth on this.
Well, how many gigs are we talking about? I prefer not to actually.
Also, if the idea was for multiple people to be able to do something like this, I can even do something like create a temporary location on files.o.o with all the old-en files in one directory.
This would be of GREAT help. If the position is known, it eliminates switch from new to old wiki all the time, to find and download your missing images. If we can get to multiple uploading, then it will be perfect: *Check tranferred articles *Paste all the red File: links in a text file *Download all the files in one time with wget *Upload them again in one time *10 pages fixed in no time! Greets, Tim 2010/7/15 Matthew Ehle <mehle@novell.com>:
Tim,
I have added the Indexes directive to files.opensuse.org. You should be able to go in there and file the uploaded files for all the wikis. More importantly, you can now get all the files together with a single wget command if that will help you out in your quest. Something like the following should work:
wget -r -np -nd ?http://files.opensuse.org/opensuse/old-en/?
This will download every file that has been uploaded to the old-en wiki and place it under a single directory. I believe all the filenames should be different, and I believe it is set to no-clobber by default, so this should be a safe command to run. You may also want to consider adding a little wait time between requests (with a -w), so that you don't destroy your bandwidth on this.
Also, if the idea was for multiple people to be able to do something like this, I can even do something like create a temporary location on files.o.o with all the old-en files in one directory.
-Matt
"Matthew Ehle" <mehle@novell.com> 07/15/10 12:30 AM >>> Tim,
How are you determining which files you want to get? If it makes life easier for you, I can just allow index view on >files.opensuse.org, which I should probably do anyway. With that, you can download all the files on old-en.opensuse.org >into one directory with a single wget command. That would probably make it a lot easier for you to pick and choose the >files you need.
I will look into getting one of these upload extensions installed as soon as I get a handle on this session error bug that is >affecting imports in the first place. Since it is possible, however unlikely, that one of the new extensions that we installed is >related to the problem, I am wary of installing more extensions until we at least know what is going on.
In any case, I'm going to go ahead and set index views on files.o.o before I go to bed and forget all about it.
-Matt
Christian Boltz 07/14/10 4:35 PM >>> Hello,
on Mittwoch, 14. Juli 2010, Tim Mohlmann wrote:
I'm still try to work out a script for downloading multiple images at once, to overcome the sub-directory problem. Tried earlier tips form Rajko and Cristian, but it's not working as it should (wget is following a little bit to much).
to get the directory structure:
echo 'Image:foo.jpg' | sed 's/^Image://' | md5sum - | sed 's§^\(.\)\(.\).*§\1/\1\2§'
Script to download several images (not really tested, but should work):
grep ".." imglist.txt | while read file ; do img=$(echo $file | sed 's/^Image://') imgpath=$(echo "$img" | md5sum - | sed 's§^\(.\)\(.\).*§\1/\1\2§') wget "http://old-en.opensuse.org/images/$imgpath/$img" done
This will download all files listed in imglist.txt with wget. imglist could look like that:
Image:Foo.jpg Bar.jpg <---- Image: is not required (will be cut away anyway) Image:Baz.jpg
I assume you know what you are writing into imglist.txt - the script does not have a real protection against funny filenames etc. Most important: you have to use _ instead of spaces (or add another sed ;-)
If you prefer to use commandline parameters instead of a file with the filelist, replace the first line with for file in "$@" ; do
Regards,
Christian Boltz
PS: In case you wonder: the "grep .." is a trick to skip empty lines ;-)
-- To unsubscribe, e-mail: opensuse-wiki+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-wiki+help@opensuse.org