How do I access a link with wget written in the following form? http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror=... I spent hours trying things, reading man pages and looking for online help. I don´t know enough HTML to understand what the ending on that URL means. Thanks, -- Jim Sabatke Hire Me!! - See my resume at http://my.execpc.com/~jsabatke Do not meddle in the affairs of Dragons, for you are crunchy and good with ketchup. NOTE: Please do not email me any attachments with Microsoft extensions. They are deleted on my ISP's server before I ever see them, and no bounce message is sent.
Hello
The Actual link is
http://unc.dl.sourceforge.net/sourceforge/abiword/abiword-2.0.11.tar.gz
So, try wget
http://unc.dl.sourceforge.net/sourceforge/abiword/abiword-2.0.11.tar.gz
----- Original Message -----
From: "Jim Sabatke"
How do I access a link with wget written in the following form?
http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror=...
I spent hours trying things, reading man pages and looking for online help.
I don´t know enough HTML to understand what the ending on that URL means.
Thanks, -- Jim Sabatke Hire Me!! - See my resume at http://my.execpc.com/~jsabatke
Do not meddle in the affairs of Dragons, for you are crunchy and good with ketchup.
NOTE: Please do not email me any attachments with Microsoft extensions. They are deleted on my ISP's server before I ever see them, and no bounce message is sent.
-- Check the headers for your unsubscription address For additional commands send e-mail to suse-linux-e-help@suse.com Also check the archives at http://lists.suse.com Please read the FAQs: suse-linux-e-faq@suse.com
On Mon, 20 Sep 2004 07:11:37 -0500
Jim Sabatke
How do I access a link with wget written in the following form?
http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror=...
Just type it after wget and it'll download the file. $ wget <the-long-url-here>
I spent hours trying things, reading man pages and looking for online help.
I don_t know enough HTML to understand what the ending on that URL means.
It's just passing an information to the server to let the server know which mirror you want to use. HTH, -- - E - on SUSE 9.1 | blackbox 0.70b2 | Panasonic CF-L1 Buffalo WLI-PCM-L11GP | copperwalls was here ;) "Look! I am making all things new." - Revelation 21:5
- Edwin - wrote:
On Mon, 20 Sep 2004 07:11:37 -0500 Jim Sabatke
wrote: How do I access a link with wget written in the following form?
http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror=...
Just type it after wget and it'll download the file.
$ wget <the-long-url-here>
I've tried that, a whole bunch of times. It creates a subdirectory prdownloads.sourceforge.net with the following: robot.txt abiword/ and abiword/ contains abiword-2.0.11.tar.gz?mirror=mesh and it's not the right file. It's very short and am HTML file that lynx won't decode. I've tried every option in man and in the online manual. -- Jim Sabatke Hire Me!! - See my resume at http://my.execpc.com/~jsabatke Do not meddle in the affairs of Dragons, for you are crunchy and good with ketchup. NOTE: Please do not email me any attachments with Microsoft extensions. They are deleted on my ISP's server before I ever see them, and no bounce message is sent.
Jim Sabatke wrote:
How do I access a link with wget written in the following form?
http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror=...
I've tried every option in man and in the online manual.
I typically find it necessary to put quotes around a URL that contains a "?". -- "Blessed is the nation whose God is the Lord." Psalm 33:12 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://members.ij.net/mrmazda/
Felix Miata wrote:
Jim Sabatke wrote:
How do I access a link with wget written in the following form?
http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror=...
I've tried every option in man and in the online manual.
I typically find it necessary to put quotes around a URL that contains a "?".
You need to escape the ? using a \ because it's an special character, like this: http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz\?use_mirror=unc Or try the double-quotes proposed by Felix. HTH, Martin -- Martin Mielke Senior UNIX SysAdmin martin.mielke@thales-is.com THALES Information Systems http://www.thales-is.com/ Tel.: (+34) 91 556 92 62 TimeZone: GMT+1 :-)
Jim, On Monday 20 September 2004 05:47, Jim Sabatke wrote:
- Edwin - wrote:
On Mon, 20 Sep 2004 07:11:37 -0500
Jim Sabatke
wrote: How do I access a link with wget written in the following form?
http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_ mirror=unc
Wget is very powerful. There are many options, but they're not particularly difficult to understand, so you'd be well advised to familiarize yourself with them. The "--help" output is over 100 lines long.
Just type it after wget and it'll download the file.
$ wget <the-long-url-here>
I've tried that, a whole bunch of times. It creates a subdirectory prdownloads.sourceforge.net with the following:
robot.txt abiword/
and abiword/ contains
abiword-2.0.11.tar.gz?mirror=mesh
and it's not the right file. It's very short and am HTML file that lynx won't decode.
You can suppress the creation of directories based on the host name and the leading directory components of the URL by giving wget the "-nd" option. You can suppress the host-name directory with "-X" and you can also have a leading sequence of some number of URL directory components with "--cut-dirs=number". You can get rid of URL query elements and, basically, totally override the output file name with the "-O" option, though this is only really useful for single-file retrievals, for the most part since it will concatentate the retrieved data from all specified URLs into that file.
I've tried every option in man and in the online manual.
Really? There are dozens of them! In general, it's a good idea to quote long arguments like those typical with wget (or to escape the special characters such as '*', '?', '[', ']', '{', '}', spaces and so on), though the likelihood that a glob character would lead to a match with something on your system is remote. Good luck.
-- Jim Sabatke
Randall Schulz
Randall R Schulz wrote:
Jim,
Wget is very powerful. There are many options, but they're not particularly difficult to understand, so you'd be well advised to familiarize yourself with them. The "--help" output is over 100 lines long.
As I posted, I spent hours reading man, then online manual, and searching google for the answer. This situation isn't covered in any of them that I could find. I think the wget community would be well served by having this type of usage covered by an example.
Jim, On Monday 20 September 2004 11:59, Jim Sabatke wrote:
Randall R Schulz wrote:
Jim,
Wget is very powerful. There are many options, but they're not particularly difficult to understand, so you'd be well advised to familiarize yourself with them. The "--help" output is over 100 lines long.
As I posted, I spent hours reading man, then online manual, and searching google for the answer. This situation isn't covered in any of them that I could find. I think the wget community would be well served by having this type of usage covered by an example.
Here's an excerpt from the manual page: -==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==- --cut-dirs=number Ignore number directory components. This is useful for getting a fine-grained control over the directory where recursive retrieval will be saved. Take, for example, the directory at ftp://ftp.xemacs.org/pub/xemacs/. If you retrieve it with -r, it will be saved locally under ftp.xemacs.org/pub/xemacs/. While the -nH option can remove the ftp.xemacs.org/ part, you are still stuck with pub/xemacs. This is where --cut-dirs comes in handy; it makes Wget not ‘‘see'' number remote directory components. Here are several examples of how --cut-dirs option works. No options -> ftp.xemacs.org/pub/xemacs/ -nH -> pub/xemacs/ -nH --cut-dirs=1 -> xemacs/ -nH --cut-dirs=2 -> . --cut-dirs=1 -> ftp.xemacs.org/xemacs/ ... If you just want to get rid of the directory structure, this option is similar to a combination of -nd and -P. However, unlike -nd, --cut-dirs does not lose with subdirectories---for instance, with -nH --cut-dirs=1, a beta/ subdirectory will be placed to xemacs/beta, as one would expect. -==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==- To me, wget ain't rocket science. And if you want to get hired for computer work, you're surely going to have to be able to figure this sort of thing out under your own intellectual power, it seems to me. Randall Schulz
Randall R Schulz wrote:
Jim,
On Monday 20 September 2004 11:59, Jim Sabatke wrote:
Randall R Schulz wrote:
Jim,
Wget is very powerful. There are many options, but they're not particularly difficult to understand, so you'd be well advised to familiarize yourself with them. The "--help" output is over 100 lines long.
As I posted, I spent hours reading man, then online manual, and searching google for the answer. This situation isn't covered in any of them that I could find. I think the wget community would be well served by having this type of usage covered by an example.
Here's an excerpt from the manual page:
-==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==-
--cut-dirs=number
Ignore number directory components. This is useful for getting a fine-grained control over the directory where recursive retrieval will be saved.
Take, for example, the directory at ftp://ftp.xemacs.org/pub/xemacs/. If you retrieve it with -r, it will be saved locally under ftp.xemacs.org/pub/xemacs/. While the -nH option can remove the ftp.xemacs.org/ part, you are still stuck with pub/xemacs. This is where --cut-dirs comes in handy; it makes Wget not ‘‘see'' number remote directory components. Here are several examples of how --cut-dirs option works.
No options -> ftp.xemacs.org/pub/xemacs/ -nH -> pub/xemacs/ -nH --cut-dirs=1 -> xemacs/ -nH --cut-dirs=2 -> .
--cut-dirs=1 -> ftp.xemacs.org/xemacs/ ...
If you just want to get rid of the directory structure, this option is similar to a combination of -nd and -P. However, unlike -nd, --cut-dirs does not lose with subdirectories---for instance, with -nH --cut-dirs=1, a beta/ subdirectory will be placed to xemacs/beta, as one would expect.
-==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==-
To me, wget ain't rocket science.
And if you want to get hired for computer work, you're surely going to have to be able to figure this sort of thing out under your own intellectual power, it seems to me.
Randall Schulz
That's an idiotic reply. Who would know that --cut-dirs would have anything to do with the URL mentioned? I read through everything you posted and still don't get the reference. What on earth does that have to do with www.abc.com?something=nothing type structures? I still have no idea why I want to get rid of a directory structure. As I originally said, I didn't understand the URL structure for the get. I couldn't find doc's on it. I still have no idea what the '?' is for. Sometimes it's very easy when you have working knowledge of an area that is helpful, and competely obsucre if you don't. I once made a very good living from solvng difficult problems, although my area is really project management. Some things just require assistance. Perhaps your bullying should convince me to slit my wrists and get it over with. The reason I'm not working is because of severe depression. I'm desperately (yes, I am desperate) trying to recover from that by doing things to spur me in a positive direction. In the past couple days I've compiled, installed and configured gnome 2.8 and the latest abiword. That isn't easy for most people, and I did it without asking anyone a thing.
Jim, On Monday 20 September 2004 13:34, Jim Sabatke wrote:
...
That's an idiotic reply. Who would know that --cut-dirs would have anything to do with the URL mentioned? I read through everything you posted and still don't get the reference. What on earth does that have to do with www.abc.com?something=nothing type structures? I still have no idea why I want to get rid of a directory structure. As I originally said, I didn't understand the URL structure for the get. I couldn't find doc's on it. I still have no idea what the '?' is for.
Hey. You're welcome. I also answered earlier about altering the default output file name with "-O". The question mark introduces query parameters within a URL. It shows up quite commonly these days in web browsing. Multiple query parameters are separated by ampersands. Often individual query parameters take the form of name/value pairs with the name separated from the value by an equal sign, but that is technically outside the specification and is basically just a convention for interpreting a monolithic query string. If you spend "hours" reading the manual page, how could you have failed to absorb the significance of "-nd", "-nH", "--cut-dirs" and "-O" and their relevance to the problem you were trying to solve?
Sometimes it's very easy when you have working knowledge of an area that is helpful, and competely obsucre if you don't. I once made a very good living from solvng difficult problems, although my area is really project management. Some things just require assistance.
I guess I figured that basic information about the nature and structure of
a URL was not at issue here. Why would you expect the "wget" man page to
educate you about URLs?
Here's a tutorial on URLs:
Perhaps your bullying should convince me to slit my wrists and get it over with. The reason I'm not working is because of severe depression. I'm desperately (yes, I am desperate) trying to recover from that by doing things to spur me in a positive direction. In the past couple days I've compiled, installed and configured gnome 2.8 and the latest abiword. That isn't easy for most people, and I did it without asking anyone a thing.
I'm sorry about that. I, too, struggle with depression. I have lost or quit jobs because of my depressions. I, too, have at times contemplated suicide. So far, I have never attempted it. Could you tell from my interactions on this list that these things are true? Doubtful. Likewise for me responding to you. Randall Schulz
Jim Sabatke wrote:
Randall R Schulz wrote:
Jim,
Wget is very powerful. There are many options, but they're not particularly difficult to understand, so you'd be well advised to familiarize yourself with them. The "--help" output is over 100 lines long.
As I posted, I spent hours reading man, then online manual, and searching google for the answer. This situation isn't covered in any of them that I could find. I think the wget community would be well served by having this type of usage covered by an example.
May I suggest that you subscribe to the wget list? Perhaps they can solve your problem/confirm the bug has been reported etc. Send a blank mail to: wget-subscribe@sunsite.dk -- The Little Helper ======================================================================== Hylton Conacher - Linux user # 229959 at http://counter.li.org Currently using SuSE 9.0 Professional with KDE 3.1 Licenced Windows user ========================================================================
Jim wrote regarding '[SLE] wget question' on Mon, Sep 20 at 07:09:
How do I access a link with wget written in the following form?
http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror=...
I spent hours trying things, reading man pages and looking for online help.
I don´t know enough HTML to understand what the ending on that URL means.
The ending is just a little more pathinfo for a script to generate a redirect. If you look at the generated HTML in teh file that wget downloads, you'll see near the top: <META HTTP-EQUIV="refresh" content="5; URL=http://unc.dl.sourceforge.net/sour ceforge/abiword/abiword-2.0.11.tar.gz"> Now, it's important to note that wget doesn't typically parse the downloaded HTML looking for meta refresh directives. Your typical browser, however, will read that as "pause on this page for 5 seconds and then load the next URL". The "next URL" is the actual file to be downloaded. Sourceforge does that 1) to put a pretty splash screen on the download and 2) to discourage download scripts. So, to grab that via CLI, you need to perform a couple of steps. Personally, I'm partial to using "lynx -source" and a couple of pipes to filter out what I'd need, but if doing that more than a few times it'd be worthwhile to write a script... --Danny, who'd use perl and LWP to write that script :)
On Mon, 2004-09-20 at 07:11, Jim Sabatke wrote:
How do I access a link with wget written in the following form?
http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror=...
I spent hours trying things, reading man pages and looking for online help.
I don´t know enough HTML to understand what the ending on that URL means.
wget http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz Will work for you. I have just tried this in a shell, and the file transfered. Mike
Mike McMullin wrote:
On Mon, 2004-09-20 at 07:11, Jim Sabatke wrote:
How do I access a link with wget written in the following form?
http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror=...
I spent hours trying things, reading man pages and looking for online help.
I don´t know enough HTML to understand what the ending on that URL means.
wget http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz
Will work for you. I have just tried this in a shell, and the file transfered.
Mike
Actually, that was one of the first things I tried. It downloads a file called abiword-2.0.11.tar.gz, but it is way too small (supposed to be almost 25MB) and it isn't a tar.gz file. It looked like an HTML file, but lynx couldn't decode it. Thanks for trying!
Jim, On Monday 20 September 2004 17:39, Jim Sabatke wrote:
Mike McMullin wrote: ...
wget http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz
Will work for you. I have just tried this in a shell, and the file transfered.
Mike
Actually, that was one of the first things I tried. It downloads a file called abiword-2.0.11.tar.gz, but it is way too small (supposed to be almost 25MB) and it isn't a tar.gz file. It looked like an HTML file, but lynx couldn't decode it.
Thanks for trying!
Sourceforge has mirror selection built in to their main Web site design and it is not the most "wget-friendly," if you will. The link you copied is misleading in its appearance. It actually leads to a Web page from which you select a mirror (buried amid the boilerplate and advertising is the text "You are requesting file: /abiword/abiword-2.0.11.tar.gz Please select a mirror"). The links on that page are the ones that include the query parameter ("?...") that selects the mirror. Without that parameter, you just won't get the actual file you're trying to download. Thus there's no way around either using the "-O" option to wget or manually renaming the file with the mirror selection query parameter appended. Randall Schulz
Randall R Schulz wrote:
Jim,
Sourceforge has mirror selection built in to their main Web site design and it is not the most "wget-friendly," if you will.
The link you copied is misleading in its appearance. It actually leads to a Web page from which you select a mirror (buried amid the boilerplate and advertising is the text "You are requesting file: /abiword/abiword-2.0.11.tar.gz Please select a mirror"). The links on that page are the ones that include the query parameter ("?...") that selects the mirror. Without that parameter, you just won't get the actual file you're trying to download.
Thus there's no way around either using the "-O" option to wget or manually renaming the file with the mirror selection query parameter appended.
Randall Schulz
I think the path I followed was this: From freshmeat's listing I browsed to: http://freshmeat.net/redir/abiword/56/url_tgz/abiword-2.0.10.tar.gz Which took me to the mirror selection page. From there I copied the link for the download from the mirror and used that for input to wget: http://prdownloads.sourceforge.net/abiword/abiword-2.0.10.tar.gz?use_mirror=... That was the URL that consumed all the time trying to get the resumed download. I thought quoting it had worked, but I tried it again and it downloads a small file, like the previous attempts. I didn't want to click on the link, because I didn't want the browser to truncate the 20MB+ that I had already retrieved. Had I moved the file and clicked on the link, I would have seen that the next screen shows the direct link in case the download doesn't start. I used that link, provided by someone else in an earlier email, with wget -c and the download resumed to completion. From now on, I'm just going to start the download, kill it, and then use wget with the backup link. BTW, a few weeks ago, when I was asking for alternative download programs, several people mentioned that wget has no problem with following this kind of link. I did ask in advance. Then I couldn't figure out how to make it work and focused too hard on making wget do something it apparenlty can't do. Telling someone they may not be competent to get a job because they couldn't follow the simple man pages was really annoying after doing that much research into the problem.
Jim, On Monday 20 September 2004 18:46, Jim Sabatke wrote:
...
I think the path I followed was this:
From freshmeat's listing I browsed to:
http://freshmeat.net/redir/abiword/56/url_tgz/abiword-2.0.10.tar.gz
Which took me to the mirror selection page. From there I copied the link for the download from the mirror and used that for input to wget:
http://prdownloads.sourceforge.net/abiword/abiword-2.0.10.tar.gz?use_mi rror=mesh
That was the URL that consumed all the time trying to get the resumed download. I thought quoting it had worked, but I tried it again and it downloads a small file, like the previous attempts.
Quoting isn't going to make a difference for this URL unless in the directory in which you issue the wget command there happens to be a directory named "http:" which contains a directory named "prdownloads.sourceforge.net" which contains a directory named "abiword" which contains a file or directory named "abiword-2.0.10.tar.gz?use_mirror=mesh" with any character in the position of the '?'. Otherwise, that argument, which _is_ a shell glob pattern, is passed unchanged. (And even that detail depends on which shell you're using and / or the options you've selected.)
I didn't want to click on the link, because I didn't want the browser to truncate the 20MB+ that I had already retrieved. Had I moved the file and clicked on the link, I would have seen that the next screen shows the direct link in case the download doesn't start. I used that link, provided by someone else in an earlier email, with wget -c and the download resumed to completion.
From now on, I'm just going to start the download, kill it, and then use wget with the backup link.
That isn't going to work any better than using the browser's "copy link address" command (typically accessed via a context menu from the link itself) and then using the resulting URL with wget (including any necessary directory trimming options, the "-O" to get the right output name and the "-c" to get "wget" to retry if an error occurs).
BTW, a few weeks ago, when I was asking for alternative download programs, several people mentioned that wget has no problem with following this kind of link. I did ask in advance. Then I couldn't figure out how to make it work and focused too hard on making wget do something it apparenlty can't do. Telling someone they may not be competent to get a job because they couldn't follow the simple man pages was really annoying after doing that much research into the problem.
You were not misinformed. Wget handles those URLs just fine. If you want query strings stripped (or other options such as the directory adjustments and continue options, etc) you can write a simple shell script to save you the tedium of handling those details manually. Randall Schulz
On Monday 20 September 2004 14:11, Jim Sabatke wrote:
How do I access a link with wget written in the following form?
http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror =unc
I spent hours trying things, reading man pages and looking for online help.
I don´t know enough HTML to understand what the ending on that URL means.
wget 'http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_mirror=...' yields another HTML-page, so clearly this is not the right URL to use with wget. Put quotes around the URL because of the '?'. Let's open the URL with firefox. Hello, that works! I get a nice page, on the top there is this: Your download should begin shortly. If it does not, try http://unc.dl.sourceforge.net/sourceforge/abiword/abiword-2.0.11.tar.gz or choose a different mirror (Cancel the dialog that pops up. If java-script is off in your browser, then no worries) So instead of that first URL, you use wget with the 2nd URL: wget http://unc.dl.sourceforge.net/sourceforge/abiword/abiword-2.0.11.tar.gz Yup, that worked: leen@ws-02:~> wget http://unc.dl.sourceforge.net/sourceforge/abiword/abiword-2.0.11.tar.gz --12:30:33-- http://unc.dl.sourceforge.net/sourceforge/abiword/abiword-2.0.11.tar.gz => `abiword-2.0.11.tar.gz' Resolving unc.dl.sourceforge.net... 152.2.210.121 Connecting to unc.dl.sourceforge.net[152.2.210.121]:80... connected. HTTP request sent, awaiting response... 200 OK Length: 26,238,046 [application/x-gzip] 8% [==> ] 2,280,226 90.72K/s ETA 05:20 leen@ws-02:~> I don't need that file, so I canceled wget. ;P Cheers, Leen
Leen, On Tuesday 21 September 2004 03:32, Leendert Meyer wrote:
...
wget 'http://prdownloads.sourceforge.net/abiword/abiword-2.0.11.tar.gz?use_m irror=unc' yields another HTML-page, so clearly this is not the right URL to use with wget. Put quotes around the URL because of the '?'.
Let's not encourage magical thinking w.r.t. shell glob characters. The question mark (or an asterisk or a sequence of characters enclosed in square brackets) will only be changed by the shell's glob processing when the full string in which it appears can be replaced by the name of an existing entity in the file system (and one that can be referred to relative to the current directory, whenever the argument string does not start with a slash). If the shell is set to reject glob patterns that don't match an existing file system entity, then you'll get an explicit error from the shell. If not, the unaltered pattern will be passed on as an argument to the program. So in cases like this, the likelihood that such an argument will be altered silently by shell glob processing is exceedingly remote.
..
Cheers,
Leen
Randall Schulz
participants (10)
-
- Edwin -
-
Danny Sauer
-
Felix Miata
-
Hylton Conacher (ZR1HPC)
-
Jim Sabatke
-
John
-
Leendert Meyer
-
Martin Mielke
-
Mike McMullin
-
Randall R Schulz