Mailinglist Archive: opensuse (6210 mails)
| < Previous | Next > |
Re: [SLE] wget confusion...
- From: Randall R Schulz <rschulz@xxxxxxxxx>
- Date: Sun, 30 Oct 2005 08:06:06 -0700
- Message-id: <200510300706.06875.rschulz@xxxxxxxxx>
Anders,
On Sunday 30 October 2005 06:50, Anders Norrbring wrote:
> On 2005-10-30 15:26 Randall R Schulz wrote:
> > Anders,
> >
> > ...
> >
> > Is it possible that the modification time returned is not that of
> > the underlying files, but rather the current time? I think this
> > will subvert wget's cycle-breaking logic, ...
>
> That would certainly make sense.. Looking in the log (with tail), I
> see that:
>
> Length: 4,573 (4.5K) [image/jpeg]
> Server file no newer than local file `/dir/file.jpg' -- not
> retrieving.
Modification times only really matter for HTML files (for the purposes
of this discussion), since they're the only ones with hyperlinks.
> I have no idea what the server time stamp would be..
You can see what mod time the server is returning by opening the page in
a browser and getting the page's information or properties. In Mozilla,
it's "View -> Page Info...". The "General" tab of the resulting window
includes the modification and expiration times.
> >>My intention was that it should do one round, and then exit, but it
> >>seems like it just goes on...
> >
> > Normally it works, as long as the graph formed by the hyperlinks is
> > bounded, either intrinsically or because you gave options that cut
> > off, say, links that go off-site or outside the hierarchy at which
> > you initiated the retrieval.
>
> It would be nice if I knew how to make it break after one cycle, but
> I don't....
Ordinarily, it's not up to you, but rather wget itself. But if the
server is not returning sensible modification times, then it may be
difficult.
You might want to try a different mirroring tool to see if the same
problem exists there, too.
> Anders
Randall Schulz
On Sunday 30 October 2005 06:50, Anders Norrbring wrote:
> On 2005-10-30 15:26 Randall R Schulz wrote:
> > Anders,
> >
> > ...
> >
> > Is it possible that the modification time returned is not that of
> > the underlying files, but rather the current time? I think this
> > will subvert wget's cycle-breaking logic, ...
>
> That would certainly make sense.. Looking in the log (with tail), I
> see that:
>
> Length: 4,573 (4.5K) [image/jpeg]
> Server file no newer than local file `/dir/file.jpg' -- not
> retrieving.
Modification times only really matter for HTML files (for the purposes
of this discussion), since they're the only ones with hyperlinks.
> I have no idea what the server time stamp would be..
You can see what mod time the server is returning by opening the page in
a browser and getting the page's information or properties. In Mozilla,
it's "View -> Page Info...". The "General" tab of the resulting window
includes the modification and expiration times.
> >>My intention was that it should do one round, and then exit, but it
> >>seems like it just goes on...
> >
> > Normally it works, as long as the graph formed by the hyperlinks is
> > bounded, either intrinsically or because you gave options that cut
> > off, say, links that go off-site or outside the hierarchy at which
> > you initiated the retrieval.
>
> It would be nice if I knew how to make it break after one cycle, but
> I don't....
Ordinarily, it's not up to you, but rather wget itself. But if the
server is not returning sensible modification times, then it may be
difficult.
You might want to try a different mirroring tool to see if the same
problem exists there, too.
> Anders
Randall Schulz
| < Previous | Next > |