[opensuse] Howto keep track....
...on changed or new files on HTTP/HTML only servers. I like to track some directories to see if there are new source tarballs added and subsequently download them automatically. I can do that with FTP sites using my own scripts, but processing HTML pages is different for every page/site and would require custom processing for each different page and directory. I know it is hard to do while using linux basic tools (lynx, curl, wget) and bash scripting, but if someone has a pointer, then please let me know. By the way, they say that perl and LWP might do, but I do a lot of programming in various languages but not with perl or python. Regards, Frans. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 10/20/2013 03:42 PM, Frans de Boer wrote:
...on changed or new files on HTTP/HTML only servers.
I like to track some directories to see if there are new source tarballs added and subsequently download them automatically. I can do that with FTP sites using my own scripts, but processing HTML pages is different for every page/site and would require custom processing for each different page and directory.
I know it is hard to do while using linux basic tools (lynx, curl, wget) and bash scripting, but if someone has a pointer, then please let me know. By the way, they say that perl and LWP might do, but I do a lot of programming in various languages but not with perl or python.
Regards, Frans. What about using rsync ?
-- Duaine Hechler Piano, Player Piano, Pump Organ - Tuning, Servicing & Rebuilding (314) 838-5587 / dahechler@att.net / www.hechlerpianoandorgan.com Home & Business user of Linux - 13 years -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 10/21/2013 02:26 AM, Duaine Hechler wrote:
On 10/20/2013 03:42 PM, Frans de Boer wrote:
...on changed or new files on HTTP/HTML only servers.
I like to track some directories to see if there are new source tarballs added and subsequently download them automatically. I can do that with FTP sites using my own scripts, but processing HTML pages is different for every page/site and would require custom processing for each different page and directory.
I know it is hard to do while using linux basic tools (lynx, curl, wget) and bash scripting, but if someone has a pointer, then please let me know. By the way, they say that perl and LWP might do, but I do a lot of programming in various languages but not with perl or python.
Regards, Frans. What about using rsync ?
Rsync works only local, via SSH or via a remote rsync daemon. Neither of which is possible. Access to external sites like sourceforge.net is via git, svn, cvs or http. Many projects on sourceforge.net and other sites do not support git, svn or cvs, which leaves only http. Frans. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On October 20, 2013 at 10:42 PM Frans de Boer <frans@fransdb.nl> wrote: ...on changed or new files on HTTP/HTML only servers. I like to track some directories to see if there are new source tarballs added and subsequently download them automatically.
I'd start with wget --mirror and play with options like -X -np to limit the download. Have fun, Berny -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Frans de Boer wrote:
I know it is hard to do while using linux basic tools (lynx, curl, wget) and bash scripting, but if someone has a pointer, then please let me know. By the way, they say that perl and LWP might do, but I do a lot of programming in various languages but not with perl or python.
Yeah, perl will do it, and LWP provides some of the necessary nuts and bolts, but you'd be better starting from WWW::Mechanize and its friends. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (4)
-
Bernhard Voelker
-
Dave Howorth
-
Duaine Hechler
-
Frans de Boer