Re: [opensuse] Spam and this and other mailing lists

6 Jun 2008


      On Fri, 6 Jun 2008, Carlos E. R. wrote:-
...
Well, if the other side is interested in getting info they shouldn't,
they can ignore the robot.txt file and do the scan slowly, so as not to
be so intrusive ;-)
One interesting little experiment I've yet to try is to add a deny entry
to the robots.txt for a sub-directory that has no links from anywhere
else, and then to see which robots actually try indexing it. It might
even be fun to build another page detailing which IP addresses visited
these hidden locations.
...
Even wget has options for that. The server thinks it is normal traffic,
unless they do analysis on the pattern.
And with the use of --wait and --random-wait you can (virtually?)
eliminate the patterns. By setting the wait time to a minute, wget will
wait anywhere upto two minutes between successive fetches. The full
details, and an explanation of why --random-wait exists is in the wget
man page.


Regards,
        David Bolt

-- 
Team Acorn: http://www.distributed.net/ OGR-P2 @ ~100Mnodes RC5-72 @ ~15Mkeys
SUSE 10.1 32bit  |                     | openSUSE 10.3 32bit | openSUSE 11.0RC1
SUSE 10.1 64bit  | openSUSE 10.2 64bit | openSUSE 10.3 64bit
RISC OS 3.6      | TOS 4.02            | openSUSE 10.3 PPC   | RISC OS 3.11
-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse+help@opensuse.org