On Thursday 05 June 2008 16:36, Carlos E. R. wrote:
The Thursday 2008-06-05 at 22:41 +0200, jdd sur free wrote:
Google don't search way back, so I beg it's protected against robots, but as I could find my mother e-mail, I beg smart robots can circonvent robot.txt
Not smart, simply twisted. A robot can simply ignore what the robot.txt file says. It is not mandatory, it is not forced by the server.
Not necessarily, but not necessarily not... One day I came to work to find that several of our servers were running "hot" (protracted very high utilization). When we started looking at the logs, it turned out that a particular IP range was responsible for the bulk of the traffic. It turned out someone was very interested in our retail business database (accessed via a location-specific Web query interface). We backtracked to the business that was doing the very inefficient bulk download, contacted them and made a deal to share a database appropriately indexed for their needs. The bottom line is that organizations that run big server farms keep a close eye on their resource usage and utilization patterns. If necessary, extreme prejudice can be used to deflect inappropriate access...
-- Cheers, Carlos E. R.
Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org