Hi, How can I disable a website to be mirrored by Wget? I have tried robots .txt with User-agent :* Disallow: / It stops the robots but not Wget so is there some other way to stop wget users eat my bandwidth? Thanks -- Togan Muftuoglu
User-agent :* Disallow: /
It stops the robots but not Wget so is there some other way to stop wget users eat my bandwidth?
Tough luck. By default, wget honours robots.txt, but that can be turned off. If it couldn't be turned off, wget would only be 1/10 as useful. wget is a great tool to get around noxious webmasters. I agree though that if used carelessly it can be a pain to those webmasters. Btw, you can change the user-agent ID in wget too, so give up. A web server is a public offering, so restricting the public to make use of it is contradictory and therefore difficult to impossible. I'm afraid you'll have to put up with it. Any countermeasures are difficult to implement and not necessarily reliable (e.g. restricting download volume per originating IP - there could be 10000 users behind that IP). Volker -- Volker Kuhlmann is possibly list0570 with the domain in header http://volker.orcon.net.nz/ Please do not CC list postings to me.
participants (2)
-
Togan Muftuoglu
-
V K