Robots.txt revisited. Also, which robots to disallow?
Wednesday, August 27th, 2008We’ve been switching servers and web hosting for a lot of web sites in the last 3 weeks. We’ve allotted a healthy dose of bandwidth to these site’s new shared hosting accounts knowing that upload bandwidth can be counted in the traffic even though one of our dedicated server providers don’t count incoming FTP traffic. So we were surprised to find that one of the sites we recently moved had already eaten up the 5Gig of bandwidth assigned to it, this is a 54MB site powered by WordPress where all if not most content are text blobs on the database. A quick look at the Webalizer traffic logs (provided on cPanel) revealed visits from trusted and known web robots as well as a new surprise entry….