Weblog Metrics: 'bots redux. Bernie Goldbach writes: "If you do this kind of analysis, ensure you are considering page views and not hits. Spiders will normally grab HTML files, real visitors will grab entire pages. On my blog site, a real person generates at least 13 hits every for every visit. A spider normally generates no more than 2 hits per visit as good spiders take robots.txt along with the page.
"I did a rough analysis of my blog traffic for December 11, 2002. At first pass, 350 blog pages were served. Nearly 60 per cent of those pages from my blog directory were requested by robots. That indicates I am at the lower end of the information feeding chain, getting more bots than bodies."
Bernie's remark and Henry Copeland's comment got me digging through www.gulker.com's logs. On busy days, there are usually a few heavy referrers (likely humans) and a gaggle of referrers with one or two accesses (likely 'bots) and a roughly equal number of search engine requests (likely humans - I don't think the 'bots crawl each other).
Which is to say Bernie and Henry are right: 'bots make up a lot of blog traffic. The relatively heavy robot traffic isn't a bad thing: if the 'bots weren't indexing your pages, you wouldn't be getting the search referrers. On gulker.com this morning, there were 42 referrers at noon: The Independent had sent 79 hits, Bernie's site had sent 5 and the rest were either referred by search engine results or appeared to be 'bots (18 searches and 21 'bots). The logs showed 1845 pages downloaded in 869 visits by 656 unique hosts and identified 'known robots' as 25% of visitor paltforms.
This is roughly consistent with gulker.com's referrers when I looked early Sunday morning, when there were no large 'human' results: the number of 'bot crawls was a bit larger than the number of people who had come in following a link from a Google or other search engine.
So the 'bot crawls are the 'tax' you pay to get the search referrals. And it takes one-and-a-fraction 'bot crawls to land one human visit. So, again, it appears inbound links are important for non-celeb sites: more links mean more crawls mean more people find you through searches.
BTW, I have been doing an analysis of robot 'noise' which has necesitated restarting my logs process... will relay when I have a better handle on it. Bernie by the way, has an interesting ISP situation... he pays for a share of 512K by buying his provider 2 pints of Guinness a week... costs him about $70 a year... good deal! Any takers in Menlo Park?
Comments
1:06:24 PM
|