What to think about Cyveillance?
Turned into a longer piece than I had anticipated... here's the whole thing. A few salient excerpts:
You only get one chance to make a first impression... In that light, my introduction to Cyveillance, which firm's bot appears to have visited my Web server an astonishing 985 times since December 15, is a bit troubling. Cyveillance, you see, is like a stranger who comes to your door, and, by way of introduction, lies to you, before doing even more alarming things.
Cyveillance does this by forging the name of its spider software, and otherwise obfuscating its identity when it comes to visit. By comparison... Googlebot is like the UPS driver who comes to the door in a uniform, and will happily show you his ID and business card: Cyveillancebot is more like like a coarse, unshaven, itchy guy in a ski mask lurking near your half-open bedroom window...
Indeed, Googlebot is showing extreme politeness and deference by asking if it's even all right even to come visit: "GET /robots.txt HTTP/1.0" means that Googlebot is checking for a file in which I can tell robots the terms for visiting my site. It's the place where I can put up the Web equivalent of a 'No Solicitors' or 'Keep Out' sign.
Cyveillancebot has no such manners: not once in 985 visits, has Cyveillancebot asked to see 'robots.txt'. Not only do we start our relationship with a blitz of lies, but Cyveillancebot doesn't seem to much care if it's even welcome hereabouts...
On December 15, Cyveillancebot rifled through the directory that contained all of the columns and articles that I submitted to The Independent for the years 1999 to 2001 - 155 items in all. As it happens, these particular items are copyrighted.
Which is an interesting proposition. 'robots.txt' is a well-documented Web standard that is observed, in my experience, by virtually every crawler that visits my site. It is, in fact a mechanism for safeguarding content that owners wish to keep private from crawlers.
You wouldn't decribe it as a terribly robust mechanism: it's sort of like a weak latch on your door. A thug can easily kick the door in, but a law-abiding citizen, a genteel person, never would.
Cyveillancebot is operated by a company that claims among it clients The Washington Post, Dow Jones, Dell Computer, Ford Motor Company, Levi Strauss, Bell Atlantic, Nextel, Nintendo, Goodyear and VeriSign and God knows how many others. These companies are reported to pay Cyveillance from $30,000 to hundreds of thousands of dollars annually to monitor the Web on their behalf.
But, my goodness, these are fine, upright corporate citizens, the very models of decent, accountable customer-oriented consumer companies, are they not? So what are they doing not only hanging out with a rough and dishonest character like Cyveillancebot, but paying his creators, to boot?
The answer, as it often does, goes to the roots of wealth and power... the powerful seek mechanisms, and agents by which to maintain their position against perceived threats. The nature of the mechanisms and agents vary, but they have a common unsavory thread...
These companies use Cyveillance to protect their good names and brands and content from hackers, pirates and all manner of Net 'low life', which seems like a fair thing. If you are getting mugged on the sidewalk every day, no one would blame you for choosing to travel in the company of a large, tough-looking person.
But what happens when the former victim now chooses to pre-emptively mug others? It would seem that Cyveillance performs exactly that role: companies whose safeguards have been violated, and whose digital property has been misappropriated, now hire Cyveillance to do precisely that, and not just to the criminals and thugs who've been nailed with the victim's wallet and keys in their posession.
Cyveillance widely and randomly (a 2000 press release brags that they continually 'mine and analyze' all 2.1 billion pages on the Web) mugs everybody who comes down the street, and if you don't happen to be the type who goes out, they come and find you.
Again, here's the whole rant: it will be interesting to see what Cyveillencebot's patented NetSapien™ Technology does with this particular data...
11:22:09 AM
|