![]() | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
What to think about Cyveillance? You only get one chance to make a first impression. Knowing that, most of us tend to treat first meetings with, at the least, a bit of respect and preparation. I, for one, usually run my hand through my normally scraggly locks and make sure my handshaking paw isn't too grubby before being introduced to a new person. I try to be polite, and behave in a socially acceptable manner. You never know where a new connection can lead. In that light, my introduction to Cyveillance, which firm appears to have visited my Web server an astonishing 985 times since December 15, is a bit troubling. Cyveillance, you see, is like a stranger who comes to your door, and, by way of introduction, lies to you. Cyveillance does this by forging the name of its spider software, and otherwise obfuscating its identity when it comes to visit. Indeed, Cyveillance is a kind of hyperkinetic liar: it frequently identifies itself as a half-dozen different entities in the space of a few seconds. Imagine meeting someone who told you "Hi, I'm Bob Jones, Hi I'm Roger Smith, Hi, I'm Elaine MacPherson" in one gush. Here's what it looks like in my Web server's log:
63.148.99.232 - - [02/May/2003:13:01:37 -0700] "Mozilla/4.0 (compatible; MSIE 5.05; Windows NT 3.51)" By comparison, here's how another visitor introduces itself: 64.68.82.39 - - [05/May/2003:15:18:23 -0700] "GET /robots.txt HTTP/1.0" 404 275 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)" If you look up Googlebot's and Cyveillancebot IP addresses, you find this:
[g4_tower:~] gulker% nslookup 64.68.82.39
Name: crawler11.googlebot.com
[g4_tower:~] gulker% nslookup 63.148.99.232
*** galapagos.gulker.com can't find 63.148.99.232: Non-existent host/domain Googlebot is like the UPS driver who comes to the door in a uniform, and will happily show you his ID and business card: Cyveillancebot is like a coarse, unshaven, itchy guy with his hat pulled down lurking near your half-open bedroom window. Googlebot not only has nothing to hide but is attempting to be polite, besides - the robot equivalent of patting hair in place and asking how the wife and kids are doing. In the first place, Googlebot is saying who and what version it is, and is leaving a calling card - a Web address where you can get the full scoop on how Googlebot goes about its daily rounds. Indeed, Googlebot is showing extreme politeness and deference by asking if it's even all right even to come visit: "GET /robots.txt HTTP/1.0" means that Googlebot is checking for a file in which I can tell robots the terms for visiting my site. It's the place where I can put up the Web equivalent of a 'No Solicitors' or 'Keep Out' sign. Cyveillancebot has no such manners: not once in 985 visits, has Cyveillancebot asked to see 'robots.txt'. Not only do we start our relationship with a blitz of lies, but Cyveillancebot doesn't seem to much care if it's even welcome hereabouts. Indeed, Cyveillancebot is like the rough-looking visitor who, after banging on your door and getting no answer, rattles the doorknob to see if he can get in. And if your door isn't open, he then tries the windows, the back door and the garage. And if any of those happen to be open, he prety much helps himself to anything he can find. On December 15, Cyveillancebot rifled through the directory that contained all of the columns and articles that I submitted to The Independent for the years 1999 to 2001 - 155 items in all. As it happens, these particular items are copyrighted. Which is an interesting proposition. 'robots.txt' is a well-documented Web standard that is observed, in my experience, by virtually every crawler that visits my site. It is, in fact a mechanism for safeguarding content that owners wish to keep private from crawlers. You wouldn't decribe it as a terribly robust mechanism: it's sort of like a weak latch on your door. A thug can easily kick the door in, but a law-abiding citizen, a genteel person, never would. However strong or weak the latch, breaking and entering is breaking and entering. Cyveillancebot is operated by a company that claims among it clients The Washington Post, Dow Jones, Dell Computer, Ford Motor Company, Levi Strauss, Bell Atlantic, Nextel, Nintendo, Goodyear, VeriSign and God knows how many others. These companies are reported to pay Cyveillance from $30,000 to hundreds of thousands of dollars annually to monitor the Web on their behalf. But, my goodness, these are fine, upright corporate citizens, the very models of decent, accountable customer-oriented consumer companies, are they not? So what are the doing not only hanging out with a rough and dishonest character like Cyveillancebot, but paying his creators, to boot? The answer, as it often does, goes to the roots of wealth and power. Heterogenous networks, human socities for example, exhibit a phenomenon called preferential attachmnet. What that means, is that those who work very hard, or, who are very lucky, aquire a bit more than their fellows. Once that occurs, it is normal for the blessed entities to acquire more wealth and power at a faster rate than the competition. And once they have attained a leading position, they normally wish to continue in that role. They have no desire to go back to mere 'everyman' status. So the powerful seek mechanisms, and agents by which to maintain their position against perceived threats. The nature of the mechanisms and agents vary, but they have a common, unsavory thread. So, in search of an understanding of the ethics involved, let's go all the way to an admittedly extreme case: the recently deposed regime in Iraq. Saddam Hussein and his inner circle employed a mechanism - torture, imprisonment and execution - in the hands of an agent - the Baath party, to counter what it perceived as threats to its continued favored position in that society. We don't really need to say much about the relative ethics or moral position of said regime: the recently-uncovered mass graves speak more loudly than words ever could, but the basic script is 'powerful organization employs unsavory means to protect its turf'. So lets go all the way back to our fine, upstanding American corporations, who (we should hope) bear no resemblance to the former Iraqi regime, and who it would be thought, occupy a position 180-degrees opposed on the morality meter. We nevertheless see them employing an agent who is less than genteel. These companies use Cyveillance to protect their good names and brands and content from hackers, pirates and all manner of Net low life, which seems like a fair thing. If you are getting mugged on the sidewalk every day, no one would blame you for choosing to travel in the company of a large, tough-looking person. But what happens when the former victim now chooses to pre-emptively mug others? It would seem that Cyveillance performs exactly that role: companies whose safeguards have been violated, and whose digital property has been misappropriated, now hire Cyveillance to do precisely that, and not just to the criminals and thugs who've been nailed with the victim's wallet and keys in their posession. Cyveillance widely and randomly (a 2000 press release brags that they continually 'mine and analyze' all 2.1 billion pages on the Web) mugs everybody who comes down the street, and if you don't happen to be the type who goes out, they come and find you. Which brings us to the time-honored slippery slope of morals and ethics. Stealing is wrong. Misappropriating the property of another, intellectual or otherwise is wrong. Yet companies with reputations as shining as the Washington Post and Ford Motor Co., don't seem to mind hiring a company to do, in their name, precisely what they so rightly complain about.
Copyright 2003, Chris Gulker
Updated 5/6/03; 1:08:27 PM |
Dotcom Garden Picture Weblog Random Access (soon) Search Venture News Weblog Metrics
gulker.com Cam
Natalie d'Arbeloff
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||