logo_text_trans.gif
Click to see the XML version of this web page.
Thursday, May 15, 2003

I'm a geek, I read referrer logs, along with my copy of the NY Times in the morning. And maybe that data point explains why Cyveillancebot hasn't visited ever since an apparent real, live human at Cyveillance read this Weblog a couple days ago.

Why would that cause the apparent (and welcome) cessation of Cyveillancebot activity? Well, one theory would be that Cyveillance is aware that they are in the business of, technically at least, infringing copyright, and are staying low, now that they know that I know.

One of the things I know, is that that access logs show a different pattern if a page is opened from my server, and if a copy of that page is opened from a file saved to a hard drive. So here's what it looks like when the page is opened from the server:

63.148.99.229 - - [13/May/2003:12:24:28 -0700] "GET / HTTP/1.0" 304 - "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

63.148.99.229 - - [13/May/2003:12:24:28 -0700] "GET /graphics/logo_blu_bg_shado_116.png HTTP/1.0" 304 - "http://www.gulker.com/" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

[snip - the full log sequence is here - snip]

63.148.99.229 - - [13/May/2003:12:24:29 -0700] "GET /graphics/right_bg.jpg HTTP/1.0" 304 - "http://www.gulker.com/" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

The first item in the sequence is a request for "/", the short way to request gulker.com's default home page, and all the rest are requests for the graphical bits and pieces that comprise the page. So what happens if someone saves my page to their hard drive, and then opens it in a browser?

What happens is that you get the same sequence in the access log, with one exception: there is no request for "/" - the browser already has it, it only needs the graphics and other 'furniture' to draw the page.

In a recent 2-day period, I noticed that the graphics-only sequence was requested 554 more times than the full sequence. So, more than 250 times a day, a browser somewhere in the world was pulling my page from a local file, rather than from my server.

One reason that this would happen is that the browser has cached my page, but not the graphics files associated with it. I've done some experiments with this (Mozilla, IE, Safari), and the behavior depends on the type of browser and how its cache prefs are set. I'm sure that some of the time, particular browser versions, set in just the right fashion, are causing this behavior, when people come back to my page before it's expired from their cache.

But there is another reason this could happen. If someone were to download my page, store it on their own hard drive or local server, and then open it from that server, you see the same sequence - no "/".

You might see this if, for example, Cyveillance - who have pulled more than a thousand files from my server including essays, articles, research, presos etc. without ever (until the Tuesday human visit) downloading the attendant graphics files - were to post my files to their internal, private network, and was allowing access to them by employees and clients.

If that's what they're doing, I think that is copyright infringement - many of the files they have pulled (and continue to pull down, over and over again) are copyrighted. It seems to me that if Cyveillance were to post something like "This imbecile is spouting anti-DCMA blasphemy at http://www.gulker.com/ " on their private or public servers, that would probably not be a copyright violation - their clients would be reading my opinions from my server. In this case, they are being paid to find inimical opinion on the Web.

But if they place my original work on their server, and then distribute it to the clients who pay them (large) fees, that probably is a copyright violation - they are being paid for distributing my copyrighted material without permission, which, of course, is exactly what they and their clients object to so strenuously. So, next step is a little detective work to figure out who owns the IP addresses that are pulling my stuff in this fashion... 63.148.99.229, BTW, is registered to Cyveillance according to arin.net, and is almost certainly one of their firewall machines...
Comments 10:13:50 AM    




Top of page | Home | About gulker.com | About Chris Gulker

Updated 4/16/04; 1:19:23 PM

Chris Gulker's view from Silicon Valley - in words and pictures

Updated 4/16/04; 1:19:23 PM


May 2003
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Apr   Jun

Gulker Photo Archive Logo

Features & Categories:
Columns (soon)
Dotcom Garden
Lone Genius Hackers
Picture Weblog
Theory & Strategy
Weblogging

gulker.com Cam
gulker.com Cam

Interesting blogs et al.:

AlwaysOn Network
Natalie d'Arbeloff
Azeem Azhar
Ken Bereskin
Blogging Ecosysytem
Blogging Network
BlogStreet
Boing Boing
Tim Bray
Matt Croydon
DaveNet
Rael Dornfest
Esther Dyson
Dave Farber's IP
Dave Fitch
David Galbraith
John Getze
William Gibson
Dan Gillmor
James Gleick
Bernie Goldbach
Meg Hourihan
Joi Ito
Xeni Jardin
Jeff Jarvis
Linux Journal
Mitch Kapor
Kuro5hin
Gunnar Langemark
Joshua Levy
Scott Loftesness
Macintouch
Ross Mayfield
Hans Moravec
Rafe Needleman
Nonsense Verse
OS Opinion
Tim Porter
Recommended Reading
Reverse Cowgirl
Glenn Reynolds
Roger Ridey
Phil Ringnalda
John Robb
Scott Rosenberg
Anita Rowland
Brent Simmons
Robert Scoble
Doc Searls
Jessica Shea
Gavin Sheridan
Shifted Librarian
Stefan Smalla
Bruce Sterling
Scripting News
Slashdot
Dan Shafer
John Tringham
Jon Udell
Moicho Umeda
Philipp Weltentummler
Kevin Werbach
Amy Wohl

Click here to visit the Radio UserLand website.

Subscribe to "Cyveillancebot" in Radio UserLand.






Google