p2, aka the Paperless Project
The Paperless Project, since renamed p2, had its inspiration in events more than a decade ago at Apple, and in a more modern epiphany courtesy of the San Franciso Police Dept. In the mid-90′s, before Windows ’95 began to ship and the wheels started coming off Apple’s high-margin Mac OS computer business, cash-rich Apple kept a stable of interesting computer scientists, including Gary Starkweather, inventor of the LaserWriter, in something called the Advanced Technology Group on the Cupertino campus.
As far as I could tell, ATG was almost completely disconnected from Apple’s actual business and even from Apple R&D. ATG guys were writing sophisticated programming languages that Apple never used, a version of Linux (!) that Apple, of course, never used and developing sophisticated network protocols that extended beyond TCP/IP so that everything, e.g. you keyboard and monitor, had a network address that Apple never used or promoted.
But enough ATG lore. Gary, an Apple fellow, had considerable funding, and in the year that magnetic storage began to equal the cost of paper storage, hired a raft of grad students to scan in everything he read – his mail, newspapers, books et al. so that he could see what it was like to really eliminate paper. His verdict: not as bad as it sounded, and he challenged me to read a novel-longth document on my laptop (I chose a Voyager Hypercard version of William Gibson’s Sprawl Trilogy). That got me intrigued about a Paperless future.
But, truth be known, other than embracing email and the web, I lived no more paperlessly than anyone in ensuing years. That terribly important looking mail that kept coming in from banks, credit card cos., brokerages et al., just had to be kept, somewhere, just in case, right?
Then, a couple years ago, our Land Rover was stolen while we were on vacation. The car was dumped in San Francisco, and despite reporting the car stolen promptly, SFPD put a bunch of parking tickets on it (which, of course I never knew about). Eventually the car was recovered, and the unpaid parking ticket notices came rolling in, to the tune of around a thousand bucks. I immediatle sent the court a copy of the stolen report, figuring that would end the unpaid ticket requests, but, oh, no.
Turns out the courts require I get a copy of every ticket, and submit each one with a letter and copy of the police reports showing when the car was reported stolen and recovered. Since the thieves hadn’t bothered to send me the tickets, what was I to do? Short answer was to stand in line at a clerk’s window at SFPD for a couple hours, and hope I had an accurate list of every ticket number – the only way to retrieve a copy. And it turned out, these copies weren’t even in San Francisco – they were scans kept by a company called Choicepoint in Atlanta (yes, the same Choicepoint that was busy selling Californian’s credit card and personal information to Nigerian scammers last year). The clerk could call up scans in a web browser, and print them out for me, slowly and for a fee. This whole thing seemed like a revenue-generation exercise for the City and County of San Francisco, but what was a guy to do?
I called my insurance adjustor to see if there were no quicker way: it turns out there was, since insurance cos. can get into Choicepoint, too. He found all the tickets and emailed them to me as PDFs. I printed them, attached the required documentation, and Fedexed the whole giant package to the court (as paper, of course… I wonder if they scanned it all and sent it back to Choicepoint). Case dismissed! The clerk even told me that I was one of a very few who ever managed to get a whole batch dismissed, so onerous was the process. Finding paper documents can be hard: finding electronic ones should be easy. This started the wheels turning… if courts and police departments were happy with PDF copies, why was I saving all this paper?
Another minor epiphany came at Macworld 2005 (I think). While talking to colleagues atthe Adobe booth, I noticed a machine in Fujitsu’s nearby booth calleda ScanSnap: a motorized scanner that scanned both sides of an 8.5×11 document in about a second, including color. It was impressive, and the last piece of a the puzzle, though the whole plan didn’t begin to coalesce until some months later. Indeed it wasn’t until July, stymied by a blizzard of incoming paper that was stacking up and getting misplaced faster than I could deal with it (part of a house remodel job) that I bought the ScanSnap, mated it with a copy of Acrobat 7 Pro (the scanner ships with Acrobat Standard, but I wanted the batch OCR and better document-assembly features) and began saving everything to a folder hierarchy on the G4 Mac Mini that became the project’s home.
I then tripped over a cool Mac PDF browser called Kip, since renamed Yep for legal reasons. Yep mates a Flickr-like keyword pane and a cursor-magnifier with an iPhoto-like view of a PDF collection. Yep also has same reasonably intelligent AI that begins to make associations for you as it indexes ocr’ed documents. For example, the second time or so I added ‘family trust’ as a keyword to a fax from family lawyer Friedman, Yep began doing it for me automatically. Drifts and stacks of paper began to disappear from gulker.com World HQ, especially after I worked up the guts to recycle everything that didn’t have to be saved as paper (like IRS tax receipts).
Eventually, we began to experience small victories pulling up documents from our archive (we’d have been hard-pressed to find the same paper docs in drawers and piles around here). A bigger advantage may actually be that this system meets a concept I learned in my David Allen time-mangaement course (Get Things Done). If you don’t have a trusted system where you can park important things knowing that you can absolutely get them back when you need them, those things will kind of gnaw at you, chew up cycles and brain RAM and otherwise make you less efficient.
So, the new p2 system uses the ScanSnap as the normal front end, and each evening’s mail is opened and dispatched to the recycle bin as soon as relevant docs go through the scanner. It takes only a few seconds or minutes, and I can stop thinking about those documents immediately. Our new Canon MP530 inkjet printer is a multifunction device that has a batch scanner in it that makes it possible to more easily pull in long documents. Every so often I run Acrobat’s batch ocr, which automatically dumps into Yep’s ‘pending’ folder: Yep eventually uses its AI and Spotlight to index everything, and it’s become rare not to quickly be able to find a document.
Meantime, I’ve moved every bank, credit card, brokerage, insurance and you-name-it service to ‘email/electronic statement delivery,’ which usually means PDFs which go straight into Yep. This year, we’ll even be doing our taxes via Yep’s PDF copies (with paper backups per IRS). I just search on ‘taxes’ and ’2006′, and up comes everything I need to send the tax man. Yes!
Schematically, p2 looks like this, with paper moving from ScanSnap to PDF (I omit the recycle/shred step):

The ScanSnap bitmap PDF goes to Acrobat, where it’s batch ocr’d (with amazing accuracy, IMHO, Acrobat even catches those thermal-printer credit card receipts surprisingly well) and dumped into a watched folder (Yep calls it ‘Pending’) where the PDF’s text is indexed, a little AI magic (usually a good thing) happens and, voila, our docs are now searchable.
Just for safety’s sake, the whole works is backed up – original scans, ocr’d scans, Yep’s data hierarchies – and copies of many documents are copied to my .Mac account as well (nice feature – if you absolutely need a document to prove to someone somewhere that you did someting, you can grab it wherever you have connectivity – it’s also offsite backup).
You can see how we’re doing with p2 by querying the blog. There was at least briefly, a small community discussing the project, and one guy even offered an AppleScript to help people with Acrobat standard process their scans in a more automated fashion. I like the system because it’s easy to set up and use, not a lot of work to maintain, and can do more sophisticated things (like put a bunch of related documents together quickly, in a batch). The system has yet to bite us, but, user beware – we may not have thought of (or encountered) every case. Hopefully p2 will continue to simplify our life.









{ 2 trackbacks }
{ 0 comments… add one now }