It is one of these BIG projects, which one tries to realize again and again but never really gets into: the paperless office. I am collecting and sorting bills, insurance-documents, banking-documents and so on for years now in two big black paper-boxes. New documents first land on my desk and after dealing with them, I throw them in a drawer in my desk. Then, once a year, I waste one to two evenings sorting all these documents in folders inside my paper-boxes. It’s a hideous task, really.
This is about to end now! (I hope ;-)) I aim to scan all sorted and future documents, save the pdf’s, and shred the paper-documents afterwards. I will only keep important documents, like certificates and so on.
To reach my goal, I did some searching on the net, gathering ideas to accomplish this project with a long-term solution. I came to the conclusion that one needs to take care of three things:
- A scanner
- A scanning software with text-recognition (OCR)
- A software to archive, backup and search
(also known as DMS, Document Management System)
Everyone who owns a moderately recent smartphone already owns a scanner; sort of. The App-store and the Play-store have a multitude of apps in their portfolio which not only scan documents with the smartphone-camera, but also help by sorting and searching in those documents.
A few examples:
– Readdle Scanner Pro
– Adobe Scan
The results often have a surprisingly good quality if the ambient-light is sufficient. Even errors in the perspective are corrected on the fly. And it is a very fast method, just point your smartphone at the desired document an off you go.
There are two reasons, why I chose another solution for my private office.
First, the quality is good, but does not match the resolution and color-accuracy of a decent scanner.
Second, I plan to archive my documents for years, even decades to come, and want to be able to search and view my documents even 30 years from now. There is absolutely no guaranty, that said app developers will support their apps this long or that even a device like a smartphone will still exist in 30 years in the form, as we know it now.
That is why I have decided, to save my scans as plain .pdf-documents in an ordinary folder-system. Even with the fast advancing world of technology, files and folders are not going anywhere anytime soon. How I managed to get cloud-support und device-independent access, well, read on.
If one is serious about scanning a lot of old documents, a dedicated document-scanner should be considered. For now, I have not made this step, because my old brother all-in-one printer has a scanner with document-feeder. But if the latest blog-reviews and recommendations from DMS-developers are right, the best in class are the ScanSnap scanner from Fujitsu. I personally only have experience with the devices from Epson, as we use these daily in the clinic and they get the job done.
Here are the affiliate-Links for a Fujitsu- and an Epson-document scanner, as well as a link to my trusted all-in-one brother printer, which by now has not dissapointed and is a whole lot cheaper.
Scan-software and DMS
Using a smartphone app is already an all-in-one solution. If you buy a dedicated document-scanner, chances are, you get a good software for pdf-creation with text-recognition over OCR. One still has to care about sorting and archiving the documents. On my search around the net, I stumbled upon a few interesting software-solutions. A lot of them aim for business-clients and have a hefty subscription-fee. For home-users I found these solutions reasonable:
Regardless of the solution, it is important that the software provides text-recognition over OCR. This technique makes the generated pdf’s searchable, comparable to modern mail-provider like gmail.
My personal solution
I decided against a dedicated document-management-software for my home-office. Again, I don’t trust these developers to be on the market for 20-30 years, and many of them use some kind of proprietary system to manage your documents.
Fortunately I stumbled upon one of the best open-source tools I have discovered for a long while. The software is called NAPS2, and provides easy but still highly customizable one-click scan solution for nearly every scanner. It generates searchable pdf’s and the ocr-engine is sufficient enough for my personal needs.
NAPS2 is so good, I instantly donated 5$ to the developer, and will do so again, when it keeps up during the course of this project.
The generated pdf’s are saved directly from NAPS2 in my Google Drive via Drive-File-Stream. I generated a few sub-folders like “banking”, “insurances”, “bills” and so on, the filenames are a combination of the date of the document an a short description; like “20181019-car-insurance.pdf”.
Theoretically I didn’t even need to work with folders and file descirptions, because google search algorithm works like a charm with the searchable pdf’s. But again, who knows, where google will stand in 20 years. I will backup the files to a local hard-drive on maybe a yearly basis.
Now, I am curious how fast I will be able to scan my whole archive of paper-documents of the last 20 years or so. But its a good opportunity to get rid of old or expired documents.
The only limiting factor will probably my all-in-one printer-scanner which was most likely not build for a task like this. I will report how far I will come.