[Digitalarchivists] Update on book scanning

kprichard kprichard at gmail.com
Wed May 31 21:09:11 UTC 2017


https://noisebridge.net/wiki/30_May_2017:_Test_a_copy_of_PDFScanner

linked from a page with previous documented work-

https://noisebridge.net/wiki/Book_Scanner_Software

On Wed, May 31, 2017 at 7:39 AM, newmy51 at gmail.com <newmy51 at gmail.com>
wrote:

> Super cool!  Would love to see some photos or screenshots.  Any of this
> excellent progress added to the wiki?
>
> Best from Syracuse,
>
> -Danny
>
> On May 31, 2017 7:08 AM, "kprichard" <kprichard at gmail.com> wrote:
>
>> Tonight I finished rebuilding the dorkroom mac mini by reinstalling macOS
>> Sierra. Previously I replaced the crashed HD with a donated SSD. Specs are:
>> 8GB RAM, Core2Duo 2.4 GHz, 128GB SSD.  It boots quickly and is faster
>> overall.  I renamed it to 'BookScannerMacMini'.
>>
>> Since my last emails I have continued looking for image-to-pdf softwares,
>> and recently found another one which looks promising: PDFScanner (macOS)
>>
>> I put it through the same test as ABBYY FineReader Pro, writing up a
>> report and producing a PDF (linked on the wiki)-
>>
>> https://noisebridge.net/wiki/30_May_2017:_Test_a_copy_of_PDFScanner
>>
>> Results are acceptable. Not nearly so accurate as ABBYY FineReader, but
>> substantially better than Tesseract from cli.  Sorry there are no exact
>> quantitative results, just my sense from having looked at this problem for
>> more than five minutes.
>>
>> Cost is $16, which I've spent.  Appears to be faster than FineReader.
>>
>> Next steps:
>> - Hooking the mini up to the twin Canons and getting scan.py working again
>> - Add a post-process pipeline with as filesystem watcher, and a script to
>> pump the image files thru imagemagick or GIMP: autocrop, align, deskew,
>> autolevels, contrast
>> - Run some books through and get PDFs
>>
>> PDFScanner is as close to user-friendly as anything I've seen, certainly
>> more so than ABBYY FineReader.  A set of files can be drag-dropped onto it,
>> and it automatically starts OCRing them.  If they're all oriented and
>> cropped ahead of time, then the only remaining step is to press Cmd-S to
>> export as PDF.
>>
>> We are getting close to having a fully functional book scanner.
>>
>>
>> _______________________________________________
>> Digitalarchivists mailing list
>> Digitalarchivists at lists.noisebridge.net
>> http://www.noisebridge.net/mailman/listinfo/digitalarchivists
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.noisebridge.net/pipermail/digitalarchivists/attachments/20170531/9beabeb9/attachment-0003.html>


More information about the Digitalarchivists mailing list