BookBridge: Next-generation OCR for Books

Our BookBridge software lets libraries and publishers transform print collections into accessible ebooks in minutes. BookBridge is faster, cheaper, and more reliable than double keying or OCR with manual cleanup.

Technology

BookBridge takes a new approach to book digitization, borrowing ideas from bioinformatics. Our software shines when provided with two versions of the same book (e.g., hardback and large print). In this case, character recognition accuracy often reaches 100% and the resulting ebook is indistinguishable from an ebook prepared by a publisher.

Technical details about our approach are available in Riddell, A. B. (2022). Reliable editions from unreliable components: Estimating ebooks from print editions using profile hidden Markov models. In Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries.

Performance

BookBridge converts books into accessible, standards-compliant ebooks (EPUB 3.3, WCAG 2.2 AA, eBraille 1.0). EPUBs produced by BookBridge are free from print artifacts such as end-of-line hyphenation and running headers. Our character recognition accuracy meets or exceeds that of Tesseract, AWS Textract, Google Document AI, and Azure Document Intelligence.

Cost

BookBridge costs a fraction of double keying. We currently offer two usage-based plans:

Glyphlab's Mission

There are more than 113 million books in existence. Only a fraction are available in an accessible format—fewer than 10% in the US, fewer than 1% in the Global South.

Glyphlab Inc. is a Public Benefit Corporation founded in 2022. Our mission is to grow the number of accessible books available worldwide.

Subscribe to our newsletter to receive news and product updates.