Case study · OCR

ROCR: Turning State-of-the-art OCR into Automated Form Processing

← All posts

Riders for Health offers medical transportation and logistics services in Africa, especially in rural areas, using fleets of motorcycles to handle rougher terrains. However, their riders don't just get to be awesome riding motorcycles around — they also have to spend considerable time keeping careful paper logbooks of when, where, and what they collect and transport.

To assist with this, we built ROCR. ROCR (Riders for Health OCR) is a prototype automated form processing and handwriting prediction tool to aid community health workers with more efficient digital data collection. The ultimate goal of the project is to let Riders for Health do what they do best — manage the logistics of getting into remote healthcare outposts and picking up medical samples — by minimizing the time spent in the field writing in paper logbooks.

ROCR extracts and predicts the key handwritten information from regularly used forms. The figure below shows an example of this process. A photo of the World Health Organization case report for confirmed COVID-19 cases is passed through the ROCR tool that extracts and predicts two key fields: the reporting country and unique case identifier.

ROCR tool results on WHO COVID-19 case report
Actual ROCR tool results using Azure Form Recognizer OCR model on WHO case report for confirmed COVID-19 cases. Form data shown was filled out by our team and is not real.

This task presents several challenges: How do you find the key field you are searching for in each new image? How do you predict handwriting (which is much more difficult than computer-generated text)? Can you use domain knowledge of the form's use case to improve the predictions?

How ROCR works

To address these challenges, ROCR processes forms using the following steps:

  1. Image Alignment — Warp the input form image to align with the blank template form
  2. OCR / Handwriting Prediction — Predict the form's text
  3. Key Field Extraction — Find OCR predictions text in regions of interest
  4. Post Processing — Apply domain knowledge to improve OCR prediction outputs
ROCR pipeline — image alignment, OCR prediction, post processing

ROCR uses off-the-shelf OCR engines, Google Cloud Vision and Azure Form Recognizer, but needs "glue" code — proper image alignment, key field extraction, OCR cloud response handling, and post processing — to use them for the ROCR tool and get reasonable results.

Key technical takeaways


About the project

ROCR was built for Riders for Health: an international nonprofit that offers medical transportation and logistics services in Africa. It's difficult to overstate the challenge of bringing healthcare to rural villages in developing countries. Riders for Health overcomes these challenges mostly with a fleet of motorcycles capable of handling rougher terrains effectively and efficiently. Working with Riders for Health is basically like working with superheroes riding motorcycles, with kindness and humility to boot.

Riders for Health health worker on motorcycle
Photo courtesy of www.riders.org

This work is driven by DataKind, a nonprofit committed to the application of data science for social good. DataKind brings together pro-bono data scientists and social impact organizations who benefit from their time and expertise. Alex Fried, Karry Lu, Amy Roberts, Alexander Sack, and Anna Dixon formed the team that built and tested the ROCR tool. ROCR is part of a portfolio of projects designed to impact community health workers.