Iris is an open source distributed optical character recognition pipeline that makes it easy to preprocess and OCR scans of text documents in a multitude of ways. Iris builds upon the celery distributed task queue to provide a lightweight task queue that performs reasonably well across a wide spectrum of use cases.

To get started with iris, first read The Iris Architecture then use Installing Iris to start provisioning your pipeline. Using Iris describes different ways to use the pipeline.

The full documentation tree can be seen here.