Image processing to facilitate Optical Character Recognition
2009
Master Semester Project
Project: 00181

Optical Character Recognition (OCR) is extensively used in automated digitisation of libraries. This task can raise several difficulties. However, since we have strong a priori on the data to be recovered (the text) we expect to be able to recover it fairly well.
In this project, we are interested in developping preprocessings in order to facilitate the OCR. Particularly, we will focus on the distortions that appear in the central part of a book (see the picture).
Things that can be done:
This work could be implemented as a java-ImageJ plugin.
In this project, we are interested in developping preprocessings in order to facilitate the OCR. Particularly, we will focus on the distortions that appear in the central part of a book (see the picture).
Things that can be done:
- Estimate and correct geometric distortions
- Compensate for the contrast
- Deblur parts that are not in-focus
This work could be implemented as a java-ImageJ plugin.
- Supervisors
- Guerquin-Kern Matthieu, matthieu.guerquin-kern@epfl.ch, 35142, BM 4.140
- Michael Unser, michael.unser@epfl.ch, 021 693 51 75, BM 4.136