February 11, 2022    Share on: Twitter | Facebook | HackerNews | Reddit

15 tools for document Deskewing and Dewarping

Sometimes input for document processing tasks such as OCR, table detection, or text segmentation can be scan, or photo taken from hand that does not have an ideal perspective - is rotated or spatially distorted in some way (warped document). If you are looking for my recommendations go straight to the last section of this article Summary and recommendations. This article was inspired by a list of OCR-related projects posted on the list awesome-ocr. To give readers intuition about the popularity of the project - information about GitHub stars is added to each project (as of Feb 2022 - time of writing this article). To differentiate actively developed projects from ones that don’t get commits anymore - the date of the last commit was added.

The typical approach for deskewing

The deskewing is typically realized by using Canny Edge Detection and Hough Transform to determine the angle of rotation (skew) and then applying rotation in opposite direction.

The typical approach for dewarping

Reconstruction of spatial (3D) structure of the document is typically done using the Deep Learning approach.

List of approaches presented in this article

Dewarping

Page dewarp (1.1k stars)

Last commit: Oct 2016, but the reworked version is actively developed (last commit: 24 Jan 2022)

page_dewarp - Page dewarping and thresholding using a “cubic sheet” model

before and after dewarp

Read more here: Page dewarping

NOTE: It is written in Python but using Python 2.

Since the original work of mzucker was written in Python 2 and not developed further there was the initiative to renovate the original scripts and there is page-dewarp which is also available on Pypi, and it is pip installable (pip install page-dewarp)

MORAN (579 stars)

Last commit: 30 Jul 2019

MORAN_v2 - A Multi-Object Rectified Attention Network for Scene Text Recognition

img

Written in Python, using PyTorch

NOTE: The project is only free for academic research purposes.

DewarpNet (291 stars)

Last commit: 6 Sep 2021

DewarpNet - This repository contains the codes for DewarpNet training.

img

DewarpNet project web page. Here is how the authors characterize their solution in the abstract of the paper:

DewarpNet, a deep learning approach for document image unwarping from a single image. Our insight is that the 3D geometry of the document not only determines the warping of its texture but also causes the illumination effects. Therefore, our novelty resides on the explicit modeling of 3D shape for document paper in an end-to-end pipeline.

DewarpNet pre-trained models are available for download from Google Drive.

Document Image Dewarping - algorithm (241 stars)

Last commit: 30 Sep 2019

Document-Image-Dewarping- Document image dewarping is approached by using text lines and line segments.

In this repository, there is no public code to use but just algorithm description and executable available for download.

Unproject Text (104 stars)

Last commit: 13 Oct 2016

unproject_text - Perspective recovery of text using transformed ellipses.

It is not exactly dewarping but perspective correction which is why it was placed in the dewarping section.

Written in Python, it is pretty lightweight: using numpy, scipy, cv2,…

In a nutshell, letters are replaced with ellipses and the axes of ellipses are used to determine what affine transformation is needed to correct perspective:

contours with areas

Image source: repository owner’s writeup

More information about the method can be found in the article and paper by Carlos Merino-Gracia et al. .

Docuwarp (stars 83)

Last commit: 18 Oct 2021

Docuwarp - An application of high-resolution GANs to dewarp images of perturbed documents. This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for the general image to image translation. The objective is to take images of documents that are warped, folded, crumpled, etc., and convert the image to a “dewarped” state by using pix2pixHD to train and perform inference.

Written in Python.

Book content segmentation and dewarping (under construction) (11 stars)

Last update of code: 2018

Book Content Segmentation and Dewarping - Using FCN (fully convolution network) to segment the image into 3 parts (left page, right page, and background).

The segmentation demo is available here: https://raymondmgwx.github.io/?e=Project_BookContent&&theme=Image-Process-Content.

NOTE: that Data Augment and Dewarp Algorithm are in TODO of this project.

Deskewing

Unpaper (770 stars)

Last commit: 21 Jan 2022

unpaper is a post-processing tool for scanned sheets of paper, especially for book pages that have been scanned from previously created photocopies.

The main purpose is to make scanned book pages better readable on screen after conversion to PDF. Additionally, unpaper might be useful to enhance the quality of scanned pages before performing optical character recognition (OCR).

unpaper tries to clean scanned images by removing dark edges that appeared through scanning or copying on areas outside the actual page content (e.g. dark areas between the left-hand side and the right-hand side of a double-sided book-page scan).

The program also tries to detect misaligned centering and rotation of pages and will automatically straighten each page by rotating it to the correct angle. This process is called “deskewing”.

Written mostly in C.

Alyn (222 stars)

Last commit: 14 Jun 2017

Alyn - Skew detection and correction in images containing text. It uses Canny Edge Detection and Hough Transform to determine skew.

How the Alyns’ skew detection works:

  • Converts the image to greyscale
  • Performs Canny Edge Detection on the Image
  • Calculates the Hough Transform values
  • Determines the peaks
  • Determines the deviation of each peaks from 45-degree angle
  • Segregates the detected peaks into bins
  • Chooses the probable skew angle using the value in the bins

Alyn is written in Python can be installed with pip (pip install allyn).

Deskew (211 stars)

Last commit : 10 Feb 2022

deskew - Library used to deskew a scanned document. Skew detection and correction in images containing text Written in Python, lightweight. Inspired by Alyn.

galfar/deskew (102 stars)

Last commit: 6 Jan 2022

galfar/deskew - Deskew is a command-line tool for deskewing scanned text documents. It uses Hough transform to detect “text lines” in the image. As an output, you get an image rotated so that the lines are horizontal.

There are binaries built for these platforms: Win64, Win32, Linux 64bit, macOS, and Linux ARMv7. GUI frontend for this CLI tool is available as well (Windows, Linux, and macOS),

NOTE: It is written in Pascal.

Skew correction (12 stars)

skew_correction - Deskewing images with slanted content by finding the deviation using Canny Edge Detection.

Deskewing (stars 8)

Last commit: 12 Jan 2014

deskewing - Contains code to deskew images using MLPs, LSTMs and LLS transformations. Written in Python.

Text deskewing (5 stars)

Last commit: 9 Mar 2018

text_deskewing - Rotate text images if they are not straight for better text detection and recognition. Uses Canny Edge Detection and probabilistic Hough Transform.

It is written in Python and the repository does not contain a lot of code - it is easy to follow and learn how those simple techniques can be used to desk the text.

Summary and recommendations

What to use for Deskewing?

  • If you need to deskew and additionally clean-up document from scanning artifacts use: unpaper
  • If you just need to correct the rotation of the document use: Alyn or deskew
  • If you want to learn about using Edge Detection and Hough Transform for document deskewing you might want to have look at: text_deskewing

What to use for Unwarping and Deskewing?

  • For dewarping book pages that have smooth bendings consider using page-dewarp (renovated version of popular page_dewarp.
  • For more complex dewarping including e.g. folded pages use Deep Learning-based solutions such as DewarpNet or Docuwarp.
  • If you are working with flat pages and you just need to correct perspective unproject_text might be the right tool for you.

What to use for Document Segmentation

Document segmentation was not in the scope of this article. You can check awesome-ocr section on Document Segmentation

References:

  • awesome-ocr - a rich collection of OCR-related projects and tools

Credits: