Skip to main content

Optical character recognition

·95 words·1 min·
Photo by Kristian Strand on Unsplash
Optical character recognition (OCR) is the process of converting images of text into machine-encoded text. This can be useful for digitizing printed documents, extracting information from images, and more.

Install dependencies
#

pip install -U opencv-python pytesseract

Image preprocessing and characters extraction
#

import cv2
import pytesseract

img = cv2.imread("my_image.png")

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), cv2.BORDER_DEFAULT)
thresh = cv2.threshold(blur, 200, 255, cv2.THRESH_BINARY)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 1))
opening = cv2.morphologyEx(
    thresh,
    cv2.MORPH_OPEN,
    kernel,
    iterations=1,
)

print(
    pytesseract.image_to_string(
        opening,
        lang="eng",
        config="--oem 3 --psm 6",
    )
)