Skip to main content
Optical character recognition

Optical character recognition

·95 words·1 min·
Photo by Kristian Strand on Unsplash
Optical character recognition (OCR) is the process of converting images of text into machine-encoded text. This can be useful for digitizing printed documents, extracting information from images, and more.

Install dependencies
#

pip install -U opencv-python pytesseract

Image preprocessing and characters extraction
#

import cv2
import pytesseract

img = cv2.imread("my_image.png")

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), cv2.BORDER_DEFAULT)
thresh = cv2.threshold(blur, 200, 255, cv2.THRESH_BINARY)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 1))
opening = cv2.morphologyEx(
    thresh,
    cv2.MORPH_OPEN,
    kernel,
    iterations=1,
)

print(
    pytesseract.image_to_string(
        opening,
        lang="eng",
        config="--oem 3 --psm 6",
    )
)