I was also unable to extract the numbers when I tried MATLAB’s “ocr” function on the given image. The result was an empty string, and it couldn’t recognize the numbers as expected.
I found that “ocr” performs best when the text is located on a uniform background and is formatted like a document with dark text on a light background. When the text appears on a non-uniform dark background, additional pre-processing steps are required to get the best OCR results.
To overcome the challenges obtaining accurate results, kindly follow the preprocessing steps given below:
OCR performs better on grayscale images than on RGB images
grayImage = rgb2gray(rgbImage);
To enhance the edges of the text and make it more readable for OCR:
sharpenedImage = imsharpen(grayImage);
Use a median filter or Gaussian filter to smooth the image while preserving edges:
filteredImage = medfilt2(grayImage, [3 3]);
- Binarization (Thresholding)
Convert the grayscale image to a binary image using Otsu's method
binaryImage = imbinarize(filteredImage);
- Text Stroke Width Normalization
If text has varying thickness, normalize it using the Stroke Width Transform (SWT):
binaryImage = bwareafilt(binaryImage, [30, Inf]);
Remove small noise and connect broken characters using dilation and erosion:
erodeImage = imerode(binaryImage, se);
You can also set “LayoutAnalysis” to "Block" to instruct “ocr” to assume the image contains just one block of text.
results = ocr(binaryImage,LayoutAnalysis="Block");
For more information, please refer to the following MATLAB documentation:
I hope this helps!