MATLAB: Image recognition: extracting numbers from white paper

computer visionComputer Vision Toolboximage recognitionocr

I have a large dataset of coloured images, all with one person holding a white piece of paper with a printed number. I am trying to extract the number as a class label for each person. By binarizing the image and removing small areas I can create an image as shown. But from here, all implementations of the ocr function I have attempted fail to extract the number. I have also attempted using corner extraction, but this does not work easily as some candidates are obscuring the corners of the paper with their hands. Could anyone provide some tips on how to achieve this?

Code:

clc
clear all
close all
% Load an image
rgbImage = imread('person11.jpeg');
grayImage = rgb2gray(rgbImage);
% Binarize the image.
binaryImage = grayImage > 120;
% Remove small objects.
binaryImage = bwareaopen(binaryImage, 5000);
figure(1)
imshow(binaryImage);
title('Cleaned Binary Image');
% Use the 'CharacterSet' parameter to constrain OCR
results = ocr(binaryImage, 'CharacterSet', '0123456789', 'TextLayout','Block');
results.Text

Output lots of different numbers, not 11!

Image:

Best Answer

Stephanie:

I'm sure you've got it working by now, but for others (or if you want to compare your algorithm to mine), here is how I would do it (attached). You could make it a lot faster if you didn't have the background be the same color as the sheet of paper the subject is holding. That could save us time because we wouldn't have to spend a lot of time to separate the two with an erosion. Have the background be some vivid color - any color except white or black or gray.

0000 Screenshot.png

Related Solutions

MATLAB: Recognize black dot

Well now that you posted an image, and I see you have black spots not only on the dice but also on the red background, I need to modify my answer to have you threshold on the blue or green channel:

clc;    % Clear the command window.
close all;  % Close all figures (except those of imtool.)
imtool close all;  % Close all imtool figures.
clear;  % Erase all existing variables.
workspace;  % Make sure the workspace panel is showing.
fontSize = 20;
folder = 'C:\Documents and Settings\user\My Documents\Temporary stuff';
% Read in a color demo image.
baseFileName = 'IMG_0584.JPG';
fullFileName = fullfile(folder, baseFileName);
% Get the full filename, with path prepended.
fullFileName = fullfile(folder, baseFileName);
if ~exist(fullFileName, 'file')
  % Didn't find it there.  Check the search path for it.
  fullFileName = baseFileName; % No path this time.
  if ~exist(fullFileName, 'file')
    % Still didn't find it.  Alert user.
    errorMessage = sprintf('Error: %s does not exist.', fullFileName);
    uiwait(warndlg(errorMessage));
    return;
  end
end
rgbImage = imread(fullFileName);
% Get the dimensions of the image.  numberOfColorBands should be = 3.
[rows columns numberOfColorBands] = size(rgbImage);
% Display the original color image.
subplot(2, 2, 1);
imshow(rgbImage, []);
title('Original color Image', 'FontSize', fontSize);
% Enlarge figure to full screen.
set(gcf, 'Position', get(0,'Screensize')); 
% Extract the individual red, green, and blue color channels.
redChannel = rgbImage(:, :, 1);
greenChannel = rgbImage(:, :, 2);
blueChannel = rgbImage(:, :, 3);
spots = blueChannel < 128;
subplot(2, 2, 2);
imshow(spots, []);
title('Thresholded Blue Channel', 'FontSize', fontSize);
spots = imclearborder(spots);
subplot(2, 2, 3);
imshow(spots, []);
title('Border Cleared', 'FontSize', fontSize);
% Fill holes
spots = imfill(spots, 'holes');
subplot(2, 2, 4);
imshow(spots, []);
title('Final Spots Image', 'FontSize', fontSize);
% Count them
[labeledImage numberOfSpots] = bwlabel(spots);
message = sprintf('Done!\nThe number of spots (total on both dice) is %d', numberOfSpots);
msgbox(message);

MATLAB: How to change 1 channel image to 3 channel

Try cat() to stack the gray scale image into 3 slices (color channels):

rgbImage = cat(3, grayImage, grayImage, grayImage);

It will be an RGB image though the only colors will be gray since there is no difference between the three different color channels. It will be a 3-D image with the z-direction (third index) being the color channel index.

Related Question