Main Content

Count Objects Using CounTR Model

Since R2025a

This example demonstrates few-shot object counting using the Counting Transformer (CounTR) model, which counts objects in an image by identifying a few exemplars without additional training.

CounTR is a class-agnostic model that determines object counts by assessing the similarity between image patches and exemplars via an attention mechanism [1]. Unlike class-specific models that target specific categories like people or cars, class-agnostic counting relies on a few exemplar instances, known as few-shot counting. This model generates a density map by comparing image and exemplar features, with the total object count derived from this map. The model is invariant to variations in shape, color, scale, and texture of exemplars, counting only objects matching the provided exemplars' characteristics. The CounTR model used in this example was trained on the FSC-147 dataset, which includes 6,135 images from 147 diverse object categories, with counts ranging from 7 to 3,731 and an average of 56 objects per image [2]. It effectively counts common objects in images without further training.

Download Deep NIR Fruit Detection Data Set

This example uses the Deep NIR Fruit Detection Data Set, which contains images of various types of fruit and berries captured using near-infrared (NIR) imaging technology [3]. These images are taken under different lighting and environmental conditions.

Specify dataDir as the location of the data set. Download the data set and unzip the contents of the folder into the specified location using the dataSetDownloadURL helper function. The function is attached to this example as a supporting file.

dataSetDownloadURL = "https://ssd.mathworks.com/supportfiles/vision/data/deepNIRFruitDetectionDataset.zip";
dataDir = fullfile(tempdir,"FruitDetectionDataset");
downloadFruitDetectionData(dataSetDownloadURL,dataDir);
Downloading FruitDetection data set.
This can take several minutes to download and unzip...
Done.

Select Images Containing Blueberries

Create a pattern that matches any string containing the word "blueberry" with zero or more word characters before and after it. Retrieve a list of all files and folders in the directory specified by dataDir and its subdirectories.

pattern = "\w*blueberry\w*";
fileList = dir(fullfile(dataDir,"**"));

Extract the folder paths and the file and folder names from fileList and store them in the folderNames and fileNames cell arrays, respectively. Then, check each file name and create an array of image files, whose file names contain "blueberry" and do not contain "json".

folderNames = {fileList.folder};
fileNames = {fileList.name};
matchedFiles = cellfun(@(x) ~isempty(regexpi(x,pattern,"once")) && isempty(regexpi(x, "json", "once")),fileNames);

Create full file paths for the files containing blueberries by concatenating folderNames and fileNames. Store the image file path results in filteredFileList.

filteredFileList = cellfun(@(x,y) string(x)+filesep+string(y),folderNames(matchedFiles),fileNames(matchedFiles));
fprintf("Found %d images containing blueberries.\n",numel(filteredFileList));
Found 23 images containing blueberries.

Select Exemplar Bounding Boxes

Select several exemplars, or bounding boxes that contain blueberries within the image. The CountTR model uses these exemplars to extract patches for computing similarity with the original image features. Typically, a minimum of three exemplars is necessary to achieve an accurate count. In this instance, select eight exemplars to count the blueberry instances across all filtered images.

Load Exemplar Image

Load an exemplar image from the filtered image files that contain blueberries.

exemplarImage = filteredFileList(1);
I = imread(exemplarImage);

Select Exemplars from Exemplar Image

Select bounding boxes interactively using the selectBoundingBoxes helper function.

exemplarBboxes = ...
    [246.0000  349.0000   68.0000   67.0000
    402.0000  189.0000   66.0000   65.0000
    525.0000  330.0000   73.0000   58.0000
    235.0000  439.0000   65.0000   65.0000
    130.0000  246.0000   66.0000   65.0000
    510.0000  426.0000   63.0000   64.0000
    574.0000  442.0000   55.0000   55.0000
    138.0000  555.0000   52.0000   58.0000];

Show the selected exemplar bounding boxes overlaid on the image using the insertShape function.

annotatedImage = insertShape(I,"rectangle",exemplarBboxes,"LineWidth",3);
imshow(annotatedImage)
title("Original Image and Selected Exemplars")

Configure CounTR Object

Configure a CounTR model using the exemplar image and bounding box exemplars, and extract image patches corresponding to the exemplar bounding box locations using the counTRObjectCounter object.

counTRObj = counTRObjectCounter(I,exemplarBboxes);

The CounTR model is a vision transformer-based model which requires inputs to be of size 384-by-384. The model resizes the images to size 384-by-N or N-by-384 depending on the original image's orientation, where N is a dimension size that maintains the image aspect ratio with respect to its original size. To analyze image patches, the model moves a fixed-size window across the image, which overlaps with the previous window position by 128 pixels. This overlap ensures continuity and helps to detect features that span across multiple windows. The model scales the exemplar bounding boxes using the scale factor of the resized image relative to the original image, and uses them to extract exemplar patches. The model then resizes the patches to size 64-by-64 and uses them as the exemplar patches to extract features for computing similarity.

Count Blueberries in Sample Image

Count the number of blueberries in a sample image using the countObjects object function.

count = countObjects(counTRObj,I);
fprintf("Number of objects detected: %0.3f",count);
Number of objects detected: 46.513

Display Density Map

The CounTR model produces a density map at each pass of the sliding window over the image, where each pixel value indicates the estimated density of objects, such as blueberries, in that region of the image. The individual density maps from each sliding window position, or step, are combined or "blended" together to form a final density map for the entire image. The final density map provides a comprehensive view of object density and localization across the image. The model computes the object count by summing up the values in this accumulated density map.

Compute the density map of the image using the densityMap object function, and display it using the imshow function. The model localizes a number of blueberries at varying intensities.

density_map = densityMap(counTRObj,I);
imshow(density_map)

Overlay Density Map Over Image

Display the blended output by overlaying the density map onto the original image using the anomalyMapOverlay function. Include a text box in the overlay indicating the total count using the insertText function.

densityOverlayImage = anomalyMapOverlay(I,density_map,"Blend","equal");
fontSize = min(floor(size(densityOverlayImage,2)/30),200);
gutterWidth = fontSize*9;
densityOverlayImageWText = insertText(densityOverlayImage,[size(densityOverlayImage,2)-gutterWidth 5], ...
    sprintf("count=%0.2f",count),FontSize=fontSize);
imshow(densityOverlayImageWText)

This visualization demonstrates the CounTR model's effectiveness in detecting the majority of blueberries, including those that are occluded or partially visible. Although the intensity for occluded and partial blueberries may be lower in the density map, they still contribute to the overall count.

Display a montage of the image, density map overlay, and density map.

annotatedInputImage = insertShape((I),"rectangle",exemplarBboxes,LineWidth=10,ShapeColor="red");
montage({(I),densityOverlayImageWText,rescale(density_map)},Size=[1 3],BorderSize=[0 10])

Compute Object Counts and Density Maps for Image Datastore

Create an image data store containing all the images in the data set using the ImageDatastore object.

dsFilteredImages = imageDatastore(filteredFileList,FileExtensions=[".jpg",".png"]);

Count blueberries in all the images in the data set using the previously constructed CounTR object, counTRObj, as an input to the countObjects object function.

counts = countObjects(counTRObj,dsFilteredImages);
Running CounTR Object Counting network
--------------------------------------
* Processed 23 images.

Display Count Values

Display the blueberry count values computed from all the images as a scatter plot.

scatter(1:numel(dsFilteredImages.Files),counts)

Compute Density Maps

Compute density maps for all images in the datastore using the densityMap object function. The densityMap object function computes density maps, writes them to disk as a .png files, and stores them in a dsDensityMaps datastore.

dsDensityMaps = densityMap(counTRObj,dsFilteredImages);
Running CounTR Object Counting network
--------------------------------------
* Processed 23 images.

Display Overlaid Density Map on All Images

Combine the image datastore, dsFilteredImages, and density map datastore, dsDensityMaps, to create a datastore of images with overlaid density maps, dsImageAndDensityMap. Using the slider, select the image index, imgIdx, of the image to display from the combined datastore. Display the density map and object count overlaid on the selected image using the overlayDensityMap helper function.

numDensityMaps = numpartitions(dsDensityMaps);
imgIdx =7;
dsImageAndDensityMap = combine(dsFilteredImages,dsDensityMaps);
x = read(subset(dsImageAndDensityMap,imgIdx));
I = x{1};
density_map = x{2};
count = counts(imgIdx);
densityOverlayImageWText = overlayDensityMap(I,density_map,count);
imshow(densityOverlayImageWText)

Helper functions

overlayDensityMap

function densityOverlayImageWText = overlayDensityMap(I,density_map,count)

    densityOverlayImage = anomalyMapOverlay(I,density_map,"Blend","equal");
    fontSize = min(floor(size(densityOverlayImage,2)/30),200);
    gutterWidth = fontSize*9;
    densityOverlayImageWText = insertText(densityOverlayImage,[size(densityOverlayImage,2)-gutterWidth 5],...
        sprintf("count=%0.2f",count),FontSize=fontSize);
end

selectBoundingBoxes

function boundingBoxes = selectBoundingBoxes(img) %#ok<DEFNU>
    % Display the image
    figure;
    imshow(img);
    title('Select bounding boxes, double-click on the box when done');
    hold on;

    % Initialize an array to store bounding box positions
    boundingBoxes = [];

    % Loop to allow multiple rectangle selections
    while true
        % Use drawrectangle to let the user draw a bounding box
        h = drawrectangle('Color', 'r');

        % Wait for the user to double-click or press Enter to finish drawing
        wait(h);

        % Get the position of the drawn rectangle
        position = h.Position;

        % Append the position to the boundingBoxes array
        boundingBoxes = [boundingBoxes; position]; %#ok<AGROW>

        % Ask the user if they want to select another bounding box
        choice = questdlg('Do you want to select another bounding box?', ...
            'Continue Selection', ...
            'Yes', 'No', 'Yes');

        % Break the loop if the user selects 'No'
        if strcmp(choice, 'No')
            break;
        end
    end

    % Close the figure
    close;

    % Display the selected bounding boxes
    disp('Selected bounding boxes:');
    disp(boundingBoxes);
end

References

[1] Liu, Chang, Yujie Zhong, Andrew Zisserman, and Weidi Xie. "Countr: Transformer-based generalised visual counting." arXiv preprint arXiv:2208.13721 (2022).

[2] Ranjan, Viresh, Udbhav Sharma, Thu Nguyen, and Minh Hoai. "Learning to count everything." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3394-3403. 2021.

[3] Sa, Inkyu, et al. "deepNIR: Datasets for Generating Synthetic NIR Images and Improved Fruit Detection System Using Deep Learning Techniques." Sensors, vol. 22, no. 13, June 2022, p. 4721. DOI.org (Crossref), https://doi.org/10.3390/s22134721.

See Also

| | | |

Topics