Main Content

imsegkmeans

K-means clustering based image segmentation

Description

example

L = imsegkmeans(I,k) segments image I into k clusters by performing k-means clustering and returns the segmented labeled output in L.

example

[L,centers] = imsegkmeans(I,k) also returns the cluster centroid locations, centers.

L = imsegkmeans(I,k,Name,Value) uses name-value arguments to control aspects of the k-means clustering algorithm.

Examples

collapse all

Read an image into the workspace.

I = imread("cameraman.tif");
imshow(I)
title("Original Image")

Segment the image into three regions using k-means clustering.

[L,Centers] = imsegkmeans(I,3);
B = labeloverlay(I,L);
imshow(B)
title("Labeled Image")

Read an image into the workspace. Reduce the image size to make the example run more quickly.

RGB = imread("kobi.png");
RGB = imresize(RGB,0.5);
imshow(RGB)

Segment the image into two regions using k-means clustering.

L = imsegkmeans(RGB,2);
B = labeloverlay(RGB,L);
imshow(B)
title("Labeled Image")

Several pixels are mislabeled. The rest of the example shows how to improve the k-means segmentation by supplementing the information about each pixel.

Supplement the image with information about the texture in the neighborhood of each pixel. To obtain the texture information, filter a grayscale version of the image with a set of Gabor filters.

Create a set of 24 Gabor filters, covering 6 wavelengths and 4 orientations.

wavelength = 2.^(0:5) * 3;
orientation = 0:45:135;
g = gabor(wavelength,orientation);

Convert the image to grayscale.

I = im2gray(im2single(RGB));

Filter the grayscale image using the Gabor filters. Display the 24 filtered images in a montage.

gabormag = imgaborfilt(I,g);
montage(gabormag,"Size",[4 6])

Smooth each filtered image to remove local variations. Display the smoothed images in a montage.

for i = 1:length(g)
    sigma = 0.5*g(i).Wavelength;
    gabormag(:,:,i) = imgaussfilt(gabormag(:,:,i),3*sigma); 
end
montage(gabormag,"Size",[4 6])

Supplement the information about each pixel with spatial location information. This additional information allows the k-means clustering algorithm to prefer groupings that are close together spatially.

Get the x and y coordinates of all pixels in the input image.

nrows = size(RGB,1);
ncols = size(RGB,2);
[X,Y] = meshgrid(1:ncols,1:nrows);

Concatenate the intensity information, neighborhood texture information, and spatial information about each pixel.

For this example, the feature set includes intensity image I instead of the original color image, RGB. The color information is omitted from the feature set because the yellow color of the dog's fur is similar to the yellow hue of the tiles. The color channels do not provide enough distinct information about the dog and the background to make a clean segmentation.

featureSet = cat(3,I,gabormag,X,Y);

Segment the image into two regions using k-means clustering with the supplemented feature set.

L2 = imsegkmeans(featureSet,2,"NormalizeInput",true);
C = labeloverlay(RGB,L2);
imshow(C)
title("Labeled Image with Additional Pixel Information")

Read an image into the workspace.

I = imread("peppers.png");
imshow(I)
title("Original Image")

Segment the image into 50 regions by using k-means clustering. Return the label matrix L and the cluster centroid locations C. The cluster centroid locations are the RGB values of each of the 50 colors.

[L,C] = imsegkmeans(I,50);

Convert the label matrix into an RGB image. Specify the cluster centroid locations, C, as the colormap for the new image.

J = label2rgb(L,im2double(C));

Display the quantized image.

imshow(J)
title("Color Quantized Image")

Write the original and compressed images to file. The quantized image file is approximate one quarter the size of the original image file.

imwrite(I,"peppersOriginal.png");
imwrite(J,"peppersQuantized.png");

Read and display an image of tissue stained with hematoxylin and eosin (H&E). This staining method helps pathologists distinguish between tissue types that are stained blue-purple and pink.

he = imread("hestain.png");
imshow(he)
title("H&E Image");
text(size(he,2),size(he,1)+15, ...
     "Image courtesy of Alan Partin, Johns Hopkins University", ...
     FontSize=7,HorizontalAlignment="right");

Convert the image to the L*a*b* color space using the rgb2lab function. The L*a*b* color space separates image luminosity and color. This makes it easier to segment regions by color, independent of lightness.

lab_he = rgb2lab(he);

To segment the image using only color information, limit the image to the a* and b* values in lab_he. Convert the image to data type single for use with imsegkmeans. Use the imsegkmeans function to segment the image into three regions.

ab = lab_he(:,:,2:3);
ab = im2single(ab);
numColors = 3;
L2 = imsegkmeans(ab,numColors);

Display the label image as an overlay on the original image. The label image separates the white, blue-purple, and pink stained tissue regions.

B2 = labeloverlay(he,L2);
imshow(B2)
title("Labeled Image a*b*")

Input Arguments

collapse all

Image to segment, specified as a 2-D grayscale image, 2-D color image, or 2-D multispectral image. If the original image is of data type double, convert the image to data type single by using the im2single function.

Data Types: single | int8 | int16 | uint8 | uint16

Number of clusters to create, specified as a positive integer.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: imsegkmeans(I,k,NumAttempts=5) repeats the clustering process five times.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: imsegkmeans(I,k,"NumAttempts",5) repeats the clustering process five times.

Normalize input data to zero mean and unit variance, specified as a numeric or logical 1 (true) or 0 (false). If you specify true, then imsegkmeans normalizes each channel of the input individually.

Number of times to repeat the clustering process using new initial cluster centroid positions, specified as a positive integer.

Maximum number of iterations, specified as a positive integer.

Accuracy threshold, specified as a positive number. The algorithm stops when each of the cluster centers move less than the threshold value in consecutive iterations.

Output Arguments

collapse all

Label matrix, specified as a matrix of positive integers. Pixels with label 1 belong to the first cluster, label 2 belong to the second cluster, and so on for each of the k clusters. L has the same first two dimensions as image I. The data type of L depends on the number of clusters.

Data Type of LNumber of Clusters
uint8k <= 255
uint16256 <= k <= 65535
uint3265536 <= k <= 2^32-1
double2^32 <= k

Cluster centroid locations, returned as a numeric matrix of size k-by-c, where k is the number of clusters and c is the number of channels. centers is the same data type as the image I.

Tips

  • The function yields reproducible results. The output does not vary across multiple runs given the same input arguments.

  • The imsegkmeans function accepts input images in all supported color spaces. Using a different color space generates different results. If you do not receive satisfactory results for an input image, consider trying an alternative color space. For more information about color spaces in MATLAB®, see Understanding Color Spaces and Color Space Conversion.

  • To perform k-means clustering on images of data type double, convert the image to data type single by using the im2single function. For applications requiring input data of type double, see the kmeans (Statistics and Machine Learning Toolbox) function.

References

[1] Arthur, David, and Sergei Vassilvitskii. “K-Means++: The Advantages of Careful Seeding.” In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–35. SODA ’07. USA: Society for Industrial and Applied Mathematics, 2007.

Version History

Introduced in R2018b