Image Thresholding and Segmentation

bogotobogo.com site search:

Thresholding

Thresholding is the simplest method of image segmentation. It is a non-linear operation that converts a gray-scale image into a binary image where the two levels are assigned to pixels that are below or above the specified threshold value. In other words, if pixel value is greater than a threshold value, it is assigned one value (may be white), else it is assigned another value (may be black). In OpenCV, we use cv2.threshold() function:

cv2.threshold(src, thresh, maxval, type[, dst])

This function applies fixed-level thresholding to a single-channel array. The function is typically used to get a bi-level (binary) image out of a grayscale image ( compare() could be also used for this purpose) or for removing a noise, that is, filtering out pixels with too small or too large values. There are several types of thresholding supported by the function.

The function returns the computed threshold value and thresholded image.

bogotobogo.com site search:

src - input array (single-channel, 8-bit or 32-bit floating point). This is the source image, which should be a grayscale image.
thresh - threshold value, and it is used to classify the pixel values.
maxval - maximum value to use with the THRESH_BINARY and THRESH_BINARY_INV thresholding types. It represents the value to be given if pixel value is more than (sometimes less than) the threshold value.
type - thresholding type.(see threshold for details).
1. cv2.THRESH_BINARY
2. cv2.THRESH_BINARY_INVY
3. cv2.THRESH_TRUNCY
4. cv2.THRESH_TOZEROY
5. cv2.THRESH_TOZERO_INVY
dst - output array of the same size and type as src.

Thresholding - code and output

The code looks like this:

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('gradient.png',0)
ret,thresh1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
ret,thresh2 = cv2.threshold(img,127,255,cv2.THRESH_BINARY_INV)
ret,thresh3 = cv2.threshold(img,127,255,cv2.THRESH_TRUNC)
ret,thresh4 = cv2.threshold(img,127,255,cv2.THRESH_TOZERO)
ret,thresh5 = cv2.threshold(img,127,255,cv2.THRESH_TOZERO_INV)

titles = ['Original Image','BINARY','BINARY_INV','TRUNC','TOZERO','TOZERO_INV']
images = [img, thresh1, thresh2, thresh3, thresh4, thresh5]

for i in xrange(6):
    plt.subplot(2,3,i+1),plt.imshow(images[i],'gray')
    plt.title(titles[i])
    plt.xticks([]),plt.yticks([])

plt.show()

Output:

Original images are available :
gradient.png and circle.png

Adaptive Thresholding

Using a global threshold value may not be good choicewhere image has different lighting conditions in different areas. So, in that case, we may want to use adaptive thresholding. It uses the algorithm that calculates the threshold for a small regions of the image so that we can get different thresholds for different regions of the same image and it gives us better results for images with varying light conditions.

cv.AdaptiveThreshold(src, dst, maxValue, adaptive_method=CV_ADAPTIVE_THRESH_MEAN_C, thresholdType=CV_THRESH_BINARY, blockSize=3, param1=5)

where:

src - Source 8-bit single-channel image.
dst - Destination image of the same size and the same type as src.
maxValue - Non-zero value assigned to the pixels for which the condition is satisfied.
adaptiveMethod - Adaptive thresholding algorithm to use, ADAPTIVE_THRESH_MEAN_C (hreshold value is the mean of neighbourhood area) or ADAPTIVE_THRESH_GAUSSIAN_C (threshold value is the weighted sum of neighbourhood values where weights are a gaussian window). adaptiveMethod decides how thresholding value is calculated.
thresholdType - Thresholding type that must be either THRESH_BINARY or THRESH_BINARY_INV .
blockSize - size of a pixel neighborhood that is used to calculate a threshold value for the pixel: 3, 5, 7, and so on.
C - Constant subtracted from the mean or weighted mean. Normally, it is positive but may be zero or negative as well. It is just a constant which is subtracted from the mean or weighted mean calculated.

Here is the code for adaptive thresholding:

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('bw.png',0)
img = cv2.medianBlur(img,5)

ret,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
th2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
            cv2.THRESH_BINARY,11,2)
th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
            cv2.THRESH_BINARY,11,2)

titles = ['Original Image', 'Global Thresholding (v = 127)',
            'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
images = [img, th1, th2, th3]

for i in xrange(4):
    plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
    plt.title(titles[i])
    plt.xticks([]),plt.yticks([])
plt.show()

Original images : bw.png

Separating Bimodal Distributions with Otsu Threshold

Otsu binarization automatically calculates a threshold value from image histogram for a bimodal image.

It uses cv2.threshold() function with an extra flag, cv2.THRESH_OTSU. For threshold value, simply pass zero. Then the algorithm finds the optimal threshold value and returns us as the second output, retVal. If Otsu thresholding is not used, the retVal remains same as the threshold value we used.

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('EinStein.jpg',0)

# global thresholding
ret1,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)

# Otsu's thresholding
ret2,th2 = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# Otsu's thresholding after Gaussian filtering
blur = cv2.GaussianBlur(img,(5,5),0)
ret3,th3 = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# plot all the images and their histograms
images = [img, 0, th1,
          img, 0, th2,
          blur, 0, th3]
titles = ['Original Noisy Image','Histogram','Global Thresholding (v=127)',
          'Original Noisy Image','Histogram',"Otsu's Thresholding",
          'Gaussian filtered Image','Histogram',"Otsu's Thresholding"]

for i in xrange(3):
    plt.subplot(3,3,i*3+1),plt.imshow(images[i*3],'gray')
    plt.title(titles[i*3]), plt.xticks([]), plt.yticks([])
    plt.subplot(3,3,i*3+2),plt.hist(images[i*3].ravel(),256)
    plt.title(titles[i*3+1]), plt.xticks([]), plt.yticks([])
    plt.subplot(3,3,i*3+3),plt.imshow(images[i*3+2],'gray')
    plt.title(titles[i*3+2]), plt.xticks([]), plt.yticks([])
plt.show()