otsu算法详细推导、实现及Multi Level OTSU算法实现

微信公众号:幼儿园的学霸

目录

简介

OTSU算法也称最大类间差法,有时也称之为大津算法,由大津于1979年提出,被认为是图像分割中阈值选取的最佳算法,计算简单,不受图像亮度和对比度的影响,因此在数字图像处理上得到了广泛的应用。它是按图像的灰度分布特性,将图像分成背景(background)和目标(object)两部分。考虑到方差是灰度分布均匀性的一种度量,理想情况下,对于同一类,其类内方差应该是很小的,同时背景和目标之间的类间方差越大,说明构成图像的两部分的差别越大,当部分目标错分为背景或部分背景错分为目标都会导致两部分差别变小。因此,使类间方差最大的分割意味着错分概率最小。

该方法的基本思想是根据选取的阈值将图像分为目标和背景两个部分,计算该灰度值下的类间方差值,将类间方差最大时对应的阈值/灰度值作为最佳阈值。

推导及实现

常规推导

设图像尺寸为MxN,其二值化的最佳阈值为T,该阈值将图像分为背景和目标2类别。其中属于背景的像素点数量为N0,属于目标的像素点数量为N1;背景像素点数占整幅图像的比例为 ω 0 \omega_0 ω0,其灰度均值为 μ 0 \mu_0 μ0;目标像素点数占整幅图像的比例为 ω 1 \omega_1 ω1,其灰度均值为 μ 1 \mu_1 μ1;整幅图像的灰度均值为 μ \mu μ,则:
ω 0 = N 0 M ∗ N (1) \omega_{0} = \frac{N_0}{M*N} \tag{1} ω0=MNN0(1)

ω 1 = N 1 M ∗ N (2) \omega_{1} = \frac{N_1}{M*N} \tag{2} ω1=MNN1(2)

N 0 + N 1 = M ∗ N (3) N_0 + N_1 = M*N \tag{3} N0+N1=MN(3)
由上面公式(1)(2)(3)得:
ω 0 + ω 1 = 1 (4) \omega_{0} + \omega_{1} = 1 \tag{4} ω0+ω1=1(4)

μ = μ 0 ∗ N 0 + μ 1 ∗ N 1 M ∗ N = μ 0 ω 0 + μ 1 ω 1 (5) \begin{aligned} \mu &= \frac{\mu_0*N_0 + \mu_1*N_1}{M*N} \\ &= \mu_0\omega_0 + \mu_1\omega_1 \tag{5} \end{aligned} μ=MNμ0N0+μ1N1=μ0ω0+μ1ω1(5)
在论文中,类内方差(Within-class variance)公式如下:

σ W 2 = ω 0 σ 0 2 + ω 1 σ 1 2 (6) \sigma_{W}^{2}=\omega_{0} \sigma_{0}^{2}+\omega_{1} \sigma_{1}^{2} \tag{6} σW2=ω0σ02+ω1σ12(6)

从上式可以看出,类内方差指的是两类像素的方差的加权和,这里权指的是该类像素点数量占整个图像像素点数量的比值

类间方差(Between-class variance)的公式如下:

σ B 2 = ω 0 ( μ 0 − μ T ) 2 + ω 1 ( μ 1 − μ T ) 2 (7) \sigma_{B}^{2}=\omega_{0}(\mu_{0}-\mu_{T})^{2}+\omega_{1}(\mu_{1}-\mu_{T})^{2} \tag{7} σB2=ω0(μ0μT)2+ω1(μ1μT)2(7)
可等价为:

σ B 2 = ω 0 ω 1 ( μ 0 − μ 1 ) 2 (8) \sigma_{B}^{2}=\omega_{0} \omega_{1}(\mu_{0}-\mu_{1})^{2} \tag{8} σB2=ω0ω1(μ0μ1)2(8)
其推导过程如下:

σ B 2 = ω 0 ( μ 0 − μ ) 2 + ω 1 ( μ 1 − μ ) 2 = ω 0 ( μ 0 − ( ω 0 μ 0 + ω 1 μ 1 ) ) 2 + ω 1 ( μ 1 − ( ω 0 μ 0 + ω 1 μ 1 ) ) 2 = ω 0 ( μ 0 − ω 0 μ 0 − ω 1 μ 1 ) 2 + ω 1 ( μ 1 − ω 0 μ 0 − ω 1 μ 1 ) 2 = ω 0 ( ( 1 − ω 0 ) μ 0 − ω 1 μ 1 ) 2 + ω 1 ( ( 1 − ω 1 ) μ 1 − ω 0 μ 0 ) 2 = ω 0 ( ω 1 μ 0 − ω 1 μ 1 ) 2 + ω 1 ( ω 0 μ 1 − ω 0 μ 0 ) 2 = ω 0 ( ω 1 ( μ 0 − μ 1 ) ) 2 + ω 1 ( ω 0 ( μ 1 − μ 0 ) ) 2 = ω 0 ω 1 2 ( μ 0 − μ 1 ) 2 + ω 1 ω 0 2 ( μ 1 − μ 0 ) 2 = ( ω 0 + ω 1 ) ω 0 ω 1 ( μ 0 − μ 1 ) 2 = ω 0 ω 1 ( μ 0 − μ 1 ) 2 (9) \begin{aligned} \sigma_{B}^{2} &= \omega_{0}(\mu_{0}-\mu)^{2}+\omega_{1}(\mu_{1}-\mu)^{2} \\ &=\omega_{0}(\mu_{0}-(\omega_{0} \mu_{0}+\omega_{1} \mu_{1}))^{2}+\omega_{1}(\mu_{1}-(\omega_{0} \mu_{0}+\omega_{1} \mu_{1}))^{2} \\ &= \omega_{0}(\mu_{0}-\omega_{0} \mu_{0}-\omega_{1} \mu_{1})^{2}+\omega_{1}(\mu_{1}-\omega_{0} \mu_{0}-\omega_{1} \mu_{1})^{2} \\ &= \omega_{0}((1-\omega_{0}) \mu_{0}-\omega_{1} \mu_{1})^{2}+\omega_{1}((1-\omega_{1}) \mu_{1}-\omega_{0} \mu_{0})^{2} \\ &= \omega_{0}(\omega_{1} \mu_{0}-\omega_{1} \mu_{1})^{2}+\omega_{1}(\omega_{0} \mu_{1}-\omega_{0} \mu_{0})^{2} \\ &= \omega_{0}(\omega_{1}(\mu_{0}-\mu_{1}))^{2}+\omega_{1}(\omega_{0}(\mu_{1}-\mu_{0}))^{2} \\ &= \omega_{0} \omega_{1}^{2}(\mu_{0}-\mu_{1})^{2}+\omega_{1} \omega_{0}^{2}(\mu_{1}-\mu_{0})^{2} \\ &= (\omega_{0}+\omega_{1}) \omega_{0} \omega_{1}(\mu_{0}-\mu_{1})^{2} \\ &= \omega_{0} \omega_{1}(\mu_{0}-\mu_{1})^{2} \tag{9} \end{aligned} σB2=ω0(μ0μ)2+ω1(μ1μ)2=ω0(μ0(ω0μ0+ω1μ1))2+ω1(μ1(ω0μ0+ω1μ1))2=ω0(μ0ω0μ0ω1μ1)2+ω1(μ1ω0μ0ω1μ1)2=ω0((1ω0)μ0ω1μ1)2+ω1((1ω1)μ1ω0μ0)2=ω0(ω1μ0ω1μ1)2+ω1(ω0μ1ω0μ0)2=ω0(ω1(μ0μ1))2+ω1(ω0(μ1μ0))2=ω0ω12(μ0μ1)2+ω1ω02(μ1μ0)2=(ω0+ω1)ω0ω1(μ0μ1)2=ω0ω1(μ0μ1)2(9)
理想情况下,对于同一类,其类内方差应该是很小的,同时背景和目标之间的类间方差越大,说明构成图像的两部分的差别越大。因此采用遍历的方法得到使类间方差最大的阈值T即为所求。

大津法的形象理解:对于直方图有两个峰值的图像,大津法求得的T近似等于两个峰值之间的低谷

算法步骤及实现

步骤

按照上面的公式,可以初步确定一个容易实现的算法过程:
1.根据图像的尺寸确定图像总的像素点数量MxN
2.遍历图像灰度级0-255中的每一个灰度值L,
假设L将图像分为背景(Background)和目标(Object)两部分
2.1 求解背景的像素点数量及像素的灰度值和 N 0 , S u m 0 N_0,Sum_0 N0,Sum0;目标的像素点数量及像素的灰度值和 N 1 , S u m 1 N_1,Sum_1 N1,Sum1;
2.2 根据上面公式(1)(2)求解两类像素点各自占图像的比例 ω 0 , ω 1 \omega_{0},\omega_{1} ω0,ω1;
根据下面的公式计算两类的灰度均值 μ 0 , μ 1 \mu_0,\mu_1 μ0,μ1:

μ 0 = S u m 0 N 0 μ 1 = S u m 1 N 1 \mu_0 = \frac{Sum_0}{N_0} \\ \mu_1 = \frac{Sum_1}{N_1} μ0=N0Sum0μ1=N1Sum1

2.3 根据公式(8)计算类间方差
3.在上面遍历过程中找到使类间方差最大的灰度值L即为所求

上面步骤存在很大的优化空间,仅为说明算法过程

实现

实现代码如下:

//
// Created by liheng on 11/22/20.
//

#include <iostream>
#include <chrono>
#include <opencv2/opencv.hpp>

/*!
 * 大津法求解二值化分割阈值
 * @param image 输入图像.CV_8UC1
 * @return >0 大津法求解的分割阈值
 * @author liheng
 */
int ThresholdOTSU(const cv::Mat &image)
{
    CV_Assert( image.type() == CV_8UC1 );

    int thresh = -1;

    int pixNumber = image.cols * image.rows;//图像总像素点
    int pixCount[256] = { 0 }; //每个灰度值所占像素个数.灰度级为0-255.


    //1°统计每个灰度级中像素的个数.本质:灰度直方图
    {
        //for (int i=0; i<image.rows; ++i)
        //{
        //    for (int j=0; j<image.cols; ++j)
        //        pixCount[image.at<unsigned char>(i,j)]++;
        //}

        cv::Size size(image.size());
        if( image.isContinuous() )
        {
            size.width *= size.height;
            size.height = 1;
        }
        for(int i=0; i<size.height; ++i )
        {
            const uchar* sdata = image.ptr(i);
            for( int j = 0; j < size.width; ++j )
                pixCount[sdata[j]] += 1;
        }
    }

    //2°穷举0-255各灰度,求解使得类间方差最大的灰度值k
    const auto PpixNum = 1.0f/image.rows*image.cols;//像素总数
    float gmax = -FLT_MAX;
    for(int k=0;k<256;++k)
    {
        //背景及前景的像素点数量及和(用于求均值)
        int backNum = 0;
        float backSum = 0;

        int objectNum = 0;
        float objectSum = 0;

        //遍历区分背景和前景
        //for(int i=0;i<256;++i)
        //{
        //    if( i<=k )//背景,此处为i<=k,如果为i<k呢?
        //    {
        //        backNum +=pixCount[i];
        //        backSum += i*pixCount[i];
        //    }
        //    else//前景
        //    {
        //        objectNum += pixCount[i];
        //        objectSum += i*pixCount[i];
        //    }
        //}

        //上面修改为该形式,能够减少for循环中的判断
        //阈值k将像素分为背景和前景2部分
        for(int i=0;i<=k;++i)//背景,此处为i<=k,如果为i<k呢?
        {
            backNum +=pixCount[i];
            backSum += i*pixCount[i];
        }
        for(int i=k+1;i<256;++i)//前景
        {
            objectNum += pixCount[i];
            objectSum += i*pixCount[i];
        }

        //求解类间方差
        const auto w0 = backNum*PpixNum;
        const auto w1 = objectNum*PpixNum;
        //考虑到PpixNum为定值,上面2行可修改为下面形式,但结果可能与opencv结果存在稍许不同!!!
        //const auto w0 = backNum;
        //const auto w1 = objectNum;

        const float u0 = backSum*1.0f/(backNum+FLT_EPSILON);
        const float u1 = objectSum*1.0f/(objectNum+FLT_EPSILON);

        const float sigma = w0*w1*std::pow((u0-u1),2);//w0*w1*(u0-u1)^2
        if( sigma>=gmax )
        {
            thresh = k;
            gmax = sigma;
        }
    }

    //利用概率学相关内容,考虑将上面的双重循环,修改为单层循环


    return thresh;
}

void ShowHist(const cv::Mat& image,cv::Mat& hist_img,int scale=2)
{
    int bins = 256;
    int hist_size[] = {bins};
    float range[] = { 0, 256 };
    const float* ranges[] = { range};
    cv::MatND hist;
    int channels[] = {0};

    cv::calcHist( &image, 1, channels, cv::Mat(), // do not use mask
              hist, 1, hist_size, ranges,
              true, // the histogram is uniform
              false );

    double max_val;
    cv::minMaxLoc(hist, nullptr, &max_val, nullptr, nullptr);
    const int hist_height=image.rows;
    hist_img = cv::Mat::zeros(hist_height,bins*scale, CV_8UC3);
    for(int i=0;i<bins;i++)
    {
        float bin_val = hist.at<float>(i);
        int intensity = cvRound(bin_val*hist_height/max_val);  //要绘制的高度
        cv::rectangle(hist_img,cv::Point(i*scale,hist_height-1),
                  cv::Point((i+1)*scale - 1, hist_height - intensity),
                  CV_RGB(255,255,255));
    }
    
    //cv::copyMakeBorder(hist_img,hist_img,10,10,10,10,cv::BORDER_CONSTANT,cv::Scalar(0,0,0));
}

int main()
{
    cv::Mat image = cv::imread("./input.bmp",cv::IMREAD_GRAYSCALE);
    cv::imshow("src",image);

    //求解直方图
    cv::Mat hist_img;
    int scale = 2;
    ShowHist(image,hist_img);

    cv::Mat ourdst,cvdst;

    auto th = ThresholdOTSU(image);
    cv::threshold(image,ourdst,th,255,cv::THRESH_BINARY);
    
    //在直方图上绘制阈值位置
    cv::line(hist_img,cv::Point(th*scale,0),cv::Point(th*scale,hist_img.rows-1),cv::Scalar(0,0,255),2);

    auto cvTh = cv::threshold(image,cvdst,0,255,cv::THRESH_OTSU);
    
    cv::imshow("hist",hist_img);
    cv::imshow("ourdst",ourdst);
    cv::imshow("cvdst",cvdst);



    cv::waitKey(0);

    return 0;
}

输入图像如下所示:
输入图像

OTSU阈值分割结果如下(和OpenCV分割结果一致,OpenCV源码分割结果不在展示):
OTSU分割结果

输入图像的灰度直方图和OTSU阈值如下图所示:图像中的红线位置为OTSU算法求解的阈值:
图片灰度值直方图及OTSU阈值位置

可以看到OTSU算法求解的阈值和灰度直方图中的2个双峰均值比较接近。

如果选择两个波峰之间的波谷作为阈值,就能轻松地把这两类像素分开。但是图像的直方图往往是不连续的,有非常多尖峰和抖动,要找到准确的极值点十分困难

从概率的角度解释

在上面的介绍中对OTSU算法的实现思想已经有了一定的认知。接着从概率学的角度继续探讨该算法。

推导

设待分割图像有L个灰度级,灰度值范围为[0,1,2,...,L-2,L-1],其中灰度值为i的像素个数为 n i n_i ni,那么可以得到图像总的像素个数为:
N = n 1 + n 2 + ⋯ + n L − 2 + n L − 1 = ∑ i = 0 L − 1 n i (11) \begin{aligned} N &=n_1+n_2+\cdots+n_{L-2}+n_{L-1} \\ &=\sum_{i=0}^{L-1} n_{i} \end{aligned} \tag{11} N=n1+n2++nL2+nL1=i=0L1ni(11)
则灰度值为i的像素在图像中出现的概率为:
p i = n i N i , p i ≥ 0 , ∑ i = 0 L − 1 p i = 1 (12) p_{i}=\frac{n_i}{N_i}, p_{i} \geq 0, \sum_{i=0}^{L-1} p_{i}=1 \tag{12} pi=Nini,pi0,i=0L1pi=1(12)
假设灰度值(阈值)k将图像中的像素划分为2类: C 0 , C 1 C_0,C_1 C0,C1,其中类 C 0 C_0 C0中像素的灰度级为 [ 0 , 1 , ⋯   , k − 1 , k ] [0,1,\cdots,k-1,k] [0,1,,k1,k],类 C 1 C_1 C1中像素的灰度级为 [ k + 1 , k + 2 , ⋯   , L − 2 , L − 1 ] [k+1,k+2,\cdots,L-2,L-1] [k+1,k+2,,L2,L1]
在图像中,某一类别出现的概率为:
ω 0 = P r ( C 0 ) = ∑ i = 0 k p i = ω ( k ) ω 1 = P r ( C 1 ) = ∑ i = k + 1 L − 1 p i = 1 − ω ( k ) (13) \begin{aligned} \omega_{0} &= P_r(C_{0})=\sum_{i=0}^{k} p_{i}=\omega(k) \\ \omega_{1} &= P_r(C_{1})=\sum_{i=k+1}^{L-1} p_{i}=1-\omega(k) \tag{13} \end{aligned} ω0ω1=Pr(C0)=i=0kpi=ω(k)=Pr(C1)=i=k+1L1pi=1ω(k)(13)
上面公式中,记为 ω ( k ) \omega(k) ω(k)是为了将出现概率和分割阈值k之间以函数的形式联系起来。后述公式是同样的道理。
根据期望公式,图像整体灰度值为:
μ = μ ( L ) = ∑ i = 0 L − 1 i ∗ p i (14) \mu = \mu(L)=\sum_{i=0}^{L-1} i * p_{i} \tag{14} μ=μ(L)=i=0L1ipi(14)
根据条件概率公式,某一类别的平均灰度值为:
μ 0 = ∑ i = 0 k i ∗ P r ( i ∣ C 0 ) = ∑ i = 0 k i ∗ p i ω 0 = ∑ i = 0 k i p i ω ( k ) = 1 ω ( k ) ∑ i = 0 k i p i = μ ( k ) ω ( k ) (15) \begin{aligned} \mu_{0} &=\sum_{i=0}^{k} i * P_r(i \mid C_{0})=\sum_{i=0}^{k} i*\frac{p_i}{\omega_0} \\ &= \sum_{i=0}^{k} \frac{i p_i}{\omega(k)} = \frac{1}{{\omega(k)}} \sum_{i=0}^{k}{i p_i} \\ &= \frac{\mu(k)}{\omega(k)} \tag{15} \end{aligned} μ0=i=0kiPr(iC0)=i=0kiω0pi=i=0kω(k)ipi=ω(k)1i=0kipi=ω(k)μ(k)(15)
同理,
μ 1 = ∑ i = k + 1 L − 1 i ∗ P r ( i ∣ C 1 ) = ∑ i = k + 1 L − 1 i ∗ p i ω 1 = μ − μ ( k ) 1 − ω ( k ) (16) \begin{aligned} \mu_{1} &=\sum_{i=k+1}^{L-1} i*P_r(i \mid C_{1})=\sum_{i=k+1}^{L-1} i*\frac{p_{i}}{\omega_{1}} \\ &=\frac{\mu-\mu(k)}{1-\omega(k)} \end{aligned} \tag{16} μ1=i=k+1L1iPr(iC1)=i=k+1L1iω1pi=1ω(k)μμ(k)(16)
根据上述公式(13)~(16),对于任一分割阈值k,有下面公式成立:
ω 0 μ 0 + ω 1 μ 1 = μ , ω 0 + ω 1 = 1 (17) \omega_{0} \mu_{0}+\omega_{1} \mu_{1}=\mu, \quad \omega_{0}+\omega_{1}=1 \tag{17} ω0μ0+ω1μ1=μ,ω0+ω1=1(17)

除了期望,还可以计算图像的方差和各类别的方差,如下:
σ 2 = ∑ i = 0 L − 1 ( i − μ ) 2 p i σ 0 2 = ∑ i = 0 k ( i − μ 0 ) 2 P r ( i ∣ C 0 ) = ∑ i = 0 k ( i − μ 0 ) 2 p i ω 0 σ 1 2 = ∑ i = k + 1 L − 1 ( i − μ 1 ) 2 P r ( i ∣ C 1 ) = ∑ i = k + 1 L − 1 ( i − μ 1 ) 2 p i ω 1 (18) \begin{gathered} \sigma^{2}=\sum_{i=0}^{L-1}(i-\mu)^{2} p_{i} \\ \sigma_{0}^{2}=\sum_{i=0}^{k}(i-\mu_{0})^{2} P_r(i \mid C_{0})=\sum_{i=0}^{k} \frac{(i-\mu_{0})^{2} p_i} {\omega_{0}} \\ \sigma_{1}^{2}=\sum_{i=k+1}^{L-1}(i-\mu_{1})^{2} P_r(i \mid C_{1})=\sum_{i=k+1}^{L-1}\frac{(i-\mu_{1})^{2} p_{i}} {\omega_{1}} \end{gathered} \tag{18} σ2=i=0L1(iμ)2piσ02=i=0k(iμ0)2Pr(iC0)=i=0kω0(iμ0)2piσ12=i=k+1L1(iμ1)2Pr(iC1)=i=k+1L1ω1(iμ1)2pi(18)

再对公式(8)进行进一步整理:
σ B 2 = ω 0 ω 1 ( μ 0 − μ 1 ) 2 = ω 0 ω 1 ( μ ( k ) ω ( k ) − μ − μ ( k ) 1 − ω ( k ) ) 2 = ω ( k ) ( 1 − ω ( k ) ) ( μ ( k ) ∗ ( 1 − ω ( k ) ) − ω ( k ) ∗ ( μ − μ ( k ) ) ω ( k ) ∗ ( 1 − ω ( k ) ) ) 2 = ω ( k ) ( 1 − ω ( k ) ) ( μ ( k ) − μ ∗ ω ( k ) ω ( k ) ∗ ( 1 − ω ( k ) ) ) 2 = ( μ ( k ) − μ ∗ ω ( k ) ) 2 ω ( k ) ( 1 − ω ( k ) ) (19) \begin{aligned} \sigma_{B}^{2} &= \omega_{0} \omega_{1}(\mu_{0}-\mu_{1})^{2} \\ &= \omega_{0} \omega_{1}\left(\frac{\mu(k)}{\omega(k)} -\frac{\mu-\mu(k)}{1-\omega(k)} \right)^2 \\ &= \omega(k)(1-\omega(k)) \left(\frac{\mu(k)*(1-\omega(k)) - \omega(k)*(\mu-\mu(k)) }{\omega(k)*(1-\omega(k))}\right)^2 \\ &= \omega(k)(1-\omega(k)) \left(\frac{\mu(k) - \mu*\omega(k) }{\omega(k)*(1-\omega(k))}\right)^2 \\ &= \frac{\left( \mu(k) - \mu*\omega(k) \right)^2}{\omega(k)(1-\omega(k))} \end{aligned} \tag{19} σB2=ω0ω1(μ0μ1)2=ω0ω1(ω(k)μ(k)1ω(k)μμ(k))2=ω(k)(1ω(k))(ω(k)(1ω(k))μ(k)(1ω(k))ω(k)(μμ(k)))2=ω(k)(1ω(k))(ω(k)(1ω(k))μ(k)μω(k))2=ω(k)(1ω(k))(μ(k)μω(k))2(19)
因此求解最大类间方差可写作:
σ B 2 ( k ∗ ) = max ⁡ 0 ≤ k < L σ B 2 ( k ) (20) \sigma_{B}^{2}\left(k^{*}\right)=\max _{0 \leq k<L} \sigma_{B}^{2}(k) \tag{20} σB2(k)=0k<LmaxσB2(k)(20)

公式(19)以函数的表达形式展示类间方差和分割阈值k之间的关系. 图像整体的灰度均值和分割阈值是无关的,这样能够进一步优化代码实现,优化后的代码附后。

在论文中还有下列公式成立:
σ W 2 + σ B 2 = σ 2 (21) \sigma_W^2 + \sigma_B^2 = \sigma^2 \tag{21} σW2+σB2=σ2(21)
类间方差和类内方差的和为定值,也即最大化类间差异就是最小化类内差异
下面对公式(20)进行推导
σ W 2 + σ B 2 = ( ω 0 σ 0 2 + ω 1 σ 1 2 ) + ( ω 0 ( μ 0 − μ ) 2 + ω 1 ( μ 1 − μ ) 2 ) = ( ω 0 σ 0 2 + ω 0 ( μ 0 − μ ) 2 ) + ( ω 1 σ 1 2 + ω 1 ( μ 1 − μ ) 2 ) (22) \begin{aligned} \sigma_W^2 + \sigma_B^2 &= \left(\omega_{0} \sigma_{0}^{2}+\omega_{1} \sigma_{1}^{2}\right) + \left(\omega_{0}(\mu_{0}-\mu)^{2}+\omega_{1}(\mu_{1}-\mu)^{2} \right) \\ &= \left(\omega_{0} \sigma_{0}^{2}+ \omega_{0}(\mu_{0}-\mu)^{2} \right) + \left( \omega_{1} \sigma_{1}^{2} +\omega_{1}(\mu_{1}-\mu)^{2} \right) \end{aligned} \tag{22} σW2+σB2=(ω0σ02+ω1σ12)+(ω0(μ0μ)2+ω1(μ1μ)2)=(ω0σ02+ω0(μ0μ)2)+(ω1σ12+ω1(μ1μ)2)(22)

其中:

ω 0 σ 0 2 = ω 0 [ 1 ω 0 ∗ ∑ i = 0 k ( i − μ 0 ) 2 p i ] = ω 0 [ 1 ω 0 ∗ ∑ i = 0 k ( ( i − μ ) + ( μ − μ 0 ) ) 2 p i ] = ω 0 [ 1 ω 0 ∗ ∑ i = 0 k ( ( i − μ ) 2 + 2 ( i − μ ) ( μ − μ 0 ) + ( μ − μ 0 ) 2 ) p i ] = ( ∑ i = 0 k ( i − μ ) 2 ∗ p i ) + ( ∑ i = 0 k 2 ( i − μ ) ( μ − μ 0 ) ∗ p i ) + ( ( μ − μ 0 ) 2 ∗ ∑ i = 0 k p i ) = ( ∑ i = 0 k ( i − μ ) 2 ∗ p i ) + ( 2 ( μ − μ 0 ) ∑ i = 0 k ( i − μ ) p i ) + ( ω 0 ( μ − μ 0 ) 2 ) = ( ∑ i = 0 k ( i − μ ) 2 ∗ p i ) + 2 ( μ − μ 0 ) [ ∑ i = 0 k i p i − μ ∑ i = 0 k p i ] + ( ω 0 ( μ − μ 0 ) 2 ) = ( ∑ i = 0 k ( i − μ ) 2 ∗ p i ) + ( 2 ( μ − μ 0 ) ∑ i = 0 k ( i − μ ) p i ) + ( ω 0 ( μ − μ 0 ) 2 ) = ( ∑ i = 0 k ( i − μ ) 2 ∗ p i ) + 2 ( μ − μ 0 ) [ ∑ i = 0 k i p i − μ ∑ i = 0 k p i ] + ( ω 0 ( μ − μ 0 ) 2 ) = ( ∑ i = 0 k ( i − μ ) 2 ∗ p i ) + 2 ( μ − μ 0 ) ( ω 0 μ 0 − μ ω 0 ) + ( ω 0 ( μ − μ 0 ) 2 ) 根据公式(15),(13)进行替换而来 = ( ∑ i = 0 k ( i − μ ) 2 ∗ p i ) − 2 ω 0 ( μ 0 − μ ) 2 + ( ω 0 ( μ − μ 0 ) 2 ) = ( ∑ i = 0 k ( i − μ ) 2 ∗ p i ) − ω 0 ( μ 0 − μ ) 2 (23) \begin{aligned} \omega_{0} \sigma_{0}^{2} &= \omega_{0}\left[\frac{1}{\omega_{0}} * \sum_{i=0}^{k}\left(i-\mu_{0}\right)^{2} p_{i}\right]=\omega_{0}\left[\frac{1}{\omega_{0}} * \sum_{i=0}^{k}\left(\left(i-\mu\right)+\left(\mu-\mu_{0}\right)\right)^{2} p_{i}\right] \\ &=\omega_{0}\left[\frac{1}{\omega_{0}} * \sum_{i=0}^{k}\left(\left(i-\mu\right)^{2}+2\left(i-\mu\right)\left(\mu-\mu_{0}\right)+\left(\mu-\mu_{0}\right)^{2}\right) p_{i}\right] \\ &= \left(\sum_{i=0}^{k}\left(i-\mu\right)^{2} * p_{i} \right) +\left(\sum_{i=0}^{k} 2(i-\mu)(\mu-\mu_{0}) * p_{i} \right)+ \left((\mu-\mu_{0})^{2} * \sum_{i=0}^{k} p_{i}\right) \\ &= \left(\sum_{i=0}^{k}\left(i-\mu\right)^{2} * p_{i} \right) + \left(2(\mu-\mu_{0}) \sum_{i=0}^{k}(i-\mu) p_{i} \right)+ \left(\omega_{0}(\mu-\mu_{0})^{2}\right)\\ &= \left(\sum_{i=0}^{k}\left(i-\mu\right)^{2} * p_{i} \right) + 2(\mu-\mu_{0})\left[\sum_{i=0}^{k} i p_{i}-\mu \sum_{i=0}^{k} p_{i}\right] + \left(\omega_{0}(\mu-\mu_{0})^{2}\right) \\ &= \left(\sum_{i=0}^{k}\left(i-\mu\right)^{2} * p_{i} \right) + \left(2(\mu-\mu_{0}) \sum_{i=0}^{k}(i-\mu) p_{i} \right)+ \left(\omega_{0}(\mu-\mu_{0})^{2}\right)\\ &= \left(\sum_{i=0}^{k}\left(i-\mu\right)^{2} * p_{i} \right) + 2(\mu-\mu_{0})\left[\sum_{i=0}^{k} i p_{i}-\mu \sum_{i=0}^{k} p_{i}\right] + \left(\omega_{0}(\mu-\mu_{0})^{2}\right) \\ &= \left(\sum_{i=0}^{k}\left(i-\mu\right)^{2} * p_{i} \right) + 2(\mu-\mu_{0})\left(\omega_0\mu_0-\mu \omega_0\right) + \left(\omega_{0}(\mu-\mu_{0})^{2}\right) \text{根据公式(15),(13)进行替换而来} \\ &= \left(\sum_{i=0}^{k}\left(i-\mu\right)^{2} * p_{i} \right) - 2\omega_0(\mu_0-\mu)^2 + \left(\omega_{0}(\mu-\mu_{0})^{2}\right) \\ &= \left(\sum_{i=0}^{k}\left(i-\mu\right)^{2} * p_{i} \right) - \omega_0(\mu_0-\mu)^2 \end{aligned} \tag{23} ω0σ02=ω0[ω01i=0k(iμ0)2pi]=ω0[ω01i=0k((iμ)+(μμ0))2pi]=ω0[ω01i=0k((iμ)2+2(iμ)(μμ0)+(μμ0)2)pi]=(i=0k(iμ)2pi)+(i=0k2(iμ)(μμ0)pi)+((μμ0)2i=0kpi)=(i=0k(iμ)2pi)+(2(μμ0)i=0k(iμ)pi)+(ω0(μμ0)2)=(i=0k(iμ)2pi)+2(μμ0)[i=0kipiμi=0kpi]+(ω0(μμ0)2)=(i=0k(iμ)2pi)+(2(μμ0)i=0k(iμ)pi)+(ω0(μμ0)2)=(i=0k(iμ)2pi)+2(μμ0)[i=0kipiμi=0kpi]+(ω0(μμ0)2)=(i=0k(iμ)2pi)+2(μμ0)(ω0μ0μω0)+(ω0(μμ0)2)根据公式(15),(13)进行替换而来=(i=0k(iμ)2pi)2ω0(μ0μ)2+(ω0(μμ0)2)=(i=0k(iμ)2pi)ω0(μ0μ)2(23)

同样可推导:

ω 1 σ 1 2 = ( ∑ i = k + 1 L − 1 ( i − μ ) 2 ∗ p i ) − ω 1 ( μ 1 − μ ) 2 (24) \begin{aligned} \omega_{1} \sigma_{1}^{2} = \left(\sum_{i=k+1}^{L-1}\left(i-\mu\right)^{2} * p_{i} \right) - \omega_1(\mu_1-\mu)^2 \end{aligned} \tag{24} ω1σ12=(i=k+1L1(iμ)2pi)ω1(μ1μ)2(24)

因此,式(21)可继续展开:
σ W 2 + σ B 2 = ( ω 0 σ 0 2 + ω 0 ( μ 0 − μ ) 2 ) + ( ω 1 σ 1 2 + ω 1 ( μ 1 − μ ) 2 ) = ∑ i = 0 k ( i − μ ) 2 ∗ p i + ∑ i = k + 1 L − 1 ( i − μ ) 2 ∗ p i = ∑ i = 0 L − 1 ( i − μ ) 2 ∗ p i = σ 2 (25) \begin{aligned} \sigma_W^2 + \sigma_B^2 &= \left(\omega_{0} \sigma_{0}^{2}+ \omega_{0}(\mu_{0}-\mu)^{2} \right) + \left( \omega_{1} \sigma_{1}^{2} +\omega_{1}(\mu_{1}-\mu)^{2} \right) \\ &= \sum_{i=0}^{k}\left(i-\mu\right)^{2} * p_{i} + \sum_{i=k+1}^{L-1}\left(i-\mu\right)^{2} * p_{i} \\ &= \sum_{i=0}^{L-1}\left(i-\mu\right)^{2} * p_{i} \\ &= \sigma^2 \end{aligned} \tag{25} σW2+σB2=(ω0σ02+ω0(μ0μ)2)+(ω1σ12+ω1(μ1μ)2)=i=0k(iμ)2pi+i=k+1L1(iμ)2pi=i=0L1(iμ)2pi=σ2(25)

公式得证。

实现

按照公式(19)进行实现的代码如下:

int ThresholdOTSU2(const cv::Mat &image)
{
    CV_Assert( image.type() == CV_8UC1 );

    int thresh = -1;

    int pixNumber = image.cols * image.rows;//图像总像素点
    float Ppix[256] = { 0 }; //每一个灰度值出现的概率
    float u = 0;//图像整体均值
    //公式12,每一灰度值出现的概率
    {
        //求各灰度值出现的次数
        cv::Size size(image.size());
        if( image.isContinuous() )
        {
            size.width *= size.height;
            size.height = 1;
        }

        for(int i=0; i<size.height; ++i )
        {
            const uchar* sdata = image.ptr(i);
            for( int j = 0; j < size.width; ++j )
                Ppix[sdata[j]] += 1;
        }

        //求灰度值出现的概率及整体均值
        const auto p = 1.0/pixNumber;
        for(int i=0;i<256;++i)
        {
            Ppix[i] *= p;
            u += i*Ppix[i];
        }

    }
    
    //遍历
    float wk=0;
    float uk=0;
    float gmax = -FLT_MAX;
    for(int k=0;k<256;++k)
    {
        //注意此处.没有在每个循环中遍历求0-k的和,
        // 而是在上一次求解的基础上累加,防止重复计算
        wk += Ppix[k];//0-k灰度值的像素 出现的概率.Note:包含k
        uk += k*Ppix[k];

        const auto temp = uk-u*wk;
        const auto sigma = temp*temp/(wk*(1-wk)+FLT_EPSILON);
        if( sigma>=gmax )//考虑对相同的最大方差下的阈值进行存储,然后求这些阈值的均值
        {
            thresh = k;
            gmax = sigma;
        }

    }

    return thresh;
}

扩展-MultiLevel OTSU

在上面的算法中,讨论的是2个类别的分割,针对图像来说就是二值化处理。如果对该问题进行扩展,我们想把图像像素划分为M类,则对应的目标函数可以扩展为:
σ B 2 = ∑ k = 0 M − 1 ω k ( μ k − μ ) 2 \sigma_{B}^{2}=\sum_{k=0}^{M-1} \omega_{k}\left(\mu_{k}-\mu\right)^{2} σB2=k=0M1ωk(μkμ)2

对大津算法的多级推广成为多大津算法(multi Otsu method)
论文: A Fast Algorithm for Multilevel Thresholding. 来自台湾的一篇论文,说明还是比较详细.

我们需要找到 ( t 0 , t 1 , ⋯   , t M − 1 , ∣ t 0 < t 1 < ⋯ < t M − 1 ) (t_0,t_1,\cdots,t_{M-1},\mid t_0< t_1 <\cdots< t_{M-1}) (t0,t1,,tM1,t0<t1<<tM1)使得目标函数(类间方差)取最大值
根据上一节的结论:类间方差和类内方差的和为定值,因此最小化类内方差和最大化类间方差是相同的,因此目标函数可以进行修改如下:
σ W 2 = ∑ k = 0 M − 1 w k μ k 2 \sigma_{W}^{2}=\sum_{k=0}^{M-1} w_{k} \mu_{k}^{2} σW2=k=0M1wkμk2

找出使上公式最小的一系列阈值即可。

此处为我的理解,正好和原论文中结论相反

关于多阈值OTSU算法在上面提到的论文中有比较详细的说明,里面讨论了其算法实现、简化过程,以实现fast的目的.该部分暂时不赘述,看后面的情况,有时间在补上一篇.
此处将按照论文实现的算法贴出,有需要的可以浏览一番.

#include <iostream>
#include <chrono>
#include <opencv2/opencv.hpp>
#include <numeric>
cv::Mat SegImage(const cv::Mat &src,const std::vector<int> &thresholds,const std::vector<cv::Vec3b> &colors);


/*!
 * 多级大津算法
 * @param image 输入图像.CV_8UC1
 * @param classes 分类的类别
 * @param thresholds 分割阈值
 * @author liheng
 * @Reference http://github.com/hipersayanX/MultiOtsuThreshold
 */
void MultiOtsuThreshold(const cv::Mat &image, unsigned int classes, std::vector<int> &thresholds)
{
    CV_Assert( image.type() == CV_8UC1 );

    float maxSum = 0;
    std::vector<int>(classes-1,0).swap(thresholds);

    //cal histogram. 求解直方图
    std::vector<int> histogram(256, 0);
    {
        cv::Size size(image.size());
        if( image.isContinuous() )
        {
            size.width *= size.height;
            size.height = 1;
        }
        for(int i=0; i<size.height; ++i )
        {
            const uchar* sdata = image.ptr(i);
            for( int j = 0; j < size.width; ++j )
                histogram[sdata[j]] += 1;
        }

        // Since we use sum tables add one more to avoid unexistent colors.
        //for (int i = 0; i < histogram.size(); i++)
        //    histogram[i]++;
    }

    //buildTables
    std::vector<float> H(histogram.size() * histogram.size(), 0.f);
    {
        // Create cumulative sum tables.
        std::vector<int> P(histogram.size() + 1);
        std::vector<int> S(histogram.size() + 1);
        P[0] = 0;
        S[0] = 0;

        int sumP = 0;
        int sumS = 0;

        for (int i = 0; i < histogram.size(); i++) {
            sumP += int(histogram[i]);
            sumS += int(i * histogram[i]);
            P[i + 1] = sumP;
            S[i + 1] = sumS;
        }

        // Calculate the between-class variance for the interval u-v
        //std::vector<float> H(histogram.size() * histogram.size(), 0.f);

        for (int u = 0; u < histogram.size(); u++)
        {
            float *hLine = H.data() + u * histogram.size();

            for (int v = u + 1; v < histogram.size(); v++)
                hLine[v] = std::pow(S[v]-S[u], 2)*1.0f/(P[v] - P[u]+FLT_EPSILON);
        }
    }

    std::vector<int> index(classes + 1);
    index[0] = 0;
    index[index.size() - 1] = histogram.size() - 1;


    std::function<void(float *maxSum,
                       std::vector<int> *lhthres,
                       const std::vector<float> &H,
                       int u,int vmax,int level,
                       int levels,std::vector<int> *index)> lhOTSU_for_loop =
            [&](float *maxSum, std::vector<int> *lhthres,const std::vector<float> &H,
                int u,
                int vmax,
                int level,
                int levels,
                std::vector<int> *index)
            {
                int classes = index->size() - 1;

                for (int i = u; i < vmax; i++)
                {
                    (*index)[level] = i;

                    if (level + 1 >= classes)
                    {
                        // Reached the end of the for loop.

                        // Calculate the quadratic sum of al intervals.
                        float sum = 0.;
                        for (int c = 0; c < classes; c++)
                        {
                            int u = index->at(c);
                            int v = index->at(c + 1);
                            sum += H[v + u * levels];
                        }

                        if (*maxSum < sum)
                        {
                            // Return calculated threshold.
                            *lhthres = std::vector<int>(index->begin()+1,index->begin()+lhthres->size()+1);
                            *maxSum = sum;
                        }
                    }
                    else
                    {
                        // Start a new for loop level, one position after current one.
                        lhOTSU_for_loop(maxSum,
                                        lhthres,
                                        H,
                                        i + 1,
                                        vmax + 1,
                                        level + 1,
                                        levels,
                                        index);
                    }
                }
            };

    lhOTSU_for_loop(&maxSum,
                    &thresholds,
                    H,
                    1,
                    histogram.size() - classes + 1,
                    1,
                    histogram.size(), &index);

}

int main(int argc, char *argv[])
{
    cv::Mat image = cv::imread("./1.bmp",cv::IMREAD_GRAYSCALE);
    cv::imshow("src",image);

    //灰度直方图
    cv::Mat hist_img;
    int scale = 2;
    ShowHist(image,hist_img);


    //MultiLevel OTSU
    unsigned int classes = 4;
    std::vector<cv::Vec3b> colors(classes);
    colors[0] = cv::Vec3b(0,0,0);
    colors[1] = cv::Vec3b(255,0,0);
    colors[2] = cv::Vec3b(0,255,0);
    colors[3] = cv::Vec3b(0,0,255);


    std::vector<int> thresholds;
    MultiOtsuThreshold(image,classes,thresholds);
    //在直方图上绘制阈值位置
    for( const auto th:thresholds)
        cv::line(hist_img,cv::Point(th*scale,0),cv::Point(th*scale,hist_img.rows-1),cv::Scalar(0,0,255),2);

    for(int i=0;i<thresholds.size();++i)
        std::cout<<thresholds[i] <<" ";
    std::cout<<std::endl;

    cv::Mat thresholded = SegImage(image, thresholds, colors);
    cv::imshow("4otsu-dst",thresholded);
    //cv::imwrite("./res.bmp",thresholded);

    //cv::Mat otsudst;
    //auto th = ThresholdOTSU(image);
    //cv::threshold(image,otsudst,th,255,cv::THRESH_BINARY);
    //cv::imshow("2otsu-dst",otsudst);

    cv::imshow("hist",hist_img);
    cv::waitKey(0);


    return 0;
}


cv::Mat SegImage(const cv::Mat &src,
                 const std::vector<int> &thresholds,
                 const std::vector<cv::Vec3b> &colors)
{
    cv::Mat dst = cv::Mat::zeros(src.size(),CV_8UC3);

    std::vector<cv::Vec3b> colorTable(256);
    int j = 0;

    for (int i = 0; i < colorTable.size(); i++)
    {
        if (j < thresholds.size() && i >= thresholds[j])
            j++;

        colorTable[i] = colors[j];
    }

//    for (int y = 0; y < src.rows; y++)
//    {
//        const uchar *srcLine = src.ptr(y);
//        auto *dstLine = dst.ptr<cv::Vec3b>(y);
//
//        for (int x = 0; x < src.cols; x++)
//            dstLine[x] = colorTable[srcLine[x]];
//    }

    cv::cvtColor(src,dst,cv::COLOR_GRAY2BGR);
    cv::LUT(dst,colorTable,dst);
    return dst;
}

输入图像如下:
在这里插入图片描述

分割结果如下:

MultiLevelOTSU

延伸思考

从OTSU算法的思想、实现过程来看,其虽然是对图像进行分割,但是算法的核心输入为灰度直方图/概率直方图,算法过程也是对概率直方图进行处理。因此我们该算法不仅仅适用于图像分割领域,也可以适用于其他能够求解概率直方图的场合,求解最大类间方差,如点云分类等。

算法评价

  • 应用:是求图像全局阈值的最佳方法,应用不言而喻,适用于大部分需要求图像全局阈值的场合
  • 优点:计算简单快速,不受图像亮度和对比度的影响
  • 缺点:对图像噪声敏感,只能针对单一目标分割,当目标和背景大小比例(面积)悬殊、类间方差函数可能呈现双峰或者多峰,这个时候效果不好
  • 解释:当图像中的目标与背景的面积相差很大时,表现为直方图没有明显的双峰,或者两个峰的大小相差很大,分割效果不佳,或者目标与背景的灰度有较大的重叠时也不能准确的将目标与背景分开。导致这种现象出现的原因是该方法忽略了图像的空间信息,同时该方法将图像的灰度分布作为分割图像的依据,因而对噪声也相当敏感。所以,在实际应用中,总是将其与其他方法结合起来使用。

含有噪声滤波效果对比

参考资料

1.原论文链接
2.otsu阈值分割算法原理_大津法—OTSU算法
3.最大类间方差法(大津法OTSU)
4.大津法(OTSU 最大类间方差法)公式推导

5.大津法(OTSU 最大类间方差法)详细数学推导(公式繁杂,欢迎讨论)

该文最后的推导中公式有处笔误,整体没有问题

6.A Fast Algorithm for Multilevel Thresholding
7.多类别的最大类间方差法(Otsu’s method)



下面的是我的公众号二维码图片,按需关注
图注:幼儿园的学霸

Logo

开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!

更多推荐