假设我们要拟合一个一元函数,目前我们知道的自变量及因变量的值是 ( x 1 , y 1 ) , ( x 2 , y 2 ) , ⋯   , ( x n , y n ) (x_1,y_1),(x_2,y_2),\cdots,(x_n,y_n) (x1,y1),(x2,y2),,(xn,yn),而我们要求 y = f ( x ) y=f(x) y=f(x),使得用它作为近似函数时,函数值与真实值的偏差 r i = f ( x i ) − y i r_i=f(x_i)-y_i ri=f(xi)yi的平方和 ∑ i = 1 n r i 2 = ∑ i = 1 n ( f ( x i ) − y i ) 2 \sum\limits_{i=1}^nr_i^2=\sum\limits_{i=1}^n(f(x_i)-y_i)^2 i=1nri2=i=1n(f(xi)yi)2最小。

假如我们已经选定了一类函数(模型)去拟合,现在要确定其中参数的值。比如我们选择了 f ( x ) = e a x + b f(x)=e^{ax+b} f(x)=eax+b这种模型(一般用于拟合人口增长),就要确定参数 a , b a,b a,b的值。设参数为 a 1 , a 2 , ⋯   , a m a_1,a_2,\cdots,a_m a1,a2,,am,将 f f f表示为 f ( x , a 1 , a 2 , ⋯   , a m ) f(x,a_1,a_2,\cdots,a_m) f(x,a1,a2,,am)。损失函数,即偏差的平方和为 Q ( a 1 , a 2 , ⋯   , a m ) = ∑ i = 1 n ( f ( x i , a 1 , a 2 , ⋯   , a m ) − y i ) 2 Q(a_1,a_2,\cdots,a_m)=\sum\limits_{i=1}^n(f(x_i,a_1,a_2,\cdots,a_m)-y_i)^2 Q(a1,a2,,am)=i=1n(f(xi,a1,a2,,am)yi)2。注意,这里 Q Q Q是参数 a 1 , a 2 , ⋯   , a m a_1,a_2,\cdots,a_m a1,a2,,am的函数,而不是 x , y x,y x,y的函数。让 Q Q Q取得极值,就需要让 Q Q Q a 1 , a 2 , ⋯   , a m a_1,a_2,\cdots,a_m a1,a2,,am的偏导数都为 0 0 0。对于参数 a k a_k ak Q Q Q对它的偏导数为 ∂ Q ∂ a k = 2 ∑ i = 1 n [ ( f ( x i , a 1 , a 2 , ⋯   , a m ) − y i ) ∂ f ∂ a k ∣ x i ] \frac{\partial Q}{\partial a_k}=2\sum\limits_{i=1}^n\left[\left(f(x_i,a_1,a_2,\cdots,a_m)-y_i\right)\left.\frac{\partial f}{\partial a_k}\right|_{x_i}\right] akQ=2i=1n[(f(xi,a1,a2,,am)yi)akfxi]让它等于 0 0 0,就是要使得 ∀ k = 1 , 2 , ⋯   , m , ∑ i = 1 n [ ( f ( x i , a 1 , a 2 , ⋯   , a m ) − y i ) ∂ f ∂ a k ∣ x i ] = 0 \forall k=1,2,\cdots,m,\\\sum\limits_{i=1}^n\left[\left(f(x_i,a_1,a_2,\cdots,a_m)-y_i\right)\left.\frac{\partial f}{\partial a_k}\right|_{x_i}\right]=0 k=1,2,,m,i=1n[(f(xi,a1,a2,,am)yi)akfxi]=0


下面以一次函数拟合为例。设 f ( x ) = a x + b f(x)=ax+b f(x)=ax+b,要通过已知的数据点 ( x 1 , y 1 ) , ( x 2 , y 2 ) , ⋯   , ( x n , y n ) (x_1,y_1),(x_2,y_2),\cdots,(x_n,y_n) (x1,y1),(x2,y2),,(xn,yn)来确定参数 a , b a,b a,b的值。损失函数 Q ( a , b ) = ∑ i = 1 n ( a x i + b − y i ) 2 Q(a,b)=\sum\limits_{i=1}^n(ax_i+b-y_i)^2 Q(a,b)=i=1n(axi+byi)2 ∂ Q ∂ a = 2 ∑ i = 1 n ( a x i + b − y i ) x i = 0 ① ∂ Q ∂ b = 2 ∑ i = 1 n ( a x i + b − y i ) = 0 ② \begin{aligned}\frac{\partial Q}{\partial a}&=2\sum\limits_{i=1}^n(ax_i+b-y_i)x_i=0\qquad&①\\\frac{\partial Q}{\partial b}&=2\sum\limits_{i=1}^n(ax_i+b-y_i)=0&②\end{aligned} aQbQ=2i=1n(axi+byi)xi=0=2i=1n(axi+byi)=0 ② ② a ∑ i = 1 n x i + n b = ∑ i = 1 n y i a\sum\limits_{i=1}^nx_i+nb=\sum\limits_{i=1}^ny_i ai=1nxi+nb=i=1nyi x ‾ = ∑ i = 1 n x i n \overline x=\frac{\sum\limits_{i=1}^n x_i}{n} x=ni=1nxi y ‾ = ∑ i = 1 n y i n \overline y=\frac{\sum\limits_{i=1}^n y_i}{n} y=ni=1nyi(分别为 x , y x,y x,y的平均值),则有 a x ‾ + b = y ‾ ③ a\overline x+b=\overline y\qquad ③ ax+b=y ① ① a ∑ i = 1 n x i 2 + b ∑ i = 1 n x i = ∑ i = 1 n x i y i a\sum\limits_{i=1}^n x_i^2+b\sum\limits_{i=1}^n x_i=\sum\limits_{i=1}^n x_iy_i ai=1nxi2+bi=1nxi=i=1nxiyi x 2 ‾ = ∑ i = 1 n x i 2 n , x y ‾ = ∑ i = 1 n x i y i n \overline{x^2}=\frac{\sum\limits_{i=1}^n x_i^2}{n},\overline{xy}=\frac{\sum\limits_{i=1}^n x_iy_i}{n} x2=ni=1nxi2,xy=ni=1nxiyi,则有 a x 2 ‾ + b x ‾ = x y ‾ ④ a\overline{x^2}+b\overline{x}=\overline{xy}\qquad④ ax2+bx=xy ③ × x ‾ ③\times\overline x ×x a x ‾ 2 + b x ‾ = x ˉ y ˉ ⑤ a{\overline x}^2+b\overline x=\bar x\bar y\qquad⑤ ax2+bx=xˉyˉ ⑤ − ④ ⑤-④ a ( x ‾ 2 − x 2 ‾ ) = x ˉ y ˉ − x y ‾ a({\overline x}^2-\overline{x^2})=\bar x\bar y-\overline{xy} a(x2x2)=xˉyˉxy由此算出直线的斜率 a = x ˉ y ˉ − x y ‾ x ‾ 2 − x 2 ‾ a=\frac{\bar x\bar y-\overline{xy}}{{\overline x}^2-\overline{x^2}} a=x2x2xˉyˉxy ③ ③ b = y ‾ − a x ‾ b=\overline y-a\overline x b=yax


一次函数拟合的Python代码实现:

# Least Squares Method (Linear)
# Author: seh_sjij

import matplotlib.pyplot as plt
import numpy as np

class LeastSquareMethod(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y
        self.n = len(self.x)
        if len(self.y) != self.n:
            raise Exception(
                'LeastSquareMethod: len(x) != len(y)')

    def Calculate(self):
        self.xmean = self.xsquaremean = 0
        # average of x and x^2
        for x_i in self.x:
            self.xmean += x_i
            self.xsquaremean += x_i * x_i
        self.xmean /= self.n
        self.xsquaremean /= self.n
        self.ymean = 0
        # average of y
        for y_i in self.y:
            self.ymean += y_i
        self.ymean /= self.n
        self.xymean = 0
        # average of xy
        for i in range(0, self.n):
            self.xymean += self.x[i] * self.y[i]
        self.xymean /= self.n
        a = (self.xmean * self.ymean - self.xymean) \
            / (self.xmean * self.xmean - self.xsquaremean)
        b = self.ymean - a * self.xmean
        return (a, b) # y = ax + b

if __name__ == '__main__':
    x = [1, 2, 4, 6, 7, 9, 10, 12]
    y = [0.7, 2.25, 4.64, 5.69, 7.40, 8.57, 10.72, 11.64]
    lsm = LeastSquareMethod(x, y)
    a, b = lsm.Calculate()
    plt.scatter(x, y)
    plt.plot(x, [a * x_i + b for x_i in x])
    plt.title('Least Squares Method: y = %fx + %f' % (a, b))
    plt.show()

执行结果:
在这里插入图片描述
时间复杂度: Θ ( n ) \Theta(n) Θ(n)

Logo

开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!

更多推荐