继上次使用神经网络实现手写数字识别后,这次来试试用卷积神经网络实现猫狗分类。

什么是卷积神经网络?

首先看看短视频大概了解一下什么是CNN(Convolutional Neural Networks)《什么是卷积神经网络CNN?》

从视频中得知,CNN和前面用于手写数字识别的NN最明显的区别就是CNN是利用一个叫做卷积核的方形矩阵在输入数据上不断移动并进行乘加。因为这个过程类似卷积操作,故称其为卷积神经网络。

在这里插入图片描述
为什么要引入卷积核这个东西呢?
从我们之前的手写数字识别可知,神经网络对手写数字图像的处理是将其展开为一维向量,然后对这个一维向量进行处理。这会导致一个问题,那就是忽略了图像中某一像素与其周围像素的空间关系

在这里插入图片描述
故在CNN中,我们并不将输入图像展开为一维向量,而是保留其二维特征,然后同样使用一个二维的卷积核对其进行处理,提取空间特征。这使得CNNNN更适合进行图像处理的任务。
这大概就是CNN的由来。

在这里插入图片描述
或许说到这又引入了一个新的疑问,既然我们要提取图像像素的空间信息,那为什么不使用一个和输入图像同等大小的卷积核,而是用一个很小的卷积核在图像上不断移动并乘加?

首先我们可以设想一下,一幅图像那么大,某一个像素并不是和其他所有像素都有着空间关系。
例如在我们的自拍照当中,我们眼睛的某一个像素和眉毛的像素很可能具有着某种空间关系,但其与我们的背景中某个物体可能就没有关系。

所以我们在卷积层使用较小的卷积核对一幅图像提取特征时,其感受野也很小。然后在卷积层后加池化层(下采样),将特征数量减少。而后当再经过卷积层时,卷积核大小不变,输入参数减少,相对来说卷积核感受野就增大了。

如此不断循环往复,随着网络加深,经过多个卷积-池化层后,每个神经元的感受野逐渐增大,对图像特征的提取也从局部到整体

在这里插入图片描述


接下来就试着用CNN实现猫狗分类吧。

先导入必要的库:

from numpy.lib.function_base import place
import paddle
import paddle.fluid as fluid
import numpy as np
from PIL import Image
import sys
from multiprocessing import cpu_count
import matplotlib.pyplot as plt
import os

一、数据准备

本文所使用数据集:cifar-10-batches-py.zip

该数据集共有60000张彩色图像,这些图像是32*32,分为10个类,每类6000张图:
在这里插入图片描述

def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict
def test_mapper(sample):
    img, label = sample
    # 将img 数组进行归一化处理,得到0到1之间的数值
    img = img.flatten().astype('float32')/255.0
    return img, label

def train_mapper(sample):
    img, label = sample
    img = img.flatten().astype('float32')/255.0
    return img, label
# 对自定义数据集创建训练集train 的reader
def train_r(buffered_size=1024):
    def reader():
        xs=[]
        ys=[]
        for i in range(1, 6):
            train_dict = unpickle("/home/aistudio/cifar-10-batches-py/data_batch_%d" % (i,))
            xs.append(train_dict[b'data'])
            ys.append(train_dict[b'labels'])
        Xtr = np.concatenate(xs)
        Ytr = np.concatenate(ys)
        for (x, y) in zip(Xtr, Ytr):
            yield x, int(y)
    return paddle.reader.xmap_readers(train_mapper, reader, cpu_count(), buffered_size)

# 对自定义数据集创建训练集test 的reader
def test_r(buffered_size=1024):
    def reader():
        test_dict = unpickle("/home/aistudio/cifar-10-batches-py/test_batch")
        X = test_dict[b'data']
        Y = test_dict[b'labels']
        for (x, y) in zip(X, Y):
            yield x, int(y)
    return paddle.reader.xmap_readers(test_mapper, reader, cpu_count(), buffered_size)

BATCH_SIZE = 128
# 用于训练的数据提供器
train_reader = train_r()
train_reader = paddle.batch(
    paddle.reader.shuffle(
        reader = train_reader, buf_size = BATCH_SIZE*100),
    batch_size = BATCH_SIZE
    )
# 用于测试的数据提供器
test_reader = test_r()
test_reader = paddle.batch(
    paddle.dataset.cifar.test10(),
    batch_size = BATCH_SIZE
    )

二、网络配置

CNN有着多种可供选择的网络结构,其中包括有LeNetAlexNetVGGNetResNet 等。

其中VGGNet 是由牛津大学的视觉几何组和谷歌的DeepMind 公司提出,通过VGGNet,研究人员证明了基于小尺寸卷积核,增加网络深度可以有效提升模型效果

VGGNet 引入“模块化”设计思想,将不同的层进行简单组合构成网络模块,再用模块来组装完整网络,而不再以“层”为单元组装网络。
如下VGG-19 所示,每一种颜色的网络组合都代表一个模块。

在这里插入图片描述
VGGNet 有着多种配置方案,经过实际测试后,我发现对于本篇数据集的训练来说,效果最好的反而不是VGG-19,而是VGG-13,也就是BVGG 结构。

在这里插入图片描述

1. 定义网络

按照上图所示VGG-13 进行网络搭建:
首先是两层(3×3×64)卷积层 + 一层最大池化层
(池化后使用了批归一化,作用是在深度神经网络训练过程中使得每一层神经网络的输入保持相同分布)

def convolutional_neural_network(img):
    # 模块一:3*3*64双卷积+池化层
    conv1 = fluid.layers.conv2d(input=img,          # 输入图像
                                filter_size=3,      # 卷积核大小
                                num_filters=64,     # 卷积核数量,其与输入通道相同
                                padding=1,
                                act='relu')         # 激活函数

    conv2 = fluid.layers.conv2d(input=conv1,          # 输入图像
                                filter_size=3,      # 卷积核大小
                                num_filters=64,     # 卷积核数量,其与输入通道相同
                                padding=1,
                                act='relu')         # 激活函数

    pool1 = fluid.layers.pool2d(input=conv2,        # 输入
                                pool_size=2,        # 池化核大小
                                pool_type='max',    # 池化类型
                                pool_stride=2)      # 池化步长
	# 批归一化
    conv_pool_1 = fluid.layers.batch_norm(pool1)

第二个模块是两层(3×3×128)卷积层 + 一层最大池化层

    # 模块二:3*3*128双卷积+池化层
    conv3 = fluid.layers.conv2d(input=conv_pool_1,
                                filter_size=3,
                                num_filters=128,
                                padding=1,
                                act='relu')

    conv4 = fluid.layers.conv2d(input=conv3,
                                filter_size=3,
                                num_filters=128,
                                padding=1,
                                act='relu')

    pool2 = fluid.layers.pool2d(input=conv4,
                                pool_size=2,
                                pool_type='max',
                                pool_stride=2,
                                global_pooling=False)

    conv_pool_2 = fluid.layers.batch_norm(pool2)

第三个模块是两层(3×3×256)卷积层 + 一层最大池化层

    # 模块三:3*3*256双卷积+池化层
    conv5 = fluid.layers.conv2d(input=conv_pool_2,
                                filter_size=3,
                                num_filters=256,
                                padding=1,
                                act='relu')

    conv6 = fluid.layers.conv2d(input=conv5,
                                filter_size=3,
                                num_filters=256,
                                padding=1,
                                act='relu')
    
    pool3 = fluid.layers.pool2d(input=conv6,
                                pool_size=2,
                                pool_type='max',
                                pool_stride=2,
                                global_pooling=False)

    conv_pool_3 = fluid.layers.batch_norm(pool3)

第四个模块是两层(3×3×512)卷积层 + 一层最大池化层

    # 模块四:3*3*512双卷积+池化层
    conv7 = fluid.layers.conv2d(input=conv_pool_3,
                                filter_size=3,
                                num_filters=512,
                                padding=1,
                                act='relu')

    conv8 = fluid.layers.conv2d(input=conv7,
                                filter_size=3,
                                num_filters=512,
                                padding=1,
                                act='relu')

    pool4 = fluid.layers.pool2d(input=conv8,
                                pool_size=2,
                                pool_type='max',
                                pool_stride=2,
                                global_pooling=False)

    conv_pool_4 = fluid.layers.batch_norm(pool4)

第五个模块是两层(3×3×512)卷积层 + 一层最大池化层

    # 模块五:3*3*512双卷积+池化层
    conv9 = fluid.layers.conv2d(input=conv_pool_4,
                                filter_size=3,
                                num_filters=512,
                                padding=1,
                                act='relu')

    conv10 = fluid.layers.conv2d(input=conv9,
                                filter_size=3,
                                num_filters=512,
                                padding=1,
                                act='relu')

    pool5 = fluid.layers.pool2d(input=conv10,
                                pool_size=2,
                                pool_type='max',
                                pool_stride=2,
                                global_pooling=False)

最后加三层全连接层,然后最后一层全连接层的激活函数使用soft-max

    # 以softmax 为激活函数的全连接输出层,10类数据输出10个数字
    fc1 = fluid.layers.fc(input=pool5,
                                 size=1000,
                                 act='relu')
    fc2 = fluid.layers.fc(input=fc1,
                                 size=1000,
                                 act='relu')
    prediction = fluid.layers.fc(input=fc2,
                                 size=10,
                                 act='softmax')
    return prediction

2. 定义输入数据的格式

# 定义输入数据
data_shape = [3, 32, 32]
paddle.enable_static()
images = fluid.layers.data(name='images', shape=data_shape, dtype='float32')
label = fluid.layers.data(name='label', shape=[1], dtype='int64')

# 获取分类器
predict = convolutional_neural_network(images)

3. 定义损失函数和准确率

# 定义损失函数和准确率
cost = fluid.layers.cross_entropy(input=predict, label=label)   # 交叉熵
avg_cost = fluid.layers.mean(cost)                              # 计算cost中所有元素的平均值
acc = fluid.layers.accuracy(input=predict, label=label)         # 使用输入和标签计算准确率

4. 定义优化方法

# 定义优化方法
test_program = fluid.default_main_program().clone(for_test=True)    # 获取测试程序
optimizer = fluid.optimizer.Adam(learning_rate=0.001)               # 定义优化方法
optimizer.minimize(avg_cost)
print("完成")

三、模型训练&评估

# 创建Executor
use_cuda = True # 定义使用CPU还是GPU,使用CPU时use_cuda=False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())

# 定义数据映射器
feeder = fluid.DataFeeder(feed_list=[images, label], place=place)
# 定义绘制训练过程的损失值和准确率变化趋势的方法
all_train_iter = 0
all_train_iters = []
all_train_costs = []
all_train_accs = []

def draw_train_process(title, iters, costs, accs, label_cost, label_acc):
    plt.title(title, fontsize=24)
    plt.xlabel("iter", fontsize=20)
    plt.ylabel("cost/acc", fontsize=20)
    plt.plot(iters, costs, color='red', label=label_cost)
    plt.plot(iters, accs, color='green', label=label_acc)
    plt.legend()
    plt.grid()
    plt.show()

开始训练模型:

# 训练并保存模型
EPOCH_NUM = 3
model_save_dir = "/home/aistudio/model/DogCat_Detection.inference.model"

for pass_id in range(EPOCH_NUM):
    # 开始训练
    for batch_id, data in enumerate(train_reader()):                            # 遍历train_reader
        train_cost, train_acc = exe.run(program=fluid.default_main_program(),   # 运行主程序
                                        feed=feeder.feed(data),                 # 喂入一个batch的数据
                                        fetch_list=[avg_cost, acc])             # fetch均方误差和准确率

        all_train_iter = all_train_iter + BATCH_SIZE
        all_train_iters.append(all_train_iter)
        all_train_costs.append(train_cost[0])
        all_train_accs.append(train_acc[0])

        # 每100次batch打印一次训练、进行一次测试
        if batch_id % 20 == 0:
            print('Pass:%d, Batch:%d, Cost:%0.5f, Accuracy:%0.5f' %
                  (pass_id, batch_id, train_cost[0], train_acc[0]))

    # 开始测试
    test_costs = [] # 测试的损失值
    test_accs = []  # 测试的准确率
    for batch_id, data in enumerate(test_reader()):
        test_cost, test_acc = exe.run(program=test_program,         # 执行训练程序
                                      feed=feeder.feed(data),       # 喂入数据
                                      fetch_list=[avg_cost, acc])   # fetch误差、准确率
        test_costs.append(test_cost[0])                             # 记录每个batch的损失值
        test_accs.append(test_acc[0])                               # 记录每个batch的准确率

    test_cost = (sum(test_costs) / len(test_costs)) # 计算误差平均值
    test_acc = (sum(test_accs) / len(test_accs))    # 计算准确率平均值
    print('Test:%d, Cost:%0.5f, ACC:%0.5f' % (pass_id, test_cost, test_acc))

保存模型:

# 保存模型
if not os.path.exists(model_save_dir):
    os.makedirs(model_save_dir)
print('save models to %s' % (model_save_dir))
fluid.io.save_inference_model(model_save_dir,
                              ['images'],
                              [predict],
                              exe)

print('模型保存完成')
draw_train_process("training", all_train_iters, all_train_costs, all_train_accs, "training cost", "training acc")

训练结果:

Pass:0, Batch:0, Cost:2.43514, Accuracy:0.10156
Pass:0, Batch:20, Cost:1.74249, Accuracy:0.35938
Pass:0, Batch:40, Cost:1.68267, Accuracy:0.32812
Pass:0, Batch:60, Cost:1.61223, Accuracy:0.41406
Pass:0, Batch:80, Cost:1.48359, Accuracy:0.46094
Pass:0, Batch:100, Cost:1.39801, Accuracy:0.42188
Pass:0, Batch:120, Cost:1.25192, Accuracy:0.50781
Pass:0, Batch:140, Cost:1.34669, Accuracy:0.49219
Pass:0, Batch:160, Cost:1.30619, Accuracy:0.55469
Pass:0, Batch:180, Cost:1.46351, Accuracy:0.43750
Pass:0, Batch:200, Cost:1.31915, Accuracy:0.54688
Pass:0, Batch:220, Cost:1.43963, Accuracy:0.46094
Pass:0, Batch:240, Cost:1.27932, Accuracy:0.60938
Pass:0, Batch:260, Cost:1.15979, Accuracy:0.54688
Pass:0, Batch:280, Cost:1.17669, Accuracy:0.60156
Pass:0, Batch:300, Cost:0.90510, Accuracy:0.64844
Pass:0, Batch:320, Cost:1.19685, Accuracy:0.58594
Pass:0, Batch:340, Cost:1.01533, Accuracy:0.63281
Pass:0, Batch:360, Cost:0.90877, Accuracy:0.71094
Pass:0, Batch:380, Cost:0.80900, Accuracy:0.70312
Test:0, Cost:1.00458, ACC:0.65694
Pass:1, Batch:0, Cost:0.97145, Accuracy:0.72656
Pass:1, Batch:20, Cost:0.95520, Accuracy:0.68750
Pass:1, Batch:40, Cost:0.77640, Accuracy:0.69531
Pass:1, Batch:60, Cost:0.83274, Accuracy:0.74219
Pass:1, Batch:80, Cost:0.80683, Accuracy:0.68750
Pass:1, Batch:100, Cost:1.03590, Accuracy:0.64844
Pass:1, Batch:120, Cost:0.85940, Accuracy:0.71875
Pass:1, Batch:140, Cost:0.76278, Accuracy:0.71875
Pass:1, Batch:160, Cost:0.76663, Accuracy:0.74219
Pass:1, Batch:180, Cost:0.89417, Accuracy:0.71094
Pass:1, Batch:200, Cost:0.78458, Accuracy:0.71094
Pass:1, Batch:220, Cost:0.71163, Accuracy:0.72656
Pass:1, Batch:240, Cost:0.52018, Accuracy:0.85156
Pass:1, Batch:260, Cost:0.57307, Accuracy:0.82031
Pass:1, Batch:280, Cost:0.81890, Accuracy:0.71094
Pass:1, Batch:300, Cost:0.60540, Accuracy:0.76562
Pass:1, Batch:320, Cost:0.67661, Accuracy:0.75000
Pass:1, Batch:340, Cost:0.73857, Accuracy:0.75781
Pass:1, Batch:360, Cost:0.56550, Accuracy:0.81250
Pass:1, Batch:380, Cost:0.81765, Accuracy:0.66406
Test:1, Cost:0.71891, ACC:0.76266
Pass:2, Batch:0, Cost:0.57188, Accuracy:0.73438
Pass:2, Batch:20, Cost:0.70279, Accuracy:0.74219
Pass:2, Batch:40, Cost:0.68699, Accuracy:0.77344
Pass:2, Batch:60, Cost:0.63695, Accuracy:0.78125
Pass:2, Batch:80, Cost:0.53625, Accuracy:0.79688
Pass:2, Batch:100, Cost:0.64377, Accuracy:0.75781
Pass:2, Batch:120, Cost:0.59699, Accuracy:0.78906
Pass:2, Batch:140, Cost:0.61716, Accuracy:0.77344
Pass:2, Batch:160, Cost:0.77618, Accuracy:0.75781
Pass:2, Batch:180, Cost:0.50643, Accuracy:0.82812
Pass:2, Batch:200, Cost:0.62452, Accuracy:0.81250
Pass:2, Batch:220, Cost:0.57345, Accuracy:0.81250
Pass:2, Batch:240, Cost:0.58108, Accuracy:0.82812
Pass:2, Batch:260, Cost:0.53484, Accuracy:0.81250
Pass:2, Batch:280, Cost:0.51502, Accuracy:0.80469
Pass:2, Batch:300, Cost:0.58211, Accuracy:0.81250
Pass:2, Batch:320, Cost:0.61472, Accuracy:0.82812
Pass:2, Batch:340, Cost:0.42609, Accuracy:0.84375
Pass:2, Batch:360, Cost:0.39782, Accuracy:0.85938
Pass:2, Batch:380, Cost:0.48162, Accuracy:0.85938
Test:2, Cost:0.64629, ACC:0.78619

可以看到这个模型的效果还是不错的,测试集当中可以达到将近79%的准确率。

四、模型预测

# 创建预测用的Executor
infer_exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()

将测试图像转换为与训练数据同大小:

# 图片预处理
def load_image(file):
    im = Image.open(file)
    # 将图片调整为训练数据同等大小
    # 设定ANTIALIAS,抗锯齿
    im = im.resize((32, 32), Image.ANTIALIAS)
    # 建立图片矩阵,类型为float32
    im = np.array(im).astype(np.float32)
    # 矩阵转置
    im = im.transpose((2, 0, 1))
    # 将像素值从0-255变成0-1
    im = im / 255.0
    im = np.expand_dims(im, axis=0)
    # 保持和之前输入image维度一致
    print('im_shape的维度:', im.shape)
    return im

开始预测:

# 开始预测
with fluid.scope_guard(inference_scope):
    # 从指定目录中加载,推理model(inference model)
    [inference_program, # 预测用的program
     feed_target_names, # 是一个str列表,其包含需要在推理program中提供数据的变量的名称
     fetch_targets] = fluid.io.load_inference_model(model_save_dir, # fetch_targets: 是一个Variable列表,从中可得推断结果
                                                    infer_exe)      # infer_exe: 运行inference model的executor
    
    infer_path = 'C:/Users/Desktop/DogCat_pic/dog/dog1.jpg'

    img = Image.open(infer_path)
    plt.imshow(img)
    plt.show()

    img = load_image(infer_path)

    results = infer_exe.run(inference_program,
                            feed={feed_target_names[0]: img},
                            fetch_list=fetch_targets)
    print('results', results)
    label_list = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
    print("infer results: %s" % label_list[np.argmax(results[0])])    

运行上面代码,将会对这副图像进行预测:

在这里插入图片描述
预测结果为狗狗:

在这里插入图片描述
或者对这副图像进行预测:

在这里插入图片描述
预测结果为喵喵:

在这里插入图片描述
当然,由于我们的数据集中不只有狗狗和猫猫的数据,一共有多达10种不同物体的数据,所以我们这个模型理应可以分类10种不同的物体,事实也正是如此。

以下所示的这些都可以正确分类:
在这里插入图片描述
大家也可以下载这个模型去玩玩:猫狗分类模型(使用VGG-13网络训练所得)

或者自己训练一个,完整代码如下。

五、完整代码

from numpy.lib.function_base import place
import paddle
import paddle.fluid as fluid
import numpy as np
from PIL import Image
import sys
from multiprocessing import cpu_count
import matplotlib.pyplot as plt
import os

########## 准备数据 ##########
def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

def test_mapper(sample):
    img, label = sample
    # 将img 数组进行归一化处理,得到0到1之间的数值
    img = img.flatten().astype('float32')/255.0
    return img, label

def train_mapper(sample):
    img, label = sample
    img = img.flatten().astype('float32')/255.0
    return img, label

# 对自定义数据集创建训练集train 的reader
def train_r(buffered_size=1024):
    def reader():
        xs=[]
        ys=[]
        for i in range(1, 6):
            train_dict = unpickle("/home/aistudio/cifar-10-batches-py/data_batch_%d" % (i,))
            xs.append(train_dict[b'data'])
            ys.append(train_dict[b'labels'])
        Xtr = np.concatenate(xs)
        Ytr = np.concatenate(ys)
        for (x, y) in zip(Xtr, Ytr):
            yield x, int(y)
    return paddle.reader.xmap_readers(train_mapper, reader, cpu_count(), buffered_size)

# 对自定义数据集创建训练集test 的reader
def test_r(buffered_size=1024):
    def reader():
        test_dict = unpickle("/home/aistudio/cifar-10-batches-py/test_batch")
        X = test_dict[b'data']
        Y = test_dict[b'labels']
        for (x, y) in zip(X, Y):
            yield x, int(y)
    return paddle.reader.xmap_readers(test_mapper, reader, cpu_count(), buffered_size)

BATCH_SIZE = 128
# 用于训练的数据提供器
train_reader = train_r()
train_reader = paddle.batch(
    paddle.reader.shuffle(
        reader = train_reader, buf_size = BATCH_SIZE*100),
    batch_size = BATCH_SIZE
    )
# 用于测试的数据提供器
test_reader = test_r()
test_reader = paddle.batch(
    paddle.dataset.cifar.test10(),
    batch_size = BATCH_SIZE
    )

########## 网络配置 ##########
def convolutional_neural_network(img):
    # 模块一:3*3*64双卷积+池化层
    conv1 = fluid.layers.conv2d(input=img,          # 输入图像
                                filter_size=3,      # 卷积核大小
                                num_filters=64,     # 卷积核数量,其与输入通道相同
                                padding=1,
                                act='relu')         # 激活函数

    conv2 = fluid.layers.conv2d(input=conv1,          # 输入图像
                                filter_size=3,      # 卷积核大小
                                num_filters=64,     # 卷积核数量,其与输入通道相同
                                padding=1,
                                act='relu')         # 激活函数

    pool1 = fluid.layers.pool2d(input=conv2,        # 输入
                                pool_size=2,        # 池化核大小
                                pool_type='max',    # 池化类型
                                pool_stride=2)      # 池化步长

    conv_pool_1 = fluid.layers.batch_norm(pool1)

    # 模块二:3*3*128双卷积+池化层
    conv3 = fluid.layers.conv2d(input=conv_pool_1,
                                filter_size=3,
                                num_filters=128,
                                padding=1,
                                act='relu')

    conv4 = fluid.layers.conv2d(input=conv3,
                                filter_size=3,
                                num_filters=128,
                                padding=1,
                                act='relu')

    pool2 = fluid.layers.pool2d(input=conv4,
                                pool_size=2,
                                pool_type='max',
                                pool_stride=2,
                                global_pooling=False)

    conv_pool_2 = fluid.layers.batch_norm(pool2)

    # 模块三:3*3*256双卷积+池化层
    conv5 = fluid.layers.conv2d(input=conv_pool_2,
                                filter_size=3,
                                num_filters=256,
                                padding=1,
                                act='relu')

    conv6 = fluid.layers.conv2d(input=conv5,
                                filter_size=3,
                                num_filters=256,
                                padding=1,
                                act='relu')
    
    pool3 = fluid.layers.pool2d(input=conv6,
                                pool_size=2,
                                pool_type='max',
                                pool_stride=2,
                                global_pooling=False)

    conv_pool_3 = fluid.layers.batch_norm(pool3)

    # 模块四:3*3*512双卷积+池化层
    conv7 = fluid.layers.conv2d(input=conv_pool_3,
                                filter_size=3,
                                num_filters=512,
                                padding=1,
                                act='relu')

    conv8 = fluid.layers.conv2d(input=conv7,
                                filter_size=3,
                                num_filters=512,
                                padding=1,
                                act='relu')

    pool4 = fluid.layers.pool2d(input=conv8,
                                pool_size=2,
                                pool_type='max',
                                pool_stride=2,
                                global_pooling=False)

    conv_pool_4 = fluid.layers.batch_norm(pool4)

    # 模块五:3*3*512双卷积+池化层
    conv9 = fluid.layers.conv2d(input=conv_pool_4,
                                filter_size=3,
                                num_filters=512,
                                padding=1,
                                act='relu')

    conv10 = fluid.layers.conv2d(input=conv9,
                                filter_size=3,
                                num_filters=512,
                                padding=1,
                                act='relu')

    pool5 = fluid.layers.pool2d(input=conv10,
                                pool_size=2,
                                pool_type='max',
                                pool_stride=2,
                                global_pooling=False)

    # 以softmax 为激活函数的全连接输出层,10类数据输出10个数字
    fc1 = fluid.layers.fc(input=pool5,
                                 size=1000,
                                 act='relu')
    fc2 = fluid.layers.fc(input=fc1,
                                 size=1000,
                                 act='relu')
    prediction = fluid.layers.fc(input=fc2,
                                 size=10,
                                 act='softmax')
    return prediction

# 定义输入数据
data_shape = [3, 32, 32]
paddle.enable_static()
images = fluid.layers.data(name='images', shape=data_shape, dtype='float32')
label = fluid.layers.data(name='label', shape=[1], dtype='int64')

# 获取分类器
predict = convolutional_neural_network(images)

# 定义损失函数和准确率
cost = fluid.layers.cross_entropy(input=predict, label=label)   # 交叉熵
avg_cost = fluid.layers.mean(cost)                              # 计算cost中所有元素的平均值
acc = fluid.layers.accuracy(input=predict, label=label)         # 使用输入和标签计算准确率

# 定义优化方法
test_program = fluid.default_main_program().clone(for_test=True)    # 获取测试程序
optimizer = fluid.optimizer.Adam(learning_rate=0.001)               # 定义优化方法
optimizer.minimize(avg_cost)
print("完成")

########## 模型训练&模型评估 ##########
# 创建Executor
use_cuda = True # 定义使用CPU还是GPU,使用CPU时use_cuda=False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())

# 定义数据映射器
feeder = fluid.DataFeeder(feed_list=[images, label], place=place)

# 定义绘制训练过程的损失值和准确率变化趋势的方法
all_train_iter = 0
all_train_iters = []
all_train_costs = []
all_train_accs = []

def draw_train_process(title, iters, costs, accs, label_cost, label_acc):
    plt.title(title, fontsize=24)
    plt.xlabel("iter", fontsize=20)
    plt.ylabel("cost/acc", fontsize=20)
    plt.plot(iters, costs, color='red', label=label_cost)
    plt.plot(iters, accs, color='green', label=label_acc)
    plt.legend()
    plt.grid()
    plt.show()

# 训练并保存模型
EPOCH_NUM = 3
model_save_dir = "/home/aistudio/model/DogCat_Detection.inference.model"

for pass_id in range(EPOCH_NUM):
    # 开始训练
    for batch_id, data in enumerate(train_reader()):                            # 遍历train_reader
        train_cost, train_acc = exe.run(program=fluid.default_main_program(),   # 运行主程序
                                        feed=feeder.feed(data),                 # 喂入一个batch的数据
                                        fetch_list=[avg_cost, acc])             # fetch均方误差和准确率

        all_train_iter = all_train_iter + BATCH_SIZE
        all_train_iters.append(all_train_iter)
        all_train_costs.append(train_cost[0])
        all_train_accs.append(train_acc[0])

        # 每100次batch打印一次训练、进行一次测试
        if batch_id % 20 == 0:
            print('Pass:%d, Batch:%d, Cost:%0.5f, Accuracy:%0.5f' %
                  (pass_id, batch_id, train_cost[0], train_acc[0]))

    # 开始测试
    test_costs = [] # 测试的损失值
    test_accs = []  # 测试的准确率
    for batch_id, data in enumerate(test_reader()):
        test_cost, test_acc = exe.run(program=test_program,         # 执行训练程序
                                      feed=feeder.feed(data),       # 喂入数据
                                      fetch_list=[avg_cost, acc])   # fetch误差、准确率
        test_costs.append(test_cost[0])                             # 记录每个batch的损失值
        test_accs.append(test_acc[0])                               # 记录每个batch的准确率

    test_cost = (sum(test_costs) / len(test_costs)) # 计算误差平均值
    test_acc = (sum(test_accs) / len(test_accs))    # 计算准确率平均值
    print('Test:%d, Cost:%0.5f, ACC:%0.5f' % (pass_id, test_cost, test_acc))

# 保存模型
if not os.path.exists(model_save_dir):
    os.makedirs(model_save_dir)
print('save models to %s' % (model_save_dir))
fluid.io.save_inference_model(model_save_dir,
                              ['images'],
                              [predict],
                              exe)

print('模型保存完成')
draw_train_process("training", all_train_iters, all_train_costs, all_train_accs, "training cost", "training acc")
Logo

开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!

更多推荐