Keras-MaskRCNN训练自己的数据

基于keras版本mask rcnn: https://github.com/matterport/Mask_RCNN版本信息python 3.6.9keras 2.2.0一.搭建环境1.创建虚拟环境# 创建虚拟环境mrcnncondacreate-n mrcnnpython=3.6.9# 激活虚拟环境conda activate mrcnn2.下载代码git clone https://gith

hhhuua

1662人浏览 · 2020-10-16 15:11:31

hhhuua · 2020-10-16 15:11:31 发布

基于keras版本

mask rcnn: https://github.com/matterport/Mask_RCNN

默认已安装anaconda

版本信息

python 3.6.9

keras 2.2.0

一.搭建环境

1.创建虚拟环境

# 创建虚拟环境mrcnn
conda  create  -n mrcnn  python=3.6.9

# 激活虚拟环境
conda activate mrcnn

2.下载代码

git clone https://github.com/matterport/Mask_RCNN

BUG:

bug1: ImportError:can not import name 'saving

运行demo.ipynb

from keras.engine import saving

ImportError:can not import name 'saving

原因：版本问题，本机keras为2.1.6，需要换成2.2.0

解决：pip uninstall keras

pip install keras==2.2.0

(pip install keras== 可查看各个keras的版本)

(mrcnn) myuser@ubuntu:/data/lxh/dl/Mask_RCNN$ pip install keras==

Looking in indexes: http://mirrors.aliyun.com/pypi/simple/

ERROR: Could not find a version that satisfies the requirement

keras== (from versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.3.3, 1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4,

1.0.5, 1.0.6, 1.0.7, 1.0.8, 1.1.0, 1.1.1, 1.1.2, 1.2.0, 1.2.1, 1.2.2, 2.0.0, 2.0.1, 2.0.2, 2.0.3,

2.0.4, 2.0.5, 2.0.6, 2.0.7, 2.0.8, 2.0.9, 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.1.4, 2.1.5, 2.1.6, 2.2.0,

2.2.1, 2.2.2, 2.2.3, 2.2.4, 2.2.5, 2.3.0, 2.3.1, 2.4.0, 2.4.1, 2.4.2, 2.4.3)

ERROR: No matching distribution found for keras==

bug2: labelmeTypeError: rectangle() got an unexpected keyword argument 'width'

原因：Pillow包版本较低导致的。新版本中支持width这个参数来设置矩形框的线宽。

所以最终的解决办法就是升级Pillow到5.3.0

本机版本为：5.0.0

解决：pip uninstall Pillow

pip install Pillow==5.3.0

二、标注数据

1.安装labelme

（截止20200728，最好安装labelme==3.16.2）

pip install labelme==3.16.2

bug:直接安装最新版本的pip install labelme,会在后续不会生成info.yaml

原因是 C:\ProgramData\Anaconda3\Lib\site-packages\labelme\cli 下的json_to_dataset.py 代码发生变化。

# 新版labelme
    PIL.Image.fromarray(img).save(osp.join(out_dir, 'img.png'))
    utils.lblsave(osp.join(out_dir, 'label.png'), lbl)
    PIL.Image.fromarray(lbl_viz).save(osp.join(out_dir, 'label_viz.png'))

    with open(osp.join(out_dir, 'label_names.txt'), 'w') as f:
        for lbl_name in label_names:
            f.write(lbl_name + '\n')
            
    logger.info('Saved to: {}'.format(out_dir))

if __name__ == '__main__':
    main()

对比老版的脚本文件

# 老版labelme
    PIL.Image.fromarray(img).save(osp.join(out_dir, 'img.png'))
    utils.lblsave(osp.join(out_dir, 'label.png'), lbl)
    PIL.Image.fromarray(lbl_viz).save(osp.join(out_dir, 'label_viz.png'))

    with open(osp.join(out_dir, 'label_names.txt'), 'w') as f:
        for lbl_name in label_names:
            f.write(lbl_name + '\n')

	# 缺少的部分
    logger.warning('info.yaml is being replaced by label_names.txt')
    info = dict(label_names=label_names)
    with open(osp.join(out_dir, 'info.yaml'), 'w') as f:
        yaml.safe_dump(info, f, default_flow_style=False)

    logger.info('Saved to: {}'.format(out_dir))

if __name__ == '__main__':
    main()

解决方法：把老版里生成yaml的部分复制到新版中（记得开头引入yaml）

# 开头引入
    import yaml


# 后部分加上生成".yaml"文件的部分  一定注意位置
    logger.warning('info.yaml is being replaced by label_names.txt')
    info = dict(label_names=label_names)
    with open(osp.join(out_dir, 'info.yaml'), 'w') as f:
        yaml.safe_dump(info, f, default_flow_style=False)

修改完成后面转换json文件之后会生成info.yaml：

labelme_json_to_dataset <文件名>.json

2.标注数据

打开labelme的标注软件开始标注

1）新建一个pic和json文件夹，在labelme的左上角的file设置自动保存 Save Automatically，和Change Output dir。

pic：存放需要标注的图片（注意pic图片中的名字不能有下划线“_”）

json:设置存放标注之后的json文件

标注完之后，会在json文件夹生成对应的json文件

3.批量生成数据集

目录结构如下：

3.1 python makedir.py

# makedir.py
'''
创建train_data所需的各个目录
'''
import os


dir_list = ['train_data/cv2_mask', 'train_data/pic', 'train_data/json', 'train_data/labelme_json']

for dir in dir_list:
	print(dir)
	if not os.path.exists(dir):
		os.makedirs(dir)

3.2 python transform_json.py （转换json文件）

注意路径信息以son为例，会在json的文件夹生成图片对应的目录
将目录拷贝到transform_json中。

# 2transform_json.py
'''
批量转化json文件
'''
import os
path = 'json'  # path为json文件存放的路径
json_file = os.listdir(path)
# os.system("activate labelme")
for file in json_file:

    filename = os.path.join(path, file)
    # print(filename)
    os.system("labelme_json_to_dataset %s"%(path + '/' + file))

3.3 python rename_cv2_mask.py （复制label.png）

（复制上一步生成目录中的image.png图片到cv2_mask，并重新命名）
主要目录的路径：transform_json

# 3rename_cv2_mask.py
import os
import shutil

# 将transform_json 下各个文件夹对应的label.png，以文件夹的名字命名后，复制到train_data/cv2_mask
dir_name = 'train_data/labelme_json'
for dir in os.listdir(dir_name):
	print(dir)
	# dir 01_json  --> 01 -->01.png
	new_name = dir.split('_')[0]+'.png'
    # 名称有下划线需要处理
	old_name = os.path.join(dir_name, dir, 'label.png')
	# print(old_name)
	# print(new_name)
	# label.png -->01.png  复制到 train_data/cv2_mask
	shutil.copy(old_name, os.path.join('train_data/cv2_mask', new_name))

3.4 train_data: pic、json、labelme_json （复制对应到文件到各自目录）

将pic、json放到对应的train_data下pic、json
将第3.2生成的目录拷贝到train_data/labelme_json

最后生成训练集:train_data,将此目录放到mask-rcnn的根目录下

四.训练

4.1 下载预训练的权重

下载mask_rcnn_coco.h5,放到根目录下

（可以自行百度，或者运行demo.ipynb）

4.2 训练代码train.py

# -*- coding: utf-8 -*-

import os
import sys
import random
import math
import re
import time
import numpy as np
import cv2
import matplotlib
import matplotlib.pyplot as plt
import tensorflow as tf
from mrcnn.config import Config
# import utils
from mrcnn import model as modellib, utils
from mrcnn import visualize
import yaml
from mrcnn.model import log
from PIL import Image

# os.environ["CUDA_VISIBLE_DEVICES"] = "0"
# Root directory of the project
ROOT_DIR = os.getcwd()

# ROOT_DIR = os.path.abspath("../")
# Directory to save logs and trained model
MODEL_DIR = os.path.join(ROOT_DIR, "logs")

iter_num = 0

# Local path to trained weights file
COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
# Download COCO trained weights from Releases if needed
if not os.path.exists(COCO_MODEL_PATH):
    utils.download_trained_weights(COCO_MODEL_PATH)


class ShapesConfig(Config):
    """Configuration for training on the toy shapes dataset.
    Derives from the base Config class and overrides values specific
    to the toy shapes dataset.
    """
    # Give the configuration a recognizable name
    NAME = "shapes"

    # Train on 1 GPU and 8 images per GPU. We can put multiple images on each
    # GPU because the images are small. Batch size is 8 (GPUs * images/GPU).
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1

    # Number of classes (including background)
    NUM_CLASSES = 1 + 2  # background + 2 shapes

    # Use small images for faster training. Set the limits of the small side
    # the large side, and that determines the image shape.
    IMAGE_MIN_DIM = 320
    IMAGE_MAX_DIM = 384

    # Use smaller anchors because our image and objects are small
    RPN_ANCHOR_SCALES = (8 * 6, 16 * 6, 32 * 6, 64 * 6, 128 * 6)  # anchor side in pixels

    # Reduce training ROIs per image because the images are small and have
    # few objects. Aim to allow ROI sampling to pick 33% positive ROIs.
    TRAIN_ROIS_PER_IMAGE = 100

    # Use a small epoch since the data is simple
    STEPS_PER_EPOCH = 100

    # use small validation steps since the epoch is small
    VALIDATION_STEPS = 50


config = ShapesConfig()
config.display()


class DrugDataset(utils.Dataset):
    # 得到该图中有多少个实例（物体）
    def get_obj_index(self, image):
        n = np.max(image)
        return n

    # 解析labelme中得到的yaml文件，从而得到mask每一层对应的实例标签
    def from_yaml_get_class(self, image_id):
        info = self.image_info[image_id]
        with open(info['yaml_path']) as f:
            temp = yaml.load(f.read())
            labels = temp['label_names']
            del labels[0]
        return labels

    # 重新写draw_mask
    def draw_mask(self, num_obj, mask, image, image_id):
        # print("draw_mask-->",image_id)
        # print("self.image_info",self.image_info)
        info = self.image_info[image_id]
        # print("info-->",info)
        # print("info[width]----->",info['width'],"-info[height]--->",info['height'])
        for index in range(num_obj):
            for i in range(info['width']):
                for j in range(info['height']):
                    # print("image_id-->",image_id,"-i--->",i,"-j--->",j)
                    # print("info[width]----->",info['width'],"-info[height]--->",info['height'])
                    at_pixel = image.getpixel((i, j))
                    if at_pixel == index + 1:
                        mask[j, i, index] = 1
        return mask

    # 重新写load_shapes，里面包含自己的自己的类别
    # 并在self.image_info信息中添加了path、mask_path 、yaml_path
    # yaml_pathdataset_root_path = "/tongue_dateset/"
    # img_floder = dataset_root_path + "rgb"
    # mask_floder = dataset_root_path + "mask"
    # dataset_root_path = "/tongue_dateset/"
    def load_shapes(self, count, img_floder, mask_floder, imglist, dataset_root_path):
        """Generate the requested number of synthetic images.
        count: number of images to generate.
        height, width: the size of the generated images.
        """
        # Add classes
        self.add_class("shapes", 1, "6mm")
        self.add_class("shapes", 2, "car")

        for i in range(count):
            # 获取图片宽和高
            print(i)
            filestr = imglist[i].split(".")[0]
            # print(imglist[i],"-->",cv_img.shape[1],"--->",cv_img.shape[0])
            # print("id-->", i, " imglist[", i, "]-->", imglist[i],"filestr-->",filestr)
            # filestr = filestr.split("_")[1]
            mask_path = mask_floder + "/" + filestr + ".png"
            yaml_path = dataset_root_path + "labelme_json/" + filestr + "_json/info.yaml"
            print(dataset_root_path + "labelme_json/" + filestr + "_json/img.png")
            cv_img = cv2.imread(dataset_root_path + "labelme_json/" + filestr + "_json/img.png")

            self.add_image("shapes", image_id=i, path=img_floder + "/" + imglist[i],
                           width=cv_img.shape[1], height=cv_img.shape[0], mask_path=mask_path, yaml_path=yaml_path)

    # 重写load_mask
    def load_mask(self, image_id):
        """Generate instance masks for shapes of the given image ID.
        """
        global iter_num
        print("image_id", image_id)
        info = self.image_info[image_id]
        count = 1  # number of object
        img = Image.open(info['mask_path'])
        num_obj = self.get_obj_index(img)
        mask = np.zeros([info['height'], info['width'], num_obj], dtype=np.uint8)
        mask = self.draw_mask(num_obj, mask, img, image_id)
        occlusion = np.logical_not(mask[:, :, -1]).astype(np.uint8)
        for i in range(count - 2, -1, -1):
            mask[:, :, i] = mask[:, :, i] * occlusion

            occlusion = np.logical_and(occlusion, np.logical_not(mask[:, :, i]))
        labels = []
        labels = self.from_yaml_get_class(image_id)
        labels_form = []
        for i in range(len(labels)):
            if labels[i].find("6mm") != -1:
                labels_form.append("6mm")
            elif labels[i].find("car") != -1:
                labels_form.append("car")
            # elif labels[i].find("other_class_name") != -1:
            #     labels_form.append("other_class_name")
        class_ids = np.array([self.class_names.index(s) for s in labels_form])
        return mask, class_ids.astype(np.int32)


def get_ax(rows=1, cols=1, size=8):
    """Return a Matplotlib Axes array to be used in
    all visualizations in the notebook. Provide a
    central point to control graph sizes.
    Change the default size attribute to control the size
    of rendered images
    """
    _, ax = plt.subplots(rows, cols, figsize=(size * cols, size * rows))
    return ax


# 基础设置
dataset_root_path = "train_data/"
img_floder = dataset_root_path + "pic"
mask_floder = dataset_root_path + "cv2_mask"
# yaml_floder = dataset_root_path
imglist = os.listdir(img_floder)
count = len(imglist)

# train与val数据集准备
dataset_train = DrugDataset()
dataset_train.load_shapes(count, img_floder, mask_floder, imglist, dataset_root_path)
dataset_train.prepare()

# print("dataset_train-->",dataset_train._image_ids)

dataset_val = DrugDataset()
dataset_val.load_shapes(count, img_floder, mask_floder, imglist, dataset_root_path)
dataset_val.prepare()

# print("dataset_val-->",dataset_val._image_ids)

# Load and display random samples
# image_ids = np.random.choice(dataset_train.image_ids, 4)
# for image_id in image_ids:
#    image = dataset_train.load_image(image_id)
#    mask, class_ids = dataset_train.load_mask(image_id)
#    visualize.display_top_masks(image, mask, class_ids, dataset_train.class_names)

# Create model in training mode
model = modellib.MaskRCNN(mode="training", config=config,
                          model_dir=MODEL_DIR)

# Which weights to start with?
init_with = "coco"  # imagenet, coco, or last

if init_with == "imagenet":
    model.load_weights(model.get_imagenet_weights(), by_name=True)
elif init_with == "coco":
    # Load weights trained on MS COCO, but skip layers that
    # are different due to the different number of classes
    # See README for instructions to download the COCO weights
    # print(COCO_MODEL_PATH)
    model.load_weights(COCO_MODEL_PATH, by_name=True,
                       exclude=["mrcnn_class_logits", "mrcnn_bbox_fc",
                                "mrcnn_bbox", "mrcnn_mask"])
elif init_with == "last":
    # Load the last model you trained and continue training
    model.load_weights(model.find_last()[1], by_name=True)

# Train the head branches
# Passing layers="heads" freezes all layers except the head
# layers. You can also pass a regular expression to select
# which layers to train by name pattern.
model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE,
            epochs=50,
            layers='heads')

# Fine tune all layers
# Passing layers="all" trains all layers. You can also
# pass a regular expression to select which layers to
# train by name pattern.
model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE / 10,
            epochs=100,
            layers="all")

修改地方：

4.2.1 修改预训练权重的路径

33行： COCO_MODEL_PATH

4.2.2 修改类别数目

以2分类为例："6mm" 、 "car"

53行：NUM_CLASSES

类别数目 = 1(背景） + 类别数目

4.2.3 数据源路径

183行：dataset_root_path

4.2.4 修改标注的类别名称

共两处：

1）add_class修改成标注的类别名称为: “6mm” 、"car"；

122-123行：

序号从1开始如果大于2个类，依次继续添加

例如：self.add_class(“shapes”, 3, "other_class_name")

2) 修改成自己标注类名

161-164行：

其中，labels[i].find( class_name )， class_name 实际上就是自己标注的类名。

如果实际类别多的话，就在复制elif后面的代码,添成自己的类名就可以；

4.2.5 其他训练超参数

45-71行：

修改图片的大小、batch_size、训练轮数等

5.训练

直接运行：

python train.py

其次如果错误提示你 Shape dimension 之类的错误,基本上都是因为你的类别数目和代码中 NUM_CLASSES 对不上.

其他常见错误可见：https://blog.csdn.net/lovebyz/article/details/80138261

正常运行如下：

运行了一段时间后（前面默认设置的是训练100 步保存一次模型），可以到 logs 文件夹下找到你训练好的h5模型文件。

6.测试

使用训练好的模型进行测试

在你的项目根目录下创建 fortest.py 文件（与train.py文件同在一个目录即可）

然后把下面的代码复制进去：

# -*- coding: utf-8 -*-
import os
import sys
import random
import math
import numpy as np
import skimage.io
import matplotlib
import matplotlib.pyplot as plt
import cv2
import time
from mrcnn.config import Config
from datetime import datetime

os.environ['CUDA_VISIBLE_DEVICES'] = "0"
# Root directory of the project
ROOT_DIR = os.getcwd()

# Import Mask RCNN
sys.path.append(ROOT_DIR)  # To find local version of the library
from mrcnn import utils
import mrcnn.model as modellib
from mrcnn import visualize

# Import COCO config
# sys.path.append(os.path.join(ROOT_DIR, "samples/coco/"))  # To find local version
# from samples.coco import coco


# Directory to save logs and trained model
MODEL_DIR = os.path.join(ROOT_DIR, "logs")

# Local path to trained weights file
COCO_MODEL_PATH = os.path.join(MODEL_DIR, "shapes20201014T2037/mask_rcnn_shapes_0200.h5")
# Download COCO trained weights from Releases if needed
if not os.path.exists(COCO_MODEL_PATH):
    utils.download_trained_weights(COCO_MODEL_PATH)
    print("cuiwei***********************")

# Directory of images to run detection on
IMAGE_DIR = os.path.join(ROOT_DIR, "images")


class ShapesConfig(Config):
    """Configuration for training on the toy shapes dataset.
    Derives from the base Config class and overrides values specific
    to the toy shapes dataset.
    """
    # Give the configuration a recognizable name
    NAME = "shapes"

    # Train on 1 GPU and 8 images per GPU. We can put multiple images on each
    # GPU because the images are small. Batch size is 8 (GPUs * images/GPU).
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1

    # Number of classes (including background)
    NUM_CLASSES = 1 + 2  # background + 3 shapes

    # Use small images for faster training. Set the limits of the small side
    # the large side, and that determines the image shape.
    IMAGE_MIN_DIM = 320
    IMAGE_MAX_DIM = 384

    # Use smaller anchors because our image and objects are small
    RPN_ANCHOR_SCALES = (8 * 6, 16 * 6, 32 * 6, 64 * 6, 128 * 6)  # anchor side in pixels

    # Reduce training ROIs per image because the images are small and have
    # few objects. Aim to allow ROI sampling to pick 33% positive ROIs.
    TRAIN_ROIS_PER_IMAGE = 100

    # Use a small epoch since the data is simple
    STEPS_PER_EPOCH = 100

    # use small validation steps since the epoch is small
    VALIDATION_STEPS = 50


# import train_tongue
# class InferenceConfig(coco.CocoConfig):
class InferenceConfig(ShapesConfig):
    # Set batch size to 1 since we'll be running inference on
    # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1


config = InferenceConfig()

model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)

# Create model object in inference mode.
model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)

# Load weights trained on MS-COCO
model.load_weights(COCO_MODEL_PATH, by_name=True)

# COCO Class names
# Index of the class in the list is its ID. For example, to get ID of
# the teddy bear class, use: class_names.index('teddy bear')
class_names = ['BG', '6mm', 'car']
# Load a random image from the images folder
file_names = next(os.walk(IMAGE_DIR))[2]
image = skimage.io.imread("images/car_demo.jpg")

a = datetime.now()
# Run detection
results = model.detect([image], verbose=1)
b = datetime.now()
# Visualize results
print("shijian", (b - a).seconds)
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'],
                            class_names, r['scores'])

需要修改3处路径信息

1）模型文件路径

34行：就是logs目录下的h5文件