yaml文件写法和加载(1）

yaml 是专门用来写配置文件的语言，个人认为比yaml比json格式更方便。在Python中，使用PyYAML库来处理YAML文件是一种常见的做法。YAML（YAML Ain’t Markup Language）是一种数据序列化格式，它以。

@BangBang

1131人浏览 · 2024-08-13 11:25:09

@BangBang · 2024-08-13 11:25:09 发布

文章目录

yaml 是专门用来写 配置文件的语言，个人认为比 yaml比json格式更方便。在Python中，使用 PyYAML库来处理YAML文件是一种常见的做法。YAML（YAML Ain’t Markup Language）是一种数据序列化格式，它以易于阅读的方式表示数据结构，如哈希表、列表、标量等。

1. yaml 文件的写法

1.1 YAML语法基础

大小写敏感
缩进：YAML使用空格进行缩进，通常是两个空格，不要使用制表符(tab键)。
井号（#）：用于添加注释。
短横线（-）：用于列表的开始。
冒号（:）：用于键值对的分隔。
相同层级的元素左对齐
字符串可以不用引号标注

1.2 yaml的安装

pip install pyyaml

1.3 案例

案例1

test.yaml 的内容如下:

# 这是一个注释
person:
  name: John Doe
  age: 30
  skills:
    - Python
    - YAML
    - Linux
is_employee: true

加载test.yaml

import yaml
with open('test.yaml', 'r') as file:
    try:
        data = yaml.safe_load(file)
        print(data)
    except yaml.YAMLError as exc:
        print(exc)

>> {'person': {'name': 'John Doe', 'age': 30, 'skills': ['Python', 'YAML', 'Linux']}, 'is_employee': True}

可以看出- 加一个空格，表示列表；: 用于键值对的分隔

案例2

列表中嵌套字典

-
 name: 吴彦祖
 age: 21
-
 A: apple

读取yaml文件

import yaml
with open('test.yaml', 'r') as file:
    try:
        data = yaml.safe_load(file)
        print(data)
    except yaml.YAMLError as exc:
        print(exc)

>>  [{'name': '吴彦祖', 'age': 21}, {'A': 'apple'}]

案例3

TASKS:
  - arrow
  - crosswalk
  - lane
  - stopline

EVALUATION:
  - name: LaneBEVEvaluator
    interp_interval: 0.1
    filter_region: [100, 5, 15, -15, 10, -10]
    filter_ignore: False
    # match_distance: [0.6, 0.6, 0]  # xy, xz, rv
    match_distance: [1, 1, 0]  # xy, xz, rv
    match_attribute: [road_curb, lane_type.color, lane_type.dashed, lane_type.shape]
    chamfer_config:
        decay_ratio: 0
        region_range: [100, 5, 15, -15, 10, -10]
        step: 1
        use_common_dt_gt: True
        keep_ratio: 0.9
        mode: d<->g
    eval_attributes: {
        road4: {loc_type: [right1, left2, left1, right2]},
        road2: {loc_type: [right1, left1]},
        road_curb: {road_curb: [1]},
        water_barrier: [{shuima: [1]}, {water_barrier: [1]}],
        yellow_line: {lane_type.color: [yellow]},
        white_line: {lane_type.color: [white]},
        dotted_line: {lane_type.dashed: [dotted]},
        solid_line: {lane_type.dashed: [solid]},
        single_line: {lane_type.shape: [single]},
        double_line: {lane_type.shape: [double]},
        single_white_dotted_line: {
            lane_type.shape: [single],
            lane_type.color: [white],
            lane_type.dashed: [dotted]
        },
        single_white_solid_line: {
            lane_type.shape: [single],
            lane_type.color: [white],
            lane_type.dashed: [solid]
        },
        road4_white: {
            lane_type.color: [white],
            loc_type: [right1, left2, left1, right2]
        },
        road4_yellow: {
            lane_type.color: [yellow],
            loc_type: [right1, left2, left1, right2]
        }
    }
    eval_threshold: 0.3

  - name: StoplineBEVEvaluator
    interp_interval: 0.1
    filter_region: [100, 5, 15, -15, 10, -10]
    filter_ignore: False
    match_distance: [0.6, 0.6, 0]  # xy, xz, rv
    chamfer_config:
        decay_ratio: 0
        region_range: [100, 5, 15, -15, 10, -10]
        step: 1
        use_common_dt_gt: True
        keep_ratio: 0.9
        mode: d<->g
    eval_threshold: 0.3

  - name: CrosswalkBEVEvaluator
    bev_region: [60, 0, 15, -15]
    resolution: [0.15, 0.15]

  - name: ArrowBEVEvaluator
    categories: [forbid_left, forbid_right, forbid_through, forbid_turn, 
                 guide_left, guide_right, guide_through, guide_turn]
    bev_region: [60, 0, 15, -15]
    resolution: [0.15, 0.15]

1.4 案例

yolov8 数据配置: DOTAv1.yaml

# Ultralytics YOLO 🚀, AGPL-3.0 license
# DOTA 1.0 dataset https://captain-whu.github.io/DOTA/index.html for object detection in aerial images by Wuhan University
# Documentation: https://docs.ultralytics.com/datasets/obb/dota-v2/
# Example usage: yolo train model=yolov8n-obb.pt data=DOTAv1.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── dota1  ← downloads here (2GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/DOTAv1 # dataset root dir
train: images/train # train images (relative to 'path') 1411 images
val: images/val # val images (relative to 'path') 458 images
test: images/test # test images (optional) 937 images

# Classes for DOTA 1.0
names:
  0: plane
  1: ship
  2: storage tank
  3: baseball diamond
  4: tennis court
  5: basketball court
  6: ground track field
  7: harbor
  8: bridge
  9: large vehicle
  10: small vehicle
  11: helicopter
  12: roundabout
  13: soccer ball field
  14: swimming pool

# Download script/URL (optional)
download: https://github.com/ultralytics/assets/releases/download/v0.0.0/DOTAv1.zip

```shell
>> {'path': '../datasets/DOTAv1', 'train': 'images/train', 'val': 'images/val', 'test': 'images/test', 'names': {0: 'plane', 1: 'ship', 2: 'storage tank', 3: 'baseball diamond', 4: 'tennis court', 5: 'basketball court', 6: 'ground track field', 7: 'harbor', 8: 'bridge', 9: 'large vehicle', 10: 'small vehicle', 11: 'helicopter', 12: 'roundabout', 13: 'soccer ball field', 14: 'swimming pool'}, 'download': 'https://github.com/ultralytics/assets/releases/download/v0.0.0/DOTAv1.zip'}

2. yaml 文件的读写

2.1 yaml.load() 和 yaml.safe_load()

在Python的PyYAML库中，yaml.safe_load()和yaml.load()都是用来解析YAML文件或字符串的函数，但它们之间存在一些关键的区别：

yaml.safe_load

safe_load用于安全地加载YAML。它只加载YAML中的数据，而忽略任何自定义的YAML标签，这意味着它不会调用任何外部的标签解析器。
它被认为是安全的，因为它不会执行YAML中可能包含的任意代码。
主要用于加载那些不包含自定义标签的YAML文件。
通常推荐使用yaml.safe_load

代码示例

import yaml

with open('data.yaml', 'r') as file:
    try:
        data = yaml.safe_load(file)
    except yaml.YAMLError as exc:
        print(exc)

yaml.load()

load用于加载YAML，它允许加载YAML中的所有数据，包括自定义的YAML标签。
如果YAML文件包含自定义的标签，load将使用相应的加载器来解析这些标签，这可能导致执行YAML文件中定义的任意代码，因此存在安全风险。
通常不推荐使用load，除非你完全信任YAML文件的来源。

代码示例

import yaml

with open('data.yaml', 'r') as file:
    try:
        data = yaml.load(file)
    except yaml.YAMLError as exc:
        print(exc)

yaml.load()默认使用FullLoader，它会加载自定义标签。如果你使用-
yaml.load(Loader=yaml.SafeLoader)，即使文件包含自定义标签，也不会执行它们，但仍然会抛出异常。

安全性

safe_load是安全的，因为它不会执行YAML中的自定义标签。
load可能不安全，因为它允许执行YAML中的自定义标签。

功能

safe_load的功能受限于不解析自定义标签，但它足以处理大多数标准的YAML数据。
load提供了完整的YAML解析功能，包括自定义标签的解析。

使用场景

当你不确定YAML文件的来源或内容时，应该使用safe_load。
当你完全信任YAML文件的来源，并且需要解析自定义标签时，可以使用load。
手动动检查YAML文件的内容。自定义标签通常在YAML文件的开头以!符号开始，后面跟着标签名称。例如：

!!python/object/apply:mymodule.MyClass

2.2 yaml.dump() 和 yaml.safe_dump()

yaml.dump() 和 yaml.safe_dump() 都是 PyYAML 库中用于将 Python 对象序列化为 YAML 格式的字符串或写入到文件的函数。它们之间的主要区别在于安全性和功能：

yaml.dump()

yaml.dump() 可以用来将 Python 对象序列化为 YAML 格式的字符串，并且可以写入到文件中。
它允许使用自定义的Dumper类，这意味着你可以自定义序列化的行为，比如添加自定义的代表或处理器。
yaml.dump() 不是安全的，因为它允许执行 yaml 构造函数，这可能允许执行任意代码。

使用案例

import yaml

data = {'key': 'value'}

# 序列化为 YAML 格式的字符串
yaml_str = yaml.dump(data, default_flow_style=False)
print(yaml_str)

# 写入到文件
with open('data.yaml', 'w') as f:
    yaml.dump(data, f, default_flow_style=False)

yaml.safe_dump()

yaml.safe_dump() 与 yaml.dump() 功能类似，但它不允许执行自定义的 yaml 构造函数。
它是用来序列化那些不包含自定义代表的 Python 对象，因此被认为是安全的。
yaml.safe_dump() 同样可以序列化对象到字符串或写入到文件，但它不支持自定义的Dumper类。
推荐使用 yaml.safe_dump()

代码示例

import yaml

data = {'key': 'value'}

# 安全地序列化为 YAML 格式的字符串
yaml_str = yaml.safe_dump(data, default_flow_style=False)
print(yaml_str)

# 安全地写入到文件
with open('data.yaml', 'w') as f:
    yaml.safe_dump(data, f, default_flow_style=False)

主要区别

安全性：yaml.dump() 不是安全的，因为它可以执行 YAML 中的 !!python/object 等自定义构造函数。yaml.safe_dump() 是安全的，因为它忽略这些自定义构造函数。
自定义：yaml.dump() 允许使用自定义的Dumper类来自定义序列化过程，而 yaml.safe_dump() 不允许。
使用场景：如果你需要执行自定义的序列化逻辑，并且信任你的数据源，可以使用 yaml.dump()。否则，应该使用 yaml.safe_dump() 来避免潜在的安全风险。

安全建议

出于安全考虑，推荐使用 yaml.safe_dump()，除非你有特定的理由需要使用 yaml.dump() 并且完全信任你的数据源。
如果你使用 yaml.dump()，确保不要加载不可信的YAML数据，以避免潜在的代码执行风险。

3. 执行yaml文件中的代码

coco-pose.yam

# Ultralytics YOLO 🚀, AGPL-3.0 license
# COCO 2017 dataset https://cocodataset.org by Microsoft
# Documentation: https://docs.ultralytics.com/datasets/pose/coco/
# Example usage: yolo train data=coco-pose.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── coco-pose  ← downloads here (20.1 GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco-pose # dataset root dir
train: train2017.txt # train images (relative to 'path') 118287 images
val: val2017.txt # val images (relative to 'path') 5000 images
test: test-dev2017.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794

# Keypoints
kpt_shape: [17, 3] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)
flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]

# Classes
names:
  0: person

# Download script/URL (optional)
download: |
  from ultralytics.utils.downloads import download
  from pathlib import Path

  # Download labels
  dir = Path(yaml['path'])  # dataset root dir
  url = 'https://github.com/ultralytics/assets/releases/download/v0.0.0/'
  urls = [url + 'coco2017labels-pose.zip']  # labels
  download(urls, dir=dir.parent)
  # Download data
  urls = ['http://images.cocodataset.org/zips/train2017.zip',  # 19G, 118k images
          'http://images.cocodataset.org/zips/val2017.zip',  # 1G, 5k images
          'http://images.cocodataset.org/zips/test2017.zip']  # 7G, 41k images (optional)
  download(urls, dir=dir / 'images', threads=3)

download key 对应的val 是python 代码，可以使用exec来执行这段代码

import yaml
with open('test.yaml', 'r') as file:
    try:
        data = yaml.safe_load(file)
        print(data)
    except yaml.YAMLError as exc:
        print(exc)
        
exec(data['download'])