使用tensorflow训练模型的过程中需要关注学习曲线避免模型过拟合,借助tensorboard可以非常方便地对训练过程中的各种数据进行可视化分析;幸运的是,借助tensorboardX我们也可以在pytorch中得到相同的可视化体验(虽然在pytorch下使用tensorboardX,但其内核是 tensorflow里面的board,所以安装之前得先安装 tensorflow)

1 安装必备的工具

pip install tensorboardX
pip install tensorboard
pip install tensorflow

2 在代码中使用举例

在使用的时候需要创建一个SummaryWriter的实例:

from tensorboardX import SummaryWriter

# 定义一个SummaryWriter的实例,使用writer记录的数据将保存在Result这个文件夹内
writer = SummaryWriter('./Result')

"""
使用add_scalar添加数字记录

tag (string): 数据名称,不同名称的数据使用不同曲线展示
scalar_value (float): 数字常量值
global_step (int, optional): 训练的 step
walltime (float, optional): 记录发生的时间,默认为 time.time()
"""
writer.add_scalar(tag_name="tag_name", scaler_value=value, global_step=cur_step)

SummaryWriter有很多的参数可以配置,常用的有:

  • logdir:指定数据保存的位置,如果没有指定这个参数将会在当前目录下新建一个runs文件夹,结果会自动保存在这里
  • comment:如果没有指定logdir,comment字段的内容将会被自动添加到日志文件的末尾;如果制定了logdir,这个参数无效

具体的可以看下面的文档:

SummaryWriter(
    logdir=None,
    comment='',
    purge_step=None,
    max_queue=10,
    flush_secs=120,
    filename_suffix='',
    write_to_disk=True,
    log_dir=None,
    **kwargs,
)

Docstring:     
Writes entries directly to event files in the logdir to be
consumed by TensorBoard.

The `SummaryWriter` class provides a high-level API to create an event file
in a given directory and add summaries and events to it. The class updates the
file contents asynchronously. This allows a training program to call methods
to add data to the file directly from the training loop, without slowing down
training.
Init docstring:
Creates a `SummaryWriter` that will write out events and summaries
to the event file.

Args:
    logdir (string): Save directory location. Default is
      runs/**CURRENT_DATETIME_HOSTNAME**, which changes after each run.
      Use hierarchical folder structure to compare
      between runs easily. e.g. pass in 'runs/exp1', 'runs/exp2', etc.
      for each new experiment to compare across them.
    comment (string): Comment logdir suffix appended to the default
      ``logdir``. If ``logdir`` is assigned, this argument has no effect.
    purge_step (int):
      When logging crashes at step :math:`T+X` and restarts at step :math:`T`,
      any events whose global_step larger or equal to :math:`T` will be
      purged and hidden from TensorBoard.
      Note that crashed and resumed experiments should have the same ``logdir``.
    max_queue (int): Size of the queue for pending events and
      summaries before one of the 'add' calls forces a flush to disk.
      Default is ten items.
    flush_secs (int): How often, in seconds, to flush the
      pending events and summaries to disk. Default is every two minutes.
    filename_suffix (string): Suffix added to all event filenames in
      the logdir directory. More details on filename construction in
      tensorboard.summary.writer.event_file_writer.EventFileWriter.
    write_to_disk (boolean):
      If pass `False`, SummaryWriter will not write to disk.

Examples::

    from tensorboardX import SummaryWriter

    # create a summary writer with automatically generated folder name.
    writer = SummaryWriter()
    # folder location: runs/May04_22-14-54_s-MacBook-Pro.local/

    # create a summary writer using the specified folder name.
    writer = SummaryWriter("my_experiment")
    # folder location: my_experiment

    # create a summary writer with comment appended.
    writer = SummaryWriter(comment="LR_0.1_BATCH_16")
    # folder location: runs/May04_22-14-54_s-MacBook-Pro.localLR_0.1_BATCH_16/

使用下面的完整例子体验一下:

运行下面的代码之后会在当前目录下得到一个runs文件夹,复制这个路径:

from tensorboardX import SummaryWriter
import random

writer = SummaryWriter('./runs/examples')

for i in range(100):
    writer.add_scalar('example1', i**2, global_step=i)
    writer.add_scalar('example2', random.random(), global_step=i)

之后到控制台输入:

tensorboard --logdir=YOUR_PATH

一切正常的话,会得到一个local_host的地址如这样:http://desktop-chh6vvf:6006/ 或者 http://localhost:6006/,复制这个地址到浏览器就能看到这样的界面了:

在这里可以点开你关注的数据观察~

3 其他功能

除此之外,还可以使用add_image(),add_histogram(),add_graph(),add_embedding()等在tensorboard查看图片,直方图,模型结构图,emedding结果等。

如运行下面的代码将会得到下图的结构:

class MultipleInput(nn.Module):
    def __init__(self):
        super(MultipleInput, self).__init__()
        self.Linear_1 = nn.Linear(3, 5)


    def forward(self, x, y):
        return self.Linear_1(x+y)

with SummaryWriter(comment='MultipleInput') as w:
    w.add_graph(MultipleInput(), (torch.zeros(1, 3), torch.zeros(1, 3)), True)

更多的代码可以看TensorboardX官方示例代码:https://github.com/lanpa/tensorboardX/blob/master/examples/demo_graph.py

 

参考:

https://blog.csdn.net/bigbennyguo/article/details/87956434

https://zhuanlan.zhihu.com/p/54947519

Logo

开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!

更多推荐