Filter Grafting for Deep Neural Networks

文章地址:https://arxiv.org/pdf/2001.05868.pdf
github地址https://github.com/fxmeng/filter-grafting
Abstract

文章提出While filter pruning removes these invalid filters for efficiency consideration, filter grafting re-activates them from an accuracy boosting perspective.表示通过将有用的filters替换无用的filters,则可以提升准确率。
To better perform the grafting process,we develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks。
除此之外,文章还提出了 entropy-based criterion 的filters评估方式和adaptive weighting strategy 来平衡多个network之间filters替换。

Introduction
Removing certain filters could accelerate the inference of DNNs without hurting much performance. This discovery inspires many works studying how to decide which filters are unimportant [13] and how to effectively remove the filters with tolerable performance drop。
提出了两个疑问:

  1. it is unclear that whether directly abandoning such filters and components is the best choice
  2. Besides, given multiple networks, it is unclear whether one network can learn from the others.
    在这里插入图片描述

Related Work

  • Filter Pruning.
    剪枝的作用:Filter pruning aims to remove the invalid filters to accelerate the inference of the network;剪枝采用的评估方式 L1 norm criterion、Principal Component Analysis (PCA) 、s subspace clustering to feature maps
  • Distillation and Mutual Learning
  • RePr
    在这里插入图片描述
    其中 Distillation and Mutual Learning 和 RePr 和文章提出的思路差多,区别见上表。

Filter Grafting

  • 1 . Information Source for Grafting
    文章认为the invalid filters have smaller l1 norm and have little effects for the output,但是实验证明:But after grafting, the invalid filters have larger l1 norm and begin to make more effects to DNNs
  • 2 Internal Filters 间的替换
    -在这里插入图片描述
  • 3.External Filters as Scions
    文章提出了In response to the shortcomings of adding random noise and weights inside a single network, we select external filters from other networks as scions.
    在这里插入图片描述在这里插入图片描述

Criterions for Calculating Information of Filters and Layers

  • L1 norm
    在这里插入图片描述

  • Entropy
    While l1 norm criterion only concentrates on the absolute value of filter’s weight, we pay more attention to the variation of the weight。
    在这里插入图片描述
    在这里插入图片描述
    在这里插入图片描述

  • . Adaptive Weighting in Grafting
    在这里插入图片描述
    在这里插入图片描述

Experiment

  • Selecting Useful Information Source
    在这里插入图片描述
    • Comparison of L1 norm & Entropy Criterions
      在这里插入图片描述
    • . Evaluation of Training Diversity in Grafting
      在这里插入图片描述
    • Comparing Grafting with Other Methods
      在这里插入图片描述
  • Grafting with Multiple Networks
    在这里插入图片描述

Conclusion and Discussi
作者认为 grafting algorithm有以下两个优点:

  • How to choose proper criterion to calculate the inherent information of filters in DNNs.
  • How to balance the coefficients of information among networ
Logo

开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!

更多推荐