Filter Grafting for Deep Neural Networks

Filter Grafting for Deep Neural Networksgithub地址https://github.com/fxmeng/filter-graftingAbstract文章提出While filter pruning removes these invalid filters for efficiency consideration, filter grafting...

CV/NLP大虾

364人浏览 · 2020-04-07 10:22:49

CV/NLP大虾 · 2020-04-07 10:22:49 发布

Filter Grafting for Deep Neural Networks

文章地址：https://arxiv.org/pdf/2001.05868.pdf
github地址https://github.com/fxmeng/filter-grafting
Abstract

文章提出While filter pruning removes these invalid filters for efficiency consideration, filter grafting re-activates them from an accuracy boosting perspective.表示通过将有用的filters替换无用的filters，则可以提升准确率。
To better perform the grafting process,we develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks。
除此之外，文章还提出了 entropy-based criterion 的filters评估方式和adaptive weighting strategy 来平衡多个network之间filters替换。

Introduction
Removing certain filters could accelerate the inference of DNNs without hurting much performance. This discovery inspires many works studying how to decide which filters are unimportant [13] and how to effectively remove the filters with tolerable performance drop。
提出了两个疑问：

it is unclear that whether directly abandoning such filters and components is the best choice
Besides, given multiple networks, it is unclear whether one network can learn from the others.

Related Work

Filter Pruning.
剪枝的作用：Filter pruning aims to remove the invalid filters to accelerate the inference of the network；剪枝采用的评估方式 L1 norm criterion、Principal Component Analysis (PCA) 、s subspace clustering to feature maps
Distillation and Mutual Learning
RePr

其中 Distillation and Mutual Learning 和 RePr 和文章提出的思路差多，区别见上表。

Filter Grafting

1 . Information Source for Grafting
文章认为the invalid filters have smaller l1 norm and have little effects for the output，但是实验证明：But after grafting, the invalid filters have larger l1 norm and begin to make more effects to DNNs
2 Internal Filters 间的替换
-
3.External Filters as Scions
文章提出了In response to the shortcomings of adding random noise and weights inside a single network, we select external filters from other networks as scions.

Criterions for Calculating Information of Filters and Layers

L1 norm
Entropy
While l1 norm criterion only concentrates on the absolute value of filter’s weight, we pay more attention to the variation of the weight。
. Adaptive Weighting in Grafting

Experiment

Selecting Useful Information Source
- Comparison of L1 norm & Entropy Criterions
- . Evaluation of Training Diversity in Grafting
- Comparing Grafting with Other Methods
Grafting with Multiple Networks