Filter Grafting for Deep Neural Networks
Filter Grafting for Deep Neural Networksgithub地址https://github.com/fxmeng/filter-graftingAbstract文章提出While filter pruning removes these invalid filters for efficiency consideration, filter grafting...
Filter Grafting for Deep Neural Networks
文章地址:https://arxiv.org/pdf/2001.05868.pdf
github地址https://github.com/fxmeng/filter-grafting
Abstract
文章提出While filter pruning removes these invalid filters for efficiency consideration, filter grafting re-activates them from an accuracy boosting perspective.表示通过将有用的filters替换无用的filters,则可以提升准确率。
To better perform the grafting process,we develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks。
除此之外,文章还提出了 entropy-based criterion 的filters评估方式和adaptive weighting strategy 来平衡多个network之间filters替换。
Introduction
Removing certain filters could accelerate the inference of DNNs without hurting much performance. This discovery inspires many works studying how to decide which filters are unimportant [13] and how to effectively remove the filters with tolerable performance drop。
提出了两个疑问:
- it is unclear that whether directly abandoning such filters and components is the best choice
- Besides, given multiple networks, it is unclear whether one network can learn from the others.
Related Work
- Filter Pruning.
剪枝的作用:Filter pruning aims to remove the invalid filters to accelerate the inference of the network;剪枝采用的评估方式 L1 norm criterion、Principal Component Analysis (PCA) 、s subspace clustering to feature maps - Distillation and Mutual Learning
- RePr
其中 Distillation and Mutual Learning 和 RePr 和文章提出的思路差多,区别见上表。
Filter Grafting
- 1 . Information Source for Grafting
文章认为the invalid filters have smaller l1 norm and have little effects for the output,但是实验证明:But after grafting, the invalid filters have larger l1 norm and begin to make more effects to DNNs - 2 Internal Filters 间的替换
- - 3.External Filters as Scions
文章提出了In response to the shortcomings of adding random noise and weights inside a single network, we select external filters from other networks as scions.
Criterions for Calculating Information of Filters and Layers
-
L1 norm
-
Entropy
While l1 norm criterion only concentrates on the absolute value of filter’s weight, we pay more attention to the variation of the weight。
-
. Adaptive Weighting in Grafting
Experiment
- Selecting Useful Information Source
- Comparison of L1 norm & Entropy Criterions
- . Evaluation of Training Diversity in Grafting
- Comparing Grafting with Other Methods
- Comparison of L1 norm & Entropy Criterions
- Grafting with Multiple Networks
Conclusion and Discussi
作者认为 grafting algorithm有以下两个优点:
- How to choose proper criterion to calculate the inherent information of filters in DNNs.
- How to balance the coefficients of information among networ
开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!
更多推荐
所有评论(0)