Take the sensitivity(std(weight) * s) as threshold and prune the filters with small L1-norm.
Test on cifar10_resnet56_v1 model(top1-acc=93.6%).
- Increasing
s, without regularization
Prune atstep=[0, 1200, 2400, 3600, 4800, 6000, 7200]withs=[.4, .45, .5, .55, .6, .65, .7].

(top1-acc=86.32%, pruned_MAC=45.47%, pruned_params=46.63%) - Fixed
s, with regularization
Prune atstep=[0, 1200, 2400, 3600, 4800, 6000, 7200]with fixeds=.4.
But introduce group-lasso(regularization = (weight ** 2).sum(axis=(1, 2, 3)).sqrt()) to loss function.
(top1-acc=88.54%, pruned_MAC=41.26%, pruned_params=48.24%)