EPFL
 Biomedical Imaging GroupSTI
EPFL
  Publications
English only   BIG > Publications > Deep Splines


 CONTENTS
 Home Page
 News & Events
 People
 Publications
 Tutorials and Reviews
 Research
 Demos
 Download Algorithms

 DOWNLOAD
 PDF
 Postscript
 All BibTeX References

Learning Activation Functions in Deep (Spline) Neural Networks

P. Bohra, J. Campos, H. Gupta, S. Aziznejad, M. Unser

IEEE Open Journal of Signal Processing, vol. 1, pp.295-309, November 19, 2020.



We develop an efficient computational solution to train deep neural networks (DNN) with free-form activation functions. To make the problem well-posed, we augment the cost functional of the DNN by adding an appropriate shape regularization: the sum of the second-order total-variations of the trainable nonlinearities. The representer theorem for DNNs tells us that the optimal activation functions are adaptive piecewise-linear splines, which allows us to recast the problem as a parametric optimization. The challenging point is that the corresponding basis functions (ReLUs) are poorly conditioned and that the determination of their number and positioning is also part of the problem. We circumvent the difficulty by using an equivalent B-spline basis to encode the activation functions and by expressing the regularization as an ℓ1-penalty. This results in the specification of parametric activation function modules that can be implemented and optimized efficiently on standard development platforms. We present experimental results that demonstrate the benefit of our approach.


@ARTICLE(http://bigwww.epfl.ch/publications/bohra2003.html,
AUTHOR="Bohra, P. and Campos, J. and Gupta, H. and Aziznejad, S. and
        Unser, M.",
TITLE="Learning Activation Functions in Deep (Spline) Neural Networks",
JOURNAL="{IEEE} Open Journal of Signal Processing",
YEAR="2020",
volume="1",
number="",
pages="295--309",
month="November 19,",
note="")

© 2020 The Authors. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from The Authors.
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.