ALGO Talk - Multiobjective optimization methods for neural network training

Talk by Kathrin Klamroth, University of Wuppertal, 13 October 2021

Abstract:

The training of neural networks is a topical example of a large-scale optimization problem for which the specification of “the one” optimization goal remains a big challenge. Indeed, classical error functions like, for example, the mean squared error or the cross entropy often lead to overfitting and/or vulnerability to adversarial attacks and data errors. This can partially be explained by the fact that error functions typically ignore relevant optimization criteria such as the robustness of the neural network or its complexity. Particularly network complexity is a pivotal criterion, e.g., in applications with limited hardware resources like autonomous driving. Moreover, reducing the network complexity has a regularizing effect and has proven to reduce potential overfitting in practice. In this talk, we consider the simultaneous optimization of an error function and a regularizing cost function in a truly biobjective model. We consider several algorithmic strategies, including a multiobjective stochastic gradient descent algorithm and a bisection enhanced dichotomic search (BEDS) approach that aims at identifying potential knees of the Pareto front. The methods are combined with a pruning strategy that is completely integrated in the training process and which requires only marginal extra computational cost. This provides a new perspective on automated machine learning, helps to reduce the time-consuming determination of hyperparameters, and thus paves the way for an adaptive decision support tool for preferable network architectures.