The University of Southampton

Project: Achieving Progressive Intelligence for Low Power Systems 

Key information:

Student Christopher Subia-Waud
Academic Supervisor
Cohort  1
Pure Link  Active Project


It is the movement of data, not arithmetic operations, that dominate the energy costs of deep learning inference calculations. In this work, we focus on reducing these data movements costs by reducing the number of unique weights in a network. The thinking goes that if the number of unique weights was to be kept small enough, then the entire network could be distributed and stored on processing elements (PEs) within accelerators, and the data movement costs for weight reads substantially reduced.

To this end, we investigate the merits of a method, which we call Weight Fixing Networks (WFN). We design the approach to realise four model outcome objectives: i) very few unique weights, ii) low-entropy weight encodings, iii) unique weight values which are amenable to energy-saving versions of hardware multiplication, and iv) lossless task-performance.

Some of these goals are conflicting. To best balance these conflicts, we combine a few novel (and some well-trodden) tricks; a novel regularisation term, (i, ii) a view of clustering cost as relative distance change (i, ii, iv), and a focus on whole-network re-use of weights (i, iii). The method is applied iteratively, and we achieve state-of-the-art (SOTA) results across relevant metrics. Our Imagenet experiments demonstrate lossless compression using 50x fewer unique weights and half the weight-space entropy than SOTA quantisation approaches.