» R5: Efficient DL Architectures

Research line leader: Sander Bohte (CWI)

R5 challenges are:

How to optimize DNNs at architecture level, enabling efficient mappings to low power neural architectures and computational paradigms, without loss of quality?
How to efficiently integrate asynchronous deep learning computation in large scale I/O settings?
How to dynamically optimize and allocate neural computation resources such as coding precision and responsiveness?

To what degree can we improve the energy and computational efficiency of deep neural network architectures themselves, using the building blocks of R4? Techniques include network compression, fusion of layers, breadth/depth tradeoffs, and low rank decomposition. Another source of inspiration is our brain, which is ruthlessly efficient compared to current DL implementations. Various architectural computational principles contribute: 1) neural resources are selectively focused, e.g., inspecting only part of a large image with great precision; 2) communication between neurons is sparse and of limited precision (binary spikes, averaging 1-3Hz, stochastic synaptic weights); 3) computation is asynchronous and event-driven: neural processing is triggered by the presentation of input — asynchronous neural computation greatly simplifies processing large numbers of sensors and actuators; 4) connectivity is continuously optimized as connections are spawned and pruned. Together, these paradigms potentially offer huge efficiency gains and are subject of active research. E.g., huge deep networks of spiking neurons (SNNs) demonstrated state-of-the-art performance, while using sparse firing rates, and within low power envelopes with dedicated hardware.
R5 develops techniques needed to bring these paradigms to efficient real-world applications, while exploiting conventional optimizations and taking into account the structure of the compute platform. To enable efficient large scale asynchronous processing, integration with sensory processing packages has to be included. Additionally, neural resource allocation techniques for sparse and spiking DL need to be developed to dynamically trade-off energy and precision.