Research line leader: Henk Corporaal (TUe)

R4 challenges are:

  1. How to design and develop optimized large-scale, energy efficient and flexible processing components for the huge space of DNNs?
  2. How to achieve real-time inference, given the limited power budget of mobile applications?
  3. Efficient DSE, taking the trade-offs between energy, area, quality and flexibility into account.

Performance and low power dissipation are critical in mobile, embedded, real-time and large-scale applications. Modern processing platforms typically contain multiple levels of compute elements, requiring different parallelization strategies. Moreover, minimization of data transports has become essential for performance and energy efficiency. E.g., external DRAM accesses take O(103) more energy than addition operations. Better mappings of DL algorithms and networks onto hardware platforms is key to increase efficiencies. This involves optimizing the hardware itself (e.g. using low power modes and approximate computing units), the compilers (including efficient graph compilation), runtime systems and the DL libraries, and improved exploitation of data and processing locality (e.g. by better layer fusion and partial recomputation). In R4, we develop computationally efficient techniques for DL, including solutions with specialized neuromorphic hardware and FPGAs. We research new mapping techniques for deep inference and learning onto such hardware, which is typically restricted in topology, memory and precision. We compare with performance on commodity architectures like GPUs. These techniques must be investigated in the context of architectural efficient DNNs, as researched in R5. These DNNs will be sparse, irregular, and highly quantized. This, together with the many design parameters of DNNs, requires very flexible processing platforms; otherwise we will observe huge discrepancies between peak and actual performance and efficiency. The huge design space requires research into high-level DSE (Design Space Exploration), using adequate models for energy, area, and inference quality.