Tim Bakker

University of Amsterdam
Institute of Informatics, AM Lab

EDL P16-25 P4: Deep Learning for High-Tech Systems and Materials

Research assignment:
Active learning for data-efficient deep learning

In real life processes, acquiring properly labeled data for supervised learning is difficult and expensive. Data collection processes frequently require human experts to inspect or propose annotations, and often result in data sets containing imbalanced clusters.
Active learning is a machine learning method in the subfield of semi-supervised learning that aims to tackle this problem. At its core, active learning leverages a small correctly labeled dataset to propose suggestions of which unlabelled data points to annotate. These suggestions help human experts focus their labeling efforts to data points that are important for achieving further model improvement.
Variants of active learning techniques have applications in widespread high-tech systems, such as MR imaging, and our primary use case: TATA Steel’s steel production pipeline, where correctly classifying observed defects in the produced steel plates is essential.
Our goal is to analyse existing and develop novel active learning methods for tackling these – and other – problems, with a focus on dealing with imbalanced clusters


Personal information: