Pietro GORI

Télécom Paristech

Tutorial 4.B

Self-supervised learning in computer vision and medical imaging

(July 6, 8:30 AM - 12:00 PM)


Many tasks in Computer Vision and Medical Imaging, such as object detection, image classification, or semantic segmentation, have reached astonishing results in the last years. This has been possible mainly because large (N > 10^6) and labeled data-sets were available. When dealing with small, labelled data-sets, a common strategy consists in pre-training a model on a large dataset and then transferring it to the small target dataset. This is commonly called Transfer Learning. Supervised pre-training, namely using a large, labelled dataset, such as ImageNet, is the de facto standard technique. However, recent studies have shown that its usefulness, namely feature reuse, is important only when there is a high visual similarity between pre-training and target domains, namely a small domain gap. This might not be case in many applications, in particular when using 3D data or in Medical Imaging. To reduce the domain gap, several self-supervised pre-training strategies have recently emerged. They leverage annotation-free pretext tasks to provide surrogate supervision signals for feature learning.
In the first part of this tutorial, you will learn the most important and used self-supervised strategies for computer vision and medical imaging. In particular, we will study thoroughly contrastive learning using a geometric approach. In the second part, you will test the methods on both toy exemples and real data using Pytorch.