Major research topic

Data Generation Methods for Training Deep Learning Detection and Segmentation Models under Scarce Supervision

Abstract

Deep Learning (DL) has proven its great ability in producing models able to tackle a large variety of problems, including object detection and segmentation. However, in order to attain a good performance, DL frameworks need large amounts of data to train domain-specific models; in many real-world applications, obtaining such data may be challenging. This is the reason behind data augmentation strategies, which aim at training powerful DL models when data is scarce for the task at hand. When data is extremely scarce, even these methods may result insufficient. 

This thesis has the goal of developing novel techniques for training models in data scarcity conditions using geometrical and statistical methods combined with generative AI frameworks to create new data to be used for training Computer Vision models. It will focus especially on biomedical imaging, where collecting and annotating application-specific data is costly and time-consuming.  

The methodological approach is based on three steps: ;

    ;
  1. Instance Generation: realistically generate Ground Truth annotations for single objects
  2. ;
  3. Mask Composition: combine GTs created at point 1 to create realistic mask annotations
  4. ;
  5. Texture Synthesis: create a realistic image or volume given the GTs created at point 2 using DL generative frameworks
  6. ;
; In particular, points 1 and 2 are going to exploit geometrical and statistical priors of small datasets, also incorporating expert domain knowledge to compensate for the lack of real data. This separation in phases is also going to help in enhancing the dataset's generation controllability and explainability, whose lack is one of the major gripes in using methods fully based on Deep Learning (especially in medical-related applications). Due to this fact, this research also aims at studying how explainability methods may be included in phase 3 to obtain a fully-explainable generation process. The method is going to be applied for generating annotated datasets in 2D (images), 3D (volumes) and 2D+time or 3D+time (videos).

Back to Current Students

Skip to content