Abstract: Self-supervised learning aims to learn representations from the data itself without explicit manual supervision. Existing efforts ignore a crucial aspect of self-supervised learning - the ability to scale to large amount of data because self-supervision requires no manual labels. In this work, we revisit this principle and scale two popular self-supervised approaches to 100 million images. Scaling these methods also provides many interesting insights into the limitations of current self-supervised techniques and evaluations. We conclude that current self-supervised methods are not complex enough to take full advantage of large scale data and do not seem to learn effective high level semantic representations. Finally, we show how scaling current self-supervised methods provides state-of-the-art results that sometimes match or surpass supervised representations on tasks such as object detection, surface normal estimation and visual navigation.
Bio: Ishan is a Research Scientist at Facebook AI Research. He graduated from Carnegie Mellon University where his PhD thesis was titled "Visual Learning with Minimal Human Supervision" and got the Runner Up SCS Distinguished Dissertation Award. This work was about learning recognition models with minimal supervision by exploring structure and biases in the labels (multi-task), classifiers (meta learning) and data (self supervision). His current research interests are in self supervised approaches, understanding vision and language models, and in compositional models for small sample learning.
Website - http://imisra.github.io/