What's the intuition behind contrastive learning or approach?

Question

Maybe a noobs query, but recently I have seen a surge of papers w.r.t contrastive learning (a subset of semi-supervised learning).
Some of the prominent and recent research papers which I read, which detailed this approach are:

Representation Learning with Contrastive Predictive Coding
@ https://arxiv.org/abs/1807.03748
SimCLR-v1: A Simple Framework for Contrastive Learning of Visual Representations @ https://arxiv.org/abs/2002.05709
SimCLR-v2: Big Self-Supervised Models are Strong Semi-Supervised Learners @ https://arxiv.org/abs/2006.10029
MoCo-v1: Momentum Contrast for Unsupervised Visual Representation Learning @ https://arxiv.org/abs/1911.05722
MoCo-v2: Improved Baselines with Momentum Contrastive Learning @ https://arxiv.org/abs/2003.04297
PIRL: Self-Supervised Learning of Pretext-Invariant Representations @ https://arxiv.org/abs/1912.01991

Could you guys give a detailed explanation of this approach vs transfer learning and others?
Also, why it's gaining traction amongst the ML research community?

Amit Chaudhary · Answer

Contrastive learning is very intuitive. If I ask you to find the matching animal in the photo below, you can do so quite easily. You understand the animal on left is a "cat" and you want to find another "cat" image on the right side. So, you can contrast between similar and dissimilar things.

Contrastive learning is an approach to formulate this task of finding similar and dissimilar things for a machine. You can train a machine learning model to classify between similar and dissimilar images. There are various choices to make ranging from:

Encoder Architecture: To convert the image into representations  
Similarity measure between two images: mean squared error, cosine similarity, content loss  
Generating the Training Pairs: manual annotation, self-supervised methods

This blog post explains the intuition behind contrastive learning and how it is applied in recent papers like SimCLR in more detail.

CATALUNA84 · Answer

Contrastive learning is a framework that learns similar/dissimilar representations from data that are organized into similar/dissimilar pairs. This can be formulated as a dictionary look-up problem.
Both MoCo and SimCLR use varients of a contrastive loss
function, like InfoNCE from the paper Representation Learning with
Contrastive Predictive Coding
begin{eqnarray*}
 mathcal{L}_{q,k^+,{k^-}}=-logfrac{exp(qcdot k^+/tau)}{exp(qcdot
 k^+/tau)+sumlimits_{k^-}exp(qcdot k^-/tau)} end{eqnarray*}

Here q is a query representation, $k^+$ is a representation of the
positive (similar) key sample, and ${k^−}$ are representations of the
negative (dissimilar) key samples. $tau$ is a temperature
hyper-parameter. In the instance discrimination pretext task (used by
MoCo and SimCLR), a query and a key form a positive pair if they are
data-augmented versions of the same image, and otherwise form a
negative pair. 
The contrastive loss can be minimized by various mechanisms that
differ in how the keys are maintained.
In an end-to-end mechanism (Fig. 1a), the negative keys are from
the same batch and updated end-to-end by back-propagation. SimCLR,
is based on this mechanism and requires a large batch to provide a
large set of negatives.
In the MoCo mechanism i.e. Momentum Contrast (Fig. 1b), the
negative keys are maintained in a queue, and only the queries and
positive keys are encoded in each training batch.

Quoted from a recent research paper, Improved Baselines with Momentum Contrastive Learning @ https://arxiv.org/abs/2003.04297

What's the intuition behind contrastive learning or approach?

2 Answers

Add your own answers!

Ask a Question