TransWikia.com

What's the intuition behind contrastive learning or approach?

Cross Validated Asked on November 18, 2021

Maybe a noobs query, but recently I have seen a surge of papers w.r.t contrastive learning (a subset of semi-supervised learning).

Some of the prominent and recent research papers which I read, which detailed this approach are:

Could you guys give a detailed explanation of this approach vs transfer learning and others?
Also, why it’s gaining traction amongst the ML research community?

2 Answers

Contrastive learning is very intuitive. If I ask you to find the matching animal in the photo below, you can do so quite easily. You understand the animal on left is a "cat" and you want to find another "cat" image on the right side. So, you can contrast between similar and dissimilar things.

enter image description here

Contrastive learning is an approach to formulate this task of finding similar and dissimilar things for a machine. You can train a machine learning model to classify between similar and dissimilar images. There are various choices to make ranging from:

  1. Encoder Architecture: To convert the image into representations
  2. Similarity measure between two images: mean squared error, cosine similarity, content loss
  3. Generating the Training Pairs: manual annotation, self-supervised methods

This blog post explains the intuition behind contrastive learning and how it is applied in recent papers like SimCLR in more detail.

Answered by Amit Chaudhary on November 18, 2021

Contrastive learning is a framework that learns similar/dissimilar representations from data that are organized into similar/dissimilar pairs. This can be formulated as a dictionary look-up problem.

Both MoCo and SimCLR use varients of a contrastive loss function, like InfoNCE from the paper Representation Learning with Contrastive Predictive Coding

begin{eqnarray*} mathcal{L}_{q,k^+,{k^-}}=-logfrac{exp(qcdot k^+/tau)}{exp(qcdot k^+/tau)+sumlimits_{k^-}exp(qcdot k^-/tau)} end{eqnarray*}

Here q is a query representation, $k^+$ is a representation of the positive (similar) key sample, and ${k^−}$ are representations of the negative (dissimilar) key samples. $tau$ is a temperature hyper-parameter. In the instance discrimination pretext task (used by MoCo and SimCLR), a query and a key form a positive pair if they are data-augmented versions of the same image, and otherwise form a negative pair. SimCLR vs MoCo

The contrastive loss can be minimized by various mechanisms that differ in how the keys are maintained.

In an end-to-end mechanism (Fig. 1a), the negative keys are from the same batch and updated end-to-end by back-propagation. SimCLR, is based on this mechanism and requires a large batch to provide a large set of negatives.

In the MoCo mechanism i.e. Momentum Contrast (Fig. 1b), the negative keys are maintained in a queue, and only the queries and positive keys are encoded in each training batch.

Quoted from a recent research paper, Improved Baselines with Momentum Contrastive Learning @ https://arxiv.org/abs/2003.04297

Answered by CATALUNA84 on November 18, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP