Skip to content

luciensc/ContrastiveMNIST

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contrastive MNIST Experiments

Goal

Goal: obtaining a smooth 2D latent representation of MNIST images that clusters naturally by digits, without using the labels (SSL).

Performed a few simple experiments focusing on the following questions:

  • What augmentations are conducive to a meaningful latent space?
  • Contrastive loss vs reconstructive loss
  • Various other parameters: temperature, batch size, regularisations on latent space, ...

Evaluation by visual inspection of latent space and knn accuracy on a val set with k = 1, 5, 20

Key Findings

  • Contrastive vs. reconstructive loss: contrastive loss works much better. Uses latent space more efficiently, and has better performance on knn.

    • Interpretation: reconstruction requires encoding a lot more info in latent space than would be necessary for class identity, which we care about.
  • Subtle augmentations that mimic "natural" variation between digits work better than more "artificial" ones.

    • Interpretation: data augmentation allows us to guide the model to learn invariances. If the invariances correspond to within-label variance, the representation gravitates towards clustering by labels.
  • L2 regularisation improves knn performance

    • Interpretation: L2 regularisation forces the model to cluster more tightly, improving global coherence of the representation.

Examples / Figures

Latent Space (Best Config)

Latent space at epoch 50 with L2 regularisation

Clear clusters of similar digits. For example, the "1" (orange) cluster is well separated. Local alignment is strong, though the global structure still has room for improvement.

Augmentations in the Best Config

Augmentations in the good configuration

Original images are shown in the left-most column; remaining columns display applied augmentations.

Training Behaviour (Best Config)

Training curves for the good configuration

Training Behaviour (Bad config - unsuitable augments)

Training curves with unsuitable augmentations

Augmentation Examples (Bad config - unsuitable augments)

Augmentation examples with strong affine transforms

Strong positional translations didn't work well. Presumably, this is due to MNIST images being well-centered, so the generated variation is not useful for our goal.

Next Steps

  • might be interesting to find ways to perturb the latent space, ideally leading to more global coherence. unclear how though.
  • maybe some VAE-inspired distribution embedding per image, instead of a single point?

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages