Skip to content
This repository was archived by the owner on Dec 24, 2025. It is now read-only.

sunsided/texture-extraction

Repository files navigation

Texture Patch Clustering

This repository contains code for the texture patch classification research project. From all given input images, texture patches are extracted and stored into TFRecord files for later processing; this is explained below.

The following experiments can be found:

  • som.py contains a Self-Organizing Maps (Kohonen Maps) clustering approach.
  • ae.py contains a (non-convolutional) tied-weights autoencoder that treats the inputs as an n-dimensional vector and compresses them to some m < n dimensions using standard MLP-like TanH activated axpy operations.
  • cae.py contains a convolutional autoencoder that attempts to do the same using 2D convolutions (sigmoid activation) on 2D RGB images, rather than flat vectors.
  • cae_tw.py like cae.py, but using a tied-weights approach.
  • show_energy.py and show_entropy.py demonstrate energy and entropy based cropping, respectively, and are showcased below.

Energy-/entropy-based cropping

The show_energy.py performs energy based cropping of an image. The following image shows the energy based on a scharr edge detector.

Energy

Alternatively, show_entropy.py determines the local entropy given a window size and crops based on this. The following image shows the entropy given a window of 10 pixels.

Entropy

Patch Extraction

The extract_random_patches.py program loops through all files in the product-pictures bucket of the Google Cloud project determined by the google-credentials.json. It then assigns an int64 ID number ("iid") to each file and maps these out in the data/index.tsv file.

The format of that index is a tab-separated plain-text file containing the ID in the first column and the file name within the bucket as the second column, e.g.

0	000/571e174e9fcd67000f55d000.jpg
1	000/571e28bc9fcd67000f55e000.jpg
2	000/571e74e39fcd67000f562000.jpg
3	000/571e976b9fcd67000f564000.jpg
...
368442	fff/5852c2efb2f46e00010dafff.jpg
368443	fff/5852ec9ab2f46e00010dbfff.jpg
368444	fff/58535a17b2f46e00010ddfff.jpg
368445	socks.jpg

Each of these files is downloaded, cropped using auto-determined energy level threshold and then run through a patch extraction. The algorithm randomly samples RGB patches of sizes 32x32, 64x64 and 128x128, resamples them to 32x32 and discards patches that are not distinct (per batch). Each remaining patch is assigned an increasing int64 ID ("pid") in order to identify the images that contained a specific patch.

The patches are then formatted as TFExample shaped protocol buffers using the following structure:

iid:  int64     the image ID
pid:  int64     the patch ID
raw: [uint8]    the pixel values in columnwise RGB order

The TFExamples are collected and written to gzipped TFRecord files of 100000 example entries each (resulting in approx. 150 MB per file) and stored as data/[0-9]{5,}\.tfrecord\.gz (e.g. data/00000.tfrecord.gz). The load_patches.py file demonstrates loading the examples back in.

Releases

No releases published

Packages

No packages published

Languages