Books and Their Covers: SIFT -> Codebook -> Features -> Classification

Last time my update was about moving forward with feature extraction and logistic regression classification upon having completed installation of VLFeat.

vl_sift outputs X, Y frame center position coordinates, a frame scale coordinate, and a frame orientation in radians. When mapped onto an image, sift features look something like:

The yellow circles represent the output values mentioned above, and the green boxes represent descriptor vectors calculated using local gradient information.

These raw position/scale/orientation values won't immediately work for logistic regression purposes because each value on its own doesn't mean anything - I need a one-value representation of each SIFT frame. A way of doing this is generating a "codebook" of similar features across the training data and using the codebook to generate histograms for classification. This concept is visualized below from Kinnunen et al. 2009:

A commonly used codebook creation method is to run k-means and then use the cluster centers as a codevectors (for SIFT we could run k-means on the descriptor vectors). Defining a number of codevectors, one can then define similarity bins centered around each codevector.

So for next week I'll focus on generating a codebook of SIFT features, creating feature histograms for each image, and running logistic regression on SIFT and HoC.

My MATLAB aside:

MATLAB can store images in 64-bit (double), 32-bit (single), 16-bit (uint16), or 8-bit (uint8) form. double is MATLAB's standard 64-bit representation of numeric data. Storing images in uint8 form is a good idea for using less storage space, but many MATLAB operations are carried out on the double form.

To convert an image I to a double:

I = im2double(I)

The same can be carried out with any of the image types:

I = im2uint8(I)

I = im2uint16(I)

I = im2single(I)

Note that VLFeat asks for image input in single form.

1 comment:

raj007November 10, 2013 at 10:44 AM
Nice article,Thanks for the help!!! I am working on this in my college project but the project is too big that i can't write the code form scratch,so can you please help me with color SIFT(CSIFT)feature extraction,codebook and the corresponding histogram. I don't think VLFeat includes CSIFT. Can you provide me with the code that takes dataset of images and returns corresponding histogram assinging them words....ITS really urgent,Hope you understand!!!

Wednesday, April 25, 2012

SIFT -> Codebook -> Features -> Classification

1 comment: