Wednesday, April 25, 2012

SIFT -> Codebook -> Features -> Classification

Last time my update was about moving forward with feature extraction and logistic regression classification upon having completed installation of VLFeat

vl_sift outputs X, Y frame center position coordinates, a frame scale coordinate, and a frame orientation in radians. When mapped onto an image, sift features look something like:

The yellow circles represent the output values mentioned above, and the green boxes represent descriptor vectors calculated using local gradient information.

These raw position/scale/orientation values won't immediately work for logistic regression purposes because each value on its own doesn't mean anything - I need a one-value representation of each SIFT frame. A way of doing this is generating a "codebook" of similar features across the training data and using the codebook to generate histograms for classification. This concept is visualized below from Kinnunen et al. 2009:


A commonly used codebook creation method is to run k-means and then use the cluster centers as a codevectors (for SIFT we could run k-means on the descriptor vectors). Defining a number of codevectors, one can then define similarity bins centered around each codevector.

So for next week I'll focus on generating a codebook of SIFT features, creating feature histograms for each image, and running logistic regression on SIFT and HoC.

My MATLAB aside:

MATLAB can store images in 64-bit (double), 32-bit (single), 16-bit (uint16), or 8-bit (uint8) form. double is MATLAB's standard 64-bit representation of numeric data. Storing images in uint8 form is a good idea for using less storage space, but many MATLAB operations are carried out on the double form.

To convert an image I to a double:

I = im2double(I)

The same can be carried out with any of the image types:

I = im2uint8(I)

I = im2uint16(I)

I = im2single(I)

Note that VLFeat asks for image input in single form.


1 comment:

  1. Nice article,Thanks for the help!!! I am working on this in my college project but the project is too big that i can't write the code form scratch,so can you please help me with color SIFT(CSIFT)feature extraction,codebook and the corresponding histogram. I don't think VLFeat includes CSIFT. Can you provide me with the code that takes dataset of images and returns corresponding histogram assinging them words....ITS really urgent,Hope you understand!!!

    ReplyDelete