Last time my update was about moving forward with feature extraction and logistic regression classification upon having completed installation of VLFeat.
The yellow circles represent the output values mentioned above, and the green boxes represent descriptor vectors calculated using local gradient information.
These raw position/scale/orientation values won't immediately work for logistic regression purposes because each value on its own doesn't mean anything - I need a one-value representation of each SIFT frame. A way of doing this is generating a "codebook" of similar features across the training data and using the codebook to generate histograms for classification. This concept is visualized below from Kinnunen et al. 2009:
A commonly used codebook creation method is to run k-means and then use the cluster centers as a codevectors (for SIFT we could run k-means on the descriptor vectors). Defining a number of codevectors, one can then define similarity bins centered around each codevector.
So for next week I'll focus on generating a codebook of SIFT features, creating feature histograms for each image, and running logistic regression on SIFT and HoC.
My MATLAB aside:
MATLAB can store images in 64-bit (double), 32-bit (single), 16-bit (uint16), or 8-bit (uint8) form. double is MATLAB's standard 64-bit representation of numeric data. Storing images in uint8 form is a good idea for using less storage space, but many MATLAB operations are carried out on the double form.
To convert an image I to a double:
I = im2double(I)
The same can be carried out with any of the image types:
I = im2uint8(I)
I = im2uint16(I)
I = im2single(I)
Note that VLFeat asks for image input in single form.






















