Wednesday, May 2, 2012

SIFT histograms, EMD, and classification

I found a paper Home Interior Classification Using SIFT Keypoint Histograms where the authors used SIFT histograms to classify indoor scenes. This was great because they had the same problem that I had, and implemented and compared a few different solutions.

For clarification, right now I'm playing around with a toy dataset of 40 images (10 from 4 genres).

The problem is that for 40 images, I get about 26,000 SIFT keypoints, with 128 descriptor dimensionality. One way of comparison could be to classify each of the key points individually and then match images with the same class with the most matching key points, but that's very computationally expensive.

In the aforementioned Ayers and Boutell paper, the optimal solution to the above problem that they found was reducing the dimensionality of the SIFT descriptors from 128 to c-1 (where c is number of classes) using Fisher's Linear Discriminant, creating histograms from the lower-dimensionality SIFT descriptors, and then classifying the images using AdaBoost.

I thought I'd start there, so I used Linear Discriminant Analysis (LDA) (which maximizes the between-class variance and minimizes the within-class variance for projection onto a c-1 dimensional vector) to reduce the dimensionality of the key points, and then made histograms with 3 bins for each dimension and ran AdaBoost on the results. It gets reasonable output right now, I just need to make sure I'm using LDA correctly and do cross-validation, and play around with histogram binning.

A problem with AdaBoost that I've run into is that it gets very, very slow with ~250 features as input. Does anybody know the time complexity on AdaBoost?

In this case, AdaBoost eliminates the need to compare histograms directly. However, if I were to do so (which I probably will at some point), Serge pointed me in the direction of Earth Mover's Distance (EMD) as a distance metric between histograms. This is better than the histogram intersection method I previously used for color histograms, because it takes bin distance/local bin similarity into account.

Something I could try with EMD is to use it to get cluster centers using k-means, and then use a nearest mean classifier.

Histograms, histograms everywhere!

Preliminary classification results will definitely be out next week.

No comments:

Post a Comment