Wednesday, May 16, 2012

Classification Fail

Classification Results


The only somewhat promising classification result I attained was using color histograms to classify by genre. The error rate is at around .5 (at least below chance!) and area under the ROC curve was highest for DietFitness (~.8), second highest for History (~.75), and lowest for Romance and SciFiFantasy (~.65).



Using LDA-reduced SIFT histograms for AdaBoost classification is performing at around chance. Error rate (number of misclassified images/ total number of images) never goes beneath .7 on testing data, where chance would be .75 for 4 classes, for classification by both rating and genre. Area under the ROC curve also hovers at around .5 for all classes.

I took a look at the LDA-projected data and it's clear that there is not good class separation for SIFT histograms (for class definition by either genre or rating).

Figure 1: First two dimensions of LDA-reduced SIFT descriptors (color-coded by rating)


Figure 2: First two dimensions of LDA-reducted SIFT descriptors (color coded by genre)

This could mean that a) SIFT features are not doing a good job of capturing class differences or b) there are not clear class differences based on cover image.

In support of (b) (that there are not good class differences), I also tried AdaBoost with the color histograms as feature input and class labels by rating, with very similar results (~.7 error rate on testing data), area under ROC curve ~.5 for all classes.


To Do


- Get actual ROC curves/ confusion matrices for classification
- Take a look at which colors are good between-genre specifiers
- Try classification by year/number of reviews
- Suggestions?

Matlab Tip


Check it out:


keyboard

A life-changing debugging tool.


Preprocessing


SIFT descriptors were attained using VLFeat, and then reduced to k-1 dimensions (where k=number of classes) using Linear Discriminant Analysis. The LDA output was then histogram-ed to make the final features for input to AdaBoost.


Labels


Rating


Based on the distribution of labels for my set of book cover images, for ratings ranging from 1 to 4, (see Figure 3) I chose 4 classes:

1) "Bad" rating <= 3.5
2) "Okay" 3.5 < rating <= 4
3) "Good" 4 < rating <-= 4.5
4) "Great" rating > 4.5



Genre

I'm still working with 4 different genres:

1) Romance
2) History
3) DietFitness
4) SciFiFantasy


Histogram Construction


In case of nonparametric distribution of projected SIFT data, I chose a bin width W determined by the interquartile range of the histogram-ed data and number of samples (average number of SIFT descriptors per image ~= 700)


W = 2 (IQRN-1/3






No comments:

Post a Comment