After initially attempting to write my own color histogram code (wouldn't recommend it), I found some effective code (getPatchHist.m) from a computer vision source code website, the method of which matched what I'd seen elsewhere. Essentially RGB values for each image are transformed into one value using a weighted sum of RGB values at each pixel. These values are then counted into a series of bins for each image.
Using 16^3 bins, the color histogram for Adora looks like:
The Histogram Intersection between each image was then computed by taking the sum of the intersection between all the color bins of each image, followed by summing and normalizing them.
Taking a look at the histogram intersections using imagesc, we can double-check that histograms intersect completely with themselves (diagonal values are 1). We also see that most color histograms are only distantly related, with a few that are very similar.
Running k-means on this intersection matrix, using 4 clusters and taking the closest 8 images from each cluster center, we get the following:
Cluster 1
Big black areas: 33% Romance, 37% History, 13% DietFitness, 17% SciFiFantasy
Cluster 2
Pastel colors: 24% Romance, 28% History, 36% DietFitness, 12% SciFiFantasy
Cluster 3
White with font: 11% Romance, 28% History, 56% DietFitness, 1% SciFiFantasy
Cluster 4
Bright colors: 29% Romance, 24% History, 21% DietFitness, 27% SciFiFantasy
Cluster Visualization
Taking the first 2 principle components of the histogram intersection matrix, and color-coding them according to cluster, we get:
Goals for Wed
- get familiar with VLfeat for more features
Using 16^3 bins, the color histogram for Adora looks like:
The Histogram Intersection between each image was then computed by taking the sum of the intersection between all the color bins of each image, followed by summing and normalizing them.
Taking a look at the histogram intersections using imagesc, we can double-check that histograms intersect completely with themselves (diagonal values are 1). We also see that most color histograms are only distantly related, with a few that are very similar.
Running k-means on this intersection matrix, using 4 clusters and taking the closest 8 images from each cluster center, we get the following:
Cluster 1
Big black areas: 33% Romance, 37% History, 13% DietFitness, 17% SciFiFantasy
Cluster 2
Pastel colors: 24% Romance, 28% History, 36% DietFitness, 12% SciFiFantasy
Cluster 3
White with font: 11% Romance, 28% History, 56% DietFitness, 1% SciFiFantasy
Cluster 4
Bright colors: 29% Romance, 24% History, 21% DietFitness, 27% SciFiFantasy
Cluster Visualization
Taking the first 2 principle components of the histogram intersection matrix, and color-coding them according to cluster, we get:
Goals for Wed
- get familiar with VLfeat for more features









No comments:
Post a Comment