DOI: 10.14704/nq.2018.16.6.1612

Automatic Image Annotation via Combining Low-level Colour Feature with Features Learned from Convolutional Neural Networks

Yi Lin, Honggang Zhang


In this paper, a feature combination approach to annotate and retrieve images is proposed. In addition to using low-level colour features from original images, we extract features learned from convolutional neural networks (CNNs). We find these two sets are complementary to each other in conducting automatic image annotation (AIA). For both single-label CIFAR-10 and multi-label COREL-5K AIA tasks, the CNN-learned features perform slightly better than the low-level image features. Finally, when combining the two feature sets as inputs into the deep neural network-based AIA systems, we obtain the best performance in both cases.


Automatic Image Annotation, Deep Learning, Convolutional Neural Networks, Feature Combination

Full Text:



Barnard K, Duygulu P, Forsyth D, Freitas ND, Blei DM, Jordan MI. Matching words and pictures. Journal of Machine Learning Research 2003; 3(2): 1107–35.

Duygulu P, Barnard K, Freitas JFGD, Forsyth DA. Object recognition as machine translation, Learning a lexicon for a fixed image vocabulary. European Conference on Computer Vision 2002; 2353(6): 97-112.

Globerson A, Roweis ST. Metric learning by collapsing classes. Advances in Neural Information Processing Systems 2006: 451-58.

Grangier D, Bengio S. A discriminative kernel based approach to rank images from text queries. IEEE transactions on Pattern Analysis and Machine Intelligence 2008; 30(8): 1371-84.

Guillaumin M, Mensink T, Verbeek J, Schmid C. Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. IEEE International Conference on Computer Vision In ICCV 2009; 30(2), 309-16.

Gupta A, Verma Y, Jawahar CV. Choosing linguistics over vision to describe images. AAAI Conference on Artificial Intelligence 2012; 5(1): 606-12.

He X, Zemel RS. Multiscale conditional random fields for image labeling, IEEE Computer Society Conference on Computer Vision & Pattern Recognition 2004: 695-703.

Jeon J, Lavrenko V, Manmatha R. Automatic image annotation and retrieval using cross-media relevance models. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval 2003; 2003: 119-26

Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 2012; 60(2): 1097-05.

Liu J, Li M, Liu Q, Lu H, Ma S. Image antation via graph learning. Pattern recognition 2009; 42(2): 218-28.

Makadia A, Pavlovic V, Kumar S. A new baseline for image an- tation. European Conference on Computer Vision 2008; 5304: 316-29.

McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics 1943; 5(4): 115-33.

Murthy VN, Can EF, Manmatha R. A hybrid model for automatic image annotation. International Conference on Multimedia Retrieval 2014; 2014: 369-76.

Nakayama H. Linear distance metric Learning for large-scale generic image recognition. PhD thesis, The University of Tokyo Japan, 2011.

Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. Nature 1986; 323(6088): 533-38.

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. 2014.

Zhang S, Huang J, Huang Y. Automatic image antation using group sparsity. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2010; 119(5): 3312-19.

Supporting Agencies

| NeuroScience + QuantumPhysics> NeuroQuantology :: Copyright 2001-2018