Volume 20 No 12 (2022)
Download PDF
FILTERING INSTAGRAM HASHTAGS THROUGH CROWD TAGGING AND THE HITS ALGORITHM
CHAMALA CHINNA SIVAKUMAR REDDY , KAMBHAM SALIVAHANA REDDY
Abstract
Instagram is a rich source for mining descriptive tags for images and multimedia in general. The tags–
image pairs can be used to train automatic image annotation (AIA) systems in accordance with the
learning by example paradigm. In previous studies, we had concluded that, on average, 20% of the
Instagram hash tags are related to the actual visual content of the image they accompany, i.e., they are
descriptive hash tags, while there are many irrelevant hash tags, i.e., stop-hash tags, that are used
across totally different images just for gathering clicks and for search ability enhancement. In this paper,
we present a novel methodology, based on the principles of collective intelligence that helps in locating
those hash tags. In particular, we show that the application of a modified version of the well-known
hyperlink induced topic search (HITS) algorithm, in a crowd tagging context, provides an effective and
consistent way for finding pairs of Instagram images and hash tags, which lead to representative and
noise-free training sets for content-based image retrieval. As a proof of concept, we used the crowd
sourcing platform Figure-eight to allow collective intelligence to be gathered in the form of tag selection
(crowd tagging) for Instagram hash tags. The crowd tagging data of Figure-eight are used to form
bipartite graphs in which the first type of nodes corresponds to the annotators and the second type to
the hash tags they selected. The HITS algorithm is first used to rank the annotators in terms of their
effectiveness in the crowd tagging task and then to identify the right hash tags per image.
Keywords
HITS, Hash tags, CNN, ML
Copyright
Copyright © Neuroquantology
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Articles published in the Neuroquantology are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJECSE right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.