Volume 20 No 8 (2022)
Download PDF
Multi-label Text Document Classification using Fuzzy Rough Set based on Robust Nearest Neighbor Method
Bichitranada Behera , G.Kumaravelan , V.K. Mixymol
Abstract
Multi-label text document classification is more challenging than multi-class classification because it requires
simultaneously assigning many class labels to unlabeled text documents rather than just one. The development of an
efficient method for multi-label classification is continually in demand. On the other hand, the fuzzy rough set (FRS) is
an effective mathematical tool for handling uncertain data, and it has numerous applications in classification,
dimensionality reduction, and feature selection. The Fuzzy Rough Set Based on Robust Nearest Neighbor (FRS-RNN)
approach has been demonstrated in the literature to be a proper classification tool for multi-class classification on
both real-valued and text datasets. Due to a lack of study on FRS-RNN to demonstrate its potential power on multilabel text datasets, this research paper proposed ML-FRSRNN: a new multi-label algorithm based on FRS-RNN. In
general, the process of classifying text documents has two crucial steps: extracting features and building a classifier
model. For efficient feature extraction, techniques based on TF-IDF and CNN are mainly used. The CNN has the best
feature engineering because it pre-processes documents well and uses pre-trained word embedding to make them
easier to understand. Utilizing the CNN structure for feature extraction, the proposed ML-FRSRNN has been effectively
applied to classify multi-label text documents. The classification performance of the proposed method is evaluated and
compared with state-of-the-art multi-label machine learning (ML) models such as Binary Relevance (BR), Label Power
Set (LP), Classifier Chains (CC), and ML-KNN using well-defined multi-label classification metrics like hamming loss,
recall, precision, subset accuracy, and F1 score. The proposed ML-FRSRNN outperforms the aforementioned multilabel ML models according to the experimental findings and empirical evaluation
Keywords
Multi-label text Classification, Convolutional Neural Network, Fuzzy Rough Set, Natural Language Processing, Machine Learning
Copyright
Copyright © Neuroquantology
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Articles published in the Neuroquantology are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJECSE right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.