Volume 21 No 6 (2023)
 Download PDF
Telugu Dialect Identification Using Machine Learning Models with Cross-Validation: An Automated Approach to Preserving Linguistic Diversity
S.Shiva Prasad, Dr.Ramu Vankudoth, Dr.V.Madhukar, Dr Hanmanthu Bhukya, Chevuru Madhu Babu
Abstract
Telugu, one of the major Dravidian languages spoken in South India, exhibits a rich diversity of dialects across various regions. Telugu dialects are exhibiting variations in pronunciation, vocabulary, grammar, and sentence structures across different regions. Understanding and identifying these dialects play a crucial role in linguistic research, cultural preservation, and language planning. This study focuses on the identification of Telugu dialects using various machine learning models, namely Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Naive Bayesian, Random Forest, and Gradient Boosting. The results are analyzed with different cross validations. To accomplish this, we have developed a comprehensive database of Telugu dialects consisting of diverse speech samples from different regions. The data was pre-processed, and relevant features were extracted to train the machine learning models. Each model was trained using appropriate algorithms and techniques specific to the respective model. Among the evaluated models, SVM exhibited the highest accuracy in identifying Telugu dialects.
Keywords
Telugu, dialects, SVM, KNN, Naive Bayesian, Random Forest, Gradient Boosting, dialect identification, linguistic variations.
Copyright
Copyright © Neuroquantology

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Articles published in the Neuroquantology are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJECSE right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.