Volume 18 No 8 (2020)
Download PDF
Developing a Data Mining Platform in a Big Data Environment: Integration of R Language and Hadoop
K.Sindhuja, S.Saroja Devi, S.Madeline Arockiya Shiney
Abstract
With the exponential growth of data in various industries, the need for effective data mining platforms has become crucial. This research focuses on constructing a data mining platform suitable for processing large-scale and diverse data sets. The platform leverages the power of the R language and utilizes the scalability and processing capabilities of Hadoop. The system architecture encompasses a physical layer, virtualization layer, service layer, and application layer. Heterogeneous hardware resources are deployed at the physical layer, while virtual machines are created and managed using Cloud Stack at the virtualization layer. The service layer integrates the R language, enabling the implementation of various data mining functions. Finally, the application layer provides users with a user-friendly interface to customize flow paths and configure parameters. The proposed method effectively processes big data, facilitates comprehensive data analysis, and achieves high processing efficiency.
Keywords
Data mining platform, Big data, R language, Hadoop, CloudStack, Scalability, Processing efficiency
Copyright
Copyright © Neuroquantology
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Articles published in the Neuroquantology are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJECSE right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.