Volume 20 No 10 (2022)
 Download PDF
A Comparative Study Enhancing XML Document Clustering Performance with Delta Compression
Wanjari Ravindra Shankar, Dr. F. Rahman
Abstract
In order to update clustering solutions without completely decompressing documents, this work offers a time-efficient method for document clustering that uses compressed delta representations. Using known distances before modifications and a set of changes stored in the compressed delta, the proposed approach reassesses pairwise distances between documents. The technique is tested using XML documents of different sizes from a data source with an average depth of four layers, and it is implemented in Java. Rather of decompressing the changes that are responsible for document versions, compressed delta saves them. Test findings demonstrate that the suggested method outperforms FDC in terms of time efficiency, particularly when evaluating new distances between document versions, and that the compression methodology considerably decreases document sizes. According to the findings, the suggested approach significantly reduces processing time without sacrificing clustering precision.
Keywords
Documents, Clustering, Time, Compression, Efficiency
Copyright
Copyright © Neuroquantology

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Articles published in the Neuroquantology are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJECSE right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.