A comparison of tree data structures in the streaming data clustering issue

dc.authoridSENOL, Ali/0000-0003-0364-2837
dc.authoridKAYA, Mahmut/0000-0002-7846-1769
dc.contributor.authorSenol, Ali
dc.contributor.authorKaya, Mahmut
dc.contributor.authorCanbay, Yavuz
dc.date.accessioned2024-12-24T19:30:23Z
dc.date.available2024-12-24T19:30:23Z
dc.date.issued2024
dc.departmentSiirt Üniversitesi
dc.description.abstractProcessing streaming data is a challenging issue because of the limitation of time and resources. Clustering data streams is an efficient technique to analyze this kind of data. This study proposes two new streaming data clustering algorithms, BT-AR Stream and VP-AR Stream, inspired by the KD-AR Stream clustering algorithm [32]. Our algorithms used Ball-Tree and Vintage Tree data structures instead of KD-Tree. To reveal the efficiency of the proposed algorithms, we tested the algorithms on 18 benchmark datasets in terms of clustering qualities and runtime complexities. Then we compared obtained results with the results of the KD-AR Stream algorithm. According to the results, the BT-AR Stream algorithm was the most successful in terms of clustering quality and runtime complexity, as illustrated in Figure A.Purpose: This study aims to analyze and compare the efficiency of tree data structure in data stream clustering issues. We aim to reveal the efficiency of tree data structures in both clustering quality and runtime performance.Theory and Methods: To compare the efficiency of tree data structures in data stream clustering, we proposed two stream clustering algorithms inspired by KD-AR Stream. For this reason, we used Ball-Tree and Vintage-Tree data structures instead of KD-Tree and proposed two new stream clustering algorithms named BT-AR Stream and VP-AR Stream. To compare the success of algorithms, we tested them on 18 benchmark datasets and compared them in aspects of clustering quality and runtime complexity.Results: According to the results obtained in the experimental study, the BT-AR Stream algorithm, which uses Ball-Tree, was the most successful in both clustering quality and runtime complexity on the KDD, which is a high-dimensional dataset. On the other hand, the clustering quality of all algorithms was good on the other datasets. Conclusion: Although the clustering quality of all three algorithms was good, the BT-AR Stream algorithm was the most successful because KDD is high-dimensional. Furthermore, it is the fastest algorithm compared to the others.
dc.identifier.doi10.17341/gazimmfd.1144533
dc.identifier.endpage231
dc.identifier.issn1300-1884
dc.identifier.issn1304-4915
dc.identifier.issue1
dc.identifier.scopus2-s2.0-85174177582
dc.identifier.scopusqualityQ2
dc.identifier.startpage217
dc.identifier.trdizinid1249335
dc.identifier.urihttps://doi.org/10.17341/gazimmfd.1144533
dc.identifier.urihttps://search.trdizin.gov.tr/tr/yayin/detay/1249335
dc.identifier.urihttps://hdl.handle.net/20.500.12604/7516
dc.identifier.volume39
dc.identifier.wosWOS:001058089000018
dc.identifier.wosqualityN/A
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.indekslendigikaynakTR-Dizin
dc.language.isoen
dc.publisherGazi Univ, Fac Engineering Architecture
dc.relation.ispartofJournal of The Faculty of Engineering and Architecture of Gazi University
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_20241222
dc.subjectStreaming data
dc.subjectClustering
dc.subjectTree data structures
dc.titleA comparison of tree data structures in the streaming data clustering issue
dc.title.alternativeAkan veri kümeleme probleminde ağaç veri yapılarının performans karşılaştırması
dc.typeArticle

Dosyalar