DSpace Arşivi :: by Yazar "Sharma, Parichit" değerine göre listeleniyor

Yazar "Sharma, Parichit" seçeneğine göre listele

Listeleniyor 1 - 2 / 2

DCEM: An R package for clustering big data via data-centric modification of Expectation Maximization
(Elsevier, 2022) Sharma, Parichit; Kurban, Hasan; Dalkilic, Mehmet
Clustering is intractable, so techniques exist to give a best approximation. Expectation Maximization (EM), initially used to impute missing data, is among the most popular. Parameters of a fixed number of probability distributions (PDF) together with the probability of a datum belonging to each PDF are iteratively computed. EM does not scale with data size, and this has hampered its current use. Using a data-centric approach, we insert hierarchical structures within the algorithm to separate high expressive data (HE) from low expressive data (LE): the former greatly affects the objective function at some iteration i, while LE does not. By alternating using either HE or HE+LE, we significantly reduce run-time for EM. We call this new, data-centric EM, EM*. We have designed and developed an R package called DCEM (Data Clustering with Expectation Maximization) to emphasize that data is driving the algorithm. DCEM is superior to EM as we vary size, dimensions, and separability, independent of the scientific domain. DCEM is modular and can be used as either a stand-alone program or a pluggable component. DCEM includes our implementation of the original EM as well. To the best of our knowledge, there is no open source software that specifically focuses on improving EM clustering without explicit parallelization, modified seeding, or data reduction. DCEM is freely accessible on CRAN (Comprehensive R Archive Network). (C) 2021 The Author(s). Published by Elsevier B.V.
Predicting atom types of anatase tio2 nanoparticles with machine learning
(Trans Tech Publications Ltd, 2021) Kurban, Hasan; Kurban, Mustafa; Sharma, Parichit; Dalkilic, Mehmet M.
Machine learning (ML) has recently made a major contribution to the fields of Material Science (MS). In this study, ML algorithms are used to learn atoms types over structural geometrical data of anatase TiO2 nanoparticles produced at different temperature levels with the densityfunctional tight-binding method (DFTB). Especially for this work, Random Forest (RF), Decision Trees (DT), K-Nearest Neighbor (KNN), Naïve Bayes (NB), which are among the most popular ML algorithms, were run to learn titanium (Ti) and oxygen (O) atoms. RF outperforms other algorithms, almost succeeding in learning this skewed data set close to perfect. The use of ML algorithms with datasets compatible with its mathematical design increases their learning performance. Therefore, we find it remarkable that a certain type of ML algorithm performs almost perfectly. Because it can help material scientists predict the behavior and structural and electronic properties of atoms at different temperatures. © 2021 Trans Tech Publications Ltd, Switzerland.

Yazar "Sharma, Parichit" seçeneğine göre listele

Sayfa Başına Sonuç

Sıralama seçenekleri