ccImpute: an accurate and scalable consensus clustering based algorithm to impute dropout events in the single-cell RNA-seq data

dc.authoridKURBAN, HASAN/0000-0003-3142-2866
dc.contributor.authorMalec, Marcin
dc.contributor.authorKurban, Hasan
dc.contributor.authorDalkilic, Mehmet
dc.date.accessioned2024-12-24T19:29:53Z
dc.date.available2024-12-24T19:29:53Z
dc.date.issued2022
dc.departmentSiirt Üniversitesi
dc.description.abstractBackground: In recent years, the introduction of single-cell RNA sequencing (scRNA-seq) has enabled the analysis of a cell's transcriptome at an unprecedented granularity and processing speed. The experimental outcome of applying this technology is a M x N matrix containing aggregated mRNA expression counts of M genes and N cell samples. From this matrix, scientists can study how cell protein synthesis changes in response to various factors, for example, disease versus non-disease states in response to a treatment protocol. This technology's critical challenge is detecting and accurately recording lowly expressed genes. As a result, low expression levels tend to be missed and recorded as zero - an event known as dropout. This makes the lowly expressed genes indistinguishable from true zero expression and different than the low expression present in cells of the same type. This issue makes any subsequent downstream analysis difficult. Results: To address this problem, we propose an approach to measure cell similarity using consensus clustering and demonstrate an effective and efficient algorithm that takes advantage of this new similarity measure to impute the most probable dropout events in the scRNA-seq datasets. We demonstrate that our approach exceeds the performance of existing imputation approaches while introducing the least amount of new noise as measured by clustering performance characteristics on datasets with known cell identities. Conclusions: cclmpute is an effective algorithm to correct for dropout events and thus improve downstream analysis of scRNA-seq data. cclmpute is implemented in R and is available at https://github.com/khazum/ccImpute.
dc.description.sponsorshipLilly Endowment, Inc.
dc.description.sponsorshipThis research was supported in part by Lilly Endowment, Inc. through its support for the Indiana University Pervasive Technology Institute.
dc.identifier.doi10.1186/s12859-022-04814-8
dc.identifier.issn1471-2105
dc.identifier.issue1
dc.identifier.pmid35869420
dc.identifier.scopus2-s2.0-85134567172
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.1186/s12859-022-04814-8
dc.identifier.urihttps://hdl.handle.net/20.500.12604/7295
dc.identifier.volume23
dc.identifier.wosWOS:000829032300001
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.indekslendigikaynakPubMed
dc.language.isoen
dc.publisherBmc
dc.relation.ispartofBmc Bioinformatics
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_20241222
dc.subjectscRNA
dc.subjectImputation
dc.subjectSingle-cell
dc.subjectDropout event
dc.subjectDownstream analysis
dc.subjectNext generation sequencing
dc.titleccImpute: an accurate and scalable consensus clustering based algorithm to impute dropout events in the single-cell RNA-seq data
dc.typeArticle

Dosyalar