A Clustering Algorithm Based on an Ensemble of Dissimilarities: An Application in the Bioinformatics Domain

Martín Merino, Manuel; López Rivero, Alfonso José; Alonso, Vidal; Vallejo, Marcelo; Ferreras, Antonio

dc.contributor.author	Martín Merino, Manuel
dc.contributor.author	López Rivero, Alfonso José
dc.contributor.author	Alonso, Vidal
dc.contributor.author	Vallejo, Marcelo
dc.contributor.author	Ferreras, Antonio
dc.date	2022-09
dc.date.accessioned	2022-12-13T12:20:55Z
dc.date.available	2022-12-13T12:20:55Z
dc.identifier.issn	1989-1660
dc.identifier.uri	https://reunir.unir.net/handle/123456789/13904
dc.description.abstract	Clustering algorithms such as k-means depend heavily on choosing an appropriate distance metric that reflect accurately the object proximities. A wide range of dissimilarities may be defined that often lead to different clustering results. Choosing the best dissimilarity is an ill-posed problem and learning a general distance from the data is a complex task, particularly for high dimensional problems. Therefore, an appealing approach is to learn an ensemble of dissimilarities. In this paper, we have developed a semi-supervised clustering algorithm that learns a linear combination of dissimilarities considering incomplete knowledge in the form of pairwise constraints. The minimization of the loss function is based on a robust and efficient quadratic optimization algorithm. Besides, a regularization term is considered that controls the complexity of the distance metric learned avoiding overfitting. The algorithm has been applied to the identification of tumor samples using the gene expression profiles, where domain experts provide often incomplete knowledge in the form of pairwise constraints. We report that the algorithm proposed outperforms a standard semi-supervised clustering technique available in the literature and clustering results based on a single dissimilarity. The improvement is particularly relevant for applications with high level of noise.	es_ES
dc.language.iso	eng	es_ES
dc.publisher	International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI)	es_ES
dc.relation.ispartofseries	;vol. 7, nº 6
dc.relation.uri	https://ijimai.org/journal/bibcite/reference/3181	es_ES
dc.rights	openAccess	es_ES
dc.subject	bioinformatics	es_ES
dc.subject	clustering	es_ES
dc.subject	kernel methods	es_ES
dc.subject	machine learning	es_ES
dc.subject	metric learning	es_ES
dc.subject	IJIMAI	es_ES
dc.title	A Clustering Algorithm Based on an Ensemble of Dissimilarities: An Application in the Bioinformatics Domain	es_ES
dc.type	article	es_ES
reunir.tag	~IJIMAI	es_ES
dc.identifier.doi	https://doi.org/10.9781/ijimai.2022.09.007

Ficheros en el ítem

Nombre:: ijimai7_6_1.pdf
Tamaño:: 2.477Mb
Formato:: PDF

Ver/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

vol. 7, nº 6, september 2022

Mostrar el registro sencillo del ítem

A Clustering Algorithm Based on an Ensemble of Dissimilarities: An Application in the Bioinformatics Domain

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Ítems relacionados

Understanding the MSME Environmental Transition: Nonlinear and Moderation Effects of Digitalization and Institutional Context ﻿

SMEs and Sustainable Practices: Identifying Key Factors from Spanish Evidence ﻿

Empirical Analysis of Ethical Principles Applied to Different AI Uses Cases ﻿

Understanding the MSME Environmental Transition: Nonlinear and Moderation Effects of Digitalization and Institutional Context

SMEs and Sustainable Practices: Identifying Key Factors from Spanish Evidence

Empirical Analysis of Ethical Principles Applied to Different AI Uses Cases