Use of Data Mining for Intelligent Evaluation of Imputation Methods
Autor:
la Red Martínez, David L.
; Primorac, Carlos R.
Fecha:
01/06/2025Palabra clave:
Revista / editorial:
UNIRCitación:
D. L. la Red Martínez, C. R. Primorac. Use of Data Mining for Intelligent Evaluation of Imputation Methods, International Journal of Interactive Multimedia and Artificial Intelligence, vol. 9, no. 3, pp. 82-95, 2025, http://dx.doi.org/10.9781/ijimai.2023.03.002Tipo de Ítem:
articleDirección web:
https://www.ijimai.org/index.php/ijimai/article/view/243
Resumen:
In real-world situations, researchers frequently face the difficulty of missing values (MV), i.e., values not observed in a data set. Data imputation techniques allow the estimation of MV using different algorithms, by means of which important data can be imputed for a particular instance. Most of the literature in this field deals with different imputation methods. However, few studies deal with a comparative evaluation of the different methods as to provide more appropriate guidelines for the selection of the method to be applied to impute data for specific situations. The objective of this work is to show a methodology for evaluating the performance of imputation methods by means of new metrics derived from data mining processes, using quality metrics of data mining models. We started from the complete dataset that was amputated with different amputation mechanisms to generate 63 datasets with MV; these were imputed using Median, k-NN, k-Means and Hot-Deck imputation methods. The performance of the imputation methods was evaluated using new metrics derived from quality metrics of the data mining processes, performed with the original full file and with the imputed files. This evaluation is not based on measuring the error when imputing (usual operation), but on considering the similarity of the values of the quality metrics of the data mining processes obtained with the original file and with the imputed files. The results show that –globally considered and according to the new proposed metric, the imputation methods that showed the best performance were k-NN and k-Means. An additional advantage of the proposed methodology is that it provides predictive data mining models that can be used a posteriori.
Ficheros en el ítem
Nombre: Use of Data Mining for Intelligent Evaluation of Imputation Methods.pdf
Tamaño: 772.9Kb
Formato: application/pdf
Este ítem aparece en la(s) siguiente(s) colección(es)
Estadísticas de uso
| Año |
| 2012 |
| 2013 |
| 2014 |
| 2015 |
| 2016 |
| 2017 |
| 2018 |
| 2019 |
| 2020 |
| 2021 |
| 2022 |
| 2023 |
| 2024 |
| 2025 |
| 2026 |
| Vistas |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 2 |
| Descargas |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
Ítems relacionados
Mostrando ítems relacionados por Título, autor o materia.
-
Magnetic Nanoclusters Increase the Sensitivity of Lateral Flow Immunoassays for Protein Detection: Application to Pneumolysin as a Biomarker for Streptococcus pneumoniae
Salvador, Maria; Marques-Fernandez, Jose Luis; Bunge, Alexander; Martinez-Garcia, Jose Carlos; Turcu, Rodica; Peddis, Davide; García-Suárez, María del Mar ; Cima-Cabal, María Dolores ; Rivas, Montserrat (Nanomaterials, 06/2022)Lateral flow immunoassays for detecting biomarkers in body fluids are simple, quick, inexpensive point-of-care tests widely used in disease surveillance, such as during the coronavirus disease 2019 (COVID-19) pandemic. ... -
Integration of DevOps Practices on a Noise Monitor System with CircleCI and Terraform
Romero, Esteban Elias; Camacho, David; Montenegro, Carlos Enrique; Acosta Agudelo, Oscar Esneider; González-Crespo, Rubén; Gaona-García, Elvis; Herrera Martínez, Marcelo (ACM Transactions on Management Information Systems, 2022)Lowering pollution levels is one of the main principles of Sustainable Development goals dictated by the United Nations. Consequently, developments on noise monitoring contribute in great manner to this purpose, since they ... -
CYSAS-S3: a novel dataset for validating cyber situational awareness related tools for supporting military operations
Daton Medenou, Roumen ; Calzado Mayo, Victor Manuel; Garcia Balufo, Miriam; Páramo Castrillo, Miguel; González Garrido, Francisco José; Luis Martinez, Alvaro; Nevado Catalán, David; Hu, Ao; Sandoval Rodriguez-Bermejo, David; Maestre Vidal, Jorge; Pasqual de Riquelme, Gerardo Ramis; Berardi, Antonio; De Santis, Paolo; Torelli, Francesco; Llopis Sánchez, Salvador (ACM International Conference Proceeding Series, 2020)The lack of suitable datasets and evaluation processes entails one of the most challenging gaps on the digital transformation era, where data-driven solutions like machine learning algorithms constitute a key pillar of the ...





