Efficient and Robust Model Benchmarks with Item Response Theory and Adaptive Testing

Song, Hao; Flach, Peter

Autor:

Song, Hao

;

Flach, Peter

Fecha:

03/2021

Palabra clave:

item response theory; adaptive testing; model evaluation; benchmark; IJIMAI

Revista / editorial:

International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI)

Tipo de Ítem:

article

Resumen:

Progress in predictive machine learning is typically measured on the basis of performance comparisons on benchmark datasets. Traditionally these kinds of empirical evaluation are carried out on large numbers of datasets, but this is becoming increasingly hard due to computational requirements and the often large number of alternative methods to compare against. In this paper we investigate adaptive approaches to achieve better efficiency on model benchmarking. For a large collection of datasets, rather than training and testing a given approach on every individual dataset, we seek methods that allow us to pick only a few representative datasets to quantify the model’s goodness, from which to extrapolate to performance on other datasets. To this end, we adapt existing approaches from psychometrics: specifically, Item Response Theory and Adaptive Testing. Both are well-founded frameworks designed for educational tests. We propose certain modifications following the requirements of machine learning experiments, and present experimental results to validate the approach.

Mostrar el registro completo del ítem

Ficheros en el ítem

Nombre: ijimai_6_5_11_0.pdf

Tamaño: 1.481Mb

Formato: application/pdf

Ver/Abrir

See more details

Este ítem aparece en la(s) siguiente(s) colección(es)

vol. 6, nº 5, march 2021

Año
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025

Vistas
0
0
0
0
0
0
0
0
0
0
31
63
119
39
252

Descargas
0
0
0
0
0
0
0
0
0
0
18
35
56
25
134