• Efficient and Robust Model Benchmarks with Item Response Theory and Adaptive Testing 

      Song, Hao; Flach, Peter (International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), 03/2021)
      Progress in predictive machine learning is typically measured on the basis of performance comparisons on benchmark datasets. Traditionally these kinds of empirical evaluation are carried out on large numbers of datasets, ...