DOI: 10.1145/3817099.3817107 ISSN: 1551-9031
SIGEcom Exchanges Annotated Reading List: Multiclass Calibration
Rabanus Derr, Jessie Finocchiaro
ML model evaluation often takes one of two main approaches:
risk minimization
, associated with "high accuracy" or
calibration
, meaning that predictions are "trustworthy" and can be interpreted from a probabilistic lens. There is an extensive line of work which has studied the relationship between risk minimization and calibration, mostly focusing on the binary outcome setting. Even in the binary setting, there are a variety of proposed calibration metrics which non-trivially interact. In the multiclass label setting, the choices to be made are even more complex and particularly there are different semantics for different notions. Here, we briefly present an annotated reading list reviewing some of the proposed definitions and their relationships.