Listwise定义
-
In ranking, the input is a set of objects, the output is a permutation of the objects, the model is a ranking function which maps a given input to an output.
-
In learning, the training data $S = {(x^{(i)}), y^{(i)})}_{i=1}^m$ is drawn i.i.d. according to an unknown but fixed joint probability distribution between input and output $ P(x, y)$.
for pointwise approach:
for pairwise approach:
for listwise approach:
-
Ideally we would minimize the expected 0 − 1 loss defined on the predicted list and the ground truth list.
Practically we instead manage to minimize an empirical surrogate loss with respect to the training data.
三种surrogate loss:
-
Cross Entropy Loss (ListNet, ICML 2007)
the listwise loss function is defined as cross entropy between two parameterized probability distributions of permutations; one is obtained from the predicted result and the other is from the ground truth.
-
Cosine Loss(RankCosine, IPM 2007)
the listwise loss function is defined on the basis of cosine similarity between two score vectors from the predicted result and the ground truth.
wiki: cosine similarity -
Likelihood loss (ListMLE, this paper)
评价Surrogate Loss Function
-
consistency
obtained ranking function can converge to the optimal one, when the training sample size goes to infinity. -
soundness
the loss can represent well the targeted learning problem. That is, an incorrect prediction should receive a larger penalty than a correct prediction, and the penalty should reflect the confidence of prediction. -
continuity, differentiability, and convexity
-
computational efficiency
Loss | Continuity | Differentiability | Convexity | Efficiency |
---|---|---|---|---|
Cosine Loss (RankCosine) | √ | √ | X | O(n) |
Cross-entropy loss (ListNet) | √ | √ | √ | O(n·n!) |
Likelihood loss (ListMLE) | √ | √ | √ | O(n) |
-
Previous
Learning to Rank with Nonsmooth Cost Functions -
Next
Adapting Boosting for Information Retrieval Measures