排序模型(learning to rank)在信息检索(information retrieval)、协同过滤(collabrative filtering)中应用广泛。和之前博客中的对参赛选手排名问题不太一样。
一般把这个问题描述为:对查询query q和一组文档documents di,对di对q的相关程度排序。
对bag of words模型,我们可以这样衡量相关程度:
其中,$q_i$是query中的第i个词,$p(q_i|d)$是multinoulli分布。
如果用Dirichlet先验,
pointwize approach
eg. Pranking, MCRank, etc.
过程:
- 对每个document $d_j$,构造特征$x(q, d_j)$。
- 从历史CTR(具体见Optimizing Search Engines using Clickthrough Data),或其它相关结果中构造目标$y_i$。
-
y可以是二维的(经典分类问题)$p(y=1 x(q,d))$,也可以是有序的$p(y=r x(q,d))$。 - 预测出y的分值后,对其进行排序
缺点:
分析与document的位置无关,比较短视(myopical)。
pairwize approach
eg. Ranking SVM, RankBoost, RankNet, etc.
比pointwize不同,pairwize比较两个document相关程度哪个更大:
$y_{jk} = 1 \ if \ rel(d_j, q) > rel(d_k, q)\ else\ 0$
如果以下式建模,那么可以看为是神经网络RankNet:
缺点:只两两比较,比较局部。
listwize approach
eg.
- Direct optimization of IR measure: AdaRank, SVM-MAP, SoftRank, LambdaRank, etc.
- Listwise loss minimization: RankCosine, ListNet, etc.
首先考虑Plackett-Luce分布:
其中$s_j = s(\pi^{-1}(j))$是排在第j位的分数。算法ListNet的思路中,将$s(d) = f(x(q, d)) = w^Tx$
L2R模型评价指标
主要类型:binary:文章label为二元:相关/无关, multilevel.
binary
-
Mean reciprocal rank(MRR)
定义query最相关文档的排序为r(q),RR = 1/r(q)。 -
Mean average precision(MAP)
定义k处的精度P为:
平均精度为AP:
-
Winner Takes All(WTA)
如果排序第一位文档相关,则cost为0,否则cost为1。
multilevel
-
Pair-wise Correct
= num. pairs correct/num. pairs possible
bpref基于Pair-wise Correct,并对排序较高的文档增加权重 - Normalized discounted cumulative gain(NDCG)
其中ri是i的相关程度。 - Rank correlation
直接衡量预测排序和实际排序之间的相关程度。多种统计学检验都可以,比如Kendall’s $\tau$。
相关论文推荐
- Joachims, T. (2002). Optimizing Search Engines using Clickthrough Data, 1–10.
- Zhai and La erty (2004) Zhai, C. and J. La erty (2004). A study of smoothing methods for language models applied to information retrieval. ACM Trans. on In- formation Systems 22(2), 179–214.
- Burges, C. J., T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender (2005). Learning to rank using gradient descent. In Intl. Conf. on Machine Learning, pp. 89–96.
- Burges, C. (2007). Learning to Rank with Nonsmooth Cost Functions, 1–8.
- Cao, Z., T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li (2007). Learning to rank: From pairwise approach to listwise approach. In Intl. Conf. on Machine
Learning, pp. 129âA ̆S ̧136. - Xia, F. (2008). Listwise Approach to Learning to Rank - Theory and Algorithm, 1–8.
- Carterette, B., P. Bennett, D. Chicker- ing, and S. Dumais (2008). Here or There: Preference Judgments for Relevance. In Proc. ECIR.
- Liu, T.-Y. (2009). Learning to rank for information retrieval. Founda- tions and Trends in Information Retrieval 3(3), 225–331.
- Usunier, N., D. Bu oni, and P. Galli- nari (2009). Ranking with ordered weighted pairwise classification.
- Zhang, X., T. Graepel, and R. Herbrich (2010). Bayesian Online Learning for Multi-label and Multi-variate Performance Measures. In AI/Statis- tics.
- Zheng, Z. (2016). A Regression Framework for Learning Ranking Functions Using Relative Relevance Judgments, 1–8.
- http://www.bing.com/community/site_blogs/b/search/archive/2009/06/01/user-needs-f eatures-and-the-science-behind-bing.aspx
Reference
- MLAPP 9.7
-
<img src=’http://videolectures.net/icml08_liu_lalr/thumb.jpg’ border=0/>
Listwise Approach to Learning to Rank - Theory and Algorithm
Tie-Yan Liu