Key points are not available for this paper at this time.
We introduce a new technique for analyzing combination models. The technique allows us to make qualitative conclusions about which IR systems should be combined. We achieve this by using a linear regression to accurately (T ' = 0.98) predict the performance of the combined system based on quantitative measurements of individual component systems taken from TREC5. When applied to a linear model (weighted sum of relevance scores), the technique supports several previously suggested hypotheses: one should maximize both the individual systems' performances and the overlap of relevant documents between systems, while minimizing the overlap of nonrelevant documents. It also suggests new conclusions: both systems should distribute scores similarly, but not rank relevant documents similarly. It furthermore suggests that the linear model is only able to exploit a fraction of the benefit possible from combination. The technique is general in nature and capable of pointing out the strengths and weaknesses of any given combination approach.
Vogt et al. (Sat,) studied this question.