Preference and Diversity-based Ranking in Network-Centric Information Management Systems

Related Work

Today, most user queries are of an exploratory nature, in the sense that users are interested in retrieving pieces of information that cover many aspects of their information needs. Therefore, recently, result diversification has attracted considerable attention as a means of enhancing the heterogeneity of the results presented to the users. Consider, for example, a user who wants to buy a car and submits a related web search query. A diverse result, i.e. a result containing various brands and models with different horsepower and other technical characteristics is intuitively more informative than a homogeneous result containing only cars with similar features.

There have been various definitions of diversity, based on (i) content (or similarity), i.e. objects that are dissimilar to each other, (ii) novelty, i.e. objects that contain new information when compared to what was previously presented and (iii) semantic coverage, i.e. objects that belong to different categories.

Next, you can find a collection of papers related to the problem of result diversification.


The following papers survey the related literature in the field of result diversification.


Content-based definitions interpret diversity as an instance of the p-dispersion problem. The objective of the p-dispersion problem is to choose p out of n given points, so that the minimum distance between any pair of chosen points is maximized.


Novelty is a notion closely related to that of diversity, in the sense that items which are diverse from all items seen in the past are likely to contain novel information, i.e. information not seen before.


Some works view diversity in a different way, that of selecting items that cover many different interpretations of the user's information need.