To handle the overwhelming amount of information currently available, personalization systems allow users to specify the information that interests them through preferences. Most often, user preferences vary depending on the circumstances, that is, preferences are context-dependent. Context is a general term used to capture any information that can be used to characterize the situations of an entity. Common types of context include the computing context (e.g., network connectivity, resources), the user context (e.g., profile, location), the physical context (e.g., noise levels, temperature) and time .
Example: consider a database that maintains information about points of interest. Points of interest may be museums, monuments, archaeological places or zoos. We consider three context parameters as relevant: location, weather and accompanying people. Depending on context, users prefer points of interest that have specific attribute values. Such preferences are expressed by providing a numeric score between 0 and 1. For instance, the interest score of a preference that is related to the type of the visiting place depends on the accompanying people that might be friends, family, or alone. For example, a zoo may be a preferred place to visit than a brewery in the context of family.
In following, we present our recent work on defining and managing contextual preferences. This is a joint work among Evaggelia Pitoura, Panos Vassiliadis and Kostas Stefanidis, that are members of the D.M.O.D. Laboratory of the Computer Science Department of the University of Ioannina.
In our previous work [2, 3, 4], we model context as a finite set of special-purpose attributes, called context parameters. Examples of context parameters are location, weather, accompanying people, time period, the type of computing device in use, etc. A context state is an assignment of values to context parameters. To allow more flexibility in defining preferences, we model context parameters as multidimensional attributes. In particular, we assume that each context parameter participates in an associated hierarchy of levels of aggregated data, i.e., it can be viewed from different levels of detail. Regarding our example, levels of location are Region (L1), City (L2), Country (L3) and ALL (L4). For example, a value of City is Athens and two Regions in Athens are Plaka and Kifisia. For weather, there are three levels: the detailed level Conditions (L1) whose domain includes freezing, cold, mild, warm and hot, the level Weather Characterization (L2) which just refers to whether the weather is good (grouping mild, warm and hot) or bad (grouping freezing and cold) and the level ALL (L3). Accompanying people has the lower level Relationship (L1) which consists of the values friends, family, alone and the level ALL (L2). Figure 1 depicts these hierarchies.
Figure 1. Example hierarchies
Users express their preferences on specific database instances based on a single context parameter. Such basic preferences, i.e., preferences associating database relations with a single context attribute, are combined to compute aggregate preferences that include more than one context parameter. We store basic preferences in data cubes, following the OLAP paradigm. An advantage of using cubes and OLAP techniques is that they provide the capability of using hierarchies to introduce different levels of abstraction for the captured context data. For instance, this allows us to aggregate data along the location context parameter, by grouping preferences for all cities of a specific country. We show how context-aware preference queries are processed and the role of OLAP techniques in their manipulation. Aggregate preferences are not explicitly stored. To improve performance, we propose storing aggregate preferences computed as results of previous queries using an auxiliary data structure called context tree. A path in the context tree corresponds to an assignment of values to context parameters, that is, to a context state, for which the aggregate score has been previously computed.
The context tree provides an efficient way to retrieve the top-k results that are relevant to a preference query. When a query is submitted to the system, we first check if there exists a context state that matches it in the context tree. If so, we retrieve the top-k results from the associated leaf node. Otherwise, we compute the answer and insert the new context state, i.e., the new path and the associated top-k results, in the tree. Furthermore, the results stored in the context tree can be re-used to speed-up query processing. This happens for a similar query, i.e., for a query whose context state has similar values with a previous one. In this case, the results of the new query are approximate. We also show how search in the context tree can be improved by using a variation of a Bloom-based filter for testing membership in the tree.
To generalize our contextual preference model, we define context descriptors that express conditions on context parameters [5, 6]. Users employ context descriptors to express their preferences on specific database instances for a variety of context states expressed with varying levels of detail. Therefore, in this extended model, we allow contextual preferences to involve more than one context parameter. Context descriptors are used also to define contextual queries, that is, queries enhanced with context.
To store multidimensional contextual preferences, we use a data structure called profile tree. The general idea is to store in the profile tree all context states of all preferences of a profile. Each context state corresponds to a single root-to-leaf path in the tree. In each leaf of the tree, we store the predicates and interest scores applicable to the context state corresponding to the path leading to this leaf. For example, assume the following preferences:
Figure 2 depicts the profile tree for the above preferences.
Figure 2. An instance of a profile tree
Given now a contextual query, the context resolution problem is the problem of identifying those preferences whose context states are the most relevant to the context state of the query. The problem can be divided into two steps: (a) the identification of all the candidate context states that encompass the query state and (b) the selection of the most appropriate state among these candidates. The first sub-problem is resolved through the notion of the covers partial order between states that relates context states expressed at different levels of abstraction. For instance, the notion of coverage allows relating a context state in which location is expressed at the level of a city and a context state in which location is expressed at the level of a country. To resolve the second sub-problem, we consider two distance metrics that capture similarity between context states.
Often the context of a query is too specific to match any of the available preferences. To handle this issue, we consider hierarchical relaxation as the approach of replacing the value of one context parameter by a corresponding value at a different level of abstraction . Various related context states may be produced by relaxing one or more of the context parameters. Such relaxations are ranked according to their similarity to the original query state. Similarity is defined based on the number of parameters that are relaxed and the associated depth of such relaxations. To identify the preferences whose context states match best the relaxed context of a query, we use a prefix-based representation, called profile tree, for the set of context states of the preferences. A similar data structure is used for the query context states.
To speed up context resolution, we propose data structures and algorithms that exploit the hierarchical nature of context parameters . In the same work, we present a scheme for caching results of previous queries executed at a specific context, so that these results are re-used by subsequent queries. In particular, we extend the profile tree so that at each leaf in addition to the related predicates and scores, we cache the results of previous queries computed using the preferences whose context correspond to the path leading to this leaf. When a contextual query is submitted, we search for each of its context states, to see if the related path maintains already cached results. If this is the case, the cached results are used. Otherwise, we use the predicates and scores in the related leaf to compute the results. These results are then cached.
In our current work [9, 10], we address the problem of scoring database tuples based on contextual preferences. In particular, given a set of contextual preferences, a database instance and a query, we are interested in providing the user with the most preferable tuples for the current context. Assuming that the database is large and only a few tuples are of interest at any given context, sorting the whole database for each query and context will result in both wasting resources and slow query responses. Thus, we propose pre-processing steps that can be used to reduce the online time for processing each query.
At one extreme, we could compute all different scores for each tuple for all potential context states. Since, only a few tuples may be of interest at each context state, we propose computing scores only for relevant tuples (i.e., tuples for which there is sufficient interest). However, the number of potential scores may be still large, since the number of context states grows rapidly with the number of context attributes. For many applications, context includes a large number of attributes with domains of varying sizes. Thus, we propose computing scores only for representative context states. Our method for identifying context representatives exploits the hierarchical nature of context attributes and can be applied to both quantitative and qualitative preferences.
The idea of finding context representatives is based on the premise that preferences for similar context states produce similar scores. We run a related experiment to explore this. Since there are no real large profile data sets defined using our contextual preferences, we used a real dataset of movie ratings that includes 1000 users, 4000 movies and 150000 ratings . Ratings are of the form (user-id, movie-id, rating-value), with rating-value in the range [1, 5]. For users, there is information available of the form (user-id, sex, age, occupation) that we use as our context environment. We constructed simple predicates that involve the genre of the movies by averaging the rates assigned by each user to movies of each genre. We consider five values for the genre attribute namely, comedy, action, thriller, horror and drama. Using the profile such constructed, we show how preferences vary with context (i.e., user attributes). We compute the distance between two users as the distance of their relative context states with attributes: user-id, sex, age, and occupation. The ratings of each user are represented with a 5 × 5 matrix, where there is one row for each rating (1 to 5) and one column for each movie genre. We compute the distance between two ratings as the distance of their matrixes. As shown in Figure 3, the distance between ratings increases as the distance between users increases.
Figure 3. Distance of rankings as a function of distance between users
We also consider a complementary method for grouping preferences based on identifying those preferences that result in similar scores for all database tuples. This method takes advantages of the quantitative nature of preferences and groups together contextual preferences that have similar predicates and scores. The method is based on a novel representation of preferences through a predicate bitmap table whose size depends on the desired precision of the resulting scoring.
 E. Pitoura, K. Stefanidis and A. B. Zaslavsky. Context in Databases. Technical Report TR 2004-19, Department of Computer Science, University of Ioannina, Greece. [Local copy]
 K. Stefanidis, E. Pitoura and P. Vassiliadis. On Supporting Context-Aware Preferences in Relational Database Systems. In Proc. of the first International Workshop on Managing Context Information in Mobile and Pervasive Environments (MCMP 2005), in conjunction with MDM 2005, May 9, 2005, Ayia Napa, Cyprus. [Local copy]
 K. Stefanidis, E. Pitoura and P. Vassiliadis. Modeling and Storing Context-Aware Preferences. In Proc. of the tenth East-European Conference on Advances in Databases and Information Systems (ADBIS 2006), 3-7 September, 2006, Thessaloniki, Greece. [Local copy]
 K. Stefanidis, E. Pitoura and P. Vassiliadis. A Context-Aware Preference Database System. International Journal of Pervasive Computing and Communications. Accepted, December 2005. To appear. [Local copy]
 K. Stefanidis, E. Pitoura and P. Vassiliadis. Preference Queries in Context: An Overview. Hellenic Data Management Symposium (HDMS), 7-8 September, 2006, Thessaloniki, Greece. Abstract (in Greek).
 K. Stefanidis, E. Pitoura and P. Vassiliadis. Adding Context to Preferences. In Proc. of the 23rd International Conference on Data Engineering (ICDE 2007), 15-20 April, 2007, Istanbul, Turkey. [Local copy]
 K. Stefanidis, E. Pitoura and P. Vassiliadis. On Relaxing Contextual Preference Queries. In Proc. of the second International Workshop on Managing Context Information and Semantics in Mobile Environments (MCISME 2007), in conjunction with MDM 2007, May 7, 2007, Mannheim, Germany. [Local copy]
 K. Stefanidis, E. Pitoura and P. Vassiliadis. Multidimensional Context for Database Preferences. Submitted.
 K. Stefanidis and E. Pitoura. Approximate Contextual Preference Scoring in Digital Libraries. In Proc. of the tenth DELOS Thematic Workshop on Personalized Access, Profile Management and Context Awareness in Digital Libraries (PersDL 2007), in conjunction with UM 2007 Conference, June 29-30, 2007, Corfu, Greece. [Local copy]
 K. Stefanidis and E. Pitoura. Fast Contextual Preference Scoring of Database Tuples. EDBT 2008, accepted. [Local copy]
 MovieLens 2003. Available at: www.grouplens.org/data.