Seminar of the CUSO third cycle

Information Retrieval: New Challenges, new Perspectives

March, Thursday 18th, 2010
Geneva, Switzerland

Information Retrieval: New Challenges, new Perspectives

Seminar of the third cycle CUSO 2010
In Geneva, 9:15 AM, March Thursday 18th, 2010

Just before SIGIR 2010, 19-23 July 2010 in Geneva

The inscription to this seminar is free of charge but mandatory.
Contact Stephane (to receive the needed material).

The seminar will be held at UniGE at Battelle (Room 319, second floor, Building A).


Prof. Efthimis N. Efthimiadis
Information School
University of Washington (USA)
Prof. Hsin-Hsi Chen
Natural Languages Processing Laboratory
National Taiwan University
Prof. Fabio Crestani
Faculty of Informatics
University of Lugano (USI)
Stephane Marchand-Maillet
Computer Vision and Multimedia Lab
University of Geneva
Prof. Jacques Savoy
Information Retrieval Group
University of Neuchatel


09:15 - 10:30  Cross-lingual IR (J. Savoy, L. Dolamic) (slides)
10:30 - 10:50  Coffee break
10:50 - 11:30  Opinion Mining and Sentiment Analysis (H.-H. Chen, slides)
11:30 - 12:00  Opinion Detection and Categorization
(O. Zubaryeva, J. Savoy, slides)
12:00 - 13:30  Lunch
13:30 - 15:00  Exploiting Multiple Information Sources for Multimedia IR (S. Marchand, slides)
15:00 - 15:30  Visual Search Engine Evaluation (E. Efthimiadis, slides)
15:30 - 15:50  Coffee break
15:50 - 16:15  Analysis Query Reformulation and Search Abandonment in Web Search (E. Efthimiadis, slides)
16:15 - 17:30  IR in Context (F. Crestani, slides)


Visual Search Engine Evaluation

Prof. Efthimis N. Efthimiadis

The results from two visual search engine evaluation studies will be presented. Visual search engines are engines that present the results in some visual or graphical representation.

The first study investigates how people search when using an engine with a visual user interface (UI) compared to the traditional text-based UI. The three search engines used in the evaluation are: one visual (, one hybrid (, and one text-based (Google).

The second study investigates how people search when using three engines with visual interfaces: KartOO, SearchMe and ViewZi.

The goals of each study were to investigate visual search by comparing how people search using the three types of UIs; identifying user satisfaction levels with each system following a search; establishing how the visualizations help or hinder users with the search and navigation process; and study query formulation and reformulation patterns.

Two sets of three queries each were used for searching the 6 engines. These queries are of the "informational" query type (per Broder's classification of queries) that corresponds to the typical "subject" searches. To simulate realistic user needs for the searchers we developed scenarios that give enough background information and reasons for the search as well as the type of information that is required to satisfy the information need. To account for learning effects both systems and queries were randomly allocated using a Latin square design. The study participants belong to two groups of users: a group of iSchool students and a group of non-iSchool searchers.

After each search task a questionnaire elicited information on performing the task and satisfaction with the search process. At the end of the three search tasks there are questions for comparing the three search engines. During the search participants were asked to follow the "think aloud" process; and the search was logged using a modified version of the DejaClick plug-in for the Firefox browser. The information logged includes date, time, URL, search terms, and whether searchers use one window or multiple to open and view the results/pages. The logs can be used to re-play the entire session as well as export the data for further analysis. Eye-tracking technology was used to capture participants eye movements and investigate how they interact with the search results.

Analyzing Query Reformulation & Search Abandonment in Web Search

Prof. Efthimis N. Efthimiadis

Users frequently modify a previous search query in hope of retrieving better results. These modifications are called query reformulations or query refinements. Existing research has studied how web search engines can propose reformulations, but has given less attention to how people perform query reformulations.

Search abandonment occurs when users leave the search results page without taking any action after a series of queries. This can be positive, if an answer is found in the result set and there is no need to click a result, or negative, if there is no relevant answer in the displayed results. In this talk I will discuss the development and evaluation of a rule-based classifier for query reformulation, and a SVM classifier for query reformulation and search abandonment, and a user-based study investigating search abandonment.

Opinion Mining and Sentiment Analysis

Prof. Hsin-Hsi Chen

In this lecture, I will present an opinion analysis system, which extracts from the Web opinions about specific targets, summarizes the polarity and strength of these opinions, and tracks opinion variations over time. Then, I will discuss how to discover relationships among objects based on their opinion tracking plots and collocations. Finally, I will talk about the sentiment analysis in weblog using machine learning techniques such as support vector machines and conditional random fields.

Information Retrieval in Context

Prof. Fabio Crestani

Despite its many success stories, it is now a well-recognised fact in Information Retrieval that the query does not provide sufficient information for an IR system to interpret the information need of the user and identify correctly relevant documents to retrieve. So, it is becoming more and more important to use any other form of information that is available in relation to the task, the user and the specific interaction (e.g. click-through data, location information, previous history of searches, explicit user preferences, etc.). This information is collectively known as context. In this talk I will present the research directions and the state of the art of context- aware IR. I will also present some example of context-aware IR that has already been implemented and experimented with. The talk will end with an open discussion of what context is and how context can be captured and used to improve the IR process.

Cross-Language Information Retrieval

Prof. Jacques Savoy

A lot of work early work in Information Retrieval was exclusively focused on retrieval of English text documents. This limitation started to become addressed in the early 80s in earnest, with the advent of evaluation campaigns such as TREC in the 90s being a major force behind this development. In this lecture, we will show how to systematically extend basic monolingual indexing and matching, and adapt them for working with other languages. We will cover issues pertaining to the indexing process such as tokenization/segmentation (including Asian languages), word normalization, stemming/decompounding. We will discuss the effect of these measures on retrieval effectiveness.

Understanding how to adapt IR systems successfully for many languages is a necessary pre-requisite to tackle the problem of Cross-Language Information retrieval (CLIR), i.e. the retrieval of documents written in a language different to the language of the user's request.

Exploiting Multiple Information Sources for Multimedia Information Retrieval

Stephane Marchand-Maillet

Multimedia is, by definition, composed of multiple parallel information sources (media), on which several views may be taken via feature extraction. Metadata and data associated to usage and users also form relevant sources of information for retrieving multimedia material. In this lecture, we look at learning-based mechanisms to jointly use all this information to extract explicit or implicit knowledge about the core data. We present several example applications of this framework.

The inscription to this seminar is free of charge but mandatory.
Contact Stephane Marchand-Maillet (to receive the needed material).
The seminar will be held at Battelle (Room 319).