Topics
Visual Search Engine Evaluation
Prof. Efthimis N. Efthimiadis
The results from two visual search engine evaluation studies will be presented.
Visual search engines are engines that present the results in some visual or graphical representation.
The first study investigates how people search when using an engine with a visual user interface (UI)
compared to the traditional text-based UI. The three search engines used in the evaluation are: one visual
(www.kartoo.com), one hybrid (www.quintura.com),
and one text-based (Google).
The second study investigates how people search when using three engines with visual interfaces:
KartOO, SearchMe and ViewZi.
The goals of each study were to investigate visual search by comparing how people search using the
three types of UIs; identifying user satisfaction levels with each system following a search; establishing
how the visualizations help or hinder users with the search and navigation process; and study query formulation
and reformulation patterns.
Two sets of three queries each were used for searching the 6 engines. These queries are of the
"informational" query type (per Broder's classification of queries) that corresponds to the typical "subject" searches.
To simulate realistic user needs for the searchers we developed scenarios that give enough background
information and reasons for the search as well as the type of information that is required to satisfy
the information need. To account for learning effects both systems and queries were randomly allocated
using a Latin square design. The study participants belong to two groups of users: a group of iSchool
students and a group of non-iSchool searchers.
After each search task a questionnaire elicited information on performing the task and satisfaction
with the search process. At the end of the three search tasks there are questions for comparing
the three search engines. During the search participants were asked to follow the "think aloud"
process; and the search was logged using a modified version of the DejaClick plug-in for the Firefox browser.
The information logged includes date, time, URL, search terms, and whether searchers use one window or multiple
to open and view the results/pages. The logs can be used to re-play the entire session as well as export the
data for further analysis.
Eye-tracking technology was used to capture participants eye movements and investigate how they interact with
the search results.
Analyzing Query Reformulation & Search Abandonment in Web Search
Prof. Efthimis N. Efthimiadis
Users frequently modify a previous search query in hope of retrieving better results. These modifications are called
query reformulations or query refinements. Existing research has studied how web search engines can propose
reformulations, but has given less attention to how people perform query reformulations.
Search abandonment occurs when users leave the search results page without taking any action after a series
of queries. This can be positive, if an answer is found in the result set and there is no need to click a
result, or negative, if there is no relevant answer in the displayed results.
In this talk I will discuss the development and evaluation of a rule-based classifier for query reformulation,
and a SVM classifier for query reformulation and search abandonment, and a user-based study investigating
search abandonment.
Opinion Mining and Sentiment Analysis
Prof. Hsin-Hsi Chen
In this lecture, I will present an opinion analysis system, which
extracts from the Web opinions about specific targets, summarizes the
polarity and strength of these opinions, and tracks opinion variations over
time. Then, I will discuss how to discover relationships among objects
based on their opinion tracking plots and collocations. Finally, I will
talk about the sentiment analysis in weblog using machine learning
techniques such as support vector machines and conditional random fields.
Information Retrieval in Context
Prof. Fabio Crestani
Despite its many success stories, it is now a well-recognised fact in Information Retrieval that the
query does not provide sufficient information for an IR system to interpret the information need of the
user and identify correctly relevant documents to retrieve. So, it is becoming more and more important to
use any other form of information that is available in relation to the task, the user and the specific
interaction (e.g. click-through data, location information, previous history of searches, explicit user
preferences, etc.). This information is collectively known as context. In this talk I will present the
research directions and the state of the art of context- aware IR. I will also present some example of
context-aware IR that has already been implemented and experimented with. The talk will end with an open
discussion of what context is and how context can be captured and used to improve the IR process.
Cross-Language Information Retrieval
Prof. Jacques Savoy
A lot of work early work in Information Retrieval was exclusively focused on retrieval of English text documents.
This limitation started to become addressed in the early 80s in earnest, with the advent of evaluation
campaigns such as TREC in the 90s being a major force behind this development. In this lecture, we will
show how to systematically extend basic monolingual indexing and matching, and adapt them for working
with other languages. We will cover issues pertaining to the indexing process such as
tokenization/segmentation (including Asian languages), word normalization, stemming/decompounding.
We will discuss the effect of these measures on retrieval effectiveness.
Understanding how to adapt IR systems successfully for many languages is a necessary pre-requisite
to tackle the problem of Cross-Language Information retrieval (CLIR), i.e. the retrieval of documents written
in a language different to the language of the user's request.
Exploiting Multiple Information Sources for Multimedia Information Retrieval
Stephane Marchand-Maillet
Multimedia is, by definition, composed of multiple parallel information sources (media),
on which several views may be taken via feature extraction. Metadata and data associated to
usage and users also form relevant sources of information for retrieving multimedia material.
In this lecture, we look at learning-based mechanisms to jointly use all this information to
extract explicit or implicit knowledge about the core data. We present several example
applications of this framework.
The inscription to this seminar is free of charge but mandatory.
Contact Stephane Marchand-Maillet (to receive the needed material).
The seminar will be held at Battelle (Room 319).