Tutorial on Information Retrieval & Text Analytics
Seminar of the CUSO Ph.D. program in Computer Science
In Neuchatel, 9:15 AM, April Tuesday 2nd, 2013
Just before CORIA 2013, 3 - 5 April 2013 in Neuchatel
The seminar will be held at UniNE at Uni Mail (Room F200, second floor, Building F).
A map to reach the Computer Science Departement
|
Speakers
Topics
A Tutorial on Text Analytics
Prof. Jamie Callan
Many organizations need to analyze large amounts of text to discover
useful information. This tutorial provides students with an
understanding of common and emerging methods of summarizing and
analyzing material in large collections of unstructured and
lightly-structured text ('text analytics'). The tutorial begins by
covering basic and advanced text representation techniques and
similarity metrics used in full-text search engines and text mining
software. It then explores the use of these core techniques to
accomplish different types of analysis tasks, for example, frequency
and co-occurrence analysis, sentiment analysis, and expert finding.
This tutorial assumes a typical Computer Science background, and a
very basic knowledge of linear algebra, probability, and statistics.
Information Retrieval: Evaluation
Donna Harman
Evaluation has always been a critical component of experimentation in all areas of research, and the information retrieval community has been particularly fortunate to have had excellent methodologies for evaluation right from its beginning. These methodologies, often referred to as the Cranfield paradigm, first led to a solid foundation of research, followed by continuous improvements since then. The tutorial will start with an introduction to the Cranfield paradigm, with an emphasis on the reasoning behind its development. This will be followed by an extensive examination of how this paradigm has been adapted to current research in information access, using the TREC evaluations as a case study.
The second part of the tutorial will look at evaluation techniques that involve the interaction between users and information access technologies. This includes the types of user studies done in commercial search engines and those done in the academic settings, with discussion of both usability studies and user studies designed for more generalized testing of new information access techniques.
The final part of the tutorial will be a short summary of some of the evaluation methodologies used in other related fields of human language technology, including summarization, question answering, speech recognition, video retrieval and machine translation. The goal here is to compare and contrast the differing evaluation philosophies in these areas.
Timetable
Room: F200 Aula Louis - Guillaume
|
9h015 - 10h45: Text Analytics, Part I (text representation, retrieval modesl, clustering) (Jamie Callan) |
10h45 - 11h05: Coffee break |
11h05 - 12h30: Information Retrieval Evaluation, Part I (Crandfield and TREC paradigm) (Donna Harman) |
12h30 - 13h30: Lunch |
13h30 - 14h45: Information Retrieval Evaluation, Part II (user-centered evaluation) (Donna Harman) |
14h45 - 15h05: Coffee break |
15h05 - 16h30: Text Analytics, Part II (named-entities, frequency, sentiment analysis, expert finding) (Jamie Callan) |
| |
The inscription to this seminar is free of charge but mandatory.
Contact Mitra Akasereh (to receive the needed material).
The seminar will be held at UniNE at Uni Mail (Room F200, second floor, Building F).
A map to reach the Computer Science Departement
|