I00041 (I00041)
Information Retrieval*
< 2006/2007 > 05-02-2007 t/m 01-07-2007 () L
Informatica - Master variant C (2003) Thematische specialisatie Kunstmatige Intelligentie (6 ec) Informatiesystemen (6 ec) Keuze informatica (6 ec)
Informatica - Master variant E (2003) Keuze informatica (6 ec)
Informatica - Master variant MT (2003) Thematische specialisatie Kunstmatige Intelligentie (6 ec) Informatiesystemen (6 ec) Informatiesystemen (6 ec) Kunstmatige Intelligentie (6 ec) Keuze informatica (6 ec) (6 ec) (6 ec)
Informatica - Master variant O (2003) Thematische specialisatie Kunstmatige Intelligentie (6 ec) Informatiesystemen (6 ec) Keuze informatica (6 ec)
Informatica - Master variant O (2005) Thematische specialisatie Kunstmatige Intelligentie (6 ec) Informatiesystemen (6 ec) Keuze informatica (6 ec)
Informatica - Master na HBO Artificial Intelligence variant MT (2004) Thematische specialisatie Kunstmatige Intelligentie (6 ec) Keuze informatica (6 ec)
Informatica - Master na HBO Artificial Intelligence variant O (2004) Thematische specialisatie Kunstmatige Intelligentie (6 ec) Keuze informatica (6 ec)
Informatica - Master na HBO Computer Security variant MT (2003) Keuze informatica (6 ec)
Informatica - Master na HBO Computer Security variant O (2004) Keuze informatica (6 ec)
Informatica - Master na HBO Embedded Systems variant MT (2003) Keuze informatica (6 ec)
Informatica - Master na HBO Embedded Systems variant O (2004) Keuze informatica (6 ec)
Informatica - Master na HBO Information Systems variant MT (2003) Thematische specialisatie (6 ec)
Informatica - Master na HBO Information Systems variant O (2004) Thematische specialisatie (6 ec) Keuze informatica (6 ec)
Informatica - Master na HBO Software Construction variant MT (2003) Keuze informatica (6 ec)
Informatica - Master na HBO Software Construction variant O (2004) Keuze informatica (6 ec)
Informatiekunde - Master (2004) keuzeruimte (6 ec)
Informatiekunde na het HBO (2003) keuzeruimte (6 ec)
omvang
6 ec (168 uur) : 60 uur plenair college, 0 uur groepsgewijs college, 0 uur computerpracticum, 68 uur 'droog' practicum, 0 uur gesprekken met de docent, 0 uur onderling overleg met medestudenten (werkgroepen, projectwerk e.d.), 40 uur zelfstudie
investering
6 ec * 28 u/ec + #std * (1 + 6ec * 0.15 u/student/ec)
inzet tentatief

examinator
afdeling
tijdbesteding

prof. dr. ir. Theo van der Weide
das
265u.

speciale web-site
http://blackboard.ru.nl/bin/common/course.pl?course_id=_15923_1

 

IR (A constructive approach to Information Retrieval) treats the backgrounds of Information Retrieval:

  1. How do people search for information, and how can this be formalized?
  2. How do people describe what they mean, and how can we formalize meaning?
  3. How can these points be combined?
An important application area is the Internet.

Leerdoelen

The goals of the course IR (A constructive approach to Information Retrieval) is that its participants

  1. are familiar with the base models that are used for Information Retrieval.
  2. have knowledge of query languages, both syntactically and semantically.
  3. are familiar with information extraction from documents, inter-document relations and their appreciation.
  4. have insight and proficiency in design and construction of search engines.
  5. have insight in interaction techniques to support searchers in their quest for information.
  6. have some experience with scientific literature in this field.

Onderwerpen

The course consists of three main parts:

  1. Fundamentals
    1. After a discussion on the problem areas of Information Retrieval,
    2. the evaluation methods for Information Retrieval are discussed.
    3. The Boolean model is discussed, together with techniques related with inverted list document representation.
    4. The vector model is the most used model. As a method for knowledge extraction, the singular value decomposition (main component analysis) is discussed.
    5. The probabilistic model applies Bayesian learning techniques to Information Retrieval.
    The models are discussed both from a cognitive and a computational point of view.
  2. Knowledge extrraction and Information processing
    1. Query languages in relation with cognitive aspects of information searching.
    2. Autonomous query improvement techniques (global context analysis). Guided query improvement techniques (feedback).
    3. Pseudo relevance feedback (local context analysis).
    4. Clustering techniques for knowledge extraction
  3. Document relations on the Web
    1. Web retrieval.
    2. Exploring the reference structure between documents (for example, page rank).
    3. Exploring document appreciations (collaborative techniques).
    4. Special topics contributed by the participants
During the course, guest speakers are invited to discuss state-of-the-art topics.

Toelichting

Finding relevant documents no longer seems to be the major challenge of state-of-the-art search engines. Were recall and precision major concerns in the early days of their existence, trying to convey information rather than just data seems to be a major concern nowadays. Offering a long list of documents in order of their relevancy score is known to be a too simple interface.

In order to improve on this, solid knowledge of the information retrieval problem and its main techniques is imperative.

As there are still many questions about the essentials, a strong relation with ongoing research activities is indispensable.

Werkvormen

  1. The course is divided in three parts, each part is concluded with a test.
  2. Each week there are 4 contact hours, in which the new material is presented and exercised.
  3. The participants have to make a contribution to the course (see below).
Student contribution
Participants have to choose a topic from the most recent TREC conference. These contributions will be centered around special themes in Information Retrieval. The themes will vary from year to year. The actual themes will be announced during the lectures.

The students make an extended summary of the topic choosen, and present this during the lecture. The contributions are peer reviewed by the participants of the course.

Vereiste voorkennis

Participant of IR (A constructive approach to Information Retrieval) should have the base qualifications as provided by the bachelor Informatica or Informatiekunde.

Tentaminering

The exam for IR consists of 4 exercises, and leads to a mark (e).

The first three exercises correspond to the three parts of the course. If the corresponding test resulted in a mark ≥ 6, then the participant may choose to skip this of the exam. In that case the mark for the test is the mark for that exercise. Would the participant choose to make the exam exercise, then the mark for the test is assumed to be cancelled.

The 4th exercise is associated with the personal student contribution. If this contribution has a mark ≥ 6, then will be the score of exercise 4. In the other case, the student will have to make exercise 4.

During the course homework exercises will be handed out. Each exercise is reviewed with a score -1, 0 or +1. This results in a bonus score (b).

The final result is obtained as follows:

if e ≥ 6
then e + b/10
else e
This special arrangement is not valid during the re-exam.

Combinatiemogelijkheden

This course is a part of the Da Vinci series of courses.

Literatuur

Lecture notes will be made available via Blackboard.


Evaluatie: studentenquêtes ; geen docentevaluatie bekend Rendement: 31 begonnen, echt meegedaan, geslaagd met 1e kans, geslaagd totaal
Q: