I00068 (I00068)
Opslaan en terugvinden*
< 2006/2007 > 05-02-2007 t/m 01-07-2007 () L
Informatiekunde - Bachelor (2003) Subject van verandering en bestendiging (6 ec)
omvang
6 ec (168 uur) : 60 uur plenair college, 0 uur groepsgewijs college, 0 uur computerpracticum, 68 uur 'droog' practicum, 0 uur gesprekken met de docent, 0 uur onderling overleg met medestudenten (werkgroepen, projectwerk e.d.), 40 uur zelfstudie
investering
6 ec * 28 u/ec + #std * (1 + 6ec * 0.75 u/student/ec)
inzet tentatief

examinator
afdeling
tijdbesteding

prof. dr. ir. Theo van der Weide
das
220u.

speciale web-site
/is/edu/OnT/

 

In this course we start with the discussion of the most important hardware characteristics, and show how to build on the storage devices efficient and reliable file management systems.

Relational algebra is introduced as an intermediate between SQL and file management systems. The translation of SQL into relational algebra is discussed, and it is shown how relational algebra is used as a mechanism for query optimization. We also indicate how relational algebra is used to do lineage tracing in the context of data warehouses.

We then proceed with the mapping of the relational table structure and its elementary operators onto the file management system. We discuss the important file structuring mechanisms like ISAM, B+ tree and hashing.

We then discuss more advanced mechanism to retrieve information, that are based on the user point of view. We also relate this to structured files, especially in the context of XML.

Finally we focus on some special issues as transaction management and security.

Leerdoelen

After this course the student

  1. can describe the various storage techniques and can describe their performance for elementary access patterns.
  2. has knowledge of the execution of SQL queries in terms of relational algebra and has an elementary understanding of query optimization.
  3. has knowledge of the elementary principles of information retrieval.
  4. has knowledge of structured documents and is aware of their representation in terms of XML.

Onderwerpen

The course is organized in 3 blocks:

  1. Elementary organization storage media
    • Properties and characteristics of storage media. In detail are discussed: hard disc, optical disc and tape storage media.
    • Classic file management systems. The general allocation principles are discussed, and related to the FAT file system and the UNIX file system. Typical problems like defragmentation are treated.
    • Memory management. The importance of memory management techniques are explained, the most important replacement techniques are acquired.
    • The design of fault tolerant systems. Concepts like MTBF are introduced and related to probability theory. As an example the RAID technology is discussed in depth.
    • Advanced file management systems. The requirements of modern file management are discussed and exemplified with the Windows file system NTFS.
  2. File organization techniques.
    • Relational Algebra. The relational data model is introduced in depth, and the relational algebra as the basic operators to manipulate relational tables. The translation of SQL into relational algebra is given, and the base techniques of query optimization are discussed, leading to the pushdown normalform. It will also be indicated how relational algebra is used as a formal basis for tracing the lineage of data as used in the context of data warehouses to proof the validity of overview results.
    • Records. Elementary data storage techniques are discussed, and the various strategies to organize data into records.
    • List structure: the various variants of list structures and their representation in file structures.
    • The ISAM file organization method.
    • The B+ tree. This is introduced as a recursive extension of ISAM.
    • Hashing. Another way to efficiently organize data is the hashing technique. The main attention is on the application of hashing in a dynamic environment, such that growth and shrinkage of the data set are smoothly handled.
  3. Retrieval and structured documents: the storage and retrieval of semi-structured data, and relation with Information Retrieval to search large volumes of data.
    • Information Retrieval. The how and why of retrieval is discussed from the point of view of the searcher. The intention is to go beyond the limitations of formatted boolean retrieval as offered for example by SQL.
    • XML. Modern techniques to organize storage and retrieval of semi-structured data. Typical subjects are: the object model, path expressions, retrieval algebra. The main approaches will be discussed, the students will get hands-on in the basic application of these techniques.
    • Concurrency In a multi-user context, special design issues are important. This will be beased on the SQL trasaction mechanism. Some motivating examples are given after which the concept of ACID transactions is discussed. Some main results are presented, and the students will learn how to scheduling problems can be detected and avoided.
    • Security. SQL provides some mechanisms to protect data against unauthorizes usage. These will be discussed from a broader insight into security problems in general.

Toelichting

Special attention is paid to the design of efficient and stable realizations of conceptual structures. Formal background is not avoided, but formalism itself is not a teaching goal of this course. The intention is that students get better insight in backgrounds, supported by hands-on in the use of formal methods, which enables them to apply their knowledge in new situations.

Werkvormen

  1. The course is divided in three parts, each part is concluded with a test.
  2. Failing a test is no obstacle to attend the next part. In fact, the 2nd and 3th block provide the student an opportunity to make a fresh restart.
  3. Each week there are 4 contact hours, in which the new material is presented and exercised.
  4. The participants have to make a small personal contribution to the course. Details will be aanounced during the course.

Vereiste voorkennis

First semester courses Informatiekunde.

Tentaminering

The exam for this course consists of 4 exercises, which lead to a mark (e).

The first three exercises correspond to the three parts of the course. If the corresponding test resulted in a mark ≥ 6, then the participant may choose to skip this of the exam. In that case the mark for the test is the mark for that exercise. Would the participant choose to make the exam exercise, then the mark for the test is assumed to be cancelled.

The 4th exercise is associated with the personal student contribution. If this contribution has a mark ≥ 6, then will be the score of exercise 4. In the other case, the student will have to make exercise 4.

During the course homework exercises will be handed out. Each exercise is reviewed with a score -1, 0 or +1. This results in a bonus/malus score (b).

The final result is obtained by rounding from the result of:

if e ≥ 6
then e + b/10
else e
This special arrangement is not valid during the re-exam.

Combinatiemogelijkheden

This course is a part of the DaVinci series of courses.

Literatuur

Lecture notes, distributed via Blackboard.

Suggested literature:

  1. Ramez Elmasri, Sham Navathe, Fundamentals of Database Systems, Addison Wesley
  2. Jeffrey D. Ullman, Jenniger Widom, A first course in database systems, Pearson Education International


Evaluatie: studentenquêtes ; geen docentevaluatie bekend Rendement: 17 begonnen, echt meegedaan, geslaagd met 1e kans, geslaagd totaal
Q: