Stream Processing

Objectives

Learn the fundamentals, languages and systems for building application that process streams of data, ranging from general purpose distributed realtime stream processing systems to structured data models for dealing with streams.

General characterization

Code

11562

Credits

6.0

Responsible teacher

Nuno Manuel Ribeiro Preguiça

Hours

Weekly - 4

Total - Available soon

Teaching language

Português

Prerequisites

Available soon

Bibliography

Opher Etzion and Peter Niblett. Event Processing in Action. Manning Publications, 2010.

Lukasz Golab and Tamer Özsu. Data Stream Management. Morgan and Claypool, 2010.

Several papers will be provide for further reading.

Teaching method

Available soon

Evaluation method

General evaluation method

Two components contribute to the evaluation of the course: theoretical/problems and project.

Both components are evaluated in an integer scale from 0 to 20. 

The final evaluation is the weighted average of the two components where the project component is worth 1/3, and the theoretical/problems component the remaining 2/3.

To pass, the students must have an evaluation of at least 10 in both the theoretical/problems component and in the final evaluation.

There are no further requirements for approval.

 

Theoretical/problems component

This component is evaluation is obtained by 2 written tests, during the semester. The first test will take place by the end of the 5th week of classes, and the 2nd close to the end of the lecturing period (dates to be confirmed by the pedagogical committee).

Alternatively, this component can also be evaluated by a written exam.

In the first case the final evaluation of the component is the arithmetic average of the two tests, rounded to the closest integer. In the second case the final evaluation of the component coincides with exam grade.

Project component

This component is evaluated two project, its public presentation and discussion. The 1st project should be delivered by the end of the 7th week of classes, and the 2nd close to the end of the lecturing period (dates to be confirmed by the pedagogical committee).

The project is done in groups of exactly 2 students.

Though it is a project done in groups, the evaluation of this component is individual. Besides the project and report, prepared by the group, the individual performance in the presentation and discussion of a project from fellow students, is also considered. The final grade of this component is the arithmetic average of the inidvidual grades in the two projects.

 

Subject matter

Distributed Stream Processing Systems.
System models for stream processing: streams as sequences of mini-batches (e.g. Spark streaming); continuous processing (e.g. Storm). General-purpose programming models. The problem of cyclic computations.
System aspects: distribution, scalability and fault-tolerance.

Data Stream Management Systems (DSMS).
Structured Data Models for Streams. Algebraic operators on stream and relations (continuous queries, aggregates and blocking, time windows). Continuous query languages. Languages and systems that extend SQL and database management systems to deal with data streams.

Complex Event Processing.
Streams as sequences of events. Production rules, reactive rules, and event-driven computing. Event processing networks, agents and channels. Complex and derived events. Detection of event patterns. Event-processing languages and systems.