Stream Processing

Objectives

Learn the fundamentals, languages and systems for building application that process streams of data, ranging from general purpose distributed realtime stream processing systems to structured data models for dealing with streams.

General characterization

Code

11562

Credits

6.0

Responsible teacher

João Carlos Gomes Moura Pires, Nuno Manuel Ribeiro Preguiça

Hours

Weekly - 4

Total - Available soon

Teaching language

Português

Prerequisites

Available soon

Bibliography

Opher Etzion and Peter Niblett. Event Processing in Action. Manning Publications, 2010.

Lukasz Golab and Tamer Özsu. Data Stream Management. Morgan and Claypool, 2010.

Several papers will be provide for further reading.

Teaching method

Available soon

Evaluation method

The evaluation consists of a theoretical-practical component (2 individual tests, T1 and T2, each worth 25% of the classification) and a project component (2 implementation works, P1 and P2, each worth 25% of the classification).

The tests must be carried out in person if possible and will be without consultation (unless the one provided by the teachers).

Practical assignments must be carried out in groups of two students, preferably with the same constitution for both assignments. An oral presentation may be requested.

The final classification is 0.25 * (T1 + T2 + P1 + P2) in which the classifications of T1 and T2 are rounded to one decimal place and P1 and P2 rounded to the nearest integer.

To obtain approval, it is necessary that the average of T1 and T2 is greater than 8.0 and the average of P1 and P2 is greater than 10.0.

The mean of T1 and T2 can be replaced by an individual exam.

Subject matter

Distributed Stream Processing Systems.
System models for stream processing: streams as sequences of mini-batches (e.g. Spark streaming); continuous processing (e.g. Storm). General-purpose programming models. The problem of cyclic computations.
System aspects: distribution, scalability and fault-tolerance.

Data Stream Management Systems (DSMS).
Structured Data Models for Streams. Algebraic operators on stream and relations (continuous queries, aggregates and blocking, time windows). Continuous query languages. Languages and systems that extend SQL and database management systems to deal with data streams.

Complex Event Processing.
Streams as sequences of events. Production rules, reactive rules, and event-driven computing. Event processing networks, agents and channels. Complex and derived events. Detection of event patterns. Event-processing languages and systems.