Algorithms and Distributed Systems

Objectives

The core goal of this course is to consolidate the knowledge of the students in the area of distributed systems, by gaining a better understanding of the problems and associated algorithms that are used in the design of systems widely in use now-a-days. This tackled by studying the fundamental ideas associated with key algorithms that are underlying the conception of current systems but also that are expected to play a key role in the conception of future ones. 

The course has a strong algorithmic component, which is entwined with a set of programming projects, which put in practice the fundamental concepts learned in lectures, exposing student to key subtleties associated with the implementation of many of these algorithms.

Knowledge:

• Basic concepts to the analysis and synthesis of distributed algorithms.

• Fundamental abstractions for building distributed systems and the algorithms that are used to realize them.

• Techniques for improving the reliability and scalability of distributed systems.

Application:

• Designing distributed algorithms and applying them to build distributed systems.

• Analysing distributed algorithms.

• Programming distributed systems.

General characterization

Code

10644

Credits

6.0

Responsible teacher

João Carlos Antunes Leitão, Nuno Manuel Ribeiro Preguiça

Hours

Weekly - 4

Total - 56

Teaching language

Inglês

Prerequisites

Fundamental knowledge of distributed systems and Java programming.

Bibliography

Main course reading:

• N. Lynch. Distributed Algorithms Morgan Kauffman, 1996.

• C. Cachin, R. Guerraoui, L. Rodrigues "Introduction to Reliable and Secure Distributed Programming", 2nd ed, Springer, 2011.

Selected research papers.

 

Complementary reading:

• H. Attiya and J. Welch. Distributed Computing: Fundamentals, Simulations, and Advanced Topics (2nd Ed.), Wiley 2004.

• S. Mullender (editor) Distributed Systems, Second Edition, ACM Press, Addison-­Wesley, MA, 1994.

• A.S. Tanenbaum and M. van Steen. Distributed Systems. Principles and Paradigms. (2nd Ed.) Prentice Hall, 2007.

• Rodrigo Rodrigues, Peter Druschel. Peer-­to-­Peer Systems. Communications of the ACM. Vol. 53 N10, pp 72-­82.

Teaching method

Lectures presenting fundamental concepts and collaborative design of solutions for problem/algoirhtms.

Laboratory classes focused on case studies and mecanisms/techiniques to implement and develop distributed algorithms and systems, based on a group project.

Individual homeworks.

Evaluation method

2 midterm tests (or allternatively an exame that replaces the tests): 45%

  - each midterm has the same weight for this component.

  - midterms will be made in person if conditions allow for it.

  - if midtems can be in person, each student can take up to two A4 pages, manuscripted with whatever information they want. These are delivered at the end of the midterm and must be identified with name and number.

project (in two phases, and conducted in groups of 3 students): 45%

  - first phase has a weight of 40% in this component, and the second phase a weight of 60%.

Individual homeworks (provided in lectures, and delivered before the start of the following lecture): 10%

  - each homework has the same weight for this component.

Components resuts are rounded up to the 2 decimal place, and final grade rounded up to units.

The course has mandatory frequency. To get approval at frequency the student must have a final grade >= 8.5 at the project.

Subject matter

1. Distributed computation models:

    1.1. Modelling processes, faults, crypto primitives and communication.

    1.2. Synchronous, asynchronous and partially synchronous models.

    1.3. Communication Primitives (Best-effort, exactly once, Broadcast).

2. Peer-to-Peer Systems:

    2.1. Unstructured Overlay networks.

    2.2. Gossip Protocols.

    2.3. Structured Overlay networks.

    2.4. Consistent Hashing and Routing.

    2.5. Case studies.

3. Consensus:

    3.1. Consensus in Synchronous Systems.

    3.2. Consensus in Asynchronous Systems & FLP.

    3.3. Paxos and some variants.

4. Replication and Fault Tolerance:

    4.1. Specifying replicated systems.

    4.2. Active Replication and Passive Replication.

    4.3. Strong Consistency: State Machine Replication.

    4.4. Weak Consistency: CAP theorem, eventual consistency, causal consistency.

    4.5. Case studies.

5. Distributed transactions:

    5.1. Two-phase commit.

    5.2. Three-phase commit.

Programs

Programs where the course is taught: