Dependable Distributed Systems
Objectives
The main goal of this course is to specialize the knowledge of the students in the area of Dependable Distributed Systems. This knowledge is addressed by gaining a better understanding on foundations and proposals from the recent research on dependable computing systems, advanced techniques, algorithms and mechanisms involved in the design of large-scale and complex distributed systems, with fault-tolerance, security, privacy and intrusion tolerance services.
The course addresses the study of foundations and formalisms on algorithms, mechanisms and services used in the design of distributed dependable systems for critical applications, in which the above properties must be conjugated in the identified requirements. This knowledge is strongly supported by the domain of practical implementation tools and techniques, experimental evaluation criteria and critical analysis form design foundations and experimental observation of practical dependable distributed systems.
Skills as objectives:
Knowledge
- Concepts, principles, paradigms to the analysis and synthesis of dependable distributed systems, namely their mechanisms and services for design goals and operation support;
- Foundations and abstractions for the design and construction of mechanisms and services for dependable distributed systems;
- Techniques to combine security, privacy, reliability, fault-tolerance and intrusion tolerance for dependable distributed systems and their software components;
- Domain of technqiues and solutions for trusted execution environments and related support at hardware-level.
Application
- Designing mechanisms and services, including their components and algorithms to build critical distributed systems;
- Analysis and experimental assessement of dependable properties in a dependable distributed system;
- Programming and development of dependable distributed systems to support critical applications and services, incolving blockchain platforms, dependable services for cloud computing and cloud-storage platforms, trusted mobile computing and trustworthy solutions for IoT platfroms and applications
General characterization
Code
11555
Credits
6.0
Responsible teacher
João Carlos Antunes Leitão, Nuno Manuel Ribeiro Preguiça
Hours
Weekly - 4
Total - 52
Teaching language
Português
Prerequisites
The MIEI course dont have a formal precedence regime with mandatory requirements, beyond the normal sequence and adequacy of knowldge bases and practical skills, as addressed by previous related courses in the study plan of the MIEI curricula (Mestrado em Engenharia Informática). However, the following aspects must be considered as relevant base knowledge by the students interested in following the course, for the achievment of the proposed objectives.
- Completion of the Distributed Systems course (as a consolidation course). Recommended skills on Operating Systems Fondations and Computer and Networks System Security. Backgound on Distributed Systems Algorithms and Distributed Systems Programming can be very useful for the CSD course.
- Strong knowldge on Computer Networks and TCP/IP stack protocols (including HTTP, DNS, TCP, UDP, IP, IEEE802.1/802.11, as well as proramming skills for applications using the TCP/IP Stack (Sockets and Rest/HTTP in Java, C# or C++)
- A solid knowledge on principles and practice on distributed systems programming tools and paradigms (ex., Sockets, Webservices, Rest). Some practice in web-programming enviroments or programming with cloud-platforms ican be also interesting as well as initial practice in the design, implementation and debungging of distributed systems'''''''' algorithms.
- Very important to have backgorund in applied cryptography and programming with cryptographic methods and algorithms (ex., Java/JCE and CryptoProviders, Programming with TLS channels - Java JSSE and REST/HTTPS)
- Strong skills in programming with Java language, as well practive with programming environments and tools (ex., Eclipse IDE) and related tools for project management with maintenance repositories (ex., GitHub, Git Plufings in IDE or git command line)
- Is strongly recommended a previous knowldege and practical experience on Operating Systems Foundations and practical skills for UNIX (ex., Linux distributions or Mac OS X), practice in using shell-environment and command-line consoles, and in using virtualized OS or application-support environments (ex., VMWare, Virtual Box, or initial practice with Docker-based Containerization - Docker and Docker deployment with Docker Compose)
Bibliography
- . Anderson, A Guide to Building Dependable Distributed Systems, Wiley, 2020W. Zhao, Building Dependable Distributed Systems, Wiley, 2014
- C. Cachin, R. Guerraoui, L. Rodrigues, Introduction to Reliable and Secure Distributed Programming (2nd Ed), - Springer, 2011.
- W. Stallings, L. Brown, Computer Security - Principles and Practice, Prent. Hall, 2014
Additonal References
- W. Stallings, Information Privacy Engineeirng and Privacy by Design, Pearson, 2020
- W. Stallings, Cryptography and Network Security 8th Ed. Pearson, 2020
- M. Correia, P. Sousa, Segurança no Software, FCA Ed. 2017
Obs) Suggested readings from the bibliography and selected research papers will be guided on the class-lectures. Additonal materials and guidelines for practical/lab activities and work-assignments will be available as lab materials.
Teaching method
The classes (lectures and laboratories) and the provided materials are prepared for the course to be taught in English.
The course is organized in lectures for presenting and discussing foundations, concepts, principles, paradigms, techniques or algorithms, covering the course program topics, as well as, to conduct specific discussions, analysis and clarifications on suggested readings.
Labs are organized for conducting programming exercises involving mechanisms, techniques and algorithms involving software and hardware components and cloud-based resources. Some labs also involve the demonstration of techniques or related components, including demonstrations supporting tutorial explanations on the use of tools. Some labs are planned to support students in the development of practical work assignments and mini-projects, discussion and clarification of requirements and design criteria, and orientation on implementation options.
Evaluation of Students:
Frequency tests;
- 2 midterm frequencty tests (T1, T2)
- Tests can be done without the presence requirement (giving the covid19 circumstances). Frequency tests cover the program topics presented and initially discussed in lecture.
- Tests will be conducted without presence requirement (due to Covid-19 constraints), by using an evaluation platform for remotely assigned tests. Each test is divided into two parts: part I is organized as a quiz with time-constraints for answers; part II is organized with a set of open questions using variations of questions on the tested topics, from a test bank, avoiding interceptions of the same questions on different personalized tests.
Practical evaluation:
- Practical evaluation is composed by five workassignmemts and one final project. For the development, a set of different development tools are used: IDEs, Github shared repositories and Jupyter/Notes platform, provided runtime environments for experimental testbench platforms and benchmarking tools. The course provides access to cloud-based dedicated instances and tools, used by the students for the development of the final project.
- 5 Lab work group-assignments are practical/programming exercises (E) (developed in groups of 3 students max.). Development, demonstration, correctness proof and evaluation is coordinated and takes place in videoconference sessions. These work-assignments will be conducted and evaluated during the activities in six labs.
- 1 Final project (P)
Grade and Evaluation Criteria
Frequency assessment and grade conditions, as well as, all the related evaluation criteria are presented in detail in the section "Métodos de Avaliação" (Evaluation Methods).
Evaluation method
Assessment components
- 2 midterm frequency tests (T1, T2),
- Individual evaluation and tests "on-site" (face-to-face)
- Each test covers part of program topics from defined lectures, materials from lectures and related bibliographic references
- Tests with two parts: closed-book part and open-book part
- Closed-book part: no materials can be consulted
- Open-book part: students only authorized to use printed and personal elements (not shareable documentation, no computers or electronic and communication devices)
- Practical evaluation: (TP1, TP2) developed as groupwork miniprojects (2 students/group as reference)
- Assignments TP1 and TP2 are developed as mini-projects developed in groups of 2 students
- Evaluation criteria cover the development process and defined delivery dates (with possible penalizations in case of non-compliance
- There is an individual evaluation part (25% of each work) requiring an indivdual practical test, with a related quizz performed individually as a proof of domain about the implementation and experimental evaluation
Frequency conditions (F)
- F = 40% TP1 + 60% TP2
- Individual frequency condition: F greater or equal to 9,5/20
Grade with Frequency Evaluation (AF)
- Frequency condition verified (F)
- Rule for passing the course with frequency evaluation (AF):
- AF = 20%(T1) + 30% (T2) + 50% (F)
- Rule for passing:
- AF greater or equal to 9,5/20, with T1 and T2 (both) greater or equal to 7,5/20
Grade with Appeal Exam (AER)
- Frequency condition rule (F) verified
- ER: Appeal exam (on site, face-to-face)
- Rule for passing the course
- AER = 50% (F) + 50 % (ER)
- ER greater or equal to 7,5/20 and AER greater or equal to 9,5/20
- Frequency already obtained in 2019/2020 is valid with the respective evaluation rules applied
- All evaluations will be done and published in the scale 0-20 points (one decimal)
- All tests (T1,T2) and exams (ER) are face to-face evaluations (on physical site)
- Scheduled discussions and demonstrations will be done on site (as face-to.face activities)
Subject matter
- Introduction
- Models, Techniques, Mechanisms and Services for Dependable Distributed Systems
- Reliable and secure communication channels
- Intrusion prevention, detection, recovery and tolerance
- Byzantine Fault-Tolerance (BFT) and Intrusion Tolerance
- Blockchain Platforms: Service planes, mechanisms and case-studies
- Privacy-preservation
- Trusted and confidential computing
- Introduction
- Principles and concepts for dependable distributed systems: models, properties, design and implementation techniques
- Dependability properties, attributes and metrics
- Failure models and adversary model definition
- Mechanisms for dependable distributed systems
- Reliable and secure communication channels
- Pont-to-Point vs. End-to-End channels
- Reliable group-oriented communication and secure group-oriented communication
- Group communication and message ordering guarantees
- Communication channels and modeling for Dependable Distributed Systems
- Techniques, Mechanisms and Tools for Dependable Distributed Systems
- Logging, Logging and Checkpointing
- State recovery with rollback and rollforward techniques
- Replication models
- Quorums
- State-Machine Replication (SMR)
- Consensus with byzantine fault tolerance and intrusion tolerance guarantees
- Byzantine quorum systems
- Consensus, FLP impossibility and FLP circumvention techniques
- Consensus with Byzantine Fault Tolerance
- Byzantine SMR Protocols: Paxos, Multipaxos, PBFT and other Paxos Variants
- Probabilistic consensus solutions
- Enhanced solutions: Randomization and diversity
- Intrusion prevention, detection, auditing, and recovery
- Perimeter defenses
- Intrusion prevention systems
- Intrusion detection systems (HIDS, NIDS, HIDS, Honeypots and Honeynets)
- Intrusion recovery: reactive recovery and pro-active recovery
- Blockchain platforms
- Origins, Blockchains typology and applications
- Service planes in Blockchain platforms and their architectures
- Blockchain programming and programming with smart contracts
- From byzantine consensus to Blockchain-enabled consensus solutions
- Consensus plane solutions and models: PoW, PoS, PoET and other solutions
- Challenges, solutions and issues in Large-Scale Permissionless Blockchains
- Scale
- Performance and consistency
- Security
- Fairness and sustainability
- Privacy and anonymity preservation
- Full trust-decentralization
- Composition of blockchains: Multichains
- Independence between applications and blockchain service planes
- Case studies
- Privacy-preservation
- Advanced techniques for privacy-enhanced data management and computation
- Operations with encrypted data: security-at-the-rest techniques and homomorphic encryption
- Searchable Encryption
- Other techniques for data anonymization; data obfuscation and differential privacy
- Case studies
- Trusted Computing and Confidential Computing
- Techniques, mechanisms
- HSMs and TPMs
- Trusted computing with software attestation
- Trusted execution environments (TEE) and Hardware-based platforms
- Case-studies: IntelSGX, TrustZone and Virtualized Trusted Computing
- Programming with TEE platforms
- Virtualization with Hardware-Backed Isolation
- Confidential computing and Privacy-Preserved Computations