Dependable Distributed Systems
Objectives
The main goal of this course is to specialize the knowledge of the students in Dependable Distributed Systems. This knowledge is addressed by gaining a better understanding on foundations and proposals from the recent research on dependable computing systems, advanced techniques, algorithms and mechanisms involved in the design of large-scale and complex distributed systems, with fault-tolerance, security, privacy and intrusion tolerance services.
The course addresses the study of foundations and formalisms on algorithms, mechanisms and services used in the design of distributed dependable systems for critical applications, in which the above properties must be conjugated in the identified requirements. This knowledge is strongly supported by the domain of practical implementation tools and techniques, experimental evaluation criteria and critical analysis form design foundations and experimental observation of practical dependable distributed systems.
Knowledge goals
- Concepts, principles, paradigms to the analysis and synthesis of dependable distributed systems, namely their mechanisms and services for design goals and operation support.
- Foundations and abstractions for the design and construction of mechanisms and services for dependable distributed systems.
- To know about the principles, design issues, foundations, paradigms, models and dependability properties of Blockchain platforms and their various service planes and design options
- Techniques to combine security, privacy, reliability, fault-tolerance and intrusion tolerance for dependable distributed systems and their software components.
- Foundations of methods, algorithms, tools, and cryptographic constructions for privacy-preserving engineering for distributed and multiparty data processing and computations.
- Domain of techniques and solutions for trusted execution environments and related support for isolation and containment with hardware-level trust-enabled solutions
Practical objectives
- Designing mechanisms and services, including their components and algorithms to build critical distributed systems.
- Analysis and experimental assessment of dependable properties in a dependable distributed system;
- Practical knowledge of implementation principles and programming support for Blockchain platforms
- Practical use of cryptographic constructions, protocols and solutions for privacy-preserved distributed data management and computations
- Programming and development of dependable distributed systems to support critical applications and services, involving blockchain platforms, dependable services for cloud computing and cloud-storage platforms, trusted mobile computing and trustworthy solutions for IoT platforms and applications, or IoT-Edge-Cloud Continuum Systems and their computation and storage pipelines.
General characterization
Code
11555
Credits
6.0
Responsible teacher
Henrique João Lopes Domingos
Hours
Weekly - 4
Total - 52
Teaching language
Inglês
Prerequisites
The MIEI course dont have a formal precedence regime with mandatory requirements, beyond the normal sequence and adequacy of knowldge bases and practical skills, as addressed by previous related courses in the study plan of the MIEI curricula (Mestrado em Engenharia Informática). However, the following aspects must be considered as relevant base knowledge by the students interested in following the course, for the achievment of the proposed objectives.
- Completion of the Distributed Systems course (as a consolidation course). Recommended skills on Operating Systems Fondations and Computer and Networks System Security. Backgound on Distributed Systems Algorithms and Distributed Systems Programming can be very useful for the CSD course.
- Strong knowldge on Computer Networks and TCP/IP stack protocols (including HTTP, DNS, TCP, UDP, IP, IEEE802.1/802.11, as well as proramming skills for applications using the TCP/IP Stack (Sockets and Rest/HTTP in Java, C# or C++)
- A solid knowledge on principles and practice on distributed systems programming tools and paradigms (ex., Sockets, Webservices, Rest). Some practice in web-programming enviroments or programming with cloud-platforms ican be also interesting as well as initial practice in the design, implementation and debungging of distributed systems'''' algorithms.
- Very important to have backgorund in applied cryptography and programming with cryptographic methods and algorithms (ex., Java/JCE and CryptoProviders, Programming with TLS channels - Java JSSE and REST/HTTPS)
- Strong skills in programming with Java language, as well practive with programming environments and tools (ex., Eclipse IDE) and related tools for project management with maintenance repositories (ex., GitHub, Git Plufings in IDE or git command line)
- Is strongly recommended a previous knowldege and practical experience on Operating Systems Foundations and practical skills for UNIX (ex., Linux distributions or Mac OS X), practice in using shell-environment and command-line consoles, and in using virtualized OS or application-support environments (ex., VMWare, Virtual Box, or initial practice with Docker-based Containerization - Docker and Docker deployment with Docker Compose)
Bibliography
- W. Zhao, Building Dependable Distributed Systems, Wiley, 2014
- C. Cachin, R. Guerraoui, L. Rodrigues, Introduction to Reliable and Secure Distributed Programming, 2nd Ed - Springer, 2011
- Michel Raynal. Fault-Tolerant Message-Passing Distributed Systems: An Algorithmic Approach. Springer. 2018.
Other references
- W. Stallings, Information Privacy Engineeirng and Privacy by Design, Pearson, 2020
- W. Stallings, Cryptography and Network Security 8th Ed. Pearson, 2020
- M. Correia, P. Sousa, Segurança no Software, FCA Ed. 2017
- R. Anderson, A Guide to Building Dependable Distributed Systems, Wiley, 2020
Obs) Suggested readings from selected research papers will be proposed in class-lectures for case-studies. Additonal materials and guidelines for practical/lab activities and work-assignments will be available as lab materials.
Teaching method
Depending in the number of students and groups, classes (lectures and laboratories) can be taugth in english or portuguese. The materials and bibliographic references are available in english.
The course is organized in lectures for presenting and discussing foundations, concepts, principles, paradigms, techniques or algorithms, covering the course program topics, as well as, to conduct specific discussions, analysis and clarifications on suggested readings.
Labs are organized for conducting programming exercises involving mechanisms, techniques and algorithms involving software and hardware components and cloud-based resources. Some labs also involve the demonstration of techniques or related components, including demonstrations supporting tutorial explanations on the use of tools.
Some labs are planned to support students in the development of practical work assignments and mini-projects, discussion and clarification of requirements and design criteria, and orientation on implementation options.
Evaluation of Students:
- 2 midterm frequencty tests, with individual evaluation (T1, T2)
- Tests cover the program topics on lectures and reference readings for study.
- Tests will be conducted with mandaory presence in announced FCT physical site/room
- Tests are preformded individually and can contain questions for answers in two parts including a closed book part and open-book part with possibe use of individual materials. Only printed individual materials can be used and the use of any electronic device is not allowed.
- Practical evaluation is composed by the folllwing components:
- Individual optional evaluation (of participation and progress developments in classes)
- Evaluation of two projects developed as group-work (groups of 2 students), with part of evaluation individualized.
Grade and Evaluation Criteria
Frequency assessment and evaluation rules, as well as, grade conditions, are defined in the section "Métodos de Avaliação" (Evaluation Methods).
Evaluation method
Components
- T1, T2: 2 midterm frequency tests with two parts: closed book part (PSC) and open book part (PCC)
- Only printed and individual materials can be used for the open book part
- The PCC part can adress practical questions covering the project implementations
- P1, P2: 2 projects with mandatory submission in defined deadline dates
- Projects developed as group work (ref. 2 students/group)
- E: Exam (Appeal Exam)
- All evaluation elements will be classified in the scale 0-20 points and rouding to the decimal point
Evaluation rules for tests (T1 and T2):
- Evaluation of tests (NT)
- NT = max (45% PSC + 55% PCC; 55% PSC + 45% PCC)
Evaluation rules for projects (P1 and P2):
- Evaluation of each project (P):
- P = 75% G + 25% I
- G: evaluation of project, as group work
- I: individualized evaluation component from demonstration and discussion and demonstrated individual knowledge of the developed work (10%), individual contribution for the group work (10%) and internal group evaluation (5%)
Frequency evaluation(F):
- Frequency evaluation (F) is obtained in the following way:
- F = 50% P1 + 50% P2
- To obtain frequency the following two conditions are necessary:
- F must be greater than or equal to 7,5/20
- The evaluation of projects (P1 and P2) mut be greater than or equal to 7,5/20
Grade with frequency evaluation (AF)
- Evaluation rule with frequency elements (AF):
- AF = 45% F + 25% T1 + 30% T2
- To obtain grade with frequency elements the following conditions must be satisfied:
- AF greater than or equal to 9,5/20
- T1 greater than or equal to 7,5/20
- T2 greater than or equal to 7,5/20
Grade with final Exam (EF)
- Evaluation with the final exam (AEF):
- AEF = 45% AF + 55% EF
- To obtain grade all the following conditions must be satisfied:
- EF greater than or equal to 7,5/20
- F greater than or equal to 7,5/20
- AEF greater than or equal to 9,5/20
Subject matter
Program topics:
- Introduction
- Reliable and secure communication channels
- Techniques, mechanisms and services for dpendable distributed systems
- Byzantine Fault-Tolerance (BFT) and Intrusion Tolerance
- Intrusion prevention, detection and recovery
- Blockchain Platforms
- Privacy-preservation
- Trusted execution environments (TEE) and confidential computing
Program (topics in detail):
1. Introduction
- Concepts, properties, attributes and metrics for dependable systems
- Failure models and adversary model definition
- Modeling and representation of dependable distributed systems
- Trust decentralization
- Software stacks, mechanisms and services for dependable systems
- Decentralzied systems, Web3 and decentralziation with blockchains
2. Reliable and secure communication channels
- Unicast (PtP), multicast and broadcast communication channels
- Cryptographic techniques and tools
- Relevant security standards and leveraging protocols
- Tunneled end-to-end secure communication
- Reliable and secure broadcast channels
- Primitives and abstractions for reliable communication channels
- Case-study: protocol stacks and algorithms for reliable and secure communication services
3. Techniques and mechanisms for dependable distributed systems
- Logging and checkpointing
- State recovery using rollback and rollforward techniques
- Read/Write Registers
- Quorums
- State-machine replication
- Consistency and Durability
- Solutions with diversity
- Randomization and trust decentralization
- Isolation or confinment and trusted computing environments
4. Byzantine fault tolerance and solutions for intrusion tolerance
- Quorums and Byzantine Quorums
- Consensus, FLP impossibility and FLP circumvention techniques
- Synchronous consensus
- Asynchronous consensus
- Consensus with byzantine fault tolerance
- Sybil attacks and countermeasurs
- Design and study of practical implementations
- RAFT and BFT-RAFT; MiniSec, BFTSMaRt
- Probabilistic and randomized consensus
- Diversity-enhanced consensus solutions
- Case studies
5. Intrusion prevention and intrusion detection
- Perimeter defenses
- Intrusion prevention systems
- Intrusion detection systems
- HIDS, NIDS, HIDS, Honeypots and Honeynets
- SIEM Platforms
- Intrusion recovery: reactive recovery and pro-active recovery
- Diversity and examples of diversity solutions
- Runtime stacks with diversity
- N-Versioning Programming
- Cloud-of-Clouds
- Denial of service attacks (DoS, DDoS, Botnets) and countermeasures
6. Blockchain platforms and technology
- Origins, Blockchains typology and applications
- Case studies: Bitcoin, Ethereum, HLF and CORDA
- Bitcoin scripts and programming with smart contracts
- Service planes in Blockchain platforms and their architectures
- Blockchain programming and programming with smart contracts
- From byzantine consensus to Blockchain-enabled consensus solutions
- Performance and operation metrics
- Finality and finality latency
- Consensus models and mechanisms: PoW, PoS, PBFT, PoET, PoH, PoB and other solutions
- Limitations and drawbacks and alternative solutions to address scale, security, sustainabiility and performance
- Case studies
7. Privacy Preservation
- Advanced techniques for privacy-enhanced data management and computation
- Operations with encrypted data: security-at-the-rest techniques and homomorphic encryption
- Searchable Encryption
- Functional encryption
- Property-preserved encryption and attribute-based encryption
- Other techniques:
- Data anonymization and obfuscation
- Differential privacy
- Secret sharing
- Threshold cryptography
- Privacy-preserved multiparty computations
- ORAM, Oblivious transfers and oblivious storage
- Erasure coding
- Techniques for privacy-enhanced and anonymized communication:
- Case studies:
- ToR network and Mixnets
- Databse privacy models
- Privacy-preserved blockchains
8. Trusted Platforms and Confidential Computing
- Cryptographic HW, HSMs and TPMs
- Trusted computing with software attestation
- Trusted execution environments (TEEs)
- Reference technologies: IntelSGX, TrustZone and AMD-SEV
- Virtualization with Trusted Computing Platforms
- Programming environments for TEE platforms
- Confidential computing