Dependable Distributed Systems
Objectives
The primary objective of this course is to deepen students'' expertise in Dependable Distributed Systems. This will be achieved by enhancing their understanding of the foundational concepts and the latest research developments in dependable computing systems. The course covers advanced techniques, algorithms, and mechanisms essential for designing large-scale and complex distributed systems. Special emphasis will be placed on ensuring these systems are resilient, secure, and capable of tolerating faults, maintaining privacy, and withstanding intrusions.
The course focuses on the study of foundational theories and formalisms related to algorithms, mechanisms, and services essential for designing distributed dependable systems, particularly for critical applications where reliability, security, privacy, and fault tolerance are paramount. These properties are integrated into the system''s identified requirements. The course places strong emphasis on practical implementation tools and techniques, experimental evaluation criteria, and the critical analysis of design principles. Students will also engage in experimental observation and assessment of practical dependable distributed systems to reinforce their theoretical knowledge.
Knowledge goals
- Concepts, principles, and paradigms essential for the analysis and synthesis of dependable distributed systems. This includes an in-depth examination of the mechanisms and services that support the design goals and operational reliability of these systems. Through this exploration, students will learn how to effectively design, implement, and manage distributed systems that meet stringent dependability requirements
- foundations and abstractions necessary for designing and constructing mechanisms and services that ensure the dependability of distributed systems. These foundational concepts provide the theoretical and practical basis for developing robust systems capable of maintaining reliability, security, and fault tolerance in distributed environments
- To provide comprehensive knowledge of the principles, design issues, foundations, paradigms, models, and dependability properties of Blockchain platforms. It explores the various service layers and design options available within these platforms, focusing on how they ensure security, reliability, and fault tolerance in decentralized environments. This includes an analysis of the unique challenges and considerations involved in the design and development of Blockchain-based systems and applications
- Foundations, techniques and paradigms for solutions and protocols supporting: data-storage and computation services; information dissemination for dependable distributed systems inspired in new Web3 paradigms.
- Principles, techniques, and paradigms for developing solutions and protocols that support data storage, computation services, and information dissemination in dependable distributed systems. It places a particular emphasis on approaches inspired by emerging Web3 paradigms, exploring how these new models can enhance the reliability, security, and efficiency of distributed systems in decentralized environments.
- Deep understanding of the foundations of methods, algorithms, tools, and cryptographic constructions essential for privacy-preserving engineering in distributed and multiparty data processing and computations. It focuses on how these techniques can be used to protect sensitive data while enabling secure and efficient collaboration across distributed systems. Students will explore the design and implementation of privacy-preserving protocols that ensure confidentiality and integrity in multiparty environments.
- To know about solutions for trusted computing platforms, tamper-proof devices (ex., TPMs) and trusted execution environments (TEEs)
- To study related mechanisms for isolation and containment, leveraging hardware-level trust-enabled solutions. Students will gain expertise in the design and deployment applcatioms with HW-backed TEE isolation, focusing on how these environments can ensure secure and trusted execution of code and protection of sensitive data by isolating it from potentially untrusted parts of the system.
Practical objectives
- To design and implement mechanisms and services, including their components and algorithms, essential for building critical distributed systems. Emphasis is placed on creating systems that are reliable, secure, and capable of meeting stringent operational requirements. Students will learn how to architect and develop these components to ensure the overall dependability of distributed systems in real-world, high-stakes environments.
- Analysis and experimental assessment of dependable properties in a dependable distributed system;
- Understanding how dependable properties rare implemented in dependable distributed systems to include reliability, availability, security, fault tolerance and privacy criteria. Additionally, the course offers practical knowledge of implementation principles and programming suported by Blockchain platforms. This includes hands-on experience in developing and deploying Blockchain-based solutions, ensuring that students can apply these technologies to create robust and secure distributed applications
- Practical application of novel and emerget cryptographic constructions, protocols, and solutions designed for privacy-preserved distributed data management and computations. Students will gain hands-on experience with implementing cryptographic techniques that protect data privacy while enabling secure, efficient processing in distributed environments. This includes working with encryption methods, secure multi-party computation, and other privacy-preserving technologies
- Know how to address the development of dependable distributed systems tailored for critical applications and services in IoT-Edge-Cloud continumm environments
General characterization
Code
11555
Credits
6.0
Responsible teacher
Henrique João Lopes Domingos
Hours
Weekly - 4
Total - 52
Teaching language
Português
Prerequisites
The following aspects must be considered as a relevant base knowledge by the students interested in following the course, for the achievment of the proposed objectives.
- Completion of the Distributed Systems course (as a consolidation course). Recommended skills on Operating Systems Fondations and Computer and Networks System Security. Backgound on Distributed Systems Algorithms and Distributed Systems Programming can be very useful for the CSD course.
- Strong knowldge on Computer Networks and TCP/IP stack protocols (including HTTP, DNS, TCP, UDP, IP, IEEE802.1/802.11, as well as proramming skills for applications using the TCP/IP Stack (Sockets and Rest/HTTP in Java, C# or C++)
- A solid knowledge on principles and practice on distributed systems programming tools and paradigms (ex., Sockets, Webservices, Rest). Some practice in web-programming enviroments or programming with cloud-platforms ican be also interesting as well as initial practice in the design, implementation and debungging of distributed systems'''' algorithms.
- Very important to have backgorund in applied cryptography and programming with cryptographic methods and algorithms (ex., Java/JCE and CryptoProviders, Programming with TLS channels - Java JSSE and REST/HTTPS)
- Strong skills in programming with Java language, as well practive with programming environments and tools (ex., Eclipse IDE) and related tools for project management with maintenance repositories (ex., GitHub, Git Plufings in IDE or git command line)
- Is strongly recommended a previous knowldege and practical experience on Operating Systems Foundations and practical skills for UNIX (ex., Linux distributions or Mac OS X), practice in using shell-environment and command-line consoles, and in using virtualized OS or application-support environments (ex., VMWare, Virtual Box, or initial practice with Docker-based Containerization - Docker and Docker deployment with Docker Compose)
Bibliography
- W. Zhao, Building Dependable Distributed Systems, Wiley, 2014
- C. Cachin, R. Guerraoui, L. Rodrigues, Introduction to Reliable and Secure Distributed Programming, 2nd Ed - Springer, 2011
- Michel Raynal. Fault-Tolerant Message-Passing Distributed Systems: An Algorithmic Approach. Springer. 2018.
Other references
- W. Stallings, Information Privacy Engineeirng and Privacy by Design, Pearson, 2020
- W. Stallings, Cryptography and Network Security 8th Ed. Pearson, 2020
- M. Correia, P. Sousa, Segurança no Software, FCA Ed. 2017
- R. Anderson, A Guide to Building Dependable Distributed Systems, Wiley, 2020
Obs) Suggested readings from selected research papers will be proposed in class-lectures for case-studies. Additonal materials and guidelines for practical/lab activities and work-assignments will be available as lab materials.
Teaching method
Depending in the number of students and groups, classes (lectures and laboratories) can be taugth in english or portuguese. The materials and bibliographic references are available in english.
The course is organized in lectures for presenting and discussing foundations, concepts, principles, paradigms, techniques or algorithms, covering the course program topics, as well as, to conduct specific discussions, analysis and clarifications on suggested readings.
Labs are organized for conducting programming exercises involving mechanisms, techniques and algorithms involving software and hardware components and cloud-based resources. Some labs also involve the demonstration of techniques or related components, including demonstrations supporting tutorial explanations on the use of tools.
Some labs are planned to support students in the development of practical work assignments and mini-projects, discussion and clarification of requirements and design criteria, and orientation on implementation options.
Evaluation of Students:
- 2 midterm frequencty tests, with individual evaluation (T1, T2)
- Tests cover the program topics on lectures and reference readings for study.
- Tests will be conducted with mandaory presence in announced FCT physical site/room
- Tests are preformded individually and can contain questions for answers in two parts including a closed book part and open-book part with possibe use of individual materials. Only printed individual materials can be used and the use of any electronic device is not allowed.
- Practical evaluation is composed by the folllwing components:
- Individual optional evaluation (of participation and progress developments in classes)
- Evaluation of two projects developed as group-work (groups of 2 students), with part of evaluation individualized.
Grade and Evaluation Criteria
Frequency assessment and evaluation rules, as well as, grade conditions, are defined in the section "Métodos de Avaliação" (Evaluation Methods).
Evaluation method
Evaluation components:
- T1, T2: 2 midterm frequency tests with two parts:
- PSC: closed book part
- PCC: open book part
- Para realizção dos testes não é permitido o uso de equipamentos de comunicaçãp ou computação
- P1, P2: 2 projects with mandatory submission in defined deadline dates
- Projects developed as group work (ref. 2 students/group)
- CPQ: Class participation (class presence and quizzes)
- E: Exam (Appeal Exam)
- All evaluation elements will be classified in the scale 0-20 points and rounding to the decimal point
Evaluation rules for tests (T1 and T2):
- Evaluation of tests (NT)
- NT = max (45% PSC + 55% PCC; 55% PSC + 45% PCC)
Evaluation rules for projects (P1 and P2):
- Evaluation of each project (P):
- P = 75% G + 25% I
- G: evaluation of project, as group work
- I: individual evaluation component from demonstration and discussion and demonstrated individual knowledge of the developed work (10%), individual contribution for the group work (10%) and internal group evaluation (5%)
Conditions for obtaining frequency (F):
- Frequency evaluation (F) :
- F = 85% ( 50% P1 + 50% P2) + 15% CPQ
- To obtain frequency the following conditions are necessary:
- F must be greater than or equal to 7,5/20
- P1 must be greater than or equal to 7,5/20
- P2 mut be greater than or equal to 7,5/20
Grade rules with frequency evaluation (AF)
- Grade requires the following conditions:
- Compliance with conditions for obtaining frequency
- Evaluation rule with frequency elements (AF):
- AF = 50% F + 25% T1 + 25% T2
- To obtain grade with frequency elements the following conditions must be satisfied:
- AF greater than or equal to 9,5/20
- T1 greater than or equal to 7,5/20
- T2 greater than or equal to 7,5/20
Grade with final Exam (EF)
- Evaluation with the final exam (AEF):
- AEF = 50% AF + 50% EF
- To obtain grade all the following conditions must be satisfied:
- EF greater than or equal to 7,5/20
- F greater than or equal to 7,5/20
- AEF greater than or equal to 9,5/20
=============================
IMPORTANT NOTICE:
During evaluation exams, a student may not have in their possession any electronic devices capable of accessing the internet or connecting via Bluetooth (e.g. smartphones, smartwatches, smartglasses, tablets, laptops, etc.), even if they are turned off.
Violation of this rule will result in immediate failure in the course due to exclusion and the procedure will be reported to the Scientific Committee of the respective Degree Program.
Following a decision by the Department of Informatics (FCT/UNL), from now on, the following rule will apply to all Course Units under the responsibility of the Department of Informatics:
- During an evaluation exam, a student may not have in their possession any computation, communication or electronic devices capable of accessing the internet or connecting via Bluetooth (e.g. smartphones, smartwatches, smartglasses, tablets, laptops, etc.), even if they are turned off.
- Violation of this rule will result in immediate failure in the course due to exclusion and will be reported to the Scientific Committee of the respective Degree Program.
- It is the responsibility of each student to ensure that they do not have any of these devices with them, either by not bringing them into the exam room or by leaving them, turned off, away from their seat in the room where the test or exam is taking place — for example, near the board in the classrooms where these evaluations are held.
=================================
Subject matter
Program topics:
- Introduction
- Reliable and secure communication channels
- Techniques, mechanisms and services for dpendable distributed systems
- Byzantine Fault-Tolerance (BFT) and Intrusion Tolerance
- Intrusion prevention, detection and recovery
- Blockchain Platforms
- Privacy-preservation
- Trusted execution environments (TEE) and confidential computing
Program (topics in detail):
1. Introduction
- Concepts, properties, attributes and metrics for dependable systems
- Failure models and adversary model definition
- Modeling and representation of dependable distributed systems
- Trust decentralization
- Software stacks, mechanisms and services for dependable systems
- Decentralzied systems, Web3 and decentralziation with blockchains
2. Reliable and secure communication channels
- Unicast (PtP), multicast and broadcast communication channels
- Cryptographic techniques and tools
- Relevant security standards and leveraging protocols
- Tunneled end-to-end secure communication
- Reliable and secure broadcast channels
- Primitives and abstractions for reliable communication channels
- Case-study: protocol stacks and algorithms for reliable and secure communication services
3. Techniques and mechanisms for dependable distributed systems
- Logging and checkpointing
- State recovery using rollback and rollforward techniques
- Read/Write Registers
- Quorums
- State-machine replication
- Consistency and Durability
- Solutions with diversity
- Randomization and trust decentralization
- Isolation or confinment and trusted computing environments
4. Byzantine fault tolerance and solutions for intrusion tolerance
- Quorums and Byzantine Quorums
- Consensus, FLP impossibility and FLP circumvention techniques
- Synchronous consensus
- Asynchronous consensus
- Consensus with byzantine fault tolerance
- Sybil attacks and countermeasurs
- Design and study of practical implementations
- RAFT and BFT-RAFT; MiniSec, BFTSMaRt
- Probabilistic and randomized consensus
- Diversity-enhanced consensus solutions
- Case studies
5. Intrusion prevention and intrusion detection
- Perimeter defenses
- Intrusion prevention systems
- Intrusion detection systems
- HIDS, NIDS, HIDS, Honeypots and Honeynets
- SIEM Platforms
- Intrusion recovery: reactive recovery and pro-active recovery
- Diversity and examples of diversity solutions
- Runtime stacks with diversity
- N-Versioning Programming
- Cloud-of-Clouds
- Denial of service attacks (DoS, DDoS, Botnets) and countermeasures
6. Blockchain platforms and technology
- Origins, Blockchains typology and applications
- Case studies: Bitcoin, Ethereum, HLF and CORDA
- Bitcoin scripts and programming with smart contracts
- Service planes in Blockchain platforms and their architectures
- Blockchain programming and programming with smart contracts
- From byzantine consensus to Blockchain-enabled consensus solutions
- Performance and operation metrics
- Finality and finality latency
- Consensus models and mechanisms: PoW, PoS, PBFT, PoET, PoH, PoB and other solutions
- Limitations and drawbacks and alternative solutions to address scale, security, sustainabiility and performance
- Case studies
7. Privacy Preservation
- Advanced techniques for privacy-enhanced data management and computation
- Operations with encrypted data: security-at-the-rest techniques and homomorphic encryption
- Searchable Encryption
- Functional encryption
- Property-preserved encryption and attribute-based encryption
- Other techniques:
- Data anonymization and obfuscation
- Differential privacy
- Secret sharing
- Threshold cryptography
- Privacy-preserved multiparty computations
- ORAM, Oblivious transfers and oblivious storage
- Erasure coding
- Techniques for privacy-enhanced and anonymized communication:
- Case studies:
- ToR network and Mixnets
- Databse privacy models
- Privacy-preserved blockchains
8. Trusted Platforms and Confidential Computing
- Cryptographic HW, HSMs and TPMs
- Trusted computing with software attestation
- Trusted execution environments (TEEs)
- Reference technologies: IntelSGX, TrustZone and AMD-SEV
- Virtualization with Trusted Computing Platforms
- Programming environments for TEE platforms
- Confidential computing