Algorithms and Data Structures for Bioinformatics
Objectives
Understand:
- Bioinformatics as the study of information flow in biological systems.
- The most used algorithms, their use and limitations, for the analysis of DNA and protein sequences.
- Modern techniques for pattern recognition and their application to the analysis of biological sequences
Be able to:
- Select appropriate solutions for the bioinformatic analysis of sequences.
- Reason about bioinformatics problems, decomposing them in their basic elements
- Chain existing programs to solve problems requiring several processing steps
- Critically evaluate the results obtained.
- Use modern libraries for sequence analysis and pattern recognition.
Know:
- Most used algorithms for sequence analysis bioinformatics problems
- Modern methods for prediction and pattern recognition applied to biological sequences
General characterization
Code
12424
Credits
6.0
Responsible teacher
Ludwig Krippahl, Pedro Manuel Corrêa Calvente Barahona
Hours
Weekly - 4
Total - 48
Teaching language
Português
Prerequisites
-
Bibliography
Mandatory reading:
Haubold, Bernhard, and Thomas Wiehe. Introduction to computational biology: an evolutionary
approach. Springer Science & Business Media, 2006.
Complementary reading:
Marketa J. Zvelebil, Marketa Zvelebil, Jeremy O. Baum, Understanding Bioinformatics, Garland
Science, 2008
Miguel Rocha, Pedro G. Ferreira, Bioinformatics Algorithms: Design and Implementation in Python,
Academic Press, 2018
Bishop, Pattern Recognition and Machine Learning, Springer, 2006
Teaching method
Lectures and tutorials, introductory exercises and applications of the tools and services covered in class.
Evaluation consists on tests and essays, with a theoretical essay covering a specific technique or tool and a practical assignment in which the student must improve and apply their skills to a practical problem. Each student can choose the theme and subject of their assignments, provided they are relevant and approved by the teacher.
Evaluation method
The assessment consists of two components:
1 Project component, consisting of two practical assignments.
1 theoretical component, consisting of two tests.
The grade of the project component is the simple average of the grades of the two practical assignments, rounded up to one decimal.
The grade of the theoretical component is the simple average of the grades of the two tests, rounded up to one decimal.
The final grade is the simple average of the two components of the assessment, rounded to the nearest integer.
The grade of the recourse exam replaces the grade of the theoretical-practical component.
Subject matter
1-
Dynamic programming for local and global alignment
2-
Substitution matrices
3-
Sequence search algorithms
4-
Hierarchical and Iterative multiple sequence alignments
5-
Phylogenetics: distance, parsimony and likelihood methods; bootstrapping
6-
Supervised learning algorithms for pattern recognition and prediction in biological sequences.
7-
Unsupervised learning algorithms for clustering and modeling of biological sequences.
8-
Genome assembly and analysis algorithms
9-
Introduction to structural biology
Programs
Programs where the course is taught: