Machine Learning
Objectives
Knowledge
- Understand the paradigms and challenges of Machine Learning, distinguishing Supervised, and Unsupervised learning.
- Learn the fundamental methods and their applications in data oriented knowledge discovery. Understand data features, the selection of models and their complexity.
- Understand the advantages and disadvantages of the different methods.
Applications
- Implement and adapt Machine Learning algorithms;
- Model real data experimentally.
- Interpret and evaluate experimental results.
- Validate and compare different Machine Learning algorithms.
Soft Skills
- Evaluate the suitability of each method to concrete applications and data sets.
- Critical evaluation of the results.
- Autonomy and self-reliance in the application and further studies in Machine Learning.
General characterization
Code
11157
Credits
6.0
Responsible teacher
Claudia Alexandra Magalhães Soares, João Alexandre Carvalho Pinheiro Leite
Hours
Weekly - 4
Total - 48
Teaching language
Português
Prerequisites
Students entering the class are expected to have a pre-existing working knowledge of probability, linear algebra, statistics and algorithms; one recitation session will be held to review basic concepts.
-
Before starting this course, you need to have significant experience programming in a general programming language. Specifically, you need to have written from scratch programs consisting of several hundred lines of code. For undergraduate students, this will be satisfied, for example, by having passed 15-122 (Principles of Imperative Computation) with a grade of ‘C’ or higher, or comparable courses or experience elsewhere.
Note: For each programming assignment, you will be required to use Python. You will be expected to know, or be able to quickly pick up, that programming language.
-
You need to have, before starting this course, basic familiarity with probability and statistics, as can be achieved at NOVA by having passed the Probability and Statistics course or comparable courses elsewhere, with a grade of ‘C’ or higher.
-
You need to have, before starting this course, college-level maturity in discrete and continuous mathematics, as can be achieved by having passed mathematical analysis 1 and 2, linear algebra and analytic geometry, discrete mathematics or comparable courses elsewhere, with a grade of ‘C’ or higher.
You must strictly adhere to these pre-requisites! Even if the registration system does not prevent you from registering for this course, it is still your responsibility to make sure you have all of these prerequisites before you register.
Bibliography
- T. Mitchell. Machine Learning, McGraw-Hill, 1997.
- C. M. Bishop. Pattern Recognition and Machine Learning, Springer, 2006.
- E. Alpaydin. Introduction to Machine Learning, Second Edition, MIT Press, 2010.
- Stephen Marsland. Machine Learning: An Algorithmic Perspective CRC Press, 2011
Teaching method
Available soon
Evaluation method
The evaluation of this course consists of two components: theoretical/problems (T) and project (P). Each component contributes 50% to the final grade, and both are graded on a scale from 0 to 20.
To pass, the student must obtain a grade of at least 9.5 in the theoretical/problems component and a grade of at least 9.5 in the project component. The final grade is then calculated as the weighted average of the two evaluation components (0.5×T + 0.5×P) on an integer scale from 0 to 20 points. If the final grade is higher than 17 points, an oral test will be conducted to defend the grade obtained. After the oral examination, the grade will not be lower than 17 points. If a student has a final grade larger than 17 and does not attend the oral examination, the final grade will be set at 17 points.
Theoretical/problems component (T)
This component is evaluated through two written tests, with the grade being the average of the two tests. Alternatively, this component can be evaluated through a two-part written exam. If the student does better on the exam, their grade will replace the average of the two tests. Additionally, an extra point may be awarded for submitting a report that transcribes and deepens the contents of one theoretical class, or by correctly responding to the questions of other students posted on the Discord discussion board.
Project component (P)
This component is evaluated through two mini-projects. Some tutorial classes will be allocated to the mini-projects, which are completed in student groups but evaluated individually. The grading of the different evaluation components is rounded to the first decimal place, and the final grade is rounded to the closest integer value.
😢 Plagiarism
Plagiarism is any attempt to make you seem the author of text, code, or any work that is not yours. Plagiarism breaks the necessary trust for fair evaluation, and any student submitting plagiarized work will fail the course immediately. If you rely on any material other than what was provided in this course, you must identify it and credit your sources. Copying verbatim course material is also not permitted without proper credit. Also, note that you will be graded on the work you actually did, so even if you avoid plagiarism by crediting your sources, you still need to do your own implementation and write text in your own words.
🤖 Language Model Policy
👉 You can use language models like ChatGPT for spell-checking and to obtain generic code structures. However, relying solely on language models will not be sufficient for completing assignments and definitely not for tests and exams. Take the model’s notice seriously: Model responses can be inaccurate or misleading.
Subject matter
Introduction to Machine Learning
Machine Learning paradigms: Supervised Learning, Unsupervised Learning and Reinforcement Learning.
Data
2.1 Types of Data
2.2 Measures of similarity and dissimilarity
2.3 Data normalization and visualization
2.4 Dimensionality reduction by Principal Component Analysis
Supervised Learning
3.1 Regression
3.2 Decision Trees
3.3 Artificial Neural Networks
3.4 Support Vector Machines
3.5 Graphical models
3.6 K-nearest neighbour classifier
3.7 Methods for classifier evaluation and comparison
3.8 Ensembles
Unsupervised Learning
4.1 Partitional clustering
4.2 Probabilistic clustering
4.3 Partitional Fuzzy clustering
4.4 Hierarchical clustering
4.5 Markov chain
4.6 Clustering evaluation methods
4.6 Other unsupervised learning topics