Programming for Data Science
The Programming for Data Science curricular unit is aimed at students without prior programming experience. In this unit, students will learn the fundamentals of programming in Python necessary for a successful career in data science. Starting from the very basics of programming, we will rapidly evolve towards advanced computing techniques and concepts of interest for the development of a data science project. During the Programming for Data Science curricular unit, students will acquire experience working with the backbone stack of libraries (Pandas, Numpy, Scipy, Seaborn, Statsmodels, NetworkX) that make Python the language of choice among data scientists.
At the end of the curricular unit, students are expected to have the capacity to use programming to develop a data science project independently and to feel comfortable with the programming activities in other curricular units. The curricular unit has a strong, active learning component, and, as such, students are expected to participate during classes and read the recommended weekly materials.
Flávio Luís Portas Pinheiro
Weekly - Available soon
Total - Available soon
Portuguese. If there are Erasmus students, classes will be taught in English
The curricular unit does not have technical enrollment requirements.
Classes will be taught in English, and as such students are expected to have a good level of comprehension and communication in English.
- Lubanovic, Bill. Introducing Python: modern computing in simple packages. "O'Reilly Media, Inc.," 2014;
- VanderPlas, Jake. Python data science handbook: essential tools for working with data. "O'Reilly Media, Inc.," 2016.
- McKinney, Wes. Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. "O'Reilly Media, Inc.," 2012.
- Grus, Joel. Data science from scratch: first principles with Python. "O'Reilly Media, Inc.," 2015
- Additional reading materials will be shared in Moodle with all the students, including documentation materials and book chapters;
The curricular unit is based on a mix between theoretical and practical lessons with a strong, active learning component. During each session, students are exposed to new concepts and methodologies, case studies, and the resolution of examples. Active learning activities (debates, quizzes, mud cards, compare and contrast) will place students at the center of the classroom, promoting peer-teaching, and incite a positive discussion. Computer activities will take place weekly during the practical lessons.
EE1 - Participation in classroom activities (10%)
EE2 - Homework Assignments (50%)
EE3 - Practical Exam (40%).
To successfully finish this curricular unit, students need to score a minimum of 9.5 points. The grading is divided into two seasons. Attendance in the second is optional for students that passed the curricular unit in the first season and can be used to improve their grade.
The first grading season is dedicated to continuous evaluation, which includes the following components:
- Quizzes (10%) ¿ Set of multiple-choice questions at the start of each Lecture. Quizzes will be performed on Socrative. Students can answer the quiz using their smartphones or computer laptop as long as they have an internet connection and a web browser. Login details will be shared in Moodle during the first week of classes. Students are incentivized to discuss with their colleagues during the quiz;
- Homework Assignments (50%) ¿ Three sets of problems to be solved during the seven weeks of the curricular unit. This is an individual assignment. Homework assignments will be released during weeks 3, 5, and 6. You will have 72 hours to deliver the solution through Moodle. A penalty will be applied to late deliveries (1 point per day);
- Practical Exam (40%) ¿ The final practical challenge consists of a 48-hour assignment in which students need to implement the steps of a proposed data science project. The Exam will be released in the final week of the curricular unit.
The second grading season will take place in January and consists of a multiple-choice exam. The Exam consists of 40 questions. Correct answers count 0.5 points, and incorrect answers discount 0.2 points.
The curricular unit is organized in three Learning Units (LU):
LU0. Introduction to programming fundamentals using Python
LU1. Exploration of the most relevant libraries in the Python data science stack.
LU2. Use all the entire stack and its different parts to develop a data science project.