Programming for Data Science
Objectives
The Programming for Data Science curricular unit is aimed at students without prior programming experience. In this unit, students will learn the fundamentals of programming in Python necessary for a successful career in data science. Starting from the very basics of programming, we will rapidly evolve towards advanced computing techniques and concepts of interest for the development of a data science project. During the Programming for Data Science curricular unit, students will acquire experience working with the backbone stack of libraries (Pandas, Numpy, Scipy, Seaborn, Statsmodels, NetworkX) that make Python the language of choice among data scientists.
At the end of the curricular unit, students are expected to have the capacity to use programming to develop a data science project independently and to feel comfortable with the programming activities in other curricular units. The curricular unit has a strong, active learning component, and, as such, students are expected to participate during classes and read the recommended weekly materials.
General characterization
Code
200211
Credits
3.5
Responsible teacher
Flávio Luís Portas Pinheiro
Hours
Weekly - Available soon
Total - Available soon
Teaching language
Portuguese. If there are Erasmus students, classes will be taught in English
Prerequisites
The curricular unit does not have technical enrollment requirements.
Classes will be taught in English, and as such students are expected to have a good level of comprehension and communication in English.
Bibliography
- Lubanovic, Bill. Introducing Python: modern computing in simple packages. "O'Reilly Media, Inc.," 2014;
- VanderPlas, Jake. Python data science handbook: essential tools for working with data. "O'Reilly Media, Inc.," 2016.
- McKinney, Wes. Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. "O'Reilly Media, Inc.," 2012.
- Grus, Joel. Data science from scratch: first principles with Python. "O'Reilly Media, Inc.," 2015
- Additional reading materials will be shared in Moodle with all the students, including documentation materials and book chapters;
Teaching method
The curricular unit is based on a mix of theoretical and practical lessons with a strong, active learning component. During each session, students are exposed to new concepts and methodologies, case studies, and the resolution of examples. Active learning activities (debates, quizzes, mud cards, compare-and-contrast) will place students at the center of the classroom, promoting peer-teaching, and incite a positive discussion. Computer activities will take place weekly during the practical lessons.
Evaluation Elements:
EE1 - Participation in classroom activities (10%)
EE2 - Homework Assignments (50%)
EE3 - Group Assignment (40%).
Evaluation method
To successfully finish this curricular unit, students need to score a minimum of 9.5 points. The grading is divided into two seasons. Attendance in the second is optional for students that passed the curricular unit in the first season and can be used to improve their grade.
First Season
The first grading season is dedicated to continuous evaluation, which includes the following components:
- Quizzes (10%) ¿ It is a classroom activity. Set of multiple-choice questions at the start of each Lecture. Quizzes will be performed on Socrative. Students can answer the quiz using their smartphones or computer laptop. Login details will be shared in Moodle during the first week of classes. If possible, students are incentivized to debate with their colleagues during the quiz;
- Homework Assignments (50%) ¿ This is an individual assignment. Homework assignments will be released at the end of weeks 3, 5, and 6. You will have 72 hours (3 days) to submit the solution through Moodle. Assignments consist of simple problem sets that have been designed to challenge the students and incentivize their practice with programming. A penalty will be applied to late deliveries (1 point per day);
- Group Exam (40%) ¿ It is a group activity. During the 8th week of the semester (Monday at noon), we will release a list of steps for the execution of a short data science project. Students will have to implement the proposed steps in python. You will be graded by how successful you have implemented each step and by the clarity and quality of the implemented solution. Students will have 72 hours (3 days) to solve the challenge and deliver it through Moodle. A penalty will be applied to late deliveries (1 point per day late).
Second Season
The second grading season will take place in January and consists of a multiple-choice exam. The Exam consists of 40 questions. Correct answers count 0.5 points, and incorrect answers discount 0.2 points.
Subject matter
The curricular unit is organized in three Learning Units (LU):
LU0. Introduction to programming fundamentals using Python
LU1. Exploration of the most relevant libraries in the Python data science stack.
LU2. Use all the entire stack and its different parts to develop a data science project.