Programming for Data Science

Objectives

None

General characterization

Code

400090

Credits

7.5

Responsible teacher

Flávio Luís Portas Pinheiro

Hours

Weekly - Available soon

Total - Available soon

Teaching language

Portuguese. If there are Erasmus students, classes will be taught in English

Prerequisites

Week

Instructor

Content

1

February 12th

FLP

  • Course Overview
  • What is Python and Why are we learning it?
  • Setting up Python (Anaconda and Jupyter Notebooks)
  • ¿Hello World¿, your first program in Python
  • Variables & Data Structures (Arrays, Lists, Dictionaries, Tuples)

Chapters B1, A0 and C

2

February 19th

FLP

  • Reading and Writing to Files (I/O operations)
  • Flow Control in Python
  • Loops (For, While, iterators)
  • If Statements

Chapters C2 and C6

3

February 26th

FLP

  • Functions (Def and Lambda);
  • Python Standard Library;
  • Implement Recursivity with functions

Chapters A5 and C

4

March 12th

FLP

  • Object-Oriented Programming;
  • Objects and Classes
  • Modules, Lists, and Dictionaries

Chapters A6

5

March 19th

FLP

  • Import Libraries
  • Introduction to Numpy
  • Numpy Data Types
  • Basics of Numpy Arrays
  • Aggregations and Sorting Arrays
  • Introduction to Scipy

Chapters B2, C4, and C14a

6

March 26th

FLP

  • Introduction to Pandas
  • Pandas data structures: Series and DataFrames
  • Data exploration with Pandas
  • Reading Data

Chapters  B3, C5, and C6

7

April 2nd

FLP

  • Pandas Advanced Concepts
  • Analyze data with Pandas

Chapters C12 and C10

8

April 9th

FLP

  • Introduction to Statsmodel
  • Some notes on Statistics
  • Perform simple statistical analysis in Python;

Chapters C13

9

April 23rd

JA

  • Introduction to Matplotlib and Seaborn;
  • Use Visualization to drive your data exploration.

Chapter B9 and C4

10

April 30th

JA

  • Reporting your Findings

Chapter B9 and C4

11

May 7th

JA

  • Case Studies
  • Example of a full stack project using python
  • Worked out exercise

Chapters C14

12

May 14th

JA

  • Final Project support

13

May 21st

JA

  • Project Oral Presentations

14

May 28th

JA

  • Practical Exam

Bibliography

Lubanovic, Bill. Introducing Python: modern computing in simple packages. " O'Reilly Media, Inc.", 2014;

VanderPlas, Jake. Python data science handbook: essential tools for working with data. " O'Reilly Media, Inc.", 2016.

McKinney, Wes. Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. " O'Reilly Media, Inc.", 2012.

Grus, Joel. Data science from scratch: first principles with python. " O'Reilly Media, Inc.", 2015

Additionally, students will be able to find a rich online documentation for each of the Libraries covered during the course, and suggested readings will be share in the Moodle page

Teaching method

  1. Practical exam (40%): consists of an exercise that will need to be solved during the last class. Students will have two hours to develop in their computers the analysis of a data set provided by the instructors, and answer a few analytical questions;
  2. Final Project (60%): The final project consists of the elaboration of a report that details the process of acquisition, transformation, and analysis of a dataset. The project is to be developed in groups of at least three and up to four elements. More details about the project will be shared during the first couple of weeks in the Moodle page;

Evaluation method

Inglês

Subject matter

Theoretical and practical classes

Programs

Programs where the course is taught: