Data Science I
Objectives
The world most valuable resource is data, not oil. Hence, it is not surprising that organizations increasingly look to have in their ranks experts in the sciences of data who are able to identify the correct sources of data, raise relevant business-oriented questions, and extract useful knowledge and insights for the organization. In that sense, the Data Science for Hospitality & Tourism Course unit explores different techniques to perform a descriptive analysis of data using python, as well as an understanding of how data science benefits from Big Data and the different storing technologies.
Classes will involve a mix of lectures and practical exercises. Moreover, the course will have a strong active learning component, as such students are expected to actively participate in the class and read the recommended materials prior to each class. A short introduction to Python will be delivered in the first weeks of the course to enable students to explore and practice many of the theoretical concepts taught in the classes on their own.
Intended Learning Outcomes:
- Understand the different nature of data and its sources
- Explain why python is the preferred programming language for Data Scientists;
- Perform the extraction, exploration, transformation, and analysis, of data using Python;
- Perform descriptive analytics that helps you understand your data;
- Report your analysis using meaningful visualizations and simple models;
- Understand the role of Big Data Technologies in Data Science
- Understand the different technologies and architectures for storing data for both Transactional and Analytical operations
General characterization
Code
400112
Credits
7.5
Responsible teacher
Flávio Luís Portas Pinheiro
Hours
Weekly - Available soon
Total - Available soon
Teaching language
Portuguese. If there are Erasmus students, classes will be taught in English
Prerequisites
None
Bibliography
[A] VanderPlas, Jake. Python data science handbook: essential tools for working with data. " O'Reilly Media, Inc.", 2016.
[B] McKinney, Wes. Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. " O'Reilly Media, Inc.", 2012.
[C] Grus, Joel. Data science from scratch: first principles with python. " O'Reilly Media, Inc.", 2015
[D] Additionally, students will be able to find a rich online documentation for each of the Libraries covered during the course, and suggested readings will be share in the Moodle page.
Teaching method
Theoretical and practical classes
Evaluation method
To successfully finish this course students need to score a minimum of combined 9.5 points from the following components:
1)Practical Exam (40%): consists of the analysis of a data set provided by the teaching staff, which should be completed within the two hours of one class;
2)Final Project (60%): The final project consists of the elaboration of a report that details the process of acquisition, transformation, and analysis of a dataset. The project is to be developed in groups of up to two elements. More details about the project will be shared during the first couple of weeks in the Moodle page;
Subject matter
Week |
Instructor |
Content |
1 |
Flávio |
Chapters C1, A0 and D |
2 |
Flávio |
Chapters A1 and A2 |
3 |
Flávio |
Chapters A3, B6, and B7 |
4 |
Flávio |
Chapters C5, C6, and C7 |
5 |
Flávio |
Chapters B7, C9, and C10 |
6 |
Flávio |
Chapters C12, C15, and C19 |
7 |
Flávio |
Chapters C21 and C22 |
8 |
Francisco |
Chapter C23 |
9 |
Francisco |
Chapter C24 |
10 |
Francisco |
Chapter B9 and C4 |
11 |
Francisco |
|
12 |
Francisco |
|
13 |
Francisco |
|
14 |
Francisco |
|
Programs
Programs where the course is taught: