Big Data and Data Science for Business Analytics
Objectives
A. Knowledge and Understanding:
•Master the concepts of Big Data, Data Science and Data Driven Decision Making
•Have hands-on experience with data analytics for supporting management situations
B. Subject-Specific Skills:
•Build predictive models for classification and regression
•Evaluate model fit and generalizability
•Learn the concept of business experimentation and causality
C. General Skills:
•Be able to think systematically about how and when data can improve decision making in contexts of management
•Be able to understand and discuss topics of data analysis for business intelligence. In particular, know basic principles and
algorithms of data mining to interact with data analytics professionals.
General characterization
Code
67935
Credits
2
Responsible teacher
Miguel Godinho de Matos
Hours
Weekly - Available soon
Total - Available soon
Teaching language
English
Prerequisites
N/A
Bibliography
The mandatory textbook for the class is
Data Science for Business: Fundamental principles of data mining and data analytic thinking Provost and Fawcett (2013).
We will complement the book with discussions of applications, cases, and demonstrations.
Whenever relevant and we will hand out lecture notes.
Teaching method
The course will be comprised of theory lectures where we will cover examples of the fundamental principles and uses of
data analytics and data mining. This is not a data mining algorithms course, but we will discuss the mechanics of how these
methods work.
Class meetings will be a combination of lectures on fundamental material, case discussions and student exercises.
There will be three homework assignments that will count towards the final grade. Homework comprises questions to be
answered and/or hands-on tasks. The hands-on tasks will be based on data that we will provide. In particular students will
mine the data to get hands-on experience in formulating business analytics problems and using the various techniques
discussed in class.
Hands-on assignments we will use the R statistical language http://cran.r-project.org/.
Evaluation method
Assessment: Participation - 10%; Homework - 40%; Final Exam - 50%.
Subject matter
Introduction to data mining and business analytics: Data Analytics Thinking; From Big Data 1.0 to Big Data 2.0;From
Business Problems to Data Mining; Supervised Vs. Unsupervised Data Analysis; The Process of Data Mining; Introduction
to predictive modeling: Finding informative attributes; Tree induction; Probability estimation; Model fit and model over fit:
Finding “optimal” model parameters based on data; Choosing the goal for data mining; Objective functions; Loss functions;
Generalization; Fitting and over fitting; Complexity control; Model quality and performance evaluation: Evaluating
classifiers; Expected value as key evaluation framework; Visualizing model performance (ROC, Lift curve, Cumulative
response, Profit curve); Introduction to the paradigm of causal inference: Limits of data mining; Correlation versus
causation