Big Data Analysis

Objectives

This course will equip students with the tools to make informed decisions on data driven business areas. It will provide them with a good understanding on the working of big data and data analytics and we will cover some of the methods in detail, their advantages and disadvantages. In the end of the course students should be able conduct analysis and take decisions based in data driven environments. 

 

General characterization

Code

2440

Credits

3.5

Responsible teacher

Arash Laghaie

Hours

Weekly - Available soon

Total - Available soon

Teaching language

English

Prerequisites

n/a 



Bibliography

Provost and Fawcett (2013) - Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking

Géron (2017) ? Hands-On Machine Learning with Scikit-Learn and TensorFlow

Taddy (2019) ? Business Data Science 


Teaching method

The course starts with an introduction of the software Rapidminer towards data analyses. Students will be helped to make their own system (usually a laptop) ready to analyze datasets. The course will then introduce a number of data analytics procedures and methods applied to specific problems. In addition, the course contemplates a simulation in which teams manage a racing team. By analyzing the data produced by their driver, the teams decide on parameters used to fine tune their car and elaborate the racing strategy that will lead them to victory.

Evaluation method

 Participation (15%)

Simulation Performance (35%)

 Final Exam (50%) 


Subject matter

1. Introduction to Rapidminer in the context of data analytics

2. Identification, Prediction and Causality

3. Supervised Vs. Unsupervised Data Analysis

4. Finding informative attributes 

5. Forecast and Probability estimation

6. Model fit

7. Finding ?optimal? model parameters based on data

8. Choosing the goal for data mining

9. Objective functions / Loss functions

10. Data Reduction

11. Model quality and performance evaluation