Big Data Analysis
Objectives
This course will equip students with the tools to make informed decisions on data driven business areas. It will provide them with a good understanding on the working of big data and data analytics and we will cover some of the methods in detail, their advantages and disadvantages. In the end of the course students should be able conduct analysis and take decisions based in data driven environments.
General characterization
Code
2440
Credits
3.5
Responsible teacher
Arash Laghaie
Hours
Weekly - Available soon
Total - Available soon
Teaching language
English
Prerequisites
n/a
Bibliography
Provost and Fawcett (2013) - Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking
Géron (2017) ? Hands-On Machine Learning with Scikit-Learn and TensorFlow
Taddy (2019) ? Business Data Science
Teaching method
The course starts with an introduction of the software Rapidminer towards data analyses. Students will be helped to make their own system
(usually a laptop) ready to analyze datasets. The course will then introduce a number of data analytics procedures and methods applied to
specific problems.
In addition, the course contemplates a simulation in which teams manage a racing team. By analyzing the data produced by their driver, the
teams decide on parameters used to fine tune their car and elaborate the racing strategy that will lead them to victory.
Evaluation method
Participation (15%)
Simulation Performance (35%)
Final Exam (50%)
Subject matter
1. Introduction to Rapidminer in the context of data analytics
2. Identification, Prediction and Causality
3. Supervised Vs. Unsupervised Data Analysis
4. Finding informative attributes
5. Forecast and Probability estimation
6. Model fit
7. Finding ?optimal? model parameters based on data
8. Choosing the goal for data mining
9. Objective functions / Loss functions
10. Data Reduction
11. Model quality and performance evaluation
Programs
Programs where the course is taught: