Big Data Analysis

Objectives

Big data is a blanket term for the non-traditional strategies and technologies needed to gather, organize, process, and obtain insights from large datasets. In this course we will discuss the challenges created by Big Data and the state-of-the-art approaches to deal with them with a focus on the practical technologies that make this possible.

During Lectures we will overview the complex and heterogeneous Big Data ecosystem. A particular emphasis will be put into understanding the components that make up the popular Hadoop ecosystem (Hadoop, Hive, Kafka, Sqoop, and Spark) as well as the latest approaches to storing and processing big data (NoSQL databases). During the lab’s students will obtain hands on experience with Spark in the Databricks notebook environment.

General characterization

Code

100172

Credits

6.0

Responsible teacher

Ian James Scott

Hours

Weekly - Available soon

Total - Available soon

Teaching language

Portuguese. If there are Erasmus students, classes will be taught in English

Prerequisites

Available soon

Bibliography

Teaching method

Evaluation method

Subject matter

Programs

Programs where the course is taught: