Métodos Descritivos de Data Mining

Objetivos

Data Mining uses interdisciplinary techniques, such as statistics, data visualization, database systems, and machine learning to identify original, useful, and understandable patterns in data.
This course will familiarize students with Data Mining applications and Data Mining projects lifecycle. Students will learn techniques for understanding and preparing data before building descriptive models, such as clustering or association rules (e.g., market basket analysis).

Caracterização geral

Código

200165

Créditos

7.5

Professor responsável

Roberto André Pereira Henriques

Horas

Semanais - A disponibilizar brevemente

Totais - A disponibilizar brevemente

Idioma de ensino

Português. No caso de existirem alunos de Erasmus, as aulas serão leccionadas em Inglês

Pré-requisitos

Familiarity with the main theme of the course is not required. But it is highly recommended that the students have knowledge of Inferential Statistics as well as good skills as a computer user.

Bibliografia

Keller, G. and Gaciu, N. (2020). Statistics for Management and Economics (2nd edition), Cengage Learning

Han, J., Kamber, M., Pei, J. (2012). Data Mining - Concepts and Techniques (Third edition), Morgan Kaufmann

Jain, A.K., Murthy, M.N.,  Flynn, P.J. (1999). Data Clustering: A Review, ACM Computing Review

Linoff, G. S., and Berry, M.J.A (2011). Data Mining Techniques for marketing, sales, and customer support (Third edition). Wiley Publishing, Inc.

SAS, Course Notes Enterprise MinerTM: Applying Data Mining Techniques (2014). Available from https://documents.pub/document/sas-notes-sas-enterprise-miner-software-applying-data-mining-techniques.html

 

Método de ensino

The course is based on theoretical and practical classes. Several teaching strategies are applied, including slides presentation, step-by-step instructions on how to approach practical examples, and questions and answers. The practical component is oriented towards the exploration of the tools introduced to students (Microsoft Excel and SAS Enterprise Miner) and the development of the project. 

Método de avaliação

1st Season: Exam (60%), Project (40%)

2nd Season: Exam (60%), Project (40%)

 

Rules:

  • Minimum grade in both the exam and the term project is 8.0 (out of 20)
  • Projects not submitted in Moodle until the deadline will be rejected

Conteúdo

LU1. Introduction to Data Mining

LU2. Methodological aspects (KDD, SEMMA, CRISP-DM)

LU3. Data visualization

LU4. Data understanding

LU5. Data preparation

LU6. Clustering

LU7. Self-Organizing maps

LU8. RFM model

LU9. Association rules and the Apriori algorithm

LU10. Data similarity and dissimilarity measures