Knowledge Discovery


At the end of this course unit the student is expected to have acquired knowledge, skills and competences that will allow him / her to:

Identification of Knowledge Discovery problems

Algorithm specification.
Specification, Development and implementation of datamining

Written and oral communication skills
Demonstration skills
Produce reports of analysis, design and implementation of a solution
Work management, time management and delivery deadlines
Teamwork and tem participation

General characterization





Responsible teacher

João Paulo Branquinho Pimentão, Pedro Alexandre da Costa Sousa


Weekly - 4

Total - 56

Teaching language



Not Defined.


Data Mining de Eibe Frank, Christopher Pal, Mark Hall e Ian H. Witten, ISBN: 9780128042915, ELSEVIER SCIENCE & TECHNOLOGY

Handbook of Data Mining and Knowledge Discovery 1st Edition, by Jan Zyt (Author), Willi Klosgen (Editor), the late Jan M. Zytkow (Editor), ISBN-13: 978-0195118315

Big Data Analytics: Systems, Algorithms, Applications 1st ed. 2019 Edition, by C.S.R. Prabhu (Author), at all. , ISBN-13: 978-9811500930

Information theory, inference, and learning algorithms - Mackay, David, Cambridge University Press, ISBN: 978-0521642989

Principles of data mining - Hand, David; Smyth, Padhrai; Mannila, Heikki, MIT Press, ISBN: 978-0262082907

Pattern recognition and machine learning - Bishop, Christopher M., Springer, ISBN: 978-0387310732

Visualize This: The Flowing Data Guide to Design, Visualization, and Statistic - Yau, Nathan, John Wiley & Sons. ISBN: 978-0470944882

Teaching method

The course is divided into theoretical-practical classes and practical classes.
In the theoretical-practical classes the subjects are introduced and practical problems are formulated that the students have to solve in the respective pratical classes.
In the practical classes the execution of the problems (implementation) is carried out.
All the work that the students develop in practice is part of a larger work (integration) that students have to deliver in a defined time frame, together with a report of analysis, design and implementation.

Evaluation method

Theoretical-Practical component (weight of 34%) - NTP:
Can be performed through 1 test or exam;
It is necessary to have a grade (of exam or average of the tests) of not less than 9.5 values.

Practical component (weight of 66%) - NP:
1st project: 26%, 2nd project: 40%. Delivery through Moodle. Evaluation based on implemented functionalities.
It is necessary to have average grade of not less than 9.5 values.

NOTE: Approvals from the previous year can be used this semester.

Calculation of final grade - NF:
NF = 34% * NT + 66% * NTP

Subject matter


  • Intelligent systems

  • Data «warehouse»

  • Knowledge discovery


Managing Knowledge discovery projects



Data Warehouse and OLAP 

  • Data Warehouse and DBMS 

  • Multidimensional data model 

  • OLAP


Data preprocessing 

  • Data cleaning 

  • Data transformation 

  • Data reduction 

  • Concept hierarchies

  • Data Quality


Data mining knowledge representation 

  • Interestingness measures 

  • Input data

  • Models 

  • Visualization techniques 



  • Classification/regression

  • Segmentation

  • Instance-based methods (nearest neighbor) 

  • Association

  • Clustering


Evaluating what''''s been learned 

  • Training and testing 

  • Estimating classifier accuracy (holdout, cross-validation, leave-one-out) 

  • Combining multiple models 


Mining real data 


Dealing with Big Data

  • What is makes Data, Big Data

  • Scalable Data Analytics Framework

  • Large-scale Data Analysis Models

  • Distributed Storage Architecture

  • NoSQL Databases

  • Data Flow Management

Ethics and privacy