Data Modelling
Objectives
Knowledge:
- NoSQL models
Graph modelling and query languages - Linked Open Data principles and Semantic Web concepts
- Languages for representing, reasoning and querying in the Semantic Web
- Concepts, architectures and models of a Data Warehouse
- Multidimensional data modelling for OLAP querying.
Application:
- Identify applications requiring graph modelling
- Model a graph database and query it (e.g. Neo4j with Cypher queries)
- Use a triple store and inference engine (e.g. Apache Jena) for querying with SPARQL data in the Semantic Web
- Analyse, design and query multidimensional models.
- Use a temporal database
Soft-Skills
- To explore autonomously the recent bibliography of a topic
- To develop critical reasoning regarding recent technology
- To work in a team
- To orally present a survey on a recent topic
- To review a scientific work
General characterization
Code
11559
Credits
6.0
Responsible teacher
Carlos Augusto Isaac Piló Viegas Damásio, João Carlos Gomes Moura Pires
Hours
Weekly - 4
Total - 50
Teaching language
Português
Prerequisites
To take this curricular unit you should first get approval in Database Systems.
Bibliography
• Ian Robinson, Jim Webber, and Emil Eifrem. Graph Databases. O''''Reilly Media, Inc, 2013.
• Grigoris Antoniou, Paul Groth, Frank van Harmelen and Rinke Hoekstra . A Semantic Web Primer, 3rd Edition. MIT Press, August 2012.
• The Description Logic Handbook. Theory, Implementation and Applications. Edited by Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi and Peter Patel-Schneider. Cambridge University Press, June 2010.
• The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Third Edition) - Ralph Kimball, Margy Ross. Wiley, 2013.
• Guy Harrison. Next Generation Databases: NoSQL, NewSQL and Big Data. Apress, 2015.
ISBN: 978-1484213308.
• Dan Sullivan. NoSQL for Mere Mortals. Addison-Wesley, 2015.
ISBN: 978-0134023212
ISBN: 978-1634621090
Teaching method
In the recitation lectures (2T) the main subjects are exposed and laboratory classes (2P). These are intended to be autonomously explored by students. There are written lecture notes and hand out slides, which closely follow the presentation in recitation lectures.
In the laboratory classes students explore tools that allow experimenting their use, as well as use the query languages for solving predefined problems and tasks.
Evaluation consists in 2 individual midterms (each worth 25% of the final grade), a final team project (35%), oral presentation and discussion of a colleague’s project (15% of the grade).
Each midterm has a minimum grade of 8/20 and the average of the midterms must be at least 10/20, after rounding.
Evaluation method
Evaluation consists in 2 individual midterms (each worth 25% of the final grade), a final team project (50%), with an individul grade after an oral presentation and discussion. The average of the midterms must be at least 10/20, after rounding. Any student may me called to an oral discussion to replace de tests / exam component.
The tests and exam will be in person, with the possibility of consulting only one help sheet provided by the teachers.
Subject matter
1. NoSQL data models
Alternative models for storing big volumes of data. Column, document and graph models. Relational, semi-structured and graph data. Data modelling with graphs. Querying graph models. Graph databases. Relationship to NoSQL movement and key-value stores.
2. Semantic Web
Motivation. Linked Open Data. Language and semantics of the Resource Description Framework (RDF) and SPARQL query language. Ontologies in the Semantic Web: RDF Schema and Web Ontology Language (OWL).
3. Online Analytical Processing (OLAP)
Data Warehouses. (Conceptual) multidimensional data models. Typical OLAP operations and OLAP query languages. Metadata. Spatial and temporal dimensions. Interaction in the data analysis process.
4. Exercises and final project
Use of tools (graph database, temporal databases, RDF and OWL API, OLAP and multidimensional)