Data Modelling

Objectives

Knowledge:

  • NoSQL models
    Graph modelling and query languages
  • Linked Open Data principles and Semantic Web concepts
  • Languages for representing, reasoning and querying in the Semantic Web
  • Concepts, architectures and models of a Data Warehouse
  • Multidimensional data modelling for OLAP querying.

Application:

  • Identify applications requiring graph modelling
  • Model a graph database and query it (e.g. Neo4j with Cypher queries)
  • Use a triple store and inference engine (e.g. Apache Jena) for querying with SPARQL data in the Semantic Web
  • Analyse, design and query multidimensional models.
  • Use a temporal database

Soft-Skills

  • To explore autonomously the recent bibliography of a topic
  • To develop critical reasoning regarding recent technology
  • To work in a team
  • To orally present a survey on a recent topic
  • To review a scientific work

General characterization

Code

11559

Credits

6.0

Responsible teacher

Carlos Augusto Isaac Piló Viegas Damásio, João Carlos Gomes Moura Pires

Hours

Weekly - 4

Total - 50

Teaching language

Português

Prerequisites

To take this curricular unit you should first get approval in Database Systems.

Bibliography

• Ian Robinson, Jim Webber, and Emil Eifrem. Graph Databases. O''''''''''''''''''''''''''''''''Reilly Media, Inc, 2013.

• Grigoris Antoniou, Paul Groth, Frank van Harmelen and Rinke Hoekstra . A Semantic Web Primer, 3rd Edition. MIT Press, August 2012.

• The Description Logic Handbook. Theory, Implementation and Applications. Edited by Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi and Peter Patel-Schneider. Cambridge University Press, June 2010.

• The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Third Edition) - Ralph Kimball, Margy Ross. Wiley, 2013.

• Guy Harrison. Next Generation Databases: NoSQL, NewSQL and Big Data. Apress, 2015.
ISBN: 978-1484213308.

• Dan Sullivan. NoSQL for Mere Mortals. Addison-Wesley, 2015.
ISBN: 978-0134023212

• Ted Hills. NoSQL and SQL data modeling. Technics Publications, 2016.
ISBN: 978-1634621090

Teaching method

In the recitation lectures (2T) the main subjects are exposed and laboratory classes (2P). These are intended to be autonomously explored by students. There are written lecture notes and hand out slides, which closely follow the presentation in recitation lectures.

In the laboratory classes students explore tools that allow experimenting their use, as well as use the query languages for solving predefined problems and tasks.

Evaluation consists in 2 individual midterms (each worth 25% of the final grade), a final team project (35%), oral presentation and discussion of a colleague’s project (15% of the grade).

Each midterm has a minimum grade of 8/20 and the average of the midterms must be at least 10/20, after rounding.

Evaluation method

Evaluation consists in 2 individual midterms (each worth 25% of the final grade), a final team project (50%), with an individul grade after an oral presentation and discussion which must at least 10/20. The average of the midterms must be at least 10/20, after rounding, and the tests must be at least 8/20. Any student may me called to an oral discussion to replace de tests / exam component.

The tests and exam will be in person, with the possibility of consulting only one help sheet provided by the teachers.

Subject matter

1. NoSQL data models

Alternative models for storing big volumes of data. Column, document and graph models. Relational, semi-structured and graph data. Data modelling with graphs. Querying graph models. Graph databases. Relationship to NoSQL movement and key-value stores.

2. Semantic Web

Motivation. Linked Open Data. Language and semantics of the Resource Description Framework (RDF) and  SPARQL query language. Ontologies in the Semantic Web: RDF Schema and Web Ontology Language (OWL).

3. Online Analytical Processing (OLAP)

Data Warehouses. (Conceptual) multidimensional data models. Typical OLAP operations and OLAP query languages. Metadata. Spatial and temporal dimensions. Interaction in the data analysis process.

4. Exercises and final project

Use of tools (graph database, temporal databases, RDF and OWL API, OLAP and multidimensional)

Programs

Programs where the course is taught: