Text Analytics
Objectives
Natural language processing (NLP) is a subfield of computer science, information engineering and artificial intelligence that is focused on the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of unstructured textual data.
During the text analytics course, we will learn the basic concepts, main formalisms, techniques and algorithms used in the natural language processing area. The course is oriented to students that have little or no experience in computer programming, however, the course will be highly practical and it has a strong programming component. In this way, at the end of the course, students will be able to apply these concepts to real-world applications (e.g. chatbots, translation, etc).
Classes will involve a mix of lectures and practical exercises. Moreover, the course will have a strong active learning component, as such students are expected to actively participate in the class and read the recommended materials prior to each class. A short introduction to Python will be delivered in the first weeks of the course to enable students to explore and practice many of the theoretical concepts taught in the classes.
Intended Learning Outcomes
-
Explain why natural language processing and text analytics are a key subject for the interaction with computers;
-
Understand the basic approaches to build systems that work with textual data;
-
Use the most adequate libraries for your needs;
-
Perform the extraction, manipulation, analysis, and modeling of textual data.
-
Feel Comfortable using Python as a tool for your NLP projects!
General characterization
Code
200168
Credits
7.5
Responsible teacher
Flávio Luís Portas Pinheiro
Hours
Weekly - Available soon
Total - Available soon
Teaching language
Portuguese. If there are Erasmus students, classes will be taught in English
Prerequisites
None
Bibliography
[A] Sarkar, Dipanjan. "Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from your Data"Apress; 1st ed. edition (December 1, 2016)
[B] Jurafsky, Daniel and H. Martin, James "Speech and Language Processing" Prentice Hall; 2nd edition (May 16, 2008)
Teaching method
Theoretical and Practical classes
Evaluation method
To successfully finish this course students need to score a minimum of combined 9.5 points from the following components:
-
Theoretical Tests (25%): consists of two mini-tests that will need to be solved during class- es. Students will have one hour to answer a few theoretical questions;
-
Continuous Evaluation (15%): consists of simple quizzes that will be done during classes;
-
Final Project (60%): The final project consists of the elaboration of a report that details the process of transformation, manipulation, analysis and application of the learned techniques for a specific NLP task. The project is to be developed in groups of up to two/three elements. More details about the project will be shared during the first couple of weeks in the Moodle page
Subject matter
Week |
Instructor |
Content |
1 |
FLP & RR |
|
2 |
RR |
|
3 |
RR |
- Bag-of-word models. |
4 |
RR |
|
5 |
RR |
|
6 |
RR |
- Information Retrieval |
7 |
RR |
- First Test & Presentation about the project. |
8 |
RR |
|
9 |
|
10 May 2th |
RR |
|
11 May 9th |
RR |
- Word Embeddings - SentimentAnalysis |
12 |
RR |
|
13 |
RR |
- Sequence modelling (Deep Learning overview) |
14 |
RR |
- Second Test |
Programs
Programs where the course is taught: