
Machine Learning and Knowledge Extraction
Code
9229
Academic unit
Faculdade de Ciências e Tecnologia
Department
Departamento de Informática
Credits
6.0
Teacher in charge
Nuno Miguel Cavalheiro Marques, Pedro Manuel Corrêa Calvente Barahona
Weekly hours
2
Teaching language
Português
Objectives
Covered research subjects include neural networks, namely (ubi)SOMs and deep learning; knowledge mining from text, multilingual access to information, detection of relevant parts in text and its applications, clustering and classification of documents, pattern recognition in text.
Real world applications of those methods will be of primary concern, with emphasis on those covering the research carried out in the UNL. Ongoing application of these methods in Industry will be discussed.
Prerequisites
Pre-requirements (specify knowledge, not courses) are knowledge on: Machine Learning, Data and Text Mining techniques; Statistical and Computational Methods for Data Management and Text Processing.
Introduction materials on this topics could be provided to interested students upon request.
Subject matter
A. Neural Networks
- Different types of Neural Networks: MLPs and SOMs
- Learning in Neural Networks
- Search based methods and their implications
- Neuro-Symbolic methods for improving learning in Neural Networks
- Deep Learning and Learning Optimization
- Learning in Self-Organizing Maps
B. Knowledge Extraction from Text and Time Series
- Pattern Extraction
- Document Classification by Language or Topic.
- Cause-effect relations from Time Series
- Modeling Complex Systems and Chaos.
C. Sense disambiguation and Machine Translation
- Word and Term Sense Disambiguation: Building Thesauri and Ontologies.
- Alignment; Extraction of translation equivalents.
D. Applications in Text and Data Mining
- Applications of Neuro-Symbolic Neural Networks for Text and Data Mining (example applications in Finance and NLP/Syntactic Disambiguation). Word2Vec Auto-encoder.
- Machine Translation;
- Applications on Pattern Extraction and Text Classification.
E. Open Topics in Text and Data Mining.
Bibliography
Robert Dale, Hermann Moisl and Harold Sommers (eds.). 2000. "Handbook of Natural Language Processing”, Marcel Dekker, Inc., New York.
Simon Haykin (2008). Neural Networks: A Comprehensive Foundation (3rd Edition). Prentice Hall.
Christopher Manning and Hinrich Schütze. “Foundations of Statistical Natural Language Processing. MIT Press, 1999.
M. Nielsen. "Neural Networks and Deep Learning". http://neuralnetworksanddeeplearning.com/ . September 2017.
Survey of Text Mining. Clustering, Classification and Retrieval. Michael W. Berry, editor. Springer, 2008.
More specific articles overviewing the state of the art or supporting the presentation of each problem will be used.
Evaluation method
Students will be evaluated by means of:
a) Written test, where students must show their knowledge and understanding of the main concepts (50%);
b) An individual final project where students should show their mastering of state of the art techniques applied to a theoretical and/or practical problem of reasonable size, including final report, oral presentation and participation in course workshop (50%).