IE seminar - Tutorials
Institute of Computer Science, Brandenburgische Technische Universität Cottbus-Senftenberg
Juan-Francisco Reyes
pacoreyes@protonmail.com
Institute of Computer Science, Brandenburgische Technische Universität Cottbus-Senftenberg
Juan-Francisco Reyes
pacoreyes@protonmail.com
Information extraction (IE) is about automatically extracting information from natural language texts. One common use case of IE is the knowledge base (KB) population, an approach for building artificial intelligence systems. Knowledge bases allow people and computers to process interlinked representations of real-world concepts, entities, relationships, and events efficiently and unambiguously. This hands-on seminar will investigate concepts and techniques for extracting specific information from political discourse text crawled from the WWW. During the seminar, participants will explore the fundamental NLP challenges of extracting knowledge from the raw text, such as text classification, passage extraction, named-entity recognition, relation extraction, and more. The seminar participants will get experience with spaCy, BERT, and SetFit models, creating datasets, fine-tuning large-language models (LLMs), and populating a knowledge graph using Neo4j AuraDB, a popular graph database on the cloud.