Group research
Human-Centric AITopic tags
Annotation platform Datasets Explainable AI Information extraction Knowledge graphs Multilingual systems Qualitative evaluationGroup research
Human-Centric AITopic tags
Annotation platform Datasets Explainable AI Information extraction Knowledge graphs Multilingual systems Qualitative evaluationAt a Glance
- Method
- Fact Extractor
- Research Field
- Machine learning, natural language processing
- Focus
- Free up time of domain experts and highlight previously unknown relationships between extracted text
- Use Cases
- Drug development, carbon emission reports, materials informatics
- Conferences
- EMNLP 2023, AAAI 2021, ACL 2022, EMNLP 2023
- Related Methods
AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark (Friedrich et al.), ACL 2022
BenchIE: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation (Gashteovski et al.), ACL 2022
milIE: Modular & Iterative Multilingual Open Information Extraction (Kotnis et al.), ACL 2022
Fact-Linking: Linking Surface Facts to Large-Scale Knowledge Graphs (Radevski et al.), EMNLP 2023
Gradient Rollback: Explaining Neural Matrix Factorization with Gradient Rollback (Lawrence et al.), AAAI 2021
Challenge
Information is being generated faster than humans can process it. This impedes our ability to acquire knowledge, which is slowing down scientific discovery and the advancement of technology. How can humans absorb information at the pace it is being created?
Hypothesis
Develop a scientific framework to extract information that enables computers to read text information and arrange it as structured knowledge. This framework can then be applied to achieve cross-document understanding which, for example, enhances scientific discovery.
Methodology
To enable cross-document understanding, three main goals must be achieved:
- Goal 1:
- Extract facts from text
- Goal 2:
- Connect facts from different sources in a meaningful way
- Goal 3:
- Generate new insights from the connected facts
Extracting facts from text
NEC Laboratories Europe has successfully completed its first goal of achieving cross- document understanding by developing NEC Fact Extractor, which extracts facts from any language (Goal 1).
For each sentence, Fact Extractor extracts facts in the form of triples. Each triple consists of two entities (often subject and object), a relation (often a verb phrase), and, optionally, an argument (often location or time). The information is used to identify and connect with other facts (Goal 2).
Explicit extraditions (all the slots from the triple are extracted from the sentence):
NEC Laboratories Europe recently published the paper, Fact-Linking: Linking Surface Facts to Large-Scale Knowledge Graphs (Gorjan Radevski et al.), that describes how extracted text facts can be linked in a meaningful way and how to generate new text-based insights from these.
Once a knowledge graph is created, we use Gradient Rollback to derive explainable new insights (Goal 3). You can learn more about this method in the paper, Explaining Neural Matrix Factorization with Gradient Rollback (Carolin Lawrence et al.).