Legal NERC with ontologies, Wikipedia and curriculum learning

Cristian Cardellino; Milagro Teruel; Laura Alonso Alemany; Serena Villata

doi:10.18653/v1/E17-2041

Communication Dans Un Congrès Année : 2017

Legal NERC with ontologies, Wikipedia and curriculum learning

(1) , (1) , (1) , (2)

1
2

Cristian Cardellino

Fonction : Auteur
PersonId : 1323423
IdHAL : ccardellino
ORCID : 0009-0000-1129-8330

Universidad Nacional de Córdoba [Argentina]

Milagro Teruel

Fonction : Auteur

Universidad Nacional de Córdoba [Argentina]

Laura Alonso Alemany

Fonction : Auteur

Universidad Nacional de Córdoba [Argentina]

Serena Villata

Fonction : Auteur
PersonId : 9409
IdHAL : serena-villata
ORCID : 0000-0003-3495-493X
IdRef : 200242911

Web-Instrumented Man-Machine Interactions, Communities and Semantics

Résumé

In this paper, we present a Wikipedia-based approach to develop resources for the legal domain. We establish a mapping between a legal domain ontology, LKIF (Hoekstra et al., 2007), and a Wikipedia-based ontology, YAGO (Suchanek et al., 2007), and through that we populate LKIF. Moreover, we use the mentions of those entities in Wikipedia text to train a specific Named Entity Recognizer and Classifier. We find that this classifier works well in the Wikipedia, but, as could be expected, performance decreases in a corpus of judgments of the European Court of Human Rights. However, this tool will be used as a preprocess for human annotation. We resort to a technique called curriculum learning aimed to overcome problems of overfitting by learning increasingly more complex concepts. However, we find that in this particular setting, the method works best by learning from most specific to most general concepts, not the other way round.

Domaines

Informatique

Fichier principal

EACL2017.pdf (115.65 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Serena Villata : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01572444

Soumis le : lundi 7 août 2017-13:53:46

Dernière modification le : lundi 26 février 2024-11:22:08

Dates et versions

hal-01572444 , version 1 (07-08-2017)

Identifiants

HAL Id : hal-01572444 , version 1
DOI : 10.18653/v1/E17-2041

Citer

Cristian Cardellino, Milagro Teruel, Laura Alonso Alemany, Serena Villata. Legal NERC with ontologies, Wikipedia and curriculum learning. 15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, ⟨10.18653/v1/E17-2041⟩. ⟨hal-01572444⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA I3S WIMMICS INRIA2 UNIV-COTEDAZUR

659 Consultations

225 Téléchargements

Legal NERC with ontologies, Wikipedia and curriculum learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager