A corpus-based approach to discovering semantic relationships between named entities
Keywords:
corpus linguistics, discourse analysis, named entities, linguistic corpus, computational linguisticsAbstract
Introduction: The objective of this study is to analyze a news text related to cultural identity, which is part of a labeled
linguistic corpus, in order to annotate the syntactic and semantic relationships between the named entities in the text.
Materials and methods: A classification of the semantic relationships established between the named entities and how
they function in a labeled XML format is presented, using grammatical tagging and syntactic analysis. 20 named entities,
13 grammatical relationships, and 36 semantic relationships were tagged. Results: The proposal presented in this article
proves to be useful for developing and evaluating new open information extraction systems in Spanish. Discussion:
Linguistic corpora, corpus linguistics, and computational linguistics are valuable tools in the process of machine learning
for natural language understanding. The analysis of syntactic and semantic relationships between named entities in
a news text is crucial for relevant information extraction and linguistic pattern identification. Conclusion: This study
highlights the relevance of labeled linguistic corpora and corpus linguistics in the analysis of natural language and in the
development of natural language processing systems that are capable of understanding and analyzing human language
in different contexts. The importance of this work lies in the need to develop natural language processing systems that
enable computers to understand and analyze human language in different contexts.
References
Alonso, L. (1998). El análisis sociológico de los discursos: una aproximación desde los usos concretos. Ed. Fundamentos.
Análisis del Discurso. (2015). https://metodosdeinvestigaciondcgunefa.wordpress.com/2015/07/04/analisisdel-discurso/
Arredondo Toledo, L. M. (2018). Extracción de relaciones entre las entidades nombradas en el idioma español [Tesis de Maestría].
Bernal Chávez, J. A. y Hincapié Moreno, D. A. (2018). Lingüística de corpus. http://bibliotecadigital.caroycuervo.gov.co/1703/1/Linguistica-de-corpus-2018.pdf
Boillos Pereira, M. M. (2018). La elaboración de un corpus del profesorado de español (copele): ¿utopía o realidad? Disponible en: https://www.scielo.cl/scielo.php?script=sci_arttext&pid=S0718-48832018000200153
Cruz Piñol, M. (2017). Lingüística de corpus y enseñanza del español como 2/L. Arco/Libros. https://www.arcomuralla.com/detalle_libro.php?id=872
Culotta, A., & Sorensen, J. (2004). Dependency tree kernels for relation extraction. In Proceedings of the 42nd annual meeting on association for computational linguistics (p. 423). Association for Computational Linguistics.
Culotta, A., McCallum, A. & Betz, J. (2006). Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (pp. 296-303). Association for Computational Linguistics.
Filología e informática. (1999): nuevas tecnologías en los estudios filológicos (pp. 45-77). Milenio.
Jurafsky, D., & Martin, J. H. (2017). Vector Semantics. Speech and Language Processing: An Introduction to Natural Language Processing. Computational Linguistics, and Speech Recognition (3rd ed draft chapter 15-16).
Lyons, John. (1997). Semántica lingüística. Paidós.
Martín Peris, Ernesto. (coord.) (2008). Diccionario de términos clave de ELE. SGEL.
Mercado, H. (2008). Fundamentos de la lingüística de corpus. (s.e.).
Pardo Abril, N. G. (2002). El contexto y el discurso público. https://revistas.udistrital.edu.co/index.php/enunc/article/view/2465/3432.
Sinclair, J. M. (1991). Corpus, Concordance, Collocation. Oxford: Oxford University Press.
Torruela, J. & Llisterri, J. (1999). Diseño de corpus textuales y orales. En Filología e informática: nuevas tecnologías en los estudios filológicos (pp. 45-77). Milenio.
Wallis, S. and Nelson G. (s.f.). Knowledge discovery in grammatically analysed corpora. Data Mining and Knowledge Discovery, 5: 307–340.
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Reynier Ávila Peña, Celia María Pérez Marqués
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This journal provides immediate open access to its content, based on the principle that offering the public free access to research helps a greater global exchange of knowledge. Each author is responsible for the content of each of their articles.