An Algorithmic Scheme for Statistical Thesaurus Construction in a Morphologically Rich Language

Liebeskind, Chaya and Dagan, Ido and Schler, Jonathan (2019) An Algorithmic Scheme for Statistical Thesaurus Construction in a Morphologically Rich Language. Applied Artificial Intelligence, 33 (6). pp. 483-496. ISSN 0883-9514

[thumbnail of An Algorithmic Scheme for Statistical Thesaurus Construction in a Morphologically Rich Language.pdf] Text
An Algorithmic Scheme for Statistical Thesaurus Construction in a Morphologically Rich Language.pdf - Published Version

Download (1MB)

Abstract

Corpus-based automatic thesaurus construction uses linguistic methods, such as Part-of-Speech taggers and parsers, which often perform poorly on MRLs. Therefore, in this paper, we focused on the complex task of adapting corpus-based thesaurus construction methods for MRLs. We investigated two statistical approaches for thesaurus construction; a) a first-order co-occurrence-based approach and b) a second-order distributional-based approach. We explored alternative levels of morphological term representations complemented by grouping the morphological variants. We then introduced and adopted a generic algorithmic scheme for thesaurus construction in MRLs for both first-order and second-order approaches. Our scheme investigated alternative representation levels and offered alternative configurations. We demonstrated the empirical benefits of our methodology for a diachronic Hebrew thesaurus construction. We used morphological analysis tools, defined and applied a new annotation scheme, and demonstrated its optimal configuration, which outperforms the baseline for both first and second order corpus-based thesaurus construction approaches.

Item Type: Article
Subjects: STM Academic > Computer Science
Depositing User: Unnamed user with email support@stmacademic.com
Date Deposited: 21 Jun 2023 10:28
Last Modified: 18 Nov 2023 05:49
URI: http://article.researchpromo.com/id/eprint/1114

Actions (login required)

View Item
View Item