Combining Pre-trained Word Embeddings and Linguistic Features for Sequential Metaphor Identification

Rui Mao; Chenghua Lin; Frank Guerin

doi:10.48550/arxiv.2104.03285

Back

Preprint

Combining Pre-trained Word Embeddings and Linguistic Features for Sequential Metaphor Identification

Rui Mao, Chenghua Lin and Frank Guerin

arXiv (Cornell University)

07/04/2021

DOI: https://doi.org/10.48550/arxiv.2104.03285

Abstract

Computer Science - Computation and Language

We tackle the problem of identifying metaphors in text, treated as a sequence tagging task. The pre-trained word embeddings GloVe, ELMo and BERT have individually shown good performance on sequential metaphor identification. These embeddings are generated by different models, training targets and corpora, thus encoding different semantic and syntactic information. We show that leveraging GloVe, ELMo and feature-based BERT based on a multi-channel CNN and a Bidirectional LSTM model can significantly outperform any single word embedding method and the combination of the two embeddings. Incorporating linguistic features into our model can further improve model performance, yielding state-of-the-art performance on three public metaphor datasets. We also provide in-depth analysis on the effectiveness of leveraging multiple word embeddings, including analysing the spatial distribution of different embedding methods for metaphors and literals, and showing how well the embeddings complement each other in different genres and parts of speech.

Metrics

13 Record Views

Details

Title: Combining Pre-trained Word Embeddings and Linguistic Features for Sequential Metaphor Identification
Creators: Rui Mao
Chenghua Lin
Frank Guerin - University of Aberdeen
Publication Details: arXiv (Cornell University)
Identifiers: 99817428602346
Copyright: The URI http://arxiv.org/licenses/nonexclusive-distrib/1.0/ is used to record the fact that the submitter granted the following license to arXiv.org on submission of an article: I grant arXiv.org a perpetual, non-exclusive license to distribute this article. I certify that I have the right to grant this license. I understand that submissions cannot be completely removed once accepted. I understand that arXiv.org reserves the right to reclassify or reject any submission.
Academic Unit: School of Computer Science and Electronic Engineering
Language: English
Resource Type: Preprint

Combining Pre-trained Word Embeddings and Linguistic Features for Sequential Metaphor Identification

Abstract

Metrics

Details

Usage Policy