Abstract
We tackle the problem of identifying metaphors in text, treated as a sequence
tagging task. The pre-trained word embeddings GloVe, ELMo and BERT have
individually shown good performance on sequential metaphor identification.
These embeddings are generated by different models, training targets and
corpora, thus encoding different semantic and syntactic information. We show
that leveraging GloVe, ELMo and feature-based BERT based on a multi-channel CNN
and a Bidirectional LSTM model can significantly outperform any single word
embedding method and the combination of the two embeddings. Incorporating
linguistic features into our model can further improve model performance,
yielding state-of-the-art performance on three public metaphor datasets. We
also provide in-depth analysis on the effectiveness of leveraging multiple word
embeddings, including analysing the spatial distribution of different embedding
methods for metaphors and literals, and showing how well the embeddings
complement each other in different genres and parts of speech.