APSIPA Transactions on Signal and Information Processing > Vol 8 > Issue 1

Evaluating word embedding models: methods and experimental results

Bin Wang, University of Southern California, USA, bwang28c@gmail.com , Angela Wang, University of California, USA, Fenxiao Chen, University of Southern California, USA, Yuncheng Wang, University of Southern California, USA, C.-C. Jay Kuo, University of Southern California, USA
 
Suggested Citation
Bin Wang, Angela Wang, Fenxiao Chen, Yuncheng Wang and C.-C. Jay Kuo (2019), "Evaluating word embedding models: methods and experimental results", APSIPA Transactions on Signal and Information Processing: Vol. 8: No. 1, e19. http://dx.doi.org/10.1017/ATSIP.2019.12

Publication Date: 08 Jul 2019
© 2019 Bin Wang, Angela Wang, Fenxiao Chen, Yuncheng Wang and C.-C. Jay Kuo
 
Subjects
 
Keywords
Natural language processingWord embeddingWord embedding evaluation
 

Share

Open Access

This is published under the terms of the Creative Commons Attribution licence.

Downloaded: 7035 times

In this article:
I. INTRODUCTION 
II. WORD EMBEDDING MODELS 
III. DESIRED PROPERTIES OF EMBEDDING MODELS AND EVALUATORS 
IV. INTRINSIC EVALUATORS 
V. EXPERIMENTAL RESULTS OF INTRINSIC EVALUATORS 
VI. EXTRINSIC EVALUATORS 
VII. EXPERIMENTAL RESULTS OF EXTRINSIC EVALUATORS 
VIII. CONSISTENCY STUDY VIA CORRELATION ANALYSIS 
IX. CONCLUSION AND FUTURE WORK 

Abstract

Extensive evaluation on a large number of word embedding models for language processing applications is conducted in this work. First, we introduce popular word embedding models and discuss desired properties of word models and evaluation methods (or evaluators). Then, we categorize evaluators into intrinsic and extrinsic two types. Intrinsic evaluators test the quality of a representation independent of specific natural language processing tasks while extrinsic evaluators use word embeddings as input features to a downstream task and measure changes in performance metrics specific to that task. We report experimental results of intrinsic and extrinsic evaluators on six word embedding models. It is shown that different evaluators focus on different aspects of word models, and some are more correlated with natural language processing tasks. Finally, we adopt correlation analysis to study performance consistency of extrinsic and intrinsic evaluators.

DOI:10.1017/ATSIP.2019.12