APSIPA Transactions on Signal and Information Processing > Vol 13 > Issue 2

An Overview of Language Models: Recent Developments and Outlook

Chengwei Wei, University of Southern California, USA, chengwei@usc.edu , Yun-Cheng Wang, University of Southern California, USA, Bin Wang, National University of Singapore, Singapore, C.-C. Jay Kuo, University of Southern California, USA
 
Suggested Citation
Chengwei Wei, Yun-Cheng Wang, Bin Wang and C.-C. Jay Kuo (2024), "An Overview of Language Models: Recent Developments and Outlook", APSIPA Transactions on Signal and Information Processing: Vol. 13: No. 2, e101. http://dx.doi.org/10.1561/116.00000010

Publication Date: 12 Feb 2024
© 2024 C. Wei, Y.-C. Wang, B. Wang and C.-C. J. Kuo
 
Subjects
 
Keywords
Language modelNatural language processingPre-trained language modelConventional language model
 

Share

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 378 times

In this article:
Introduction 
Types of Language Models 
Linguistic Units 
Architecture of Language Models 
Pre-trained Language Models 
Model Evaluation 
Language Models in Text Generation 
Efficient Models 
Future Research Directions 
Conclusion 
References 

Abstract

Language modeling studies the probability distributions over strings of texts. It is one of the most fundamental tasks in natural language processing (NLP). It has been widely used in text generation, speech recognition, machine translation, etc. Conventional language models (CLMs) aim to predict the probability of linguistic sequences in a causal manner, while pre-trained language models (PLMs) cover broader concepts and can be used in both causal sequential modeling and fine-tuning for downstream applications. PLMs have their own training paradigms (usually self-supervised) and serve as foundation models in modern NLP systems. This overview paper provides an introduction to both CLMs and PLMs from five aspects, i.e., linguistic units, architectures, training methods, evaluation methods, and applications. Furthermore, we discuss the relationship between CLMs and PLMs and shed light on the future directions of language modeling in the pre-trained era.

DOI:10.1561/116.00000010

Companion

APSIPA Transactions on Signal and Information Processing Special Issue - Pre-trained Large Language Models for Information Processing
See the other articles that are part of this special issue.