By Lin Zhou, Beihang University, China, lzhou@buaa.edu.cn | Alfred O. Hero, University of Michigan, USA, hero@eecs.umich.edu
Twenty Questions originated as a parlor game between two players. The game starts from a player named an oracle, who privately thinks of a secret. The other player, called the questioner, tries to guess the secret by querying the oracle with at most twenty questions having Yes/No answers. Early versions of the game can be traced to ancient Greece and ancient Rome. Motivated by the Hungarian version of this game, in the middle of the twentieth century, Rényi formulated the game as a mathematical problem of guessing an integer from a finite set, where the oracle could lie either randomly to each question or lie to a finite number of questions. The mathematical study of Twenty Questions is motivated by current applications in many domains: communications; machine learning; and computer vision.
The game with an oracle who is allowed a fixed number of lies was also studied by Ulam and Berlekamp and is known as the Rényi-Ulam-Berlekamp game. In contrast, the setting where the oracle lies randomly is less understood. In this monograph, we summarize recent advances in the information theoretical analysis of Twenty Questions with random error. In particular, focusing on the practical application of sensor network target localization, we study a query-dependent channel to model oracle’s noisy response behavior, such as providing a wrong answer or declining to answer a question. We concentrate on non-adaptive query procedures where all questions are designed prior to posing questions. We cover settings relevant to estimating a single target, a single moving target, and multiple targets over the unit cube of a finite dimension. We also consider adaptive querying for a single target to illustrate the benefit of adaptivity. In adaptive querying, each question is designed sequentially using responses to all previous questions. All of our theoretical results are illustrated using numerical examples.
Finally, we discuss future research directions. These include geometry constraints for query sets, low-complexity query procedures, connections to group testing, and practical applications in machine learning and communications.
This monograph provides a self-contained review of information theoretical benchmarks for a statistical learning problem named Twenty Questions with random error. The problem finds diverse applications across domains of communications, signal processing and computer science, e.g., beam alignment in mmWave multiple antenna communication, target localization using sensor networks, and face localization in images and videos. The problem can also find potential applications in other domains where parameter estimation is required in a query-and-answer manner.
The problem of Twenty Questions with random error originated from a parlor game between two players. The game starts from a player named an oracle, who privately thinks of a secret. The other player, called the questioner, tries to guess the secret by querying the oracle with at most twenty questions having Yes/No answers. The mathematical formulation of the problem was pioneered by Alfred Rényi as a parameter estimation problem, who assumed that oracle could lie randomly to each question and the number of questions can be more than 20. This monograph concentrates on non-adaptive query procedures where all questions are designed prior to posing questions and covers settings relevant to estimating a single target, a single moving target, and multiple targets over the unit cube of a finite dimension. For each case, theoretical benchmarks of optimal query procedures are presented and illustrated via numerical examples. This monograph also considers adaptive querying for a single target to illustrate the benefit of adaptivity. In adaptive querying, each question is designed sequentially using responses to all previous questions. One of the particular features of this monograph is the presentation of second-order asymptotic techniques that provide tighter convergence guarantees for the Twenty Questions searching and tracking problems considered here. These guarantees provide insights into the factors affecting the convergence rates depending on the problem setting and model parameters.
This monograph is suitable for researchers and graduate students who are interested in statistical learning, information theory, communications, signal processing and computer science. In particular, the ideas covered in this monograph demonstrate the application of information theory to a statistical learning problem, with applications in communications, signal processing and computer science. Therefore, people interested in statistical learning and information theory could benefit from knowing how statistical learning problems can be solved via information theoretical tools, and people interested in communications, signal processing and computer science can learn about potential algorithms for practical applications.