By Matthew Aldridge, University of Leeds, UK, m.aldridge@leeds.ac.uk | Oliver Johnson, University of Bristol, UK, O.Johnson@bristol.ac.uk | Jonathan Scarlett, National University of Singapore, Singapore, scarlett@comp.nus.edu.sg
The group testing problem concerns discovering a small number of defective items within a large population by performing tests on pools of items. A test is positive if the pool contains at least one defective, and negative if it contains no defectives. This is a sparse inference problem with a combinatorial flavour, with applications in medical testing, biology, telecommunications, information technology, data science, and more. In this monograph, we survey recent developments in the group testing problem from an information-theoretic perspective. We cover several related developments: efficient algorithms with practical storage and computation requirements, achievability bounds for optimal decoding methods, and algorithm-independent converse bounds. We assess the theoretical guarantees not only in terms of scaling laws, but also in terms of the constant factors, leading to the notion of the rate of group testing, indicating the amount of information learned per test. Considering both noiseless and noisy settings, we identify several regimes where existing algorithms are provably optimal or near-optimal, as well as regimes where there remains greater potential for improvement. In addition, we survey results concerning a number of variations on the standard group testing problem, including partial recovery criteria, adaptive algorithms with a limited number of stages, constrained test designs, and sublineartime algorithms.
Group testing emerged as an area for research from the need for the US Government to screen recruits in the second world war for syphilis. Obviously rather than testing each recruit, a more efficient method involving the minimal number of tests was required. The central problem of group testing is thus: Given a number of items and a number of defectives, how many tests are required to accurately discover the defective items, and how can this be achieved? Group testing has since found applications in medical testing, biology, telecommunications, information technology, data science, and more.
The focus of this survey is on the non-adaptive setting of group testing. In this setting, the test pools are designed in advance enabling them to be implemented in parallel. The survey gives a comprehensive and thorough treatment of the subject from an information theoretic perspective. It covers several related developments: efficient algorithms with practical storage and computation requirements, achievability bounds for optimal decoding methods, and algorithm-independent converse bounds. It assesses the theoretical guarantees not only in terms of scaling laws, but also in terms of the constant factors, leading to the notion of the rate of group testing, indicating the amount of information learned per test. Considering both noiseless and noisy settings, it identifies several regimes where existing algorithms are provably optimal or near-optimal, as well as regimes where there remains greater potential for improvement.
This monograph is an accessible treatment of an important topic for researchers and students in Information Theory.