By Raghuraman Gopalan, AT&T Labs-Research, USA, raghuram@research.att.com | Ruonan Li, Harvard University, USA, ruonanli@seas.harvard.edu | Vishal M. Patel, University of Maryland, College Park, USA, pvishalm@umd.edu | Rama Chellappa , University of Maryland, College Park, USA, rama@umiacs.umd.edu
Domain adaptation is an active, emerging research area that attempts to address the changes in data distribution across training and testing datasets. With the availability of a multitude of image acquisition sensors, variations due to illumination, and viewpoint among others, computer vision applications present a very natural test bed for evaluating domain adaptation methods. In this monograph, we provide a comprehensive overview of domain adaptation solutions for visual recognition problems. By starting with the problem description and illustrations, we discuss three adaptation scenarios namely, (i) unsupervised adaptation where the “source domain” training data is partially labeled and the “target domain” test data is unlabeled, (ii) semi-supervised adaptation where the target domain also has partial labels, and (iii) multi-domain heterogeneous adaptation which studies the previous two settings with the source and/or target having more than one domain, and accounts for cases where the features used to represent the data in each domain are different. For all these topics we discuss existing adaptation techniques in the literature, which are motivated by the principles of max-margin discriminative learning, manifold learning, sparse coding, as well as low-rank representations. These techniques have shown improved performance on a variety of applications such as object recognition, face recognition, activity analysis, concept classification, and person detection. We then conclude by analyzing the challenges posed by the realm of “big visual data”, in terms of the generalization ability of adaptation algorithms to unconstrained data acquisition as well as issues related to their computational tractability, and draw parallels with the efforts from vision community on image transformation models, and invariant descriptors so as to facilitate improved understanding of vision problems under uncertainty.
Domain adaptation is an active, emerging research area that attempts to address the changes in data distribution across training and testing datasets. With the availability of a multitude of image acquisition sensors, variations due to illumination and viewpoint among others, computer vision applications present a very natural test bed for evaluating domain adaptation methods. This monograph provides a comprehensive overview of domain adaptation solutions for visual recognition problems. By starting with the problem description and illustrations, it discusses three adaptation scenarios, namely, (i) unsupervised adaptation where the “source domain” training data is partially labeled and the “target domain” test data is unlabeled; (ii) semi-supervised adaptation where the target domain also has partial labels; and (iii) multi-domain heterogeneous adaptation which studies the previous two settings with the source and/or target having more than one domain, and accounts for cases where the features used to represent the data in each domain are different. For all of these scenarios, Domain Adaptation for Visual Recognition discusses the existing adaptation techniques in the literature. These techniques are motivated by the principles of max-margin discriminative learning, manifold learning, sparse coding, as well as low-rank representations, and have shown improved performance on a variety of applications such as object recognition, face recognition, activity analysis, concept classification, and person detection.
Domain Adaptation for Visual Recognition concludes by analyzing the challenges posed by the realm of “big visual data” - in terms of the generalization ability of adaptation algorithms to unconstrained data acquisition as well as issues related to their computational tractability - and draws parallels with efforts from the vision community on image transformation models and invariant descriptors so as to facilitate improved understanding of vision problems under uncertainty.