APSIPA Transactions on Signal and Information Processing > Vol 13 > Issue 1

Unsupervised Green Object Tracker (GOT) without Offline Pre-training

Zhiruo Zhou, University of Southern California, USA, zhiruozh@usc.edu , Suya You, DEVCOM Army Research Laboratory, USA, C.-C. Jay Kuo, University of Southern California, USA
 
Suggested Citation
Zhiruo Zhou, Suya You and C.-C. Jay Kuo (2024), "Unsupervised Green Object Tracker (GOT) without Offline Pre-training", APSIPA Transactions on Signal and Information Processing: Vol. 13: No. 1, e20. http://dx.doi.org/10.1561/116.20240022

Publication Date: 16 Sep 2024
© 2024 Z. Zhou, S. You and C.-C. J. Kuo
 
Subjects
Tracking,  Motion estimation and registration,  Segmentation and grouping,  Object and scene recognition,  Classification and prediction,  Online learning
 
Keywords
Object trackingonline trackingsingle object trackingunsupervised tracking
 

Share

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 76 times

In this article:
Introduction 
Related Work 
Green Object Tracker (GOT) 
Experiments 
Discussion on GOT’s Limitations 
Conclusion and Future Work 
References 
Model Size and Complexity Analysis of GOT 
Long-term Tracking Capability of GOT 

Abstract

Supervised trackers trained on labeled data dominate the single object tracking field for superior tracking accuracy. The labeling cost and the huge computational complexity hinder their applications on edge devices. Unsupervised learning methods have also been investigated to reduce the labeling cost but their complexity remains high. They all need large scale offline training. Aiming at lightweight high-performance tracking, feasibility without offline pre-training, and algorithmic transparency, we propose a new single object tracking method, called the green object tracker (GOT), in this work. GOT conducts an ensemble of three prediction branches for robust box tracking: 1) a global object-based correlator to predict the object location roughly, 2) a local patch-based correlator to build temporal correlations of small spatial units, and 3) a superpixel-based segmentator to exploit the spatial information of the target frame. GOT offers competitive tracking accuracy with state-of-the-art unsupervised trackers, which demand heavy offline pre-training, at a lower computation cost. GOT has a tiny model size (<3k parameters) and low inference complexity (around 58M FLOPs per frame), leading to the inference complexity that is between 0.1% ∼ 10% of DL trackers.

DOI:10.1561/116.20240022