The sense of hearing is fundamental to human beings, as it allows them to perceive their surroundings. However, this simple task of recognizing different sounds in complex environments poses a challenge for machines. Sound event detection (SED) is a field that aims to automate the human auditory system’s detection and recognition of sound events with their onset and offset points. Training an SED system typically requires a large labeled set, but is associated with high annotation costs and is dependent on the subjective judgments of annotators. Therefore, significant efforts have been made in this area, including the major DCASE challenge series, which brings researchers together annually to address this issue. The DCASE challenge was started in the year 2013, and it has evolved over the years to witness some significant breakthroughs in the field of SED. In this study, we delve into the methods proposed by various authors in the DCASE challenge series, providing a thorough discussion of feature extraction, machine learning techniques, and post-processing methods. We also study the results from top teams in each edition of the DCASE challenge to bring out the highlights of the best-performing SED systems and explore potential future research directions.