University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Athlete pose estimation from single-view TV broadcast footage.

Fastovets, Mykyta (2017) Athlete pose estimation from single-view TV broadcast footage. Doctoral thesis, University of Surrey.

thesis.pdf - Version of Record
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (147MB) | Preview


This thesis presents work on athlete pose estimation in single-iew broadcast videos. Human pose estimation is an important problem in computer vision and has received much interest in the research community due to the wide range of applications. This thesis presents a novel framework for the semi-automatic estimation of human pose in television quality sports footage. The focus is on achieving accurate pose estimation results on sports video sequences, with the assistance of a human operator in a broadcast studio setting, that can be used to drive post-action analysis and graphical overlays. A method for extracting and tracking off-the-shelf scale-invariant features on athletes is tested. Evaluation shows that such features are ill-suited for tracking articulated motion due to drift, data association, and a general lack of stable features to track. A keyframe-driven approach, inspired by the Pictorial Structures model, is developed for estimating 2D pose of athletes in sports sequences. This approach models the human body as a tree of loosely linked parts and introduces a temporal smoothness term aimed at ensuring temporal consistency of pose throughout the sequence. The evaluation demonstrates that such an approach is able to extract human pose in such videos, but requires a significant amount of manual interaction to do so with accuracy required for broadcast settings. A novel non-sequential method for maximising benefit from manually annotated keyframe poses using minimum spanning trees is developed. The developed algorithm serves two purposes: keyframe selection, and keyframe information propagation. Optimal keyframes are automatically selected and suggested to the operator for labelling. Once labelled, information from these keyframes is propagated throughout the sequence and automatically generated keyframes are created in visually similar frames. Qualitative and quantitative evaluation demonstrates an increase in accuracy and a decrease in the number of required keyframes. Finally, a geometric method for converting 2D poses into 3D is developed. The algorithm assumes a weak perspective projection for the video sequence and known relative limb lengths for the athlete, and is able to recover the relative scale given at least three labelled keyframes by solving a continuous optimisation problem. Evaluation against a baseline geometric method shows improved stability and lower residual error.

Item Type: Thesis (Doctoral)
Subjects : computer vision, human pose estimation, optimisation, feature tracking
Divisions : Theses
Authors :
Fastovets, Mykyta
Date : 28 February 2017
Funders : BBC R&D, University of Surrey
Contributors :
ContributionNameEmailORCID, A., J-Y.
Depositing User : Mykyta Fastovets
Date Deposited : 09 Mar 2017 11:25
Last Modified : 09 Nov 2018 16:40

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800