OBJECT TRACKING IN VIDEO SEQUENCE USING MODIFIED KALMAN FILTER WITH
A SHRINKING ACTIVE CONTOUR AS A MEASURING TOOL
A Thesis
By
PRAVINKUMAR G. KANDHARE
Submitted to the Office of Graduate Studies
of Texas A&M University-Commerce
in partial fulfilment of the requirements
for the degree of
MASTER OF COMPUTER SCIENCE
December 2013
OBJECT TRACKING IN VIDEO SEQUENCE USING MODIFIED KALMAN FILTER WITH
A SHRINKING ACTIVE CONTOUR AS A MEASURING TOOL
A Thesis
By
PRAVINKUMAR G. KANDHARE
Approved by:
Advisor: Nikolay Metodiev Sirakov
Committee: Nikolay Metodiev Sirakov
Unal Sakoglu
Mutlu Mete
Head of Department: Sang C. Suh
Dean of College: Grady Blount
Dean of Graduate Studies: Arlene Horne
iii
Copyright © 2013
PRAVINKUMAR G. KANDHARE
iv
ABSTRACT
OBJECT TRACKING IN VIDEO SEQUENCE USING MODIFIED KALMAN FILTER WITH
A SHRINKING ACTIVE CONTOUR AS A MEASURING TOOL
Pravinkumar G. Kandhare, MS
Texas A & M University-Commerce, 2013
Advisor: Nikolay Methodiev Sirakov, PhD
Object tracking in video is an important ongoing research with application in the field
of human object interaction recognition in video sequences and other multimedia. In the
present study, we employ the Kalman Filter, which provides a method capable for predicting
and estimating the mass center position of the object in the next frame. Further, we have
elaborated and modified the Kalman Filter such that to apply parametric shrinking active
contour (S-ACES) with a large capture range as a segmenting tool to determine the location
of the object in the present frame. This location is used to predict object’s position in the
next consecutive frame of a video sequence by using the Modified Kalman algorithm. The
Predicted position results may contain some error. To minimize error in prediction, we
introduce an estimated position that provides object’s more accurate location in the next
consecutive frame. A number of experiments are performed with video clips to validate the
theory in a way that keeps the accuracy of tracking
iii
ACKNOWLEDGEMENTS
Foremost, I would like to express my sincere gratitude to my advisor Dr. Nikolay
Sirakov for the continuous support in my Master’s degree and research, for his patience,
motivation, enthusiasm, and immense knowledge. His guidance helped me throughout
research and writing of this thesis. I could not have imagined having a better advisor and
mentor for my Master’s study.
I also would like to thank the rest of the thesis committee: Dr. Mutlu Mete and Dr.
Unal Sakoglu for their support, insightful comments, and feedback.
Last but not the least, I would like to thank my loved ones, who have supported me in
the entire process, both by encouraging and supporting me.
iv
TABLE OF CONTENTS
LIST OF TABLES ..................................................................................................................... v
LIST OF FIGURES .................................................................................................................. vi
Chapter 1 ................................................................................................................................... 1
INTRODUCTION ......................................................................................................... 1
Survey of the Recent Methods ........................................................................... 2
Problem Statement ............................................................................................. 4
Chapter 2 ................................................................................................................................... 5
BACKGROUND ........................................................................................................... 5
Kalman Filter ..................................................................................................... 5
S-ACES Active Contour Model ......................................................................... 7
Chapter 3 ................................................................................................................................. 10
THE MODIFIED KALMAN METHOD ...................................................................... 10
S-ACES stage................................................................................................... 11
Prediction stage ................................................................................................ 13
Estimation stage ............................................................................................... 14
Chapter 4 ................................................................................................................................. 16
EXPERIMENTAL RESULTS..................................................................................... 16
Chapter 5 ................................................................................................................................. 21
CONCLUSION AND FUTURE WORK .................................................................... 21
REFERENCES ....................................................................................................................... 22
v
LIST OF TABLES
TABLE
1. Experimental data derived using S-ACES tool ..................................................................18
2. Experimental data derived for average Error in position and average Error in
Measurement using S-ACES tool ......................................................................................19
vi
LIST OF FIGURES
FIGURE
1. Two consecutive frames and their parameters ............................................................5
2. The Aircraft boundary extraction using S-ACES tool ................................................9
3. Tracking results obtained using the Modified Kalman Filter ......................................10
4. Three stage tracking process ......................................................................................11
5. Randomized increase in the object’s shape during motion .........................................12
6. The tracking results from five video clips using S-ACES and the
Modified Kalman Filter. ..............................................................................................17
7. The overlapping tracking result using S-ACES and the Modified
Kalman Filter. ..............................................................................................................20
1
Chapter 1
INTRODUCTION
Object tracking in video sequence is useful in variety of real time application:
Computer vision, multimedia, animation, traffic monitoring, human objects interaction
recognition, etc. Automated camera applications are playing a very important role in daily
life for security purposes. Object tracking algorithms may become popular on the market due
to the increasing need for video analysis. The following are given some fields where object-tracking
algorithms are mostly used (Yilmaz , Omar, & Shah, 2006):
• Motion based recognition: automatic object detection and human identification
based on a particular way or manner of walking, also known as gait detection.
• Human computer interaction recognition: communication with computerized
equipment without need of physical contact (to operate control). Sign
language is used to control various tasks in human-machine interaction
(Department of numerical Analysis and Computing Science, 2001).
• Automated video surveillance: this system is designed to report movement of
an object and any doubtful situation in a particular area.
• Traffic Monitoring: surveillance system is designed to track the vehicle that
breaks the traffic rules or performs illegal acts (Watve, 2004).
• Robot Vision: a real time object-tracking steering system is designed to
identify non-stationary and stationary obstacles in the path of the robot to
avoid collision.
A video clip is composed using a sequence of images; each single image in the video
sequence is referred as frame. In other words, a video file is a sequence of frames. Frames
are continuously moved fast enough (approximately 30 frames per seconds) for human eyes
to perceive it as a video clip. Contents of two consecutive frames in video are closely related
2
to each other, which help with continuous perception of the motion. Due to this, all object
tracking algorithms or methods can be applied to individual frames. In general, the tracking
process is to estimate the trajectory of an object in the image plane as it changes the position
in a video sequence. The tracking methods can be divided in to two main classes as top-down
and bottom-up approaches. In a bottom-up approach, the image is segmented into objects
before tracking. In contrast, a top-down approach produces object hypothesis and tries to
verify them using the image (Nummiaro, Esther, & Luc, 2002). Object tracking is a
challenging task due to loss of information caused by projection of 3D on a 2D image, noise
in images, scene illumination changes, complex object motion, non-grid or articulated nature
of objects, partial and full object occlusions, complex shapes, and real-time processing
requirements (Yilmaz , Omar, & Shah, 2006).
Survey of the Recent Methods
Generally, various types of techniques are available in literature related with object
tracking, including background subtraction, shadow removing motion object detection, object
tracking, etc. (Tao, Guo, & Zhang, 2011). There are some common filtering algorithms used
in this field, which incorporate prior information of an object or scene and deal with the
object dynamics. One of them is the Kalman method, which is an optimal recursive Bayesian
filter and all measurements and noise distribution is assumed to be Gaussian. Kalman (1960)
published the research work of Kalman Filter technique in 1960. In fact, Peter Swerling
published a very similar kind of algorithm in 1958 in a less prestigious journal. The Kalman
Filter is sometimes known as the Kalman Bucy filter after the name of Richard Bucy who
joined Kalman in the early stages of development (Funk, 2003). Because of large research in
advanced digital computing, the traditional Kalman Filter method has been extended in
research areas and applications. The extended Kalman Filter (EKF) is a non-linear system
that linearizes about an estimate of the current mean and covariance.
3
Another type of filter that is used to deal with multidimensional tracking is known as
Partial Filter. This model is suitable for non-linear and non-Gaussian systems. To represent
state space, this filter randomly chooses system states, known as particles (Gordon, Salmond,
& Smith, 1993). Randomly chosen particles are associated with non-uniform weights. These
weights depend on the distance between the incoming observation and their projection onto
the observation space (Luo, Hoteit, Duan, & Wang, 2011). Weighted histogram of an object
is computed from circular (ellipsoid window) regions to represent the object instead of a
brute force search for locating the object. The similarity of the target object and candidate
object is maximized repetitively by comparing the weighted histogram of the object and the
window around the hypothesized object location, and Bhattacharya distance is used to
measure similarity (Zhanwu, 2009). In every iteration cycle, the hypothesized object is
located based on Mean Shift vector, where the similarity between both objects is increased.
This algorithm is a kernel-tracking algorithm, introduced by Faking and Hosteler (1975)
(Miguel, 2006). The main advantage of this algorithm is that the computational cost is lower
than other methods. Mean Shift algorithm fails to track the target object when the target
moves with fast speed because it doesn’t use speed information and motion direction. The
notion of Mean Shift algorithm is that underlying system uses static model, where object
outlook/shape will be unchanged during motion of an object, but it is wrong assumption in
real time environment (Zhanwu, 2009). The Kalman Filter and the Mean Shift tracking
methods are combined together to predict the next state of the object (Dorin & Visvanathan,
2000).
The main problem under consideration in this thesis is to modify and elaborate on the
existing Kalman Filter method and to make it work with a shrinking active contour as a
measuring tool. Another objective is to develop a new initialization that will allow
simplifying the calculation complexity. Software is developed on the base of the existing S-
4
ACES (Sirakov, Mete, & Chakrader, 2011) to implement the new algorithm, which is
designed to track a single object in a video sequence.
Problem Statement
In single object tracking, we need to locate the position of the target object through a
sequence of video frames. In other words, we are predicting the target object’s position in the
next consecutive frame using current frame positioning parameters. For each video frame,
we assume that there is a set of measurements, predictions and correction observations, or
parameters for describing an object, which are obtained in our case by using shrinking active
contour and the Modified Kalman method. The shrinking active contour finds the object’s
location in each processing frame, which helps to predict and estimate (correction in
prediction) an object’s location more accurately in the next consecutive frame using Modified
Kalman Filter to make the tracking process efficient and reliable.
Hereafter two assumptions are made for a smooth tracking process. First, the input of
the object’s location provided by the user in the first frame should be correct. Otherwise, the
object cannot be detected during the tracking process because of faulty input position
provided by the user in the first frame. Second, we try to predict object location in the next
frame of a video sequence based on the assumption that a target object does not change its
location and shape very quickly. Single objects can be tracked efficiently among multiple
objects as long as that moving target object is not overlapping with other objects in video.
5
Chapter 2
BACKGROUND
Kalman Filter
The main assumption of the Kalman Filter is that the underlying system is a linearly
dynamic system and all measurements and error terms have a Gaussian distribution (Yilmaz ,
Omar, & Shah, 2006). The Kalman Filter is an optimal recursive data processing algorithm,
which uses all measurement data, prior knowledge about the system and measurement device,
to generate an estimation of the desired variables in such way that error is minimized
dynamically (Kalman, 1960). The conventional Kalman Filter (approach) is divided into two
stages: prediction and correction.
The video is composed using multiple images with specific (30 milliseconds) time
interval between every two images. Each image is individually referred to as a frame. In
Figure 1, two frames are represented with the initial parameters of the Kalman Filter. T
denotes the time interval between the two consecutive frames. The Kalman Filter algorithm
predicts the position of the object in next frame (Prediction Stage) and makes the correction
in Predicted position by minimizing the error (Correction Stage). In the prediction stage, the
Predicted position may contain some error. In order to minimize this error, second stage of
correction is applied. Object tracking in video sequence is graphically represented in Figure
1.
Figure 1: Two consecutive frames and their parameters
6
In frame 𝐹!, present position of an object is denoted by Z!
!. Z!
! is a Measured position
of the object, calculated using a measurement tool. X!
!"#$%and X!
!"#$ are the Predicted and
Estimated positions respectively in frame 𝐹!.
V! = |!!
!! !!!!
! |
!
V! is the velocity of the moving target object, which is calculated by using
displacement of an object between 𝐹! to 𝐹!!! divided by fixed time interval between two
frames (T).
a! =
V! − V!!!
T
The acceleration a! is rate of change between velocities V! and V!!! divided by time.
X!!!
!"#$% is Predicted position of the object calculated by using its previous position plus
displacement and Error in position of an object (Equation [2.1.1]).
X!!!
!"#$%= X!
!"#$+ V!T+(1/2) a!T!+ EX! (2.1.1)
Z!
! = X!!!
!"#$% + EZ! (2.1.2)
EX! is Error in position at 𝐹! and EZ! denotes the Error in measurement. The Kalman
Filter results in the average prediction of a system’s state with new measurement using
weighted average. These weights are determined using covariance. In actual calculation for
state estimate and covariances, matrices representation is used to deal with multiple
dimensions of single (state) set of calculation. Consider a linear system, where the noise
distribution is Gaussian, with two variables X and Y. Position state matrix (with two
variables) is denoted as,
X!
! =
X!
Y!
Error in position covariance matrix is represented using notation Q.
7
Q = E{EX ∗ EX!}=
Q!! 0
0 Q!!
(2.1.3)
Here, We assume Q!" and Q!" coefficients are zero.
Q!! = E( X! − m! ∗ (X! − m!))
Q!! = E( Y! − m! ∗ (Y! − m!))
Similarly, Error in measurement covariance matrix is represented using notation R
R = E{EZ ∗ EZ!}=
R!! 0
0 R!!
(2.1.4)
Here, We assume R!" and R!" coefficients are zero.
X!!!
!"#$= This position is calculated to minimize the error in Predicted position, known as
Estimated position.
X!!!
!"#$= X!!!
!"#$%+K!!!(Z!!!
! − X!!!
!"#$%) (2.1.5)
Here, K!!! is called as “Kalman gain”. Above notations are represented using mass
center of an object. To explain the calculation for Error in position and Error in
measurement, we used state covariance matrix representations as shown above.
S-ACES Active Contour Model
In the tracking algorithm, our main goal is to find the path of target object through a
video sequence. But, the very first step to start tracking is to locate the object in a frame. In
other words, segmenting the target object from the background of the image. Different kinds
of models are available for boundary extraction, but we select the active contour published in
(Sirakov, 2005) because this model has a large capture range and does not need stability
convergence condition. This active contour is used as a measurement tool for our modified
tracking method. In this section, we discuss basics of this active contour as follows.
The general form of the parabolic differential equation is (Sirakov, 2005)
!!(!,!)
!! =α! !!!(!,!)
!!! (2.2.1)
8
Here, t is time parameter t ϵ [0,∞]. Consider, C is closed, smooth and convex curve
parameterized by C t, p = r t, p = x t, p i + y (t, p) j in the domain −1, 1 ×[−1, 1].
Parameter p is space parameter p ϵ 0,2π (Sirakov, 2005).
The Geometric Heat equation is a parabolic equation of the following from:
!!
!! = α! !!!
!!! = α! !
!!
!!
!! = α! !!
!! (2.2.2)
Frenet-Serret equation - “The product of curvature and the normal vector is equal to the
derivative of the tangent vector.” By using this theorem, Equation (2.2.2) can be represented
as (Sirakov, 2005),
!!
!! = α! kN ,
Where, !"
!" = T, !!
!" = kN. The Parameter k is curvature, T and N represent tangent to
the curve and the normal vector respectively. In the research paper (Sirakov & Ushkala,
2009), the exact solution of the Active Convex hull model given with equation below is used
to develop S-ACES.
!!
!! = P kN − |ds|T (2.2.3)
The tangent vector −|ds|T is used to prevent Convex Hull from intruding into
concavities of an image until straight line of its shape is reached. ψ = ψ (x s, t , y (s, t))
represents parametric curve. k is used to denote curvature and P is a penalty function
(Sirakov & Ushkala, 2009),
P = 0 if !!(!,!)
!!
1 otherwise
> ε,
f x, y denotes the image function. ε is a user selected threshold.
From equation (2.2.3), following equation is derived to evolve an active contour curve
toward the object (Sirakov & Ushkala, 2009):
r s, t = xi + yj = exp(! !"
! ! !" !!)[C!cos (c |!"|
! s√3) i + C!sin (c |!"|
! s√3) j] (2.2.4)
9
Initially, the contour inscribes the whole image and evolves toward the boundary. The
following equation initializes the active contour describing the image:
r s, t
!!!.!!", !"
! !!,!!!!!!!,!!!"""
=
R. exp(!"!!!!!.!!")[cos (1000√3 as) i, sin (1000s√3 as) j] (2.2.5)
In the above equation, R = !
! (nr! + nc!)
!
!, nr and nc represents number of rows and
number of columns in the images respectively. The parameter c is set to 1000 to keep the
evolving curve closed.
Figure: 2 Aircraft boundary extraction results are obtained using S-ACES software tool developed by Ushkala
K. (Sirakov & Ushkala, 2009) and improved by Chakrader N.S. (Chakrader, 2011) (a) The object boundary
along with curve (b) The object evolved by a active contour using equation (2.2.5) (c) The extracted boundary
of an Aircraft.
The following boundary condition is formulated to halt the active contour on the
object boundary to extract boundary of an object (Sirakov & Ushkala, 2009):
r s∗, t∗ = r s∗, t∗ + ∂t if !!(!∗,!)
!! > ε for s = s∗ and t∗ > 0.001 (2.2.6)
Boundary points calculted by shrinking active contour are used to calculate mass
center of the object. This calculated mass center is our Measured postion of the object, which
is denoted using X!
!"#$% for frame (𝐹!).
10
Chapter 3
THE MODIFIED KALMAN METHOD
In the present section, we describe functionality of the Modified Kalman Filter
method for tracking a single/multiple object(s) in video sequence using S-ACES tool
developed by (Sirakov & Ushkala, 2009) and improved by (Chakrader, 2011). The Modified
Kalman Filter tracking method is explained by using following figure.
Figure 3: Tracking results obtained using the Modified Kalman Filter. In each frame, Red colored circle
represents Measured position; Blue colored circle represents Predicted position; Green colored circle represents
Estimated position.
To perform the experiments, the video clip is sliced into multiple numbers of
frames/images. Batch processing technique is used to perform the tracking calculation
(extraction of the boundary of the target object, Measured position-mass center calculation
for the present frame, prediction position and estimation position of the target object for the
next frame) automatically on all frames at once rather than manual execution of each
individual frame. This shows the robustness of our algorithm because same set of parameters
(Measured position, Predicted position and Estimated position) is used for all frames. In
tracking algorithm, each individual frame is taken into account to predict the mass center
position of the object into the next consecutive frame. In other words, each frame is
processed through the Modified Kalman Filter algorithm to predict object’s mass center
position in next frame and keep the track of a moving target object. The algorithm
functionality is divided into the following three stages:
11
1. S-ACES stage: Object boundary extraction and object location measurement.
2. Prediction stage: Calculating Predicted position of the target object in the next
consecutive frame.
3. Estimation stage: Minimizing the error in Prediction position (stage 2: Prediction
stage)
Figure 4 shows the tracking process using three stages. Each individual frame uses
three stages to calculate Measured position, Predicted position, and Estimated position.
Prediction position calculation involves frames 𝐹! and 𝐹!!! positioning parameters to predict
future location of the object in the next frame 𝐹!!!.
Figure 4: Three stages tracking process.
The details of each stage are discussed in following section.
S-ACES stage
In the initial stage of the algorithm, shrinking active contour as explained in section 4
(Equation 2.2.4 -‐ 2.2.6) is applied on each individual frame to extract the boundary points of
the object subject of tracking. These boundary points are used to calculate the mass center of
an object, which represents actual position of an object in a frame. Here, the obtained actual
position using S-ACES is referred to as Measured position. In the first frame, the shrinking
12
active contour is applied using radius R and center C of the object subject of tracking and
these values are provided by the user. By using this center C and radius R of the object in
frame (𝐹!!!), we determine the Measured position of the object in frame (𝐹!). In the next
step of the algorithm, we calculate prediction and estimation position of the object for the
next consecutive frame (𝐹!!!). For each frame, we use this algorithm steps recursively with
few important changes.
In the next consecutive frame (𝐹!!!), we need to modify the center of the shrinking
active contour because estimation position is the new location for the moving target object in
the next frame (𝐹!!!) and we establish active contour at Estimated position that is calculated
in frame (𝐹!). A moving object can change its shape due to camera zoom in and zoom out
functionality or because of its motion. In other words, we need to modify the radius of
boundary extracting shrinking active contour. Since the contents of two (object shape)
frames are closely related with each other, we modify the radius for the frame (𝐹!!!) with
maximum distance between center of the object and furthest situated boundary point in
previous frame (𝐹!). Sometimes, the modified radius is not sufficient to evolve an active
contour around the target object because of its randomized increase in shape of the object as
shown in Figure 5. To handle this issue we increased the length of radius with a few units
provided by the user. The radius is also updated in each frame using the above methodology.
Figure 5: Randomized increase in the object’s shape during motion.
Flight_video_068.jpg_test.png Flight_video_069.jpg_test.png
13
From the above two images, we can tell that the target object is changing its shape
during motion. In other words, the shape of the target object is increased in the second frame
(Flight_video_069.jpg_test.png). In the second image, we are unable to extract the complete
boundary of the object. To grab the complete boundary of the object, we add a few length
units in radius at first frame of a video (Frame #1). These additional length units will be
added for all frames of a video sequence to handle above issue.
Prediction stage
In the second stage, we predict the position of the target object in the next consecutive
frame. To start up an algorithm, we initialized Measured position, Position Error, velocity
and acceleration. All these parameters are related with the mass center of an object. The
Measured position of an object is obtained by applying S-ACES on the first frame/image.
The Predicted position is calculated using the following formula,
X!!!
!"#$%= X!
!"#$+V!T+(1/2) a!T!+EX!!! (3.1.1)
Here, X!!!
!"#$% represents future location of the object in 𝑡 + 1!! frame, known as
“Predicted position”. X!
!"#$ is the Estimated position of the tracked object in the 𝑡!! frame
calculated using Equation [3.1.3].
V! and a! are used to denote the velocity and acceleration of the tracked object respectively.
T is the time between two sliced images. Velocity and acceleration can be calculated using
following formulas,
V! = !!
!"#$%!!!!!
!"#$%
! and a! = !!!!!!!
!
X!
!"#$% is the Measured position of the object in the 𝑡!! frame (𝐹!) and X!!!
!"#$% is the
Measured position in the 𝑡 − 1!! frame (𝐹!!!). V! is calculated using displacement between
two Measured positions (X!
!"#$% and X!!!
!"#$%) divided by time (T). The acceleration a! is
rate of change between velocities(V! and V!!!) per time.
14
EX!!! is the Position Error that is determined using difference between Estimated and
Predicted position.
EX!!!= X!
!"#$ − X!
!"#$% (3.1.2)
X!
!"#$ is the Estimated position in the 𝑡!! frame (𝐹!), which is calculated to minimize
the error of Predicted position. The derivation for the Estimated position X!!!
!"#$ is discussed in
next section (Page 17).
In the Kalman Filter method, EX! is calculated using covariance matrix (using
Equation [2.1.3]). Whereas, our Position Error calculation method is much simpler and less
time consuming as shown in Equation in [3.1.2]. In Modified Kalman method, we are not
using Kalman gain coefficient (Equation [2.1.5]) and Measured position calculation
(Equation [2.1.2]) for the object in next consecutive frame, which helps to keep our algorithm
much simpler and faster. As we are using less calculation steps for prediction and estimation;
this shows that Modified Kalman Filter method has faster execution run-time.
Estimation stage
The Predicted position may contain some error in result. To minimize this error,
Estimated position term is introduced. X!!!
!"#$ is calculated using formula,
X!!!
!"#$= X!!!
!"#$% + EZ! (3.1.3)
X!!!
!"#$ is the Estimated position in the next 𝑡 + 1!! frame (𝐹!!!), which is calculated
to minimize the error of Predicted position using current frame positioning parameters. To
minimize the error in Predicted position, we use the Measurement Error (EZ!) term that is
determined by using difference between Measured position and Predicted position as
EZ!=X!
!"#$% − X!
!"#$% (3.1.4)
In the Kalman Filter method, EZ! is calculated using covariance matrix (using Equation
15
[2.1.4]). Whereas, our Measurement Error calculation method is much simpler and less time
consuming as shown in Equation in [3.1.4].
In the Initial steps of the algorithm, we don’t have sufficient information about
prediction position X!
!"#$%, estimation position X!
!"#$ and for the motion of the object (V!) for
the 𝑡!! frame (𝐹!). In the first frame, we assume that,
X!
!"#$% = X!
!"#$% = X!
!"#$
To proceed the algorithm, we used the above initialization in the first frame. As
numbers of frames processed, we can have information about object’s motion (velocity and
acceleration) and positioning parameters (Predicted position and Estimated position) that
helps to get more reliable and faster tracking results. In each individual frame, shrinking
active contour prevents from loosing the tracking object by using the information from the
Modified Kalman Filter.
16
Chapter 4
EXPERIMENTAL RESULTS
In this section we present tracking results from five different video clips. To analyze
performance of tracking, we have chosen video clips that have different frame sizes and
frame rate per second. The tracking software tool processes all frames at once rather than
manual execution of each individual frame using batch processing technique. The graphical
user interface is provided to select an object for the tracking (to provide the location
information of the object) and to input the capture range of the shrinking active contour. In
Figure 6, the first two video clips have single moving objects and the other three video clips
have multiple moving objects. For each video, four frames are used to show the tracking
capabilities of our method as shown in Figure 6. These tracking results are obtained using S-ACES
developed for tracking in java language using Netbeans 7.0.1 Integrated Development
Environment. We used Mac OS X (version 10.7.5) operating system with 2.3 GHz Intel core
i5 processor.
Once the video clip is decomposed into sequence of frames, the Modified Kalman
Filter algorithm is applied on each individual frame in order to predict and estimate location
of the tracked object in next consecutive frame. To show that our approach is reliable, we
picked up random frames from the entire video sequence and showed them in Figure 6.
17
Flight _002.jpg_test.png Flight _011.jpg_test.png Flight _024.jpg_test.png Flight _086.jpg_test.png
A) Single Flight tracking
F_1.png_test.png F_88.png_test.png F_150.png_test.png F_233.png_test.png
B) Single Chlamydomona tracking
FN_1.png_test.png FN_38.png_test.png FN_74.png_test.png FN_111.png_test.png
C) Single Chlamydomona tracking among multiple
FG_num_1 FG_num_40 FG_num_70 FG_num_121
D) Basket ball tracking among other multiple balls
FB_1.png_test.png FB_40.png_test.png FB_61.png_test.png FB_91.png_test.png
E) Blue ball tracking among multiple balls
Figure 6: The tracking results from five video clips using S-ACES and the Modified Kalman Filter.
18
Numerical data is derived from the above experiment as shown in Table 1. Analyzing
this data, one may tell that the maximum speed processing for a single frame is 108
milliseconds. The minimum speed is 96 milliseconds. This difference comes from the nature
of shrinking active contour that has a calculation complexity Ο(M,N). Here, M and N
denotes the size of the rectangle circumscribed by S-ACES.
Table 1
Experimental data derived for execution time using S-ACES tool.
Name of initial frame Image size
Initial
Position
No. of
frame
processed
Execution Time
in milliseconds
FG_num_1
480*480 186,180 122 11821
F_1.png_test.png 340*240 160,116 281 38910
FN_1.png_test.png 640*360 164,125 133 18728
Flight_002.jpg_test.png 320*240 160,150 126 9257
FB_1.png_test.png 320*240 122,174 149 13696
The numerical data in Table 2 is derived to calculate average Error in position and
average Error in measurement as shown in Table 2. Here, EX! and EX! are used to represent
average Error in position at x- pixel and y-pixel respectively, Similarly, EZ! and EZ! are used
to represent average Error in measurement at x- pixel and y-pixel respectively. We used four
different video clips to present the average Error in position and average Error in
measurement.
19
Table 2
Experimental data derived for average Error in position and average Error in measurement using S-ACES
tool.
Name of Video Clip
Average
error in
position
EX!
Average
error in
position
EX!
Average error
in
measurement
EZ!
Average error
in
measurement
EZ!
A) Single Flight tracking 1.2840 0.3520 6.0560 13.184
C) Single Chlamydomona
tracking among multiple 0.6240 0.5639
0.9548 1.2857
D) Basket ball tracking
among multiple balls 0.2231 0
0.4793 0.0413
E) Blue ball tracking
among multiple balls 0.4161 0.2818
1.0536 0.7986
Using Figure 6, we can see that the Modified Kalman Filter algorithm successfully
tracks a single target object in all five video clips. All these video clips are considered under
non-overlapping category, where the moving target object doesn’t closely interact or overlap
with other stationary or non-stationary objects. In overlapping category, the target object
overlaps or closely interacts with other stationary or non-stationary objects. In Figure 7, we
can see that other objects are entering in target object’s active contour range/radius. Due to
this, active contour is grabbing the neighboring object’s boundary and considering multiple
objects as a single object. If the target object moves closely with neighboring object (s) then
parametric shrinking active contour expands its radius and loses original target object’s
location as shown in Frame_53.png_test.png. Using Figure 7, we can say that overlapping
and close interaction with the target object is challenging using Modified Kalman Filter.
Once the object is overlapped with the other object then it is a tedious task to identify our
target object.
20
Figure 7: The overlapping tracking result using S-ACES and the Modified Kalman Filter
21
Chapter 5
CONCLUSION AND FUTURE WORK
This thesis develops a single object tracking method that uses parametric shrinking
active contour (S-ACES) to calculate location of a target object in video sequence. This
active contour is applied as a measured tool in the Kalman method, which is modified for this
purpose. Incorporating the noise terms in the Modified Kalman Filter helped to improve the
accuracy of tracking. The proposed method also works efficiently for tracking single object
among multiple moving objects. Our experimental results show precise tracking results for
moving objects with or without changing shape attributes. This algorithm may miss the
tracked object due to occlusion and if the object leaves the scope of the frame.
The contribution of this work is the application of S-ACES to measure the target
position of the object. The next contribution is the modification of the Kalman Filter in order
to use this measurement for prediction and estimation of the target’s position in the next
frame. This provides the advantage of adjusting a larger error that may be generated by the
Modified Kalman algorithm. Other contribution is the simplification of the Kalman Filter
that increases the speed of the tracking.
In the future work, we will try to solve the overlapping problem using integration of
the modified Kalman algorithm and the shell approach provided by S-ACES. Where, we will
distinguish the target object from overlapping objects in order to make tracking smooth and
efficient. Also, we will extend this algorithm for tracking multiple objects together in video
sequence.
22
REFERENCES
Chakrader, N. S. (2011). “Active contour on the Exact solution of the active convex Hull
Model Working with noise”, Master Degree Thesis. Texas A & M Uiversity,
Commerce, TX .
Department of numerical Analysis and Computing Science, K.-1. (2001, April 23). Computer
Vision Based Human-Computer Intercation. Retrieved September 20, 2012, from
http://www.nada.kth.se/cvap/gvmdi/
Dorin, C., & Visvanathan, R. (2000). Mean shift and Optimal prediction for efficient object
tracking. Retrieved 2012, from
http://www.comaniciu.net/Papers/TrackingPrediction.pdf
Funk, N. (2003). A Study of kalman filter applied to visual Tracking. Retrieved September
17, 2012, from http://www.njfunk.com: http://www.njfunk.com/research/courses/652-
probability-report.pdf
Gordon, N. J., Salmond, D. J., & Smith, A. F. (1993). Novel approach to nonlinear and
nonGaussian Bayesian state estimation. Retrieved June 2012, from
http://home.engineering.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf
Kalman, R. E. (1960). A New Approach to Linear Filtering and Prediction. Retrieved May
17, 2012, from http://www.cs.unc.edu/~welch/kalman/media/pdf/Kalman1960.pdf
Luo, X., Hoteit, I., Duan, L., & Wang, W. (2011). Review of nonlinear Kalman, ensemble
and particle filtering with application to the reservoir history matching problem.
Retrieved from http://eprints.maths.ox.ac.uk/:
http://eprints.maths.ox.ac.uk/1422/1/finalOR34.pdf
Miguel, A. C.-p. (2006). Fast Nonparametric clustering with Guassian Blurring Mean -Shift.
Retrieved 2012, from http://www.autonlab.org/icml_documents/camera-ready/
020_Fast_Nonparametric_C.pdf
23
Nummiaro, K., Esther, K., & Luc, V. G. (2002). An Adaptive color-based particle filter.
Retrieved October 2012, from http://perso.telecom-paristech.
fr/~bloch/P6Image/articles/adaptivepartfilter.pdf
Sirakov, N. M. (2005). Heat Equation to 3D Image Segmentation. Retrieved April 2013,
from http://faculty.tamuc.edu/nsirakov/Publications/Books.htm
Sirakov, N. M., & Ushkala, K. (2009). An Integral Active Contour Model for Convex Hull
and Boundary Extraction. Retrieved from
http://www.springerlink.com/content/850686u087855876/
Sirakov, N. M., Mete, M., & Chakrader, N. S. (2011). Automatic Boundary detection and
Symmetry calculation in Dermoscopy Image of Skin Lesions. ICIP2011 (pp. 41-44).
Brussels: IEEE.
Tao, G., Guo, L. S., & Zhang, L. J. (2011). Tracking video objects with feature point based
particle filtering. Retrieved September 17, 2012, from
http://posgrado.escom.ipn.mx/biblioteca/Tracking%20video%20objects%20with%20f
eature%20points%20based.pdf
Watve, A. K. (2004). A seminar on object tracking in video scene. Retrieved June 2012, from
http://sit.iitkgp.ernet.in/: http://sit.iitkgp.ernet.in/research/aut04seminar1/12r.pdf
Yilmaz , A., Omar, J., & Shah, M. (2006). Object Tracking: A Survey. Retrieved 2013, from
http://dl.acm.org/citation.cfm?id=1177355
Zhanwu, X. (2009). Robust object Tracking Using Multiple Cues. Retrieved August 2012,
from http://gradvibot.u-bourgogne.fr/thesis/Zhanwu.pdf