Detailed Description

Version:: 0.3

Author:: cyril.roussillon@laas.fr, martial.sanfourche@laas.fr

This is the implementation by Birchfield of the famous Kanade-Lucas-Tomasi (KLT) tracker, wrapped in c++ and with an additional layer to enable to track efficiently an object in the image.

The original source code of the library can be found at http://www.ces.clemson.edu/~stb/klt/. We've tried to modify the less possible the original source code, but it was however necessary for some features. They are detailed in the KLT update page with a procedure to apply these enhancements on future versions of the original library (Birchfield proceeds sometimes to new releases to fix bugs).

Remarks concerning the c++ wrapper

Wrapping consists in a Klt class which groups all necessary functions to use the original lib. All methods have the same names as the original functions, but without the "KLT" prefix. Documentation of theses methods is given by the author of the original lib at http://www.ces.clemson.edu/~stb/klt/.

The wrapping also introduced the use of the image::JfrImage type in place of simple pointers on data. Please try to use image dimensions which are multiple of 8 if you can, it will avoid repeated memory copies to translate from JfrImage data to classical RGB aligned data used by the original library.

Remarks concerning the use of the KLT lib

The original KLT documentation can be found here : http://www.ces.clemson.edu/~stb/klt/. The page KLT theory describes how the KLT lib works and where complementary information can be found.

Kirchfield provides two functions ChangeTCPyramid and UpdateTCBorder to update certain parameters according to other ones. It is however necessary to be cautions while using them, because they do not permit all adjustments.

First, ChangeTCPyramid can be used to adjust the subsampling and nPyramidLevels parameters of the KLT_TrackingContext structure. But it has been found that for great movements between frame it is necessary to adjust subsampling to 2 and nPyramidLevels from 3 to 5 for example, which is never obtained with ChangeTCPyramid.

Second, UpdateTCBorder can calculate large borders when there are numerous levels in the pyramid, which considerably reduces the operable area. It is advised to respect it, but it can work reducing it (by not calling the UpdateTCBorder function, but there are no guarantees). The solution which has been chosen for the additional layer is to consider there is black outside the image, and to always respect borders, filling with black if necessary.

The additional layer

In this section will be presented what does this additional layer and how to use it. You can find in the KLT theory page some information about how it has been realized.

While the original lib works on the whole image, the additional layer enable to give an area in the first image to designate an object and to track it. For that :

only the area around the object is processed (what considerably speeds up the tracking),
the position of this area is estimated by a kalman filter,
lost features are replaced around the object,
an heuristic eliminates features which are added outside the object (in the landscape). This heuristic is based on displacements and relative positions of features, so that finally an object is characterized by a pattern or an area which moves in an homogeneous way.

The object is defined by the features which are tracked (you can specify their count). Its position is modelized by the barycenter of the features, and its size by the "scope", which is the largest distance between a feature and the barycenter.

Some informations can be represented on images with the writeFeaturesToImg and writeBordersToImg functions :

yellow points : added features (initial selection or replacement)
yellow frame : area for replaced features (corresponds to the size of the object)
red points : tracked features
red frame : area for tracked features (corresponds to the processed area without borders)
dashed red points : features which were successfully tracked but have been suppressed because of the landscape criterion.
purple point : barycenter of features (position of the object)
blue point : kalman estimated position of the object, center of the processed area
blue frame : processed area

Important parameters

The number of features to track the object is an important parameter. If there are not enough features the object will be easily lost, and if there are too much features the object risks spreading withoug reason. That's why when you increase the number of features you have to decrease the mindist parameter (minimum distance between features during selection), in order to authorize more features on the object.

Another important parameter for tracking is the window size, because if it is too small it increases the risk of bad tracking by KLT.

Last but not least, the number of pyramid levels and the value of subsampling are important. Subsampling should be let at 2, but the number of pyramid levels can have a significative action on the tracker performances (classically between 3 and 5, the higher being for the greater displacements -related to the precision of the kalman model-, but increasing also the number of badly tracked features).

Kalman representation

In featK$i.tiff is represented some information about kalman :

red point : measure of the position (barycenter of features)
blue point : estimated position by kalman (from the previous frames)
green point : kalman filter position (after update)
green line : kalman filter speed (after update)
blue frame : confidence zone of kalman, with covariances matrix
green line in the upperleft corner : kalman filter speed with a scale factor of 2
blue line in the upperleft corner : estimated speed from the previous frames, with a scale factor of 2

Example

A video example demo.mpg of tracking a hand (thanks, Mathias, for the video) is available in the modules/klt/doc folder to illustrate what it is possible to do with this additional layer. You can use '[' and ']' keys with mplayer to speed down or speed up the video, in order to see details of features which are tracked, replaced or suppressed. The hand is tracked even when it passed in front of the head or the drawing on the t-shirt thanks to landscape features elimination. The tracking for one frame costs less than 15ms, and it is possible to speed it up reducing the processed area.

History

0.3 (2006-12-01) - Update needed to use the new Image module (sup 1.0)
0.2 (2006-08-10) - Additional layer to track an object
0.1 (2006-07-10) - Initial version

Requirements

Interface with the JfrImage class from the image jafar module. Or you could also still use the original functions, with the prefix "KLT" in place of the parent "Klt::".

Macro

No macro provided but the kltTest function enable to do precise tests.

Tcl interface (generated by swig)

The interface of the module is generated from the following files:

klt.i defines the wrapped classes and functions,
kltException.i defines the try { } catch block for this module.

Classes
class	jafar::klt::Klt
	This class is a Jafar wrapper (using Image) to the KLT tracker. More...
class	jafar::klt::KltException
	Base class for all exceptions defined in the module klt. More...
Functions
unsigned char *	jafar::klt::compatImage (jafar::image::Image &image_, CompatTreatment &treatment, int zone_x=-1, int zone_y=-1, int zone_w=-1, int zone_h=-1)
	converts a JfrImage into aligned data compatible with original KLT lib's functions returns a pointer on aligned grey data, trying to do the less possible conversion operations.
void	jafar::klt::destroyCompatImage (unsigned char *image_ptr, CompatTreatment &treatment)
	destroys compatImage One you no more need data, this function enable to free what must be freed, what it knows thanks to the treatment structure retrieved during the creation
void	jafar::klt::kltTest (const jafar::datareader::ImageReader &imaRead, int firstFrame, int frameStep, int lastFrame, std::string path2FeatureFile, int nFeatures=100, int border_x1=-1, int border_y1=-1, int border_x2=-1, int border_y2=-1, bool replace=false, int affineConsistencyCheck=-1, int window_size=11, int nPyramidLevels=3, int subsampling=2, int mindist=10, int maxDisplacement=60, int maxAccel=50, int measureError=10, int margin=10, double max_residue=24, int max_iterations=10, double min_determinant=0.01)
	complete test function for the KLT lib, and also a nice example of how to use the module
void	jafar::klt::oldkltTest (const char *path, int firstFrame=0, int nFrames=1, int nFeatures=100, int border_x1=-1, int border_y1=-1, int border_x2=-1, int border_y2=-1, bool replace=false, int affineConsistencyCheck=-1, int window_size=11, int nPyramidLevels=3, int subsampling=2, int mindist=10, int maxDisplacement=60, int maxAccel=50, int measureError=10, int margin=10, double max_residue=24, int max_iterations=10, double min_determinant=0.01)
	complete test function for the KLT lib, and also a nice example of how to use the module

Function Documentation

unsigned char* jafar::klt::compatImage	(	jafar::image::Image &	image_,
		CompatTreatment &	treatment,
		int	zone_x = `-1`,
		int	zone_y = `-1`,
		int	zone_w = `-1`,
		int	zone_h = `-1`
	)

converts a JfrImage into aligned data compatible with original KLT lib's functions returns a pointer on aligned grey data, trying to do the less possible conversion operations.

So if you give it a u8 grey image with dimensions multiple of 8, no copy will be done.

Parameters:

image_	the JfrImage to convert
treatment	a structure to remember which allocations have been done, and to give to the destroyCompatImage function in order to know what there is to free

void jafar::klt::destroyCompatImage	(	unsigned char *	image_ptr,
		CompatTreatment &	treatment
	)

destroys compatImage One you no more need data, this function enable to free what must be freed, what it knows thanks to the treatment structure retrieved during the creation

Parameters:

image_ptr	the aligned data to free
treatment	the same structure given to compatImage

void jafar::klt::kltTest	(	const jafar::datareader::ImageReader &	imaRead,
		int	firstFrame,
		int	frameStep,
		int	lastFrame,
		std::string	path2FeatureFile,
		int	nFeatures = `100`,
		int	border_x1 = `-1`,
		int	border_y1 = `-1`,
		int	border_x2 = `-1`,
		int	border_y2 = `-1`,
		bool	replace = `false`,
		int	affineConsistencyCheck = `-1`,
		int	window_size = `11`,
		int	nPyramidLevels = `3`,
		int	subsampling = `2`,
		int	mindist = `10`,
		int	maxDisplacement = `60`,
		int	maxAccel = `50`,
		int	measureError = `10`,
		int	margin = `10`,
		double	max_residue = `24`,
		int	max_iterations = `10`,
		double	min_determinant = `0.01`
	)

complete test function for the KLT lib, and also a nice example of how to use the module

This test function is based on the example 3 of the KLT lib, but adds a lot of dynamic parameters, and enable fine tunings such as in the example 5 of the KLT lib. It should enable to do a lot of tests in a tcl shell without having to recompile, or to make a long tcl macro ... Except the path name, all parameters have default values if you omit them, and you can always put -1 to a parameter to use its default value.

Parameters:

path	the name of the path where source images are stored, and where dest images will be stored. Please do not forget a / character at the end of the path. If you have a lot of tests to do you could put your images in a ramfs periph and use this path.
imaRead	imageReader defining path and filename template of the images sequence
firstFrame	index of the first frame to use
frameStep	step to pass through the sequence (can be negative)
lastFrame	index of the last frame to use
path2FeatureFile	path to the features file contening tracks extracted by KLT
nFrames	count of frames in which tracking
nFeatures	count o features to select in the first image,
border_x1	the x of top left corner of the area where you want to select features in the first image
border_y1	the y of top left corner of the area where you want to select features in the first image
border_x2	the x of bottom right corner of the area where you want to select features in the first image
border_y2	the y of bottom right corner of the area where you want to select features in the first image
replace	if 1 features which are lost are replaced by other ones, in order to always keep nFeatures
affineConsistencyCheck	if 1 a comparison is done between the initial pattern of features and its current pattern, to verify consistency (please refer to the KLT lib documentation for more details)
window_size	corresponds to window_width = window_height of the KLT lib (please refer to the KLT lib documentation)
nPyramidLevels	the number of levels of subsampling in the pyramid (please refer to the KLT lib documentation)
subsampling	the rate of the subsampling pyramid (resolution is divided by `subsampling` at each level of the pyramid) (please refer to the KLT lib documentation)
mindist	the minimum distance between features for selection (or replacement) (please refer to the KLT lib documentation)
maxDisplacement	the maximum displacement in pixels from one frame to the next, determining the size of the processing area
maxAccel	the maximum acceleration of the object in pixels by frames^2, determining the error of the kalman model (speed is constant)
measureError	the confidence in the measure (barycenter of features representing the object)
margin	a security margin to increase the size of the processing area
max_residue	maximum residue, averaged by pixel, before declaring tracking failure (please refer to the KLT lib documentation)
max_iterations	max iterations in the minimum searching before declaring tracking failure (please refer to the KLT lib documentation)
min_determinant	min determinant of the pattern window for declaring tracking failure (please refer to the KLT lib documentation)

void jafar::klt::oldkltTest	(	const char *	path,
		int	firstFrame = `0`,
		int	nFrames = `1`,
		int	nFeatures = `100`,
		int	border_x1 = `-1`,
		int	border_y1 = `-1`,
		int	border_x2 = `-1`,
		int	border_y2 = `-1`,
		bool	replace = `false`,
		int	affineConsistencyCheck = `-1`,
		int	window_size = `11`,
		int	nPyramidLevels = `3`,
		int	subsampling = `2`,
		int	mindist = `10`,
		int	maxDisplacement = `60`,
		int	maxAccel = `50`,
		int	measureError = `10`,
		int	margin = `10`,
		double	max_residue = `24`,
		int	max_iterations = `10`,
		double	min_determinant = `0.01`
	)

complete test function for the KLT lib, and also a nice example of how to use the module

This test function is based on the example 3 of the KLT lib, but adds a lot of dynamic parameters, and enable fine tunings such as in the example 5 of the KLT lib. It should enable to do a lot of tests in a tcl shell without having to recompile, or to make a long tcl macro ... Except the path name, all parameters have default values if you omit them, and you can always put -1 to a parameter to use its default value.

Parameters:

path	the name of the path where source images are stored, and where dest images will be stored. Please do not forget a / character at the end of the path. If you have a lot of tests to do you could put your images in a ramfs periph and use this path.
firstFrame	index of the first frame to use
nFrames	count of frames in which tracking
nFeatures	count o features to select in the first image,
border_x1	the x of top left corner of the area where you want to select features in the first image
border_y1	the y of top left corner of the area where you want to select features in the first image
border_x2	the x of bottom right corner of the area where you want to select features in the first image
border_y2	the y of bottom right corner of the area where you want to select features in the first image
replace	if 1 features which are lost are replaced by other ones, in order to always keep nFeatures
affineConsistencyCheck	if 1 a comparison is done between the initial pattern of features and its current pattern, to verify consistency (please refer to the KLT lib documentation for more details)
window_size	corresponds to window_width = window_height of the KLT lib (please refer to the KLT lib documentation)
nPyramidLevels	the number of levels of subsampling in the pyramid (please refer to the KLT lib documentation)
subsampling	the rate of the subsampling pyramid (resolution is divided by `subsampling` at each level of the pyramid) (please refer to the KLT lib documentation)
mindist	the minimum distance between features for selection (or replacement) (please refer to the KLT lib documentation)
maxDisplacement	the maximum displacement in pixels from one frame to the next, determining the size of the processing area
maxAccel	the maximum acceleration of the object in pixels by frames^2, determining the error of the kalman model (speed is constant)
measureError	the confidence in the measure (barycenter of features representing the object)
margin	a security margin to increase the size of the processing area
max_residue	maximum residue, averaged by pixel, before declaring tracking failure (please refer to the KLT lib documentation)
max_iterations	max iterations in the minimum searching before declaring tracking failure (please refer to the KLT lib documentation)
min_determinant	min determinant of the pattern window for declaring tracking failure (please refer to the KLT lib documentation)

Note:: Image file names have to respect a certain syntax. Source images must have a 'pgm' extension, and be named img100.pgm, img101.pgm ... Dest images will be feat100.tiff, feat101.tiff, and the representation of kalman characteristics will be in featK100.tiff ...

You can easily rename all your image source files using your shell :

       for ((i=1450;i<1470;i++)) ; do convert image.d.$i.gz img$(($i-1450+100)).pgm ; done

Example :

       using namespace klt;
       kltTest("/mnt/ram/mat2/", 0, 110, 12,  315, 405, 355, 455,  1, -1,  11, 3, 2, 8,  150, 15, 10, 10);

will handle /mnt/ram/mat2/img100.pgm, /mnt/ram/mat2/img101.pgm, ..., /mnt/ram/mat2/img209.pgm, and create respective /mnt/ram/mat2/feat$i.tiff, searching 8 features without replacing them, in the whole image (-1 for all borders), with a 11 pixels side correlation window, 3 levels in the pyramid and subsampling 2, with other parameters to default value.

Detailed Description

Remarks concerning the c++ wrapper

Remarks concerning the use of the KLT lib

The additional layer

Important parameters

Kalman representation

Example

History

Requirements

Macro

Tcl interface (generated by swig)

Classes

Functions

Function Documentation