SLMotion logo

Component reference

Index

Component descriptions

This section provides a short overview of each documented component, and lists their requirements and the results that they provide for other components to use. Some components have intentionally been left undocumented because they are not useful or they still constitute work in progress.

AdaptiveSkinDetector

Overview

Uses the OpenCV Adaptive Skin Detector to detect skin. A binary skin mask is constructed as a result.

Options

None.

Note: There are some common options shared by all skin detectors. These are, however, currently somewhat inaccessible. This will be fixed in the future.

Command-line options

None.

Requirements

None.

Entries provided

AsmTracker

Overview

Tracks the silhouettes of the hands and the head by the means of Active Shape Models [3].

Options

Note: The current implementation may be somewhat broken, and some of the options may go unused.

Name Type Description
nlandmarks integer The number of landmark points to place evenly around the object when creating the point distribution model. Default: 60.
alignthreshold double When aligning vectors during Point Distribution Model creation, use this threshold for determining convergence. Default: 0.
sobelaperturesize string Either one of the following integers: 1, 3, 5, or 7, or the string scharr. This option will set the aperture size for the Sobel operator when computing intensity gradients of the image. Default: scharr.
maxlookdistance integer The maximum distance in pixels for target landmarks in relation to the mean shape. Default: 30.
convergencethreshold double Sets a threshold for deciding if the ASM has converged. Convergence is decided by the sum of absolute values of elements in vector dbt such that the shape of the PDM x is approximately mean(x) + Pt×(bt + dbt) where pt is a 2N×t PCA matrix. Default value: 1.0.
intensityfactor double The new target gradient value must exceed the value at the old landmark by this factor or the target is not considered clear enough. Default value: 5.0.
minintensity string Either an integer or the string "mean". The absolute value of the gradient at a new target landmark point must exceed this figure to be set. The "mean" setting corresponds to the mean absolute value computed over the whole gradient map. Default value: mean.
pcacomponentcount integer The number of PCA components to use in the fitting. Default value: 3.
maxiterations integer Maximum number of iterations allowed when performing ASM fitting. Default value: 20.
scaleapproximationfactor string A pair of numbers a:b which indicate the amount by which the body part may exceed the dimensions of its blobs in its approximate initial pose, i.e. setting this to 1.2:1.4 would mean that the body part may exceed the width of the blob by a factor of 1.2 and height by 1.4. Setting either of these zero will idsable scaling approximation altogether. Default value: 1.0:1.0.
maxshapeparameterdeviation double The shape of the body part is determined by a t-vector so that the shape is determined as a sum of the mean shape and the product of the transformation matrix and the shape vector. Each component in the shape vector is associated with an eigenvalue, computed from the covariance matrix of the original sample PDMs. This parameter sets the upper and lower bounds for these components. More precisely, the absolute values of the components in the particular shape vector may not exceed this parameter × the square root of their associated eigenvalues. E.g. if the parameter were set to 3.0, and a component had an eigenvalue, say, 16, the corresponding component in the shape vector could vary between -12 and 12. Default value: 3.0.
shapehistorysize unsigned integer When constructing the mean PDM, use this many previous shapes as reference. Setting this 0 will use as many shapes as possible. Default value: 0.
blackout boolean If enabled, black out parts of the image deemed unimportant when performing ASM fitting. Default value: true.
initialposeestimationmethod string Sets the method used for selecting the initial pose. Possible values are listed below:
  • probeonly: Always probe for the pose
  • posehistory: Use pose values from previous frame (given that their goodness exceeds the set threshold)
  • poseandshapehistory: Like above, but also set the initial value of the particular shape vector to its previous value.
If shapehistorysize > 0, setting this option to anything other than probeonly may result in unexpected behaviour. Default value: poseandshapehistory.
initialposeestimationgoodnessthreshold double If posehistory or poseandshapehistory is enabled above, their goodness is checked against a binary mask for the body part being fitted. Goodness is defined as a fraction of (count of pixels in the intersection of the pixels enclosed within a polygon limited by landmarks of the old shape in its old pose and the pixels in the reference mask) / max(number of pixels in the polygonm, number of pixels in the reference frame). The denominator has the max there to prevent huge predicted shapes from overcoming smaller reference shapes (the intersection could be perfect, yet the shape would cover a lot more area than it should). Default value: 0.7
equalisehistogram boolean Whether to perform histogram equalisation on the grayscale image before computing intensity gradients. Default value: false.
targettype string When choosing new target landmarks, choose those pixels whose intensity gradients:
  • absolute: have the greatest absolute value
  • negative: have the most negative value (ignoring any positive pixels)
  • positive: have the most positive value
Default value: absolute.
targetmethod string Selects the method used for choosing new target landmarks. Possible values include:
  • gradient: use the magnitude of the intensity gradient
  • directional: computes intensity gradients separately for x and y axes and projects the gradient vector on the normal. Then, it is assumed that intensity decreases as we move away from the body part, so pick the pixel with highest intensity (as decided above)
  • canny: uses the Canny edge detector
Default value: directional.
gradientmethod string Sets the method used for approximating the intensity gradient. Possible values include:
  • sobel: The Sobel operator (default), also has a special option to set the aperture size (see above). Can be used always.
  • laplacian: Uses the OpenCV laplacian operator (which uses the Sobel operator) with the aperture size set according to the Sobel operator aperture size option. Laplacian operators cannot be used with directional targets.
  • laplacian4: A 4-neighbourhood Laplacian
  • laplacian8: An 8-neighbourhood Laplacian
  • roberts: Uses the Roberts operator
  • prewitt: The Prewitt operator
  • robinson: The Robinson operator
  • kirsch: The Kirsch operator
Default value: sobel.
cannythreshold1 double Canny hysteresis parameter 1: Default: 150.
cannythreshold2 double Canny hysteresis parameter 2. Default: 100.
alignmaxiterations unsigned integer Maximum number of iterations allowed when aligning vectors during Point Distribution Model creation. Default: 10.

Command-line options

None.

Requirements

Entries provided

BlackBoardDumpReader

Overview

This is a special component that reads a black board dump from a file and stores all the data onto the blackboard.

Options

None.

Command-line options

Name Type Description
--in-dump-file string A comma-separated list of filenames that will be read. If the files have been gzipped, they will be decompressed automatically.

Requirements

None.

Entries provided

Depends entirely on the content of the dump. The component does not produce anything on its own.

BlackBoardDumpWriter

Overview

This is a special component that dumps all data on the black board to a file.

Options

None.

Command-line options

Name Type Description
--out-dump-file string The file to write the dump into. If the file name ends in .gz, the dump is gzipped.

Requirements

None. Everything must go.

Entries provided

None.

BlackBoardExplorer

Overview

This component is exceptional. It presents the user an interactive (although limited and cumbersome) shell on the command line that the user may use to explore the content of the black board. The Graphical User Interface has a fancier, graphical version available, which should be preferred to this one.

Options

None.

Command-line options

None.

Requirements

None.

Entries provided

None.

BlackBoardStdOutWriter

Overview

Dumps the requested properties from the black board to stdout.

Options

None.

Command-line options

Name Type Description
--stdout-dump-select string Comma-separated names of the black board data properties that need to be dumped.

Requirements

None natively. The properties listed in the dump selection should exist on the blackboard, however.

Entries provided

None.

BlobExtractor

Overview

Extracts 4-connected pixel regions (blobs) from the skinmask.

Options

Name Type Description
blobremoval boolean If enabled, removes blobs smaller than the minimum size. Default value: true.
blobs integer The number of largest blobs to preserve, i.e. never return more than this number of blobs. Default value: 3.
minblobsize integer Minimum number of pixels in a blob. If the blob is smaller than this threshold and bolbremoval is set, the blob is wiped out. Default value: 3300.

Command-line options

None.

Requirements

Entries provided

BlobTracker

Overview

Tracks separate skin blobs.

Options

None.

Command-line options

None.

Requirements

Entries provided

BodyPartCollector

Overview

This component takes in the blobs extracted earlier, assigns them identities as one or more of the following body parts: the head, the left hand, the right hand. These are then stored as a list.

Options

None.

Command-line options

None.

Requirements

Entries provided

BodyPartsFromBlobTracks

Overview

This component uses blobtracks to generate a list of body parts.

Options

None.

Command-line options

None.

Requirements

Entries provided

ColourSpaceConverter

Overview

This component takes in input frames and converts them to the desired colour space.

Options

Name Type Description
colourspace string Target colour space. The following values are supported:
  • gray Grayscale (0.299×R + 0.587×G + 0.114×B)
  • bgr BGR. This is actually a nop because the input is already BGR.

Command-line options

None.

Requirements

None.

Entries provided

CsvImporter

Overview

The component reads in data from CSV files and stores it on the blackboard as floating point vectors.

Options

Name Type Description
filenames string A comma-separated list of filenames to read.
fields string A semicolon-separated list of field specifications. Each specification corresponds to its respective CSV file (ie. the number of specifications and input CSV files must match). Each specification consists of a comma-separated list of key-and-list-of-indices pairs; the lists of indices are separated from the key by a colon and each index is also separated from one another by a colon. The indices are 0-based column indices of the values on CSV rows. An entry may be either a single number or a dash-separated range of numbers. "framenr" is a special key: it is interpreted as to having the 0-based frame number of the corresponding row and must always be a single index. Unless framenr is specified, each row is assumed to have originated from a frame number with the same 0-based row number, unless the whole file only contains one row, in which case the values are assumed to be global. An example of a valid string of specifications: framenr:0,head_pos:1-3,lefthand_pos:4:5:6,righthand_pos:7-8:9; global_x:0,global_y:1,global_z:2

Command-line options

Name Type Description
--in-csv-file string A comma-separated list of filenames to read.
--csv-fields string A semicolon-separated list of field specifications. Each specification corresponds to its respective CSV file (ie. the number of specifications and input CSV files must match). Each specification consists of a comma-separated list of key-and-list-of-indices pairs; the lists of indices are separated from the key by a colon and each index is also separated from one another by a colon. The indices are 0-based column indices of the values on CSV rows. An entry may be either a single number or a dash-separated range of numbers. "framenr" is a special key: it is interpreted as to having the 0-based frame number of the corresponding row and must always be a single index. Unless framenr is specified, each row is assumed to have originated from a frame number with the same 0-based row number, unless the whole file only contains one row, in which case the values are assumed to be global. An example of a valid string of specifications: framenr:0,head_pos:1-3,lefthand_pos:4:5:6,righthand_pos:7-8:9; global_x:0,global_y:1,global_z:2

Requirements

None.

Entries provided

Key Type Description
Specification name specific. The key is the same as the name of the specification. std::vector<float> Values from each row. The length of the vector is the same as the number of column indices in the specification.

ElmSkinDetector

Overview

A skin detector based on Extreme Learning Machines [7]

Options

Name Type Description
pthreshold double Sets the threshold for the value of estimated skin probability that a pixel should have to be interpreted as a skin pixel. Pixels whose value exceeds this threshold will be classified as skin pixels. Default: 0.75.
blurmagnitude double Sets the amount of Gaussian blur applied to the skin probability distribution before thresholding. Default: 0.

Note: There are some common options shared by all skin detectors. These are, however, currently somewhat inaccessible. This will be fixed in the future.

Command-line options

None.

Requirements

None.

Entries provided

FaceDetector

Overview

This class implements a Viola-Jones [1] style cascade classifier based face detector using Haar features. This particular implementation assumes only one relevant face in the input frame, and requires that there should be at least two successive detections spatially near-by in consequent frames before the detection is accepted. Missing detections are filled with linear interpolation.

Options

Name Type Description
cascade string Trained OpenCV cascade classifier filename. Default: haarcascade_frontalface_alt.xml
scale string Either a single value or two colon-separated values for scaling the resulting rectangle. The two value version applies the respective coefficients for scaling horizontally and vertically, respectively. The default value is 1.0:1.0.
translate string Two colon-separated integers for fixed translation of the resulting rectangle horizontally and vertically, respectively. The default value is 0:0.
maxmove integer The maximum number of pixels that the face rectangle may move between two frames before the detection is considered invalid. The default value is 25.
maxareachange double The maximal proportional change in the face rectangle area that may occur between two frames before the detection is considered invalid. The default value is 0.5.
minneighbours integer The minimum number of overlapping neighbouring detections required for a detection to be considered valid. Larger number means more accurate detections (in terms of false positives), but at the cost of possibly missing some valid detections. The default value is 2.
scalefactor double This value is passed directly to the OpenCV detection function. It represents how large a jump there is between different scales. Accuracy can be increased by moving this value closer to 1.0, but doing so will increase computation time.

FaceDetector2

Overview

This class implements a Viola-Jones [1] style cascade classifier based face detector using Haar features. This class works almost like facedetection, except that instead of linear interpolation, naïve low-pass filtering is used to make up for the missing detections.

Options

Name Type Description
cascade string Trained OpenCV cascade classifier filename. Default: haarcascade_frontalface_alt.xml
scale string Either a single value or two colon-separated values for scaling the resulting rectangle. The two value version applies the respective coefficients for scaling horizontally and vertically, respectively. The default value is 1.0:1.0.
translate string Two colon-separated integers for fixed translation of the resulting rectangle horizontally and vertically, respectively. The default value is 0:0.
maxmove integer The maximum number of pixels that the face rectangle may move between two frames before the detection is considered invalid. The default value is 25.
maxareachange double The maximal proportional change in the face rectangle area that may occur between two frames before the detection is considered invalid. The default value is 0.5.
minneighbours integer The minimum number of overlapping neighbouring detections required for a detection to be considered valid. Larger number means more accurate detections (in terms of false positives), but at the cost of possibly missing some valid detections. The default value is 2.
scalefactor double This value is passed directly to the OpenCV detection function. It represents how large a jump there is between different scales. Accuracy can be increased by moving this value closer to 1.0, but doing so will increase computation time.

Command-line options

None.

Requirements

None.

Entries provided

FaceOcclusionDetector

Overview

This component attempts to detect which pixels in the face area are occluded by the hands.

Options

None.

Command-line options

None.

Requirements

Entries provided

FaceSuviSegmentator

Overview

Takes in face detection results, skin detection results, and facial landmarks, and maps the detected skin areas into places of articulation used in Suvi. The resulting map is is a simple matrix where each pixel is given one of following values:

Options

None.

Command-line options

None.

Requirements

Entries provided

FacialLandmarkDetector

Overview

Extracts facial landmarks from the face area as described in [8].

Options

Name Type Description
enableconfidence boolean If enabled, computes a confidence value for the landmarks (do not use, doesn't do much good). Default: false.
confidencethreshold double If confidence values are enabled and the value is below this figure, the landmarks will not be stored. Default: 0.65.
enablelowpassfilter boolean If enabled, the landmarks will be lowpass filtered. Default: false.
lowpasscutoff double If low pass filtering is enabled, this option will determine the cutoff frequency in terms of N/(value). Default: 8.0.
pruneandinterpolate boolean If enabled, the detection will be post-processed by rejecting detections in frames where their positions differ too much from typical positions. The rejected values are replaced by interpolation. Default: false.

Command-line options

None.

Requirements

Entries provided

FeatureCreator

Overview

In a chain containing KLT and ASM tracking results, constructs the features outlined in [2].

Options

None.

Command-line options

None.

Requirements

Entries provided

GaussianSkinDetector

Overview

Performs Skin Detection using a very simple Gaussian model.

Options

Name Type Description
logdensitythreshold double Sets the threshold for value of logarithmic probability density function that a pixel should have to be interpreted as a skin pixel. Pixels whose value exceeds this threshold will be classified as skin pixels. Default value: -14.
kcolours integer Sets the number K of clusters that should be considered when performing K-means. Default value: 2.
rcolours integer Sets the number R ≤ K of clusters that should be considered by the filter, i.e. the R most common clusters will be used for computing R probability distributions. Default value: 1.
trainimage string Train the detector using in image file rather than the face detector. Default value: "".
weights boolean If true, a weighted linear combination of distribution values is used. Otherwise, only maximum value is used. Default value: true.

Note: There are some common options shared by all skin detectors. These are, however, currently somewhat inaccessible. This will be fixed in the future.

Command-line options

None.

Requirements

Entries provided

KinectExample

Overview

This is a very simple example component that can be used to see how multitrack processing is done with components. It simply takes in a frame, weighs each pixel by their corresponding depth, multiplies by a constant and saves the result on the black board.

Options

Name Type Description
constant double A constant to multiply the pixels by.

Command-line options

None.

Requirements

None.

Entries provided

KLTTracker

Overview

This component implements a KLT (Kanade-Lukas-Tomasi) Tracker of Shi-Tomasi interest points. [5], [6]

Options

Name Type Description
removenonmaskedfeatures boolean If enabled, interest points that have been tracked to a location that is outside the skin mask boundaries are removed. Default value: true.
maxframeerror double Maximal acceptable tracking error between two consequent frames. Default value: 1000.
gfquality double The quality level for finding good features. Default value: 0.01.
gfmindistance double Minimum distance between interest points. Default value: 3.
maxpoints integer Maximum number of points to track. Default value: 1000.
maxmove integer The maximum number of pixels that an interest point may move between two consecutive frames before it is deemed lost. Default value: 30.

Command-line options

None.

Requirements

Entries provided

MultiFacialLandmarkDetector

Overview

Performs the same facial landmark [8] detection as FacialLandmarkDetector, but does so for multiple faces at a time. The resulting number of face landmark vectors may vary from frame to frame.

Options

None.

Command-line options

None.

Requirements

Entries provided

POAAnnotationAid

Overview

POAAnnotationAid for testing ideas on template match based head point tracking that doen't rely on skin detection results

Options

None.

Command-line options

None.

Requirements

None.

Entries provided

None.

BodyPartsFromBlobTracks

Overview

This component uses blobtracks to generate a list of body parts.

Options

None.

Command-line options

None.

Requirements

Entries provided

PythonComponent

Overview

The PythonComponent is a metacomponent that can be used to instantiate other components that have been written in Python.

Options

Name Type Description
class string Name of the Python class. The class must be present in the global namespace, it must be instantiable and conform to the Python interface.

Note: Other options are not checked in the Python environment. Instead, they are passed to the constructor of the object that is to be instantiated.

Requirements

Class-dependent, none natively.

Entries provided

Class-dependent, none natively.

RawFaceDetector

Overview

This class implements a Viola-Jones [1] style cascade classifier based face detector using Haar features. This particular implementation simply performs the detection and returns a vector of rectangles which correspond to all detected faces. Therefore, the number of detected faces may vary from frame to frame even if there is only one true face (i.e. the number of false positives may go up) and at the same time missing detections are not compensated for in any way.

Options

Name Type Description
cascade string Trained OpenCV cascade classifier filename. Default: haarcascade_frontalface_alt.xml
minneighbours integer The minimum number of overlapping neighbouring detections required for a detection to be considered valid. Larger number means more accurate detections (in terms of false positives), but at the cost of possibly missing some valid detections. The default value is 2.
scalefactor double This value is passed directly to the OpenCV detection function. It represents how large a jump there is between different scales. Accuracy can be increased by moving this value closer to 1.0, but doing so will increase computation time.

Command-line options

None.

Requirements

None.

Entries provided

RuleSkinDetector

Overview

This is a very simple skin detector based on simple rules presented in [4].

Options

Name Type Description
ruleset string Either "a" or "b". "a" corresponds to the following rules:
pixel x = (R,G,B) is skin iff R > 95 AND G > 40 AND B > 20 AND max{R,G,B} - min{R,G,B} > 15 AND |R-G| > 15 AND R > G AND R > B
"b" corresponds to the following rules:
pixel x = (R,G,B) is skin iff R > 220 AND G > 210 AND B > 170 AND |R-G| ≤ 15 AND R > B AND G > B

Note: There are some common options shared by all skin detectors. These are, however, currently somewhat inaccessible. This will be fixed in the future.

Command-line options

None.

Requirements

None.

Entries provided

SuviOcclusionFeatureCreator

Overview

Given frames with Suvi place of articulation segmentation matrices, and occlusion matrices, computes a number of numerical features e.g. total number of occluded pixels and POA-specific occluded pixel counts.

Options

None.

Command-line options

None.

Requirements

Entries provided

VisualSilenceDetector

Overview

Attempts to detect moments of non-motion. Technically, this is done as follows: for grayscale frame f(t) at moment t, compute absolute-valued difference images d1 = |f(t)-f(t-1)| and d2 = |f(t+1)-f(t)|. Then form an elementwise maximal union of the two, i.e. difference image d where each element d(i,j) = max{d1(i,j),d2(i,j)}. Apply binary thresholding, ie. form a new image d' where d'(i,j) = 255 if d(i,j)ε and 0 otherwise. Reduce the amount of noise by applying morphological erosion. Finally, count the number of non-zero pixels and store double precision floating point value.

Options

Name Type Description
threshold double The threshold ε described above. Default value: 10.

Command-line options

None.

Requirements

None.

Entries provided

Known Black Board entry keys and types

KeyTypeDescription
blobtracks std::map<int, std::vector<cv::Rect>> Key to the map is the blobtracker id, while the vector lists all the rectangles belonging to the track.
bodyparts std::list<BodyPart> Detected body parts from a frame (i.e. left hand, right hand, head) as BodyPart objects, representing interconnected 2D pixel regions (blobs).
facedetection cv::Rect Detected face location.
faceocclusionmask cv::Mat A binary matrix of type CV_8UC1 where each non-zero pixel corresponds to a pixel where occlusion is presumed to take place.
facesuvisegments cv::Mat Segmentation matrix of the face of type CV_8UC1 where each element value corresponds to a Suvi place of articulation, or is zero for non-face pixels.
faciallandmarks std::vector<cv::Point2f> Detected facial landmarks as described in [8]
faciallandmarkconfidence double Facial landmark confidence value.
featurevector FeatureVector The global feature vector; the result of all analysis.
gsimg cv::Mat Greyscale image, typically of type CV_8UC1.
headanchorguess cv::Point The presumed location of the bottommost point of the head ASM instance.
headasm Asm The trained ASM model for the head.
headasminstance Asm::Instance Specific instance of the ASM model for the head.
KinectEntry cv::Mat The example entry stored by the KinectExample component, i.e. the input frame weighed with the depth and multiplied by a constant.
klttrackedpoints std::set<TrackedPoint> Points tracked in a frame by the KLT Tracker.
lefthandanchorguess cv::Point The presumed location of the bottommost point of the left hand ASM instance.
lefthandasm Asm The trained ASM model for the left hand.
lefthandasminstance Asm::Instance Specific instance of the ASM model for the left hand.
multifaciallandmarksvector std::vector<std::vector<cv::Point2f>> Detected facial landmarks as described in [8]
rawface cv::Rect "Raw" detected face location, i.e. no interpolation etc. has been applied.
rawfacevector std::vector<cv::Rect> "Raw" detected face locations, i.e. no interpolation etc. has been applied.
righthandanchorguess cv::Point The presumed location of the bottommost point of the right hand ASM instance.
righthandasm Asm The trained ASM model for the right hand.
righthandasminstance Asm::Instance Specific instance of the ASM model for the right hand.
skinblobs std::vector<Blob> Interconnected pixel regions (blobs) extracted from skinmasks.
skinmask cv::Mat Binary matrix representing the skin detection results. Should be CV_8UC1 where non-zero elements correspond to skin pixels.
visualsilence double A value in range [0,1.0] where the lower the value, the less there is movement in the frame.

Writing your own components

Writing C++ components

Overview

Writing C++ components is quite easy. Essentially, what needs to be done, is to define a class that derives from the Component class, implement all necessary virtual functions, and then register the class. Finally, the code must of course be compiled and linked into the program (library). Unfortunately, the current version does not support loading C++ components from external libraries at the run time, so the component must be linked in at build time. Luckily, the CMake build process makes this easy.

Implementing the component

The Component base class is abstract, and most of the functions that need to be implemented are purely virtual. The most important function is virtual void process(frame_number_t frameNumber). It defines the actual logic for processing one frame. If the component does not process frames one at a time, a bogus implementation can be made, and -- instead -- virtual bool processRangeImplementation(frame_number_t first, frame_number_t last, UiCallback* uiCallback) may be defined. The default implementation of this function calls process() for each frame in the range from first (inclusive) to last (exclusive). The uiCallback parameter is important; it is a pointer to an object whose operator() can be used to pass information about the progress of the component to the user interface. This function should be called periodically to prevent the user interface from freezing, and to obtain information about whether the process should be terminated (via the return value). The processRangeImplementation() function should return a boolean that tells if the process was completed successfully. False value will be returned in the case something went wrong or the user requested termination.

virtual Component* createComponentImpl(const boost::program_options::variables_map& configuration, BlackBoard* blackBoard, FrameSource* frameSource) const is used to actually construct the component instances. The parts of the program that construct components do so by having a map of options and a name for the component which are then passed to this function (or, more precisely, to another function which then calls this function). The blackboard and framesource pointers are simply to be passed to the parent Component(BlackBoard* blackBoard, FrameSource* frameSource) constructor. That is, the constructor that is used to construct the component instance should include a call to this constructor as well. The recommended course of action for this function is to have a simple constructor for the component, then apply whatever options the component may have, and finally return the pointer via the new operator as a copy (i.e. return new [classname]([configuredInstance]);.

The variables_map class is an extension of std::map<std::string,boost::program_options::variable_value>. The values are stored as key-value pairs with string keys. The keys are prefixed with the component short name, followed by a full stop (.). Values are wrappers around boost::any objects, and can be easily accessed through the following call: vm[key].as<TYPE>(). It should be noted that no conversions are made, so the type must be explicitly known at compile time.

The options above are specified by two functions: virtual boost::program_options::options_description getConfigurationFileOptionsDescription() const and virtual boost::program_options::options_description getCommandLineOptionsDescription() const. The former of the two specifies the most commonly used options which can easily be passed either from a configuration file or from a Python script. Each of these options should be prefixed with the component short name and a full stop, as suggested above. The latter should only be used in special cases (for a practical example, see BlackBoardDumpReader.cpp), the options should not be prefixed, and the words should be separated by commas. The command line options will then be available as ordinary UNIX tool options with two preceeding dashes in the main executable.

The components are also aware of what items should be on the black board and what items they are going to provide for other components to work on. These are specified as simple sets of strings by virtual property_set_t getRequirements() const and virtual property_set_t getProvided() const.

The remaining functions that need to be implemented are rather trivial and self-explanatory. They mainly provide ways to extract user-readable information about the component. These are virtual std::string getLongDescription() const, virtual std::string getShortDescription() const, virtual std::string getShortName() const, and virtual std::string getComponentName() const. As a notion, the short name should be unique and it should not contain whitespaces. It is used to identify the particular component among other components.

Please see KinectExampleComponent.hpp and KinectExampleComponent.cpp for a practical example.

Registering the component

Before the component is available for use, it must be registered. This is done via a special constructor that takes a single bool argument. Typically, the derived class would have a dedicated constructor whose only task is to call the Component(bool) constructor. The value passed as the argument is of no importance. The constructor will then add a pointer to the instance to a global linked list of component pointers which can be used to look for the construction functions of components by identifying them with their short names. Therefore, typically one would have a static instance of the class placed in the component's translation unit, so that the component is registered before main() is called, e.g. static MyDerivedComponent(true);. The member functions implemented above will then take care of the rest.

Linking the component to the rest of the program

The current CMake script that is used in the build process automatically adds all (excluding the executables) .cpp files to be built and linked into the main program. Therefore, the only thing that needs to be done is to place the source file of the component amongst other component source files and run CMake.

Adding new supported dumpable types

If you use your own, arbitrary types, chances are you may want to use the built-in black board dumping facilities to store some analysis stages for later use. If that is the case, you will need to add dump functions to your types. Luckily, this should be very easy.

POD types

If your type is a POD type, all you need to do is to call a static function called BlackBoardDumpWriter::registerAnyWriter() for your type. Everything else is done automatically, unless your data structure contains pointers in which case you should treat the type as non-POD. Calling this function will instantiate the proper dumping functions. In fact, this will even work if your type is wrapped around STL containers: the function call will recursively instantiate all necessary helper functions to dump the entire container.

Pointers can be problematic. The default action for POD types is to simply compute the POD structure size and store it verbatim, i.e. pointers are stored as integers. This is probably not what you want because the pointers will be invalid after they have been loaded, so please see the next section for creating an advanced writer function if your type contains pointers.

As an example, consider the following type:

struct Foo {
  int bar1;
  double bar2;
}

If you want to store a vector full of these structures, all you need to do is to place this function call: BlackBoardDumpWriter::registerAnyWriter<std::vector<Foo>> somewhere where it is run before a dump is attempted. A good place could be the special boolean constructor of the associated component.

Non-POD types

If your type is non-POD, then you will need to also specify the specifc "Dumb Writer" for your type. A "dumb writer" is simply a function that reads the data structure, possibly recursively, and stores it as a sequence of bytes into an output stream.

In the simple case where no additional information is required, you may simply place a function pointer of the type void(*)(std::ostream&, const T&) where T is your type to the BlackBoardDumpWriter::registerDumbWriter() function. Since we are talking about higher order functions, lambdas may come in handy.

For example: cv::Rect is a simple non-POD type defined as a four-tuple of integers (x, y, w, h, representing the x and y coordinates of the upper left corner, and the width and height of a rectangle, respectively. Supposing that we want to dump a Foo-->Rect map, we would first need to call BlackBoardDumpWriter::registerDumbWriter<std::map<Foo,cv::Rect>>() as in the non-POD case. In addition, we could define the dump function for cv::Rect as follows:

BlackBoardDumpWriter::registerDumbWriter([](std::ostream& ofs, const cv::Rect& data) {
    int32_t r[4] = { data.x, data.y, data.width, data.height };
    ofs.write(reinterpret_cast(r), sizeof(r));
  });

However, this is not sufficient. We also need to specify a size computation function because we cannot rely on a simple sizeof statement in the case of non-POD types because our implementation was arbitrary. This can be achieved as follows:

BlackBoardDumpWriter::registerSizeComputer([](const cv::Rect&) { 
    return 4*sizeof(int32_t); 
  });

Recursively dumping complex types

In a case where a complex type is created from other complex types, recursive dumping is the way to go. For instance, let us take a type called Pdm. It consists from a number of matrices. Let us consider the following file format:

// [uint64_t][Mat][Mat][Mat][Point2d]
//     |       |    |    |      |
//     |       |    |    |      +---mean anchor
//     |       |    |    +---eigenvalues
//     |       |    +---eigenvectors
//     |       +---mean shape    
//     +---number n of landmarks   

Serialising all this manually would be tedious. Instead, let us adopt a more sophisticated approach: we can use an alternate form of funciton signature which also takes in a pointer to the BlackBoardDumpWriter object. Then we gain access to its built-in functions that make recursive writing easy. This can be achieved as follows:

    BlackBoardDumpWriter::registerDumbWriter<Pdm>([](BlackBoardDumpWriter* const w, 
                                                     std::ostream& ofs, 
                                                     const Pdm& data) {
        cv::Mat meanShape = data.generateShape(cv::Mat(0,0,CV_64FC1));
        uint64_t nLandmarks = meanShape.rows/2;
        w->dumbWrite(ofs, nLandmarks);
        w->dumbWrite(ofs, meanShape);
        w->dumbWrite(ofs, data.getEigenVectors());
        w->dumbWrite(ofs, data.getEigenValues());
        w->dumbWrite(ofs, data.getMeanAnchor());
      });

The same approach can also be applied on size computation:

    BlackBoardDumpWriter::registerSizeComputer<Pdm>([](const BlackBoardDumpWriter* const w, 
                                                       const Pdm& pdm) {
        return sizeof(uint64_t) + 
          w->getSize(pdm.getMeanAnchor()) + 
          w->getSize(pdm.generateShape(cv::Mat(0,0,CV_64FC1))) +
          w->getSize(pdm.getEigenVectors()) +
          w->getSize(pdm.getEigenValues());
      });

Optional dumping and pointers

Sometimes, a case may arise where several objects have a pointer to a common object that may not be on the black board on its own. In that case, we may make use of the fact that the BlackBoardDumpWriter can keep a tally of pointers it has dumped and can dump the pointer identity to the file. Then the references to a common object can be retroactively restored by reversing the process and replacing the pointers to the deserialised instance.

Consider the following case:

the size varies depending if the PDM needs to be stored. The PDM is 
stored ONLY with the first time it is referenced by an ASM object. That
is, the format is as follows:
[uint64_t][uint8_t][?PDM][Mat][PoseParameter][Mat]
     |         |       |    |        |          |
     |         |       |    |        |          +---landmarks
     |         |       |    |        +---pose parameters
     |         |       |    +---shape parameter
     |         |       +---PDM if necessary
     |         +---Bool, true if PDM should be stored
     +---PDM pointer

So, here several Asm::Instance objects point to the same Pdm object. We adopt an approach where the Pdm is stored along one of these instances, but only one (more specifically, the first one encountered). This can be achieved as follows:

    BlackBoardDumpWriter::registerDumbWriter<Asm::Instance>([](BlackBoardDumpWriter* const w, 
                                                               std::ostream& ofs, 
                                                               const Asm::Instance& data) {
        const Pdm& pdm = data.getPdm();
        const void* pdmPointer = &pdm;
        uint64_t pdmPointer64 = reinterpret_cast<uintptr_t>(pdmPointer);
        w->dumbWrite(ofs, pdmPointer64);
        uint8_t shouldStorePdm = !w->hasStoredPointer(pdmPointer);
        w->dumbWrite(ofs, shouldStorePdm);
        if (shouldStorePdm)
          w->dumbWrite(ofs, &pdm);
        w->dumbWrite(ofs, data.getBt());
        w->dumbWrite(ofs, data.getPose());
        w->dumbWrite(ofs, data.getLandmarks());
      });

    
    BlackBoardDumpWriter::registerSizeComputer<Asm::Instance>([](const BlackBoardDumpWriter* const w, 
                                                                 const Asm::Instance& instance) {
        const Pdm& pdm = instance.getPdm();
        const void* pdmPointer = &pdm;
        uint8_t needPdm = !w->hasStoredPointer(pdmPointer);
        return sizeof(uint64_t) + sizeof(uint8_t) + 
          w->getSize(instance.getBt()) + w->getSize(instance.getPose()) +
          w->getSize(instance.getLandmarks()) +
          (needPdm ? w->getSize(&pdm) : 0);
      });

Notice that the pointed objects are stored by passing the pointer to dumbWrite. Possibly contraintuitively, this does not store the pointer as such, but also stores the pointed objects, and "e;remembers" the pointer address, so that it can be queried later on if the pointer was stored. In other words, if p is a pointer, then dumbWrite(os, p); is equivalent to dumbWrite(os, (uint64_t)p); dumbWrite(os, *p);.

We still need to be able to tell when the pointers should be stored and when not. If the answer is "always", we need not care. However, in a conditional case (where we want to avoid multiple storage cases), we would need to provide yet another function that checks which pointers should be written, aside from those found on the black board itself. Typically this would be done like this:

BlackBoardDumpWriter::registerConditionalStoredPointersFunction<Asm::Instance>([](BlackBoardDumpWriter const * const w, 
                                                                                  const Asm::Instance& i) {
  if (!w->hasStoredPointer(&i.getPdm()))
    return std::set { &i.getPdm() };
  else
    return std::set();
});

Undumping

Performing the reverse operation is somewhat simpler. We need not care about size computation functions because the size information is stored in the dump file headers. Also, if the type is POD, and its behaviour is trivial, then a call to the registerAnyWriter() already does everything that needs to be done. A registration function exists for types that need to be handled in a particular way, analoguously to the writer case. Here is an example:

    registerUnDumper<cv::Rect>([](const BlackBoardDumpReader* const w, 
                                  std::istream& ifs)->cv::Rect {
      int32_t i[4];
      w->dumbRead(ifs, i);
      return cv::Rect(i[0], i[1], i[2], i[3]);
    });

Pointer handling is also very simple. Consider the reverse of the Asm::Instance example above. There are two specific things that the undump function should do. First it should check if it should expect the PDM to be found. This boolean information was stored as an 8-bit char value, so were there. If it was, then we should restore the PDM and also store the new pointer address for maintaining correspondence. This is done automatically via the undumpSharedPointer function. It is also possible to check if the given pointer has been restored before. In this case, it should not have. This can be done with the hasSharedPointer function. Finally, to create a new shared_ptr instance to the new pointer, we would have to use the getNewSharedPtr function. The data in this example can be simply undmped as follows:

    BlackBoardDumpReader::registerUnDumper<Asm::Instance>([](BlackBoardDumpReader* const w, 
                                                             std::istream& ifs) {
        uint64_t pdmPointer;
        w->dumbRead(ifs, pdmPointer);
        uint8_t shouldPdmHaveBeenStored;
        w->dumbRead(ifs, shouldPdmHaveBeenStored);
        if (shouldPdmHaveBeenStored == w->hasNewSharedPtr(pdmPointer))
          throw IOException("Corrupt dump file: A PDM that should not exist "
                            "does indeed exist, or should exist but does not");
        if (shouldPdmHaveBeenStored)
          w->undumpSharedPointer<Pdm>(ifs, pdmPointer);
        std::shared_ptr<Pdm> pdm = w->getNewSharedPtr<Pdm>(pdmPointer);
        cv::Mat bt, landmarks;
        w->dumbRead(ifs, bt);
        Pdm::PoseParameter pose;
        w->dumbRead(ifs, pose);
        w->dumbRead(ifs, landmarks);
        return Asm::Instance(pdm, bt, pose, landmarks);
      });

Writing Python components

Overview

Writing Python components is somewhat simpler. Basic information extraction functions need to be overridden, just as in the C++ case, and they are documented in the Python API reference. As noted there, overriding the process() function is enough if the component works on a per frame basis. If this is not the case, processRange() may be overridden. However, when doing so, callback() should be called frequently to notify the UI of progress and prevent it from freezing, and to react to the possibility that the user may want to terminate the process.

Option processing is handled differently than in C++. This is done by declaring a special constructor __init__(self, opts) (the opts parameter can of course be renamed) where opts is a dictionary whose keys are strings and values typically strings, ints, floats, or bools. Unlike in the C++ case, the keys are not prefixed. This constructor should also call the PythonComponentBase.__init__(self) constructor which creates the corresponding C++ instance of the component. For a practical example, please see PythonExampleComponent.py.

Adding Python components to the component chain

The component chain only accepts C++ components natively, so in order to inject Python components into the chain, a special meta-component called PythonComponent must be used. The component specifies one special option: class. This option contains the name of the Python class that is to be instantiated. All other options are passed to the constructor of that instance in a similar dictionary. E.g. the example component could be created as follows:

setComponents([Component('PythonComponent', {'class': 'PythonExampleComponent',
                                             'multiplier': 0.5,
                                             'addend': 0.5})])

References

[1] Viola P. & Jones, M. 2001. Rapid Object Detection Using a Boosted Cascade of Simple Features. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '01), p. 511-518.

[2] Karppa, M. & Jantunen, T. & Koskela, M. & Laaksonen, J. & Viitaniemi, V. 2011. Method for visualisation and analysis of hand and head movements in sign language video. In C. Kirchhof, Z. Malisz, and P. Wagner, editors, Proceedings of the 2nd Gesture and Speech in Interaction conference (GESPIN 2011), Bielefeld, Germany, 2011. Available online at http://coral2.spectrum.uni-bielefeld.de/gespin2011/final/Jantunen.pdf.

[3] Cootes, T. & Cooper, D & Taylor, C. & Graham, J. 1995. Active Shape Models - Their training and application. Computer Vision and Image Understanding, 61(1):38–59.

[4] Kovač, J. & Peer P. & Solina, F. 2003. Human Skin Colour Clustering for Face Detection. In EUROCON 2003. Computer as a Tool. p. 144-148.

[5] Lucas, B. & Kanade, T., 1981. An iterative image registration technique with an application to stereo vision.. In Proceedings of the International Joint Conference on Artificial Intelligence, pp. 674-679.

[6] Shi, J. & Tomasi, C., 1994. Good features to track.. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '94), pp. 593-600.

[7] Huang, G. & Zhu, Q. & Siew, C. 2006. Extreme learning machine: Theory and applications. Neurocomputing, 70(1-3), pp.489-501.

[8] M. Uřičář, V. Franc and V. Hlaváč, Detector of Facial Landmarks Learned by the Structured Output SVM, VISAPP '12: Proceedings of the 7th International Conference on Computer Vision Theory and Applications, 2012. pdf