Temporal task segmentation
In order to analyze a given sequence of a set of sensory data of an entire
task, we have to preprocess it by dividing it into constituent parts which we
can then individually analyze. The basic phases that compose a grasping task are
the pregrasp, grasp and manipulation phases.
The sensory data that are available from our observation module are the human
finger joint angles and the 3D pose (position and orientation) of the hand
relative to a global coordinate system. The questions then are: What are the
features that can be used as discriminants in segmenting the task? and How can
these features be used to segment the task?
To answer these questions, we refer to the literature on human hand motion,
primarily in the psychology circles. Many of the these studies concentrate on
the reaching action (i.e., pregrasp phase) of the human hand and the effect of
differing object sizes and visual conditions (occluded or unoccluded view).
The two most frequently used features are the hand speed and the grip aperture
(which is the distance between the tips of the thumb and the index finger).
Typical profiles of these features are shown below.
As can be seen, both the speed and grip aperture profiles have the
characteristic inverted bell shapes. This is not too surprising, since the
hand undergoes an acceleration phase and decceleration phase in reaching for
an object; in addition, the hand fingers move to widen (which, at its widest,
should be greater than the width of the object
at the intended grasp positions) in anticipation of the grasp.
It is interesting to note that the peak of the grip aperture profile normally
occurs after that of the speed profile.
Features for task segmentation
In light of the evidence offered by studies on human hand motion, we chose the
- Fingertip polygon area
- The fingertip polygon is the polygon whose vertices are the fingertips
- Its area is an indication of the width of the grip aperture.
- Speed of hand motion
- Volume sweep rate
- The volume sweep rate is the product of the first two features.
It measures the rate of change in both the fingertip polygon area and the
speed in 3D space. It turns out to be more effective in segmenting the task
than the first two features. The physical interpretation of the volume sweep
rate is shown below.
The segmentation algorithm
The assumptions made are:
- The pregrasp and manipulation phases interleave.
- This basically means that the action of reaching for an object is a
purposeful one, i.e., there is no sudden change in hand motion midway to reach
for another object before the initially targeted object is grasped.
- There are no rapid or jerking motions.
- There are several reasons for this. First, the acquisition or
observation module would not be able to sample the motion
sufficiently fast enough. Second, the characteristic
inverted bell-shaped profiles may be violated. Third, the type of grasp (which
is a dynamic one) may not be recognized by the taxonomy; such a grasp
is beyond the scope of our present study.
- Profiles within pregrasp phases resemble parabolas.
- This assumption is a reasonable one, in light of the empirical
results reported in various human hand motion studies.
The segmentation algorithm is a relatively simple one. It comprises the
- Hypothesize the task breakpoints separating the phases.
- The task breakpoints are initially hypothesized from the local minima of
the hand speed profile.
- Calculate the RMS error of fitting parabolas to hypothesized pregrasp
curves (using the volume sweep rate profile).
- For each set of breakpoints, the mean of this RMS fit error is
- Find the combination of the breakpoints which yields the minimum
mean RMS fit error.
To illustrate, consider the two sets of hypothesized breakpoints below. The
first one yields a good fit, as can be seen (the e's are the RMS errors of
parabolic fit to the volume sweep rate profile during the hypothesized
pregrasp phase). The result of a bad choice of breakpoints can also be seen in
the second example.
- S.B. Kang and K. Ikeuchi, "Determination of motion breakpoints in a task
sequence from human hand motion," to appear in Proc. IEEE Int'l Conf. on
Robotics and Automation, San Diego, CA, May 1994.
- S.B. Kang and K. Ikeuchi, Temporal segmentation of tasks from human
Tech. Rep. CMU-CS-93-150, Carnegie Mellon University, April 1993.
Return to Research interests