We describe our four-camera multibaseline stereo system in a convergent configuration and our implementation of a parallel depth recovery scheme for this system. Our system is capable of image capture at video rate. This is critical in applications that require three-dimen sional tracking. We obtain dense stereo depth data by projecting a light pattern of frequency modulated sinusoidally varying intensity onto the scene, thus increasing the local discrim inability at each pixel and facilitating matches. In addition, we make most of the camera view areas by converging them at a volume of interest. Results indicate that we are able to extract stereo depth data that are, on the average, less than 1 mm in error at distances between 1.5 to 3.5 m away from the cameras.