For segmentation problems (type=pixelmask), the input is an image, and the target output is a same-sized image where each pixel is assigned to a category. In the image below using the KITTI dataset, each pixel is assigned to object categories (sidewalk, road, car, etc.):


The manifest file contains paths to the input image, as well as the target image:

@FILE       FILE
/image_dir/img1.jpg /mask_dir/mask1.png
/image_dir/img2.jpg /mask_dir/mask2.png
/image_dir/img3.jpg /mask_dir/mask3.png

Note that the target image should have a single channel only. If there are multiple channels, only the first channel from the target will be used. The image parameters are the same as above, and the pixelmask has zero configurations. Transformations such as photometric or lighting are applied to the input image only, and not applied to the pixel mask. The same cropping, flipping, and rotation settings are applied to both the image and the mask.

Name Default Description
height (uint) Required Height of provisioned image (pixels)
width (uint) Required Width of provisioned image (pixels)
name (string) “” Name prepended to the output buffer name
channels (uint) 3 Number of channels in input image
output_type (string) “uint8_t” Output data type.
seed (int) 0 Random seed

The buffers provisioned to the model are:

Buffer Name Shape Description
pixelmask (N, H, W) Target pixel image.