Skip to content

Image Segmentation API

Bases: VisionAPI

VisionSegmentationAPI extends VisionAPI to provide image segmentation functionalities, including panoptic, instance, and semantic segmentation. This API supports different segmentation tasks based on the model's capabilities and the specified subtask in the request.

Methods

segment_image(self): Processes an image for segmentation and returns the segmentation masks along with labels.

Example CLI Usage:

genius VisionSegmentationAPI rise \
    batch \
        --input_folder ./input \
    batch \
        --output_folder ./output \
    none \
    listen \
        --args \
            model_name="facebook/mask2former-swin-large-mapillary-vistas-semantic" \
            model_class="Mask2FormerForUniversalSegmentation" \
            processor_class="AutoImageProcessor" \
            device_map="cuda:0" \
            use_cuda=True \
            precision="float" \
            quantization=0 \
            max_memory=None \
            torchscript=False \
            compile=False \
            flash_attention=False \
            better_transformers=False \
            endpoint="*" \
            port=3000 \
            cors_domain="http://localhost:3000" \
            username="user" \
            password="password"

__init__(input, output, state, **kwargs)

Initializes the VisionSegmentationAPI with configurations for input, output, and state management, along with any model-specific parameters for segmentation tasks.

Parameters:

Name Type Description Default
input BatchInput

Configuration for the input data.

required
output BatchOutput

Configuration for the output data.

required
state State

State management for the API.

required
**kwargs

Additional keyword arguments for extended functionality.

{}

segment_image()

Endpoint for segmenting an image according to the specified subtask (panoptic, instance, or semantic segmentation). It decodes the base64-encoded image, processes it through the model, and returns the segmentation masks along with labels and scores (if applicable) in base64 format.

The method supports dynamic task inputs for models requiring specific task descriptions and applies different post-processing techniques based on the subtask.

Returns:

Type Description
List[Dict[str, Any]]

List[Dict[str, Any]]: A list of dictionaries where each dictionary contains a 'label', a 'score' (if applicable),

List[Dict[str, Any]]

and a 'mask' (base64-encoded image of the segmentation mask).

Raises:

Type Description
Exception

If an error occurs during image processing or segmentation.

Example CURL Request:

curl -X POST localhost:3000/api/v1/segment_image             -H "Content-Type: application/json"             -d '{"image_base64": "<base64-encoded-image>", "subtask": "panoptic"}'

or to save all masks:

(base64 -w 0 guy.jpg | awk '{print "{"image_base64": ""$0"", "subtask": "semantic"}"}' > /tmp/image_payload.json)
curl -X POST http://localhost:3000/api/v1/segment_image             -H "Content-Type: application/json"             -u user:password             -d @/tmp/image_payload.json | jq -r '.[] | .mask + " " + .label' | while read mask label; do echo $mask | base64 --decode > "${label}.jpg"; done