Image Segmentation API¶

Bases: VisionAPI

VisionSegmentationAPI extends VisionAPI to provide image segmentation functionalities, including panoptic, instance, and semantic segmentation. This API supports different segmentation tasks based on the model's capabilities and the specified subtask in the request.

Methods

segment_image(self): Processes an image for segmentation and returns the segmentation masks along with labels.

Example CLI Usage:

genius VisionSegmentationAPI rise \
    batch \
        --input_folder ./input \
    batch \
        --output_folder ./output \
    none \
    listen \
        --args \
            model_name="facebook/mask2former-swin-large-mapillary-vistas-semantic" \
            model_class="Mask2FormerForUniversalSegmentation" \
            processor_class="AutoImageProcessor" \
            device_map="cuda:0" \
            use_cuda=True \
            precision="float" \
            quantization=0 \
            max_memory=None \
            torchscript=False \
            compile=False \
            flash_attention=False \
            better_transformers=False \
            endpoint="*" \
            port=3000 \
            cors_domain="http://localhost:3000" \
            username="user" \
            password="password"

`init(input, output, state, **kwargs)` ¶

Initializes the VisionSegmentationAPI with configurations for input, output, and state management, along with any model-specific parameters for segmentation tasks.

Parameters:

Name	Type	Description	Default
`input`	`BatchInput`	Configuration for the input data.	required
`output`	`BatchOutput`	Configuration for the output data.	required
`state`	`State`	State management for the API.	required
`**kwargs`		Additional keyword arguments for extended functionality.	`{}`

`segment_image()` ¶

Endpoint for segmenting an image according to the specified subtask (panoptic, instance, or semantic segmentation). It decodes the base64-encoded image, processes it through the model, and returns the segmentation masks along with labels and scores (if applicable) in base64 format.

The method supports dynamic task inputs for models requiring specific task descriptions and applies different post-processing techniques based on the subtask.

Returns:

Type	Description
`List[Dict[str, Any]]`	List[Dict[str, Any]]: A list of dictionaries where each dictionary contains a 'label', a 'score' (if applicable),
`List[Dict[str, Any]]`	and a 'mask' (base64-encoded image of the segmentation mask).

Raises:

Type	Description
`Exception`	If an error occurs during image processing or segmentation.

Example CURL Request:

curl -X POST localhost:3000/api/v1/segment_image             -H "Content-Type: application/json"             -d '{"image_base64": "<base64-encoded-image>", "subtask": "panoptic"}'

or to save all masks:

(base64 -w 0 guy.jpg | awk '{print "{"image_base64": ""$0"", "subtask": "semantic"}"}' > /tmp/image_payload.json)
curl -X POST http://localhost:3000/api/v1/segment_image             -H "Content-Type: application/json"             -u user:password             -d @/tmp/image_payload.json | jq -r '.[] | .mask + " " + .label' | while read mask label; do echo $mask | base64 --decode > "${label}.jpg"; done

Image Segmentation API¶

__init__(input, output, state, **kwargs) ¶

segment_image() ¶

`init(input, output, state, **kwargs)` ¶

`segment_image()` ¶