Skip to content

Image Classsification API

Bases: VisionAPI

ImageClassificationAPI extends the VisionAPI for image classification tasks. This API provides functionalities to classify images into various categories based on the trained model it uses. It supports both single-label and multi-label classification problems.

Methods

classify_image(self): Endpoint to classify an uploaded image and return the classification scores. sigmoid(self, _outputs): Applies the sigmoid function to the model's outputs. softmax(self, _outputs): Applies the softmax function to the model's outputs.

Example CLI Usage:

genius ImageClassificationAPI rise \
    batch \
        --input_folder ./input \
    batch \
        --output_folder ./output \
    none \
    listen \
        --args \
            model_name="Kaludi/food-category-classification-v2.0" \
            model_class="AutoModelForImageClassification" \
            processor_class="AutoImageProcessor" \
            device_map="cuda:0" \
            use_cuda=True \
            precision="float" \
            quantization=0 \
            max_memory=None \
            torchscript=False \
            compile=False \
            flash_attention=False \
            better_transformers=False \
            endpoint="*" \
            port=3000 \
            cors_domain="http://localhost:3000" \
            username="user" \
            password="password"

__init__(input, output, state, **kwargs)

Initializes the ImageClassificationAPI with the necessary configurations for input, output, and state management, along with model-specific parameters.

Parameters:

Name Type Description Default
input BatchInput

Configuration for the input data.

required
output BatchOutput

Configuration for the output data.

required
state State

State management for the API.

required
**kwargs

Additional keyword arguments for extended functionality, such as model configuration.

{}

classify_image()

Endpoint for classifying an image. It accepts a base64-encoded image, decodes it, preprocesses it, and runs it through the classification model. It supports both single-label and multi-label classification by applying the appropriate post-processing function to the model outputs.

Returns:

Type Description
Dict[str, Any]

Dict[str, Any]: A dictionary containing the predictions with the highest scores and all prediction scores.

Dict[str, Any]

Each prediction includes the label and its corresponding score.

Raises:

Type Description
Exception

If an error occurs during image processing or classification.

Example CURL Request:

curl -X POST localhost:3000/api/v1/classify_image             -H "Content-Type: application/json"             -d '{"image_base64": "<base64-encoded-image>"}'

or to feed an image:

(base64 -w 0 cat.jpg | awk '{print "{"image_base64": ""$0""}"}' > /tmp/image_payload.json)
curl -X POST http://localhost:3000/api/v1/classify_image             -H "Content-Type: application/json"             -u user:password             -d @/tmp/image_payload.json | jq

sigmoid(_outputs)

Applies the sigmoid function to the model's outputs for binary classification or multi-label classification tasks.

Parameters:

Name Type Description Default
_outputs np.ndarray

The raw outputs from the model.

required

Returns:

Type Description
np.ndarray

np.ndarray: The outputs after applying the sigmoid function.

softmax(_outputs)

Applies the softmax function to the model's outputs for single-label classification tasks, ensuring the output scores sum to 1 across classes.

Parameters:

Name Type Description Default
_outputs np.ndarray

The raw outputs from the model.

required

Returns:

Type Description
np.ndarray

np.ndarray: The outputs after applying the softmax function.