Image Classsification API¶
Bases: VisionAPI
ImageClassificationAPI extends the VisionAPI for image classification tasks. This API provides functionalities to classify images into various categories based on the trained model it uses. It supports both single-label and multi-label classification problems.
Methods
classify_image(self): Endpoint to classify an uploaded image and return the classification scores. sigmoid(self, _outputs): Applies the sigmoid function to the model's outputs. softmax(self, _outputs): Applies the softmax function to the model's outputs.
Example CLI Usage:
genius ImageClassificationAPI rise \
batch \
--input_folder ./input \
batch \
--output_folder ./output \
none \
listen \
--args \
model_name="Kaludi/food-category-classification-v2.0" \
model_class="AutoModelForImageClassification" \
processor_class="AutoImageProcessor" \
device_map="cuda:0" \
use_cuda=True \
precision="float" \
quantization=0 \
max_memory=None \
torchscript=False \
compile=False \
flash_attention=False \
better_transformers=False \
endpoint="*" \
port=3000 \
cors_domain="http://localhost:3000" \
username="user" \
password="password"
__init__(input, output, state, **kwargs)
¶
Initializes the ImageClassificationAPI with the necessary configurations for input, output, and state management, along with model-specific parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
BatchInput
|
Configuration for the input data. |
required |
output |
BatchOutput
|
Configuration for the output data. |
required |
state |
State
|
State management for the API. |
required |
**kwargs |
Additional keyword arguments for extended functionality, such as model configuration. |
{}
|
classify_image()
¶
Endpoint for classifying an image. It accepts a base64-encoded image, decodes it, preprocesses it, and runs it through the classification model. It supports both single-label and multi-label classification by applying the appropriate post-processing function to the model outputs.
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict[str, Any]: A dictionary containing the predictions with the highest scores and all prediction scores. |
Dict[str, Any]
|
Each prediction includes the label and its corresponding score. |
Raises:
Type | Description |
---|---|
Exception
|
If an error occurs during image processing or classification. |
Example CURL Request:
curl -X POST localhost:3000/api/v1/classify_image -H "Content-Type: application/json" -d '{"image_base64": "<base64-encoded-image>"}'
or to feed an image:
sigmoid(_outputs)
¶
Applies the sigmoid function to the model's outputs for binary classification or multi-label classification tasks.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
_outputs |
np.ndarray
|
The raw outputs from the model. |
required |
Returns:
Type | Description |
---|---|
np.ndarray
|
np.ndarray: The outputs after applying the sigmoid function. |
softmax(_outputs)
¶
Applies the softmax function to the model's outputs for single-label classification tasks, ensuring the output scores sum to 1 across classes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
_outputs |
np.ndarray
|
The raw outputs from the model. |
required |
Returns:
Type | Description |
---|---|
np.ndarray
|
np.ndarray: The outputs after applying the softmax function. |