Skip to content

Base API

Bases: AudioBulk

A class representing a Hugging Face API for generating text using a pre-trained language model.

Attributes:

Name Type Description
model Any

The pre-trained language model.

processor Any

The processor used to preprocess input text.

model_name str

The name of the pre-trained language model.

model_revision Optional[str]

The revision of the pre-trained language model.

processor_name str

The name of the processor used to preprocess input text.

processor_revision Optional[str]

The revision of the processor used to preprocess input text.

model_class str

The name of the class of the pre-trained language model.

processor_class str

The name of the class of the processor used to preprocess input text.

use_cuda bool

Whether to use a GPU for inference.

quantization int

The level of quantization to use for the pre-trained language model.

precision str

The precision to use for the pre-trained language model.

device_map str | Dict | None

The mapping of devices to use for inference.

max_memory Dict[int, str]

The maximum memory to use for inference.

torchscript bool

Whether to use a TorchScript-optimized version of the pre-trained language model.

model_args Any

Additional arguments to pass to the pre-trained language model.

Methods

text(**kwargs: Any) -> Dict[str, Any]: Generates text based on the given prompt and decoding strategy.

listen(model_name: str, model_class: str = "AutoModelForCausalLM", processor_class: str = "AutoProcessor", use_cuda: bool = False, precision: str = "float16", quantization: int = 0, device_map: str | Dict | None = "auto", max_memory={0: "24GB"}, torchscript: bool = True, endpoint: str = "", port: int = 3000, cors_domain: str = "http://localhost:3000", username: Optional[str] = None, password: Optional[str] = None, *model_args: Any) -> None: Starts a CherryPy server to listen for requests to generate text.

__init__(input, output, state, **kwargs)

Initializes a new instance of the TextAPI class.

Parameters:

Name Type Description Default
input BatchInput

The input data to process.

required
output BatchOutput

The output data to process.

required
state State

The state of the API.

required

listen(model_name, model_class='AutoModel', processor_class='AutoProcessor', use_cuda=False, precision='float16', quantization=0, device_map='auto', max_memory={0: '24GB'}, torchscript=False, compile=False, concurrent_queries=False, use_whisper_cpp=False, use_faster_whisper=False, endpoint='*', port=3000, cors_domain='http://localhost:3000', username=None, password=None, **model_args)

Starts a CherryPy server to listen for requests to generate text.

Parameters:

Name Type Description Default
model_name str

The name of the pre-trained language model.

required
model_class str

The name of the class of the pre-trained language model. Defaults to "AutoModelForCausalLM".

'AutoModel'
processor_class str

The name of the class of the processor used to preprocess input text. Defaults to "AutoProcessor".

'AutoProcessor'
use_cuda bool

Whether to use a GPU for inference. Defaults to False.

False
precision str

The precision to use for the pre-trained language model. Defaults to "float16".

'float16'
quantization int

The level of quantization to use for the pre-trained language model. Defaults to 0.

0
device_map str | Dict | None

The mapping of devices to use for inference. Defaults to "auto".

'auto'
max_memory Dict[int, str]

The maximum memory to use for inference. Defaults to {0: "24GB"}.

{0: '24GB'}
torchscript bool

Whether to use a TorchScript-optimized version of the pre-trained language model. Defaults to True.

False
compile bool

Enable Torch JIT compilation.

False
concurrent_queries bool

(bool): Whether the API supports concurrent API calls (usually false).

False
use_whisper_cpp bool

Whether to use whisper.cpp to load the model. Defaults to False. Note: only works for these models: https://github.com/aarnphm/whispercpp/blob/524dd6f34e9d18137085fb92a42f1c31c9c6bc29/src/whispercpp/utils.py#L32

False
use_faster_whisper bool

Whether to use faster-whisper.

False
endpoint str

The endpoint to listen on. Defaults to "*".

'*'
port int

The port to listen on. Defaults to 3000.

3000
cors_domain str

The domain to allow CORS requests from. Defaults to "http://localhost:3000".

'http://localhost:3000'
username Optional[str]

The username to use for authentication. Defaults to None.

None
password Optional[str]

The password to use for authentication. Defaults to None.

None
**model_args Any

Additional arguments to pass to the pre-trained language model.

{}

validate_password(realm, username, password)

Validate the username and password against expected values.

Parameters:

Name Type Description Default
realm str

The authentication realm.

required
username str

The provided username.

required
password str

The provided password.

required

Returns:

Name Type Description
bool

True if credentials are valid, False otherwise.