Question Answering¶

Bases: TextAPI

`tokenizer: AutoTokenizer` `instance-attribute` ¶

A class for handling different types of QA models, including traditional QA, TAPAS (Table-based QA), and TAPEX. It utilizes the Hugging Face transformers library to provide state-of-the-art question answering capabilities across various formats of data including plain text and tabular data.

Attributes:

Name	Type	Description
`model`	`AutoModelForQuestionAnswering \| AutoModelForTableQuestionAnswering`	The pre-trained QA model (traditional, TAPAS, or TAPEX).
`tokenizer`	`AutoTokenizer`	The tokenizer used to preprocess input text.

Methods

answer(self, **kwargs: Any) -> Dict[str, Any]: Answers questions based on the provided context (text or table).

CLI Usage Example:

href="#__codelineno-0-1">genius QAAPI rise \ batch \ --input_folder ./input \ batch \ --output_folder ./output \ none \ --id distilbert-base-uncased-distilled-squad-lol \ listen \ --args \ model_name="distilbert-base-uncased-distilled-squad" \ model_class="AutoModelForQuestionAnswering" \ tokenizer_class="AutoTokenizer" \ use_cuda=True \ precision="float" \ quantization=0 \ device_map="cuda:0" \ max_memory=None \ torchscript=False \ endpoint="*" \ port=3000 \ cors_domain="http://localhost:3000" \ username="user" \ password="password"

genius QAAPI rise \
    batch \
        --input_folder ./input \
    batch \
        --output_folder ./output \
    none \
    --id google/tapas-base-finetuned-wtq-lol \
    listen \
        --args \
            model_name="google/tapas-base-finetuned-wtq" \
            model_class="AutoModelForTableQuestionAnswering" \
            tokenizer_class="AutoTokenizer" \
            use_cuda=True \
            precision="float" \
            quantization=0 \
            device_map="cuda:0" \
            max_memory=None \
            torchscript=False \
            endpoint="*" \
            port=3000 \
            cors_domain="http://localhost:3000" \
            username="user" \
            password="password"

genius QAAPI rise \
    batch \
        --input_folder ./input \
    batch \
        --output_folder ./output \
    none \
    --id microsoft/tapex-large-finetuned-wtq-lol \
    listen \
        --args \
            model_name="microsoft/tapex-large-finetuned-wtq" \
            model_class="AutoModelForSeq2SeqLM" \
            tokenizer_class="AutoTokenizer" \
            use_cuda=True \
            precision="float" \
            quantization=0 \
            device_map="cuda:0" \
            max_memory=None \
            torchscript=False \
            endpoint="*" \
            port=3000 \
            cors_domain="http://localhost:3000" \
            username="user" \
            password="password"

`init(input, output, state, **kwargs)` ¶

Initializes the QAAPI with configurations for input, output, and state management.

Parameters:

Name	Type	Description	Default
`input`	`BatchInput`	Configuration for the input data.	required
`output`	`BatchOutput`	Configuration for the output data.	required
`state`	`State`	State management for the API.	required
`**kwargs`	`Any`	Additional keyword arguments for extended functionality.	`{}`

`answer(**kwargs)` ¶

Answers questions based on the provided context (text or table). It adapts to the model type (traditional, TAPAS, TAPEX) and provides answers accordingly.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Arbitrary keyword arguments, typically containing the 'question' and 'data' (context or table).	`{}`

Returns:

Type	Description
`Dict[str, Any]`	Dict[str, Any]: A dictionary containing the question, context/table, and answer(s).

Example CURL Request for Text-based QA:

curl -X POST localhost:3000/api/v1/answer \
    -H "Content-Type: application/json" \
    -d '{"question": "What is the capital of France?", "data": "France is a country in Europe. Its capital is Paris."}'

Example CURL Requests:

/usr/bin/curl -X POST localhost:3000/api/v1/answer \
    -H "Content-Type: application/json" \
    -d '{
        "data": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.",
        "question": "What is the common wisdom about RNNs?"

    }' | jq

/usr/bin/curl -X POST localhost:3000/api/v1/answer \
    -H "Content-Type: application/json" \
    -d '{
    "data": [
        {"Name": "Alice", "Age": "30"},
        {"Name": "Bob", "Age": "25"}
    ],
    "question": "what is their total age?"
}
' | jq

/usr/bin/curl -X POST localhost:3000/api/v1/answer \
    -H "Content-Type: application/json" \
    -d '{
    "data": {"Actors": ["Brad Pitt", "Leonardo Di Caprio", "George Clooney"], "Number of movies": ["87", "53", "69"]},
    "question": "how many movies does Leonardo Di Caprio have?"
}
' | jq

`answer_pipeline(**kwargs)` ¶

Answers questions using the Hugging Face pipeline based on the provided context.

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Arbitrary keyword arguments, typically containing 'question' and 'data'.	`{}`

Returns:

Type	Description
`Dict[str, Any]`	Dict[str, Any]: A dictionary containing the question, context, and the answer.

Example CURL Request for QA:

curl -X POST localhost:3000/api/v1/answer_pipeline             -H "Content-Type: application/json"             -d '{"question": "Who is the CEO of Tesla?", "data": "Elon Musk is the CEO of Tesla."}'

`answer_table_question(data, question, model_type)` ¶

Answers a question based on the provided table.

Parameters:

Name	Type	Description	Default
`data`	`Dict[str, Any]`	The table data and other parameters.	required
`question`	`str`	The question to be answered.	required
`model_type`	`str`	The type of the model ('tapas' or 'tapex').	required

Returns:

Name	Type	Description
`str`	`dict`	The answer derived from the table.

`initialize_pipeline()` ¶

Lazy initialization of the QA Hugging Face pipeline.

Question Answering¶

tokenizer: AutoTokenizer instance-attribute ¶

__init__(input, output, state, **kwargs) ¶

answer(**kwargs) ¶

answer_pipeline(**kwargs) ¶

answer_table_question(data, question, model_type) ¶

initialize_pipeline() ¶

`tokenizer: AutoTokenizer` `instance-attribute` ¶

`init(input, output, state, **kwargs)` ¶

`answer(**kwargs)` ¶

`answer_pipeline(**kwargs)` ¶

`answer_table_question(data, question, model_type)` ¶

`initialize_pipeline()` ¶