Skip to content

Named Entity Recognition

Bases: TextAPI

NamedEntityRecognitionAPI serves a Named Entity Recognition (NER) model using the Hugging Face transformers library. It is designed to recognize and classify named entities in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.

Attributes:

Name Type Description
model Any

The loaded NER model, typically a Hugging Face transformer model specialized for token classification.

tokenizer Any

The tokenizer for preprocessing text compatible with the loaded model.

Example CLI Usage:

genius NamedEntityRecognitionAPI rise \
    batch \
        --input_folder ./input \
    batch \
        --output_folder ./output \
    none \
    --id dslim/bert-large-NER-lol \
    listen \
        --args \
            model_name="dslim/bert-large-NER" \
            model_class="AutoModelForTokenClassification" \
            tokenizer_class="AutoTokenizer" \
            use_cuda=True \
            precision="float" \
            quantization=0 \
            device_map="cuda:0" \
            max_memory=None \
            torchscript=False \
            endpoint="0.0.0.0" \
            port=3000 \
            cors_domain="http://localhost:3000" \
            username="user" \
            password="password"

__init__(input, output, state, **kwargs)

Initializes the NamedEntityRecognitionAPI class.

Parameters:

Name Type Description Default
input BatchInput

The input data.

required
output BatchOutput

The output data.

required
state State

The state data.

required
**kwargs Any

Additional keyword arguments.

{}

initialize_pipeline()

Lazy initialization of the NER Hugging Face pipeline.

ner_pipeline(**kwargs)

Recognizes named entities in the input text using the Hugging Face pipeline.

This method leverages a pre-trained NER model to identify and classify entities in text into categories such as names, organizations, locations, etc. It's suitable for processing various types of text content.

Parameters:

Name Type Description Default
**kwargs Any

Arbitrary keyword arguments, typically containing 'text' for the input text.

{}

Returns:

Type Description
Dict[str, Any]

Dict[str, Any]: A dictionary containing the original input text and a list of recognized entities.

Example CURL Request for NER:

curl -X POST localhost:3000/api/v1/ner_pipeline             -H "Content-Type: application/json"             -d '{"text": "John Doe works at OpenAI in San Francisco."}' | jq

recognize_entities(**kwargs)

Endpoint for recognizing named entities in the input text using the loaded NER model.

Parameters:

Name Type Description Default
**kwargs Any

Arbitrary keyword arguments, typically containing 'text' for the input text.

{}

Returns:

Type Description
Dict[str, Any]

Dict[str, Any]: A dictionary containing the original input text and a list of recognized entities with their respective types.

Example CURL Requests:

curl -X POST localhost:3000/api/v1/recognize_entities \
    -H "Content-Type: application/json" \
    -d '{"text": "John Doe works at OpenAI in San Francisco."}' | jq

curl -X POST localhost:3000/api/v1/recognize_entities \
    -H "Content-Type: application/json" \
    -d '{"text": "Alice is going to visit the Eiffel Tower in Paris next summer."}' | jq