OCR using trocr¶

Bases: Bolt

`init(input, output, state, **kwargs)` ¶

The TROCRImageOCR class performs OCR (Optical Character Recognition) on images using Microsoft's TROCR model. It expects the input.input_folder to contain the images for OCR and saves the OCR results as JSON files in output.output_folder.

Parameters:

Name	Type	Description	Default
`input`	`BatchInput`	Instance of BatchInput for reading data.	required
`output`	`BatchOutput`	Instance of BatchOutput for saving data.	required
`state`	`State`	Instance of State for maintaining state.	required
`**kwargs`		Additional keyword arguments.	`{}`

Command Line Invocation with geniusrise¶

genius TROCRImageOCR rise \
    batch \
        --bucket my_bucket \
        --s3_folder s3/input \
    batch \
        --bucket my_bucket \
        --s3_folder s3/output \
    none \
    process

YAML Configuration with geniusrise¶

version: "1"
spouts:
    ocr_processing:
        name: "TROCRImageOCR"
        method: "process"
        input:
            type: "batch"
            args:
                bucket: "my_bucket"
                s3_folder: "s3/input"
                use_cuda: true
        output:
            type: "batch"
            args:
                bucket: "my_bucket"
                s3_folder: "s3/output"
                use_cuda: true

`process(kind='printed', use_cuda=True)` ¶

📖 Perform OCR on images in the input folder and save the OCR results as JSON files in the output folder.

This method iterates through each image file in input.input_folder, performs OCR using the TROCR model, and saves the OCR results as JSON files in output.output_folder.