OCR using trocr¶
Bases: Bolt
__init__(input, output, state, **kwargs)
¶
The TROCRImageOCR
class performs OCR (Optical Character Recognition) on images using Microsoft's TROCR model.
It expects the input.input_folder
to contain the images for OCR and saves the OCR results as JSON files in output.output_folder
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
BatchInput
|
Instance of BatchInput for reading data. |
required |
output |
BatchOutput
|
Instance of BatchOutput for saving data. |
required |
state |
State
|
Instance of State for maintaining state. |
required |
**kwargs |
Additional keyword arguments. |
{}
|
Command Line Invocation with geniusrise¶
genius TROCRImageOCR rise \
batch \
--bucket my_bucket \
--s3_folder s3/input \
batch \
--bucket my_bucket \
--s3_folder s3/output \
none \
process
YAML Configuration with geniusrise¶
process(kind='printed', use_cuda=True)
¶
📖 Perform OCR on images in the input folder and save the OCR results as JSON files in the output folder.
This method iterates through each image file in input.input_folder
, performs OCR using the TROCR model,
and saves the OCR results as JSON files in output.output_folder
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
kind |
str
|
The kind of TROCR model to use. Default is "printed". Options are "printed" or "handwritten". |
'printed'
|
use_cuda |
bool
|
Whether to use CUDA for model inference. Default is True. |
True
|