HBase¶
Bases: Spout
__init__(output, state, **kwargs)
¶
Initialize the HBase class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output |
BatchOutput
|
An instance of the BatchOutput class for saving the data. |
required |
state |
State
|
An instance of the State class for maintaining the state. |
required |
**kwargs |
Additional keyword arguments. |
{}
|
Using geniusrise to invoke via command line¶
genius HBase rise \
batch \
--output_s3_bucket my_bucket \
--output_s3_folder s3/folder \
none \
fetch \
--args host=localhost table=my_table row_start=start row_stop=stop batch_size=100
Using geniusrise to invoke via YAML file¶
fetch(host, table, row_start, row_stop, batch_size=100)
¶
📖 Fetch data from an HBase table and save it in batch.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
host |
str
|
The HBase host. |
required |
table |
str
|
The HBase table name. |
required |
row_start |
str
|
The row key to start scanning from. |
required |
row_stop |
str
|
The row key to stop scanning at. |
required |
batch_size |
int
|
The number of rows to fetch per batch. Defaults to 100. |
100
|
Raises:
Type | Description |
---|---|
Exception
|
If unable to connect to the HBase server or execute the scan. |