Skip to content

Base Fine Tuner

Bases: Bolt

A bolt for fine-tuning Hugging Face models.

This bolt uses the Hugging Face Transformers library to fine-tune a pre-trained model. It uses the Trainer class from the Transformers library to handle the training.

__init__(input, output, state, **kwargs)

Initialize the bolt.

Parameters:

Name Type Description Default
input BatchInput

The batch input data.

required
output BatchOutput

The output data.

required
state State

The state manager.

required
evaluate bool

Whether to evaluate the model. Defaults to False.

required
**kwargs

Additional keyword arguments.

{}

compute_metrics(eval_pred)

Compute metrics for evaluation. This class implements a simple classification evaluation, tasks should ideally override this.

Parameters:

Name Type Description Default
eval_pred EvalPrediction

The evaluation predictions.

required

Returns:

Name Type Description
dict Optional[Dict[str, float]] | Dict[str, float]

The computed metrics.

fine_tune(model_name, tokenizer_name, num_train_epochs, per_device_batch_size, model_class='AutoModel', tokenizer_class='AutoTokenizer', device_map='auto', precision='bfloat16', quantization=None, lora_config=None, use_accelerate=False, use_trl=False, accelerate_no_split_module_classes=[], compile=False, evaluate=False, save_steps=500, save_total_limit=None, load_best_model_at_end=False, metric_for_best_model=None, greater_is_better=None, map_data=None, use_huggingface_dataset=False, huggingface_dataset='', hf_repo_id=None, hf_commit_message=None, hf_token=None, hf_private=True, hf_create_pr=False, notification_email='', learning_rate=1e-05, **kwargs)

Fine-tunes a pre-trained Hugging Face model.

Parameters:

Name Type Description Default
model_name str

The name of the pre-trained model.

required
tokenizer_name str

The name of the pre-trained tokenizer.

required
num_train_epochs int

The total number of training epochs to perform.

required
per_device_batch_size int

The batch size per device during training.

required
model_class str

The model class to use. Defaults to "AutoModel".

'AutoModel'
tokenizer_class str

The tokenizer class to use. Defaults to "AutoTokenizer".

'AutoTokenizer'
device_map str | dict

The device map for distributed training. Defaults to "auto".

'auto'
precision str

The precision to use for training. Defaults to "bfloat16".

'bfloat16'
quantization int

The quantization level to use for training. Defaults to None.

None
lora_config dict

Configuration for PEFT LoRA optimization. Defaults to None.

None
use_accelerate bool

Whether to use accelerate for distributed training. Defaults to False.

False
use_trl bool

Whether to use TRL for training. Defaults to False.

False
accelerate_no_split_module_classes List[str]

The module classes to not split during distributed training. Defaults to [].

[]
evaluate bool

Whether to evaluate the model after training. Defaults to False.

False
compile bool

Whether to compile the model before fine-tuning. Defaults to True.

False
save_steps int

Number of steps between checkpoints. Defaults to 500.

500
save_total_limit Optional[int]

Maximum number of checkpoints to keep. Older checkpoints are deleted. Defaults to None.

None
load_best_model_at_end bool

Whether to load the best model (according to evaluation) at the end of training. Defaults to False.

False
metric_for_best_model Optional[str]

The metric to use to compare models. Defaults to None.

None
greater_is_better Optional[bool]

Whether a larger value of the metric indicates a better model. Defaults to None.

None
use_huggingface_dataset bool

Whether to load a dataset from huggingface hub.

False
huggingface_dataset str

The huggingface dataset to use.

''
map_data Callable

A function to map data before training. Defaults to None.

None
hf_repo_id str

The Hugging Face repo ID. Defaults to None.

None
hf_commit_message str

The Hugging Face commit message. Defaults to None.

None
hf_token str

The Hugging Face token. Defaults to None.

None
hf_private bool

Whether to make the repo private. Defaults to True.

True
hf_create_pr bool

Whether to create a pull request. Defaults to False.

False
notification_email str

Whether to notify after job is complete. Defaults to None.

''
learning_rate float

Learning rate for backpropagation.

1e-05
**kwargs

Additional keyword arguments to pass to the model.

{}

Returns:

Type Description

None

load_dataset(dataset_path, **kwargs) abstractmethod

Load a dataset from a file.

Parameters:

Name Type Description Default
dataset_path str

The path to the dataset file.

required
split str

The split to load. Defaults to None.

required
**kwargs

Additional keyword arguments to pass to the load_dataset method.

{}

Returns:

Type Description
Dataset | DatasetDict | Optional[Dataset]

Union[Dataset, DatasetDict, None]: The loaded dataset.

Raises:

Type Description
NotImplementedError

This method should be overridden by subclasses.

load_models(model_name, tokenizer_name, model_class='AutoModel', tokenizer_class='AutoTokenizer', device_map='auto', precision='bfloat16', quantization=None, lora_config=None, use_accelerate=False, accelerate_no_split_module_classes=[], **kwargs)

Load the model and tokenizer.

Parameters:

Name Type Description Default
model_name str

The name of the model to be loaded.

required
tokenizer_name str

The name of the tokenizer to be loaded. Defaults to None.

required
model_class str

The class of the model. Defaults to "AutoModel".

'AutoModel'
tokenizer_class str

The class of the tokenizer. Defaults to "AutoTokenizer".

'AutoTokenizer'
device Union[str, torch.device]

The device to be used. Defaults to "cuda".

required
precision str

The precision to be used. Choose from 'float32', 'float16', 'bfloat16'. Defaults to "float32".

'bfloat16'
quantization Optional[int]

The quantization to be used. Defaults to None.

None
lora_config Optional[dict]

The LoRA configuration to be used. Defaults to None.

None
use_accelerate bool

Whether to use accelerate. Defaults to False.

False
accelerate_no_split_module_classes List[str]

The list of no split module classes to be used. Defaults to [].

[]
**kwargs

Additional keyword arguments.

{}

Raises:

Type Description
ValueError

If an unsupported precision is chosen.

Returns:

Type Description

None

preprocess_data(**kwargs)

Load and preprocess the dataset

upload_to_hf_hub(hf_repo_id=None, hf_commit_message=None, hf_token=None, hf_private=None, hf_create_pr=None)

Upload the model and tokenizer to Hugging Face Hub.