Source: texcla/experiment.py#L0
create_experiment_folder
create_experiment_folder(base_dir, model, lr, batch_size)
copy_called_file
copy_called_file(exp_path)
create_callbacks
create_callbacks(exp_path, patience)
train
train(model, word_encoder_model, lr=0.001, batch_size=64, epochs=50, patience=10, \
base_dir="experiments", **fit_args)
load_csv
load_csv(data_path=None, text_col="text", class_col="class", limit=None)
process_save
process_save(X, y, tokenizer, proc_data_path, max_len=400, save_tokenizer=True)
Process text and save as Dataset
setup_data
setup_data(X, y, tokenizer, proc_data_path, **kwargs)
Setup data
Args:
- X: text data,
- y: data labels,
- tokenizer: A Tokenizer instance
- proc_data_path: Path for the processed data
split_data
split_data(X, y, ratio=(0.8, 0.1, 0.1))
Splits data into a training, validation, and test set.
Args:
- X: text data
- y: data labels
- ratio: the ratio for splitting. Default: (0.8, 0.1, 0.1)
Returns:
split data: X_train, X_val, X_test, y_train, y_val, y_test
setup_data_split
setup_data_split(X, y, tokenizer, proc_data_dir, **kwargs)
Setup data while splitting into a training, validation, and test set.
Args:
- X: text data,
- y: data labels,
- tokenizer: A Tokenizer instance
- proc_data_dir: Directory for the split and processed data
load_data_split
load_data_split(proc_data_dir)
Loads a split dataset
Args:
- proc_data_dir: Directory with the split and processed data
Returns:
(Training Data, Validation Data, Test Data)