4.2. Machine Learning Workflow¶
In machine learning systems, the fundamental design objective of programming models is to offer comprehensive workflow programming support for developers. A typical machine learning task adheres to the workflow depicted in Figure Fig. 4.2.1. This workflow involves loading the training dataset, training, testing, and debugging models. The following APIs are defined to facilitate customization within the workflow (assuming that high-level APIs are provided as Python functions):
- Data Processing API: Users first require a data processing API to read datasets from a disk. Subsequently, they need to preprocess the data to make it suitable for input into machine learning models. Code - ch02/code2.2.1is an example of how PyTorch can be used to load data and create data loaders for both training and testing purposes.
ch02/code2.2.1
import pickle
from torch.utils.data import Dataset, DataLoader
data_path = '/path/to/data'
dataset = pickle.load(open(data_path, 'rb')) # Example for a pkl file
batch_size = ... # You can make it an argument of the script
class CustomDataset(Dataset):
    def __init__(self, data, labels):
        self.data = data
        self.labels = labels
    def __len__(self):
        return len(self.data)
    def __getitem__(self, idx):
        sample = self.data[idx]
        label = self.labels[idx]
        return sample, label
training_dataset = CustomDataset(dataset['training_data'], dataset['training_labels'])
testing_dataset = CustomDataset(dataset['testing_data'], dataset['testing_labels'])
training_dataloader = DataLoader(training_dataset, batch_size=batch_size, shuffle=True)  # Create a training dataloader
testing_dataloader = DataLoader(testing_dataset, batch_size=batch_size, shuffle=False) # Create a testing dataloader
- Model Definition API: Once the data is preprocessed, users need a model definition API to define machine learning models. These models include model parameters and can perform inference based on given data. Code - ch02/code2.2.2is an example of how to create a custom model in Pytorch:
ch02/code2.2.2
import torch.nn as nn
class CustomModel(nn.Module):
    def __init__(self, input_size, output_size):
        super(CustomModel, self).__init__()
        self.linear = nn.Linear(input_size, output_size)  # A single linear layer
    def forward(self, x):
        return self.linear(x)
- Optimizer Definition API: The outputs of models need to be compared with user labels, and their difference is evaluated using a loss function. The optimizer definition API enables users to define their own loss functions and import or define optimization algorithms based on the actual loss. These algorithms calculate gradients and update model parameters. Code - ch02/code2.2.3is an example of an optimizer definition in Pytorch:
ch02/code2.2.3
import torch.optim as optim
import torch.nn
model = CustomModel(...)
# Optimizer definition (Adam, SGD, etc.)
optimizer = optim.Adam(model.parameters(), lr=1e-4, momentum=0.9)
loss = nn.CrossEntropyLoss() # Loss function definition
- Training API: Given a dataset, model, loss function, and optimizer, users require a training API to define a loop that reads data from datasets in a mini-batch mode. In this process, gradients are computed repeatedly, and model parameters are updated accordingly. This iterative update process is known as training. Code - ch02/code2.2.4is an example of how to train a model in Pytorch:
ch02/code2.2.4
device = "cuda:0" if torch.cuda.is_available() else "cpu" # Select your training device
model.to(device) # Move the model to the training device
model.train() # Set the model to train mode
epochs = ... # You can make it an argument of the script
for epoch in range(epochs):
    for batch_idx, (data, target) in enumerate(training_dataloader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad() # zero the parameter gradients
        output = model(data) # Forward pass
        loss_value = loss(output, target) # Compute the loss
        loss_value.backward() # Backpropagation
        optimizer.step()
- Testing and Debugging APIs: Throughout the training process, users need a testing API to evaluate the accuracy of the model (training concludes when the accuracy exceeds the set goal). Additionally, a debugging API is necessary to verify the performance and correctness of the model. Code - ch02/code2.2.5is an example of model evaluation in Pytorch:
ch02/code2.2.5
model.eval() # Set the model to evaluation mode
overall_accuracy = []
for batch_idx, (data, target) in enumerate(testing_dataloader):
    data, target = data.to(device), target.to(device)
    output = model(data) # Forward pass
    accuracy = your_metrics(output, target) # Compute the accuracy
    overall_accuracy.append(accuracy) # Print the accuracy
# For debugging, you can print logs inside the training or evaluation loop, or use python debugger.
Fig. 4.2.1 Workflow within a machine learningsystem¶
