How to use K-fold Cross Validation with PyTorch?

Machine learning models must be evaluated with a test set after they have been trained. We do this to ensure that models have not overfit and to ensure that they work with real-life datasets, which may have slightly deviating distributions compared to the training set.

But in order to make your model really robust, simply evaluating with a train/test split may not be enough.

For example, take the situation where you have a dataset composed of samples from two classes. Most of the samples in the first 80% of your dataset belong to class A, whereas most of the samples in the other 20% belong to class B. If you would take a simple 80/20 hold-out split, then your datasets would have vastly different distributions - and evaluation might result in wrong conclusions.

That's something what you want to avoid. In this article, you'll therefore learn about another technique that can be applied - K-fold Cross Validation. By generating train/test splits across multiple folds, you can perform multiple training and testing sessions, with different splits. You'll also see how you can use K-fold Cross Validation with PyTorch, one of the leading libraries for neural networks these days.

After reading this tutorial, you will...

Understand why K-fold Cross Validation can improve your confidence in model evaluation results.
Have an idea about how K-fold Cross Validation works.
Know how to implement K-fold Cross Validation with PyTorch.

Update 29/Mar/2021: fixed possible issue with weight leaks.

Update 15/Feb/2021: fixed small textual error.

Summary and code example: K-fold Cross Validation with PyTorch

Model evaluation is often performed with a hold-out split, where an often 80/20 split is made and where 80% of your dataset is used for training the model. and 20% for evaluating the model. While this is a simple approach, it is also very naïve, since it assumes that data is representative across the splits, that it's not a time series dataset and that there are no redundant samples within the datasets.

K-fold Cross Validation is a more robust evaluation technique. It splits the dataset in \(k-1\) training batches and 1 testing batch across \(k\) folds, or situations. Using the training batches, you can then train your model, and subsequently evaluate it with the testing batch. This allows you to train the model for multiple times with different dataset configurations. Even better, it allows you to be more confident in your model evaluation results.

Below, you will see a full example of using K-fold Cross Validation with PyTorch, using Scikit-learn's KFold functionality. It can be used on the go. If you want to understand things in more detail, however, it's best to continue reading the rest of the tutorial as well! 🚀

``` import os import torch from torch import nn from torchvision.datasets import MNIST from torch.utils.data import DataLoader, ConcatDataset from torchvision import transforms from sklearn.model_selection import KFold

def reset_weights(m): ''' Try resetting model weights to avoid weight leakage. ''' for layer in m.children(): if hasattr(layer, 'reset_parameters'): print(f'Reset trainable parameters of layer = {layer}') layer.reset_parameters()

class SimpleConvNet(nn.Module): ''' Simple Convolutional Neural Network ''' def init(self): super().init() self.layers = nn.Sequential( nn.Conv2d(1, 10, kernel_size=3), nn.ReLU(), nn.Flatten(), nn.Linear(26 * 26 * 10, 50), nn.ReLU(), nn.Linear(50, 20), nn.ReLU(), nn.Linear(20, 10) )

def forward(self, x): '''Forward pass''' return self.layers(x)

if name == 'main':

# Configuration options k_folds = 5 num_epochs = 1 loss_function = nn.CrossEntropyLoss()

# For fold results results = {}

# Set fixed random number seed torch.manual_seed(42)

# Prepare MNIST dataset by concatenating Train/Test part; we split later. dataset_train_part = MNIST(os.getcwd(), download=True, transform=transforms.ToTensor(), train=True) dataset_test_part = MNIST(os.getcwd(), download=True, transform=transforms.ToTensor(), train=False) dataset = ConcatDataset([dataset_train_part, dataset_test_part])

# Define the K-fold Cross Validator kfold = KFold(n_splits=k_folds, shuffle=True)

# Start print print('

Hi, I'm Chris!

I know a thing or two about AI and machine learning. Welcome to MachineCurve.com, where machine learning is explained in gentle terms.

Getting started

Foundation models

Learn how large language models and other foundation models are working and how you can train open source ones yourself.

Keras

Keras is a high-level API for TensorFlow. It is one of the most popular deep learning frameworks.

TensorFlow

TensorFlow is the most popular deep learning framework. It is is used by many companies.

PyTorch

PyTorch is a deep learning framework which is popular for its ease of use and flexibility.

Machine learning theory

Read about the fundamentals of machine learning, deep learning and artificial intelligence.

Transformer architectures

Emerging since 2017, Transformer architectures are part of the state of the art in deep learning.

Most recent articles

January 8, 2024

LLM in a Flash: improving memory requirements of large language models

January 2, 2024

What is Retrieval-Augmented Generation?

December 27, 2023

Building a zero-shot image classifier with CLIP and HuggingFace Transformers

December 27, 2023

In-Context Learning: what it is and how it works

December 22, 2023

CLIP: how it works, how it's trained and how to use it

Article tags

deep learning

k fold cross validation

machine learning

model evaluation

neural network

pytorch

testing data

train test split

Connect on social media

Connect with me on LinkedIn

To get in touch with me, please connect with me on LinkedIn. Make sure to write me a message saying hi!

See my work on GitHub

My work is available on GitHub. Feel free to check it out and see if it can be of use to you!

Side info

The content on this website is written for educational purposes. In writing the articles, I have attempted to be as correct and precise as possible. Should you find any errors, please let me know by creating an issue or pull request in this GitHub repository.

All text on this website written by me is copyrighted and may not be used without prior permission. Creating citations using content from this website is allowed if a reference is added, including an URL reference to the referenced article.

If you have any questions or remarks, feel free to get in touch.

TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.

PyTorch, the PyTorch logo and any related marks are trademarks of The Linux Foundation.

Montserrat and Source Sans are fonts licensed under the SIL Open Font License version 1.1.

Mathjax is licensed under the Apache License, Version 2.0.

How to use K-fold Cross Validation with PyTorch?

February 2, 2021 by Chris

Summary and code example: K-fold Cross Validation with PyTorch

Hi, I'm Chris!

I know a thing or two about AI and machine learning. Welcome to MachineCurve.com, where machine learning is explained in gentle terms.

Getting started

Foundation models

Keras

TensorFlow

PyTorch

Machine learning theory

Transformer architectures

Most recent articles

January 8, 2024

LLM in a Flash: improving memory requirements of large language models

January 2, 2024

What is Retrieval-Augmented Generation?

December 27, 2023

Building a zero-shot image classifier with CLIP and HuggingFace Transformers

December 27, 2023

In-Context Learning: what it is and how it works

December 22, 2023

CLIP: how it works, how it's trained and how to use it

Article tags

Most popular articles

February 18, 2020

How to use K-fold Cross Validation with TensorFlow 2 and Keras?

December 28, 2020

Introduction to Transformers in Machine Learning

December 27, 2021

StyleGAN, a step-by-step introduction

July 17, 2019

This Person Does Not Exist - how does it work?

October 26, 2020

Your First Machine Learning Project with TensorFlow 2.0 and Keras

Connect on social media

Connect with me on LinkedIn

See my work on GitHub

Side info

Getting started

Foundation models

Keras

TensorFlow

PyTorch

Machine learning theory

Transformer architectures

Most popular articles

February 18, 2020

How to use K-fold Cross Validation with TensorFlow 2 and Keras?

December 28, 2020

Introduction to Transformers in Machine Learning

December 27, 2021

StyleGAN, a step-by-step introduction

July 17, 2019

This Person Does Not Exist - how does it work?

October 26, 2020

Your First Machine Learning Project with TensorFlow 2.0 and Keras

Side info

Connect with me on LinkedIn

See my work on GitHub