Using Constant Padding, Reflection Padding and Replication Padding with TensorFlow and Keras

If you're training Convolutional Neural Networks with Keras, it may be that you don't want the size of your feature maps to be smaller than the size of your inputs. For example, because you're using a Conv layer in an autoencoder - where your goal is to generate a final feature map, not reduce the size of its output.

Fortunately, this is possible with padding, which essentially puts your feature map inside a frame that combined has the same size as your input data. Unfortunately, the Keras framework for deep learning only supports Zero Padding by design. This is especially unfortunate because there are types of padding - such as Reflection Padding and Replication Padding - which may interfere less with the distribution of your data during training.

Now, there's no point in giving up :) That's why we got inspired by an answer on StackOverflow and got to work (StackOverflow, n.d.). By consequence, this blog post presents implementations of Constant Padding, Reflection Padding and Replication Padding to be used with TensorFlow 2.0 based Keras. The implementations are available for 1D and 2D data. Besides the implementation, it will also show you how to use them in an actual Keras model 👩‍💻.

Are you ready? Let's go! 😎

Update 05/Nov/2020: added 'TensorFlow' to the title in order to reflect the deep integration between TensorFlow and Keras in TensorFlow 2.x.

Recap: what is padding and why is it useful?

Suppose that you are training a convolutional neural network, which is a type of neural network where so-called "convolutional layers" serve as feature extractors:

In the drawing above, some input data (which is likely an RGB image) of height \(H\) and width \(W\) is fed to a convolutional layer. This layer, which slides (or "convolves") \(N\) kernels of size 3x3x3 over the input, produces \(N\) so-called "feature maps" as output. Through the weights of the kernels, which have been optimized based on the training dataset, the neural network learns to recognize featues in the input image.

Note that often, a convolutional neural network consists of quite a few convolutional layers stacked on top of each other. In this case, the feature map that is the output of the first layer, is used as the input of the second, and so on.

Now, due to the way such layers work, the size of one feature map (e.g. \(H_{fm}\) and \(W_{fm}\) in the image above) is smaller than the size of the input to the layer (\(H\) and \(W\)). However, sometimes, you don't want this to happen. Rather, you wish that the size of the feature map is equal - or perhaps larger - than the size of your input data.

Padding can be used to achieve this. By wrapping the outcome in some "frame", you can ensure that the size of the outputs are equal to those of the input. However, what does this frame look like? In our article about padding, we saw that zeros are often used for this. However, we also saw that this might result in worse performance due to the fact that zero padding is claimed to interfere with the distribution of your dataset. Reflection padding and replication padding are introduced as possible fixes for this issue, together with constant padding.

Unfortunately, Keras does not support this, as it only supports zero padding. That's why the rest of this blog will introduce constant padding, reflection padding and replication padding to Keras. The code below is compatible with TensorFlow 2.0 based Keras and hence should still work for quite some time from now. If not, feel free to leave a message in the comments box, and I'll try to fix it for you :)

Let's take a look at the first type: using constant padding with Keras 😎

Constant padding

The first type of padding that we'll make available for Keras: constant padding 😉

What is constant padding?

Let's take a look at what constant padding does by means of this schematic drawing:

As you can see, the feature maps that are the output of the Conv2D layer that is applied to the input data, are smaller than the input data itself. This is perfectly normal, and normally, one would apply zero padding. However, can't we pad with a constant value \(c\) instead of zeros?

Yes!

This is what constant padding does: the "frame" around the feature maps which ensures that their size equals the size of the input data, is filled with the specified \(c\). Let's now take a look at Keras implementations for 1D and 2D data :)

Keras ConstantPadding1D

First, constant padding for 1D data - a.k.a. ConstantPadding1D:

from tensorflow import pad
from tensorflow.keras.layers import Layer

'''
  1D Constant Padding
  Attributes:
    - padding: (padding_left, padding_right) tuple
    - constant: int (default = 0)
'''
class ConstantPadding1D(Layer):
    def __init__(self, padding=(1, 1), constant=0, **kwargs):
        self.padding = tuple(padding)
        self.constant = constant
        super(ConstantPadding1D, self).__init__(**kwargs)

    def compute_output_shape(self, input_shape):
        return input_shape[1] + self.padding[0] + self.padding[1]

    def call(self, input_tensor, mask=None):
        padding_left, padding_right = self.padding
        return pad(input_tensor,  [[0, 0], [padding_left, padding_right], [0, 0]], mode='CONSTANT', constant_values=self.constant)

The code above effectively defines a new layer type for Keras, which we call ConstantPadding1D. It's defined as a class and hence can be initialized multiple times. It is composed of three definitions:

__init__, which is the class constructor, and serves to lift the variables passed on creation (padding and constant, respectively) into class scope, which means that every definition can use them.
compute_output_shape, which does what it suggests: it computes the output shape for the layer. In our case, that's the new shape of our Conv1D output data, after padding is applied as well.
call, which is where the data (the input_tensor) flows through.

Results for 1D Constant Padding

Now, let's take a look at whether it works. If we applied ConstantPadding1D with constant = 0 and padding = (5, 4) after a Conv1D layer with a kernel_size = 10, we should expect to see Zero Padding applied to 1D data:

Indeed, the left and the right of the padded feature map clearly show the zeroes being padded successfully. This is supported even more by the fact that if changed into constant = 23, the padding changes color, as expected:

In both padding cases, note that the "left side" of the input is very dark, and that this darkness is also visible in the feature map. This provides some trust that it's the actual feature map that we visualize :)

Keras ConstantPadding2D

Here's ConstantPadding2D:

from tensorflow import pad
from tensorflow.keras.layers import Layer

'''
  2D Constant Padding
  Attributes:
    - padding: (padding_width, padding_height) tuple
    - constant: int (default = 0)
'''
class ConstantPadding2D(Layer):
    def __init__(self, padding=(1, 1), constant=0, **kwargs):
        self.padding = tuple(padding)
        self.constant = constant
        super(ConstantPadding2D, self).__init__(**kwargs)

    def compute_output_shape(self, input_shape):
        return (input_shape[0], input_shape[1] + 2 * self.padding[0], input_shape[2] + 2 * self.padding[1], input_shape[3])

    def call(self, input_tensor, mask=None):
        padding_width, padding_height = self.padding
        return pad(input_tensor, [[0,0], [padding_height, padding_height], [padding_width, padding_width], [0,0] ], mode='CONSTANT', constant_values=self.constant)

The code is pretty similar to the one of ConstantPadding1D:

It still represents a new Keras layer, having the __init__, compute_output_shape and call definitions.
The output shape that is computed by compute_output_shape differs from the 1D version, for the simple reason that both produce a different shape :)
The paddings attribute that is applied to the pad function is also different, and suitable for 2D padding.

Results for 2D Constant Padding

Now, time for results :) Applying ConstantPadding2D with constant = 0 equals Zero Padding:

However, the strength of ConstantPadding2D over Keras built-in ZeroPadding2D is that you can use any constant, as with ConstantPadding1D. For example, with constant = 23, this is what you get:

Great! :D

Reflection padding

The second type of padding that we'll make available for Keras: reflection padding 😉

What is reflection padding?

In order to understand reflection padding, it's important that we first take a look at this schematic drawing of \((1, 2)\) padding which, by coincidence ;-), we call "reflection padding":

Let's now take a look at the first row of our unpadded input, i.e. the yellow box. It's \([3, 5, 1]\). Reflection padding essentially uses the contents of this row for padding the values directly next to it. For example, move to the right, i.e. from 3 to 1. Then, move one additional box to the right - you'll find a 5. Hey, that's the middle value of our row. Then, you'll find a 3. Hey, that's the first value! And so on. You see the same happening on the left, and on top.

Reflection padding thus "reflects" the row into the padding. This is useful because it ensures that your outputs will transition "smoothly" into the padding. Possibly, this improves the performance of your model, because padded inputs will still look like the original ones in terms of data distribution (Liu et al., 2018).

Keras ReflectionPadding1D

Here's the implementation for 1D data, i.e. ReflectionPadding1D:

from tensorflow import pad
from tensorflow.keras.layers import Layer

'''
  1D Reflection Padding
  Attributes:
    - padding: (padding_left, padding_right) tuple
'''
class ReflectionPadding1D(Layer):
    def __init__(self, padding=(1, 1), **kwargs):
        self.padding = tuple(padding)
        super(ReflectionPadding1D, self).__init__(**kwargs)

    def compute_output_shape(self, input_shape):
        return input_shape[1] + self.padding[0] + self.padding[1]

    def call(self, input_tensor, mask=None):
        padding_left, padding_right = self.padding
        return pad(input_tensor,  [[0, 0], [padding_left, padding_right], [0, 0]], mode='REFLECT')

Once again, the class is similar to the paddings we've already seen. However, the contents of the call operation are different. Particularly, the mode has changed, from CONSTANT into REFLECT. What's more, the constant_values attribute was removed, and so was the self.constant assignment, simply because we don't need them here.

Results for 1D Reflection Padding

Now, when we apply this padding to an 1D input, we can see how it works. Firstly, the kernel reduces the input into a feature map - as you can see, the "dark area" on the left of your input has moved to approximately position 5 in the feature map. It's also 5 times smaller, which makes sense, given our large kernel size.

However, what you'll also see, is that from the "max value" in your feature map at around 5, exact symmetry is visible on the left and on the right. This is indeed reflection padding in action!

Keras ReflectionPadding2D

Now, let's take a look at 2D Reflection Padding, or ReflectionPadding2D:

from tensorflow import pad
from tensorflow.keras.layers import Layer

'''
  2D Reflection Padding
  Attributes:
    - padding: (padding_width, padding_height) tuple
'''
class ReflectionPadding2D(Layer):
    def __init__(self, padding=(1, 1), **kwargs):
        self.padding = tuple(padding)
        super(ReflectionPadding2D, self).__init__(**kwargs)

    def compute_output_shape(self, input_shape):
        return (input_shape[0], input_shape[1] + 2 * self.padding[0], input_shape[2] + 2 * self.padding[1], input_shape[3])

    def call(self, input_tensor, mask=None):
        padding_width, padding_height = self.padding
        return pad(input_tensor, [[0,0], [padding_height, padding_height], [padding_width, padding_width], [0,0] ], 'REFLECT')

The value for compute_output_shape is equal to ConstantPadding2D. So is the call operation, except for CONSTANT -> REFLECT and the removal of self.constant. Nothing too exciting :)

Results for 2D Reflection Padding

...except for the results, perhaps :) Using 2D data, the effect of reflection padding is even more visible. As you can see, with a relatively large kernel, the input is reduced to a more abstract feature map. However, the feature map has the same size as the input data, and shows perfect symmetry around the edges. Reflection padding in action! :)

Replication padding

The third type of padding that we'll make available for Keras: replication padding 😉

What is replication padding?

Replication padding is pretty similar to reflection padding, actually, and attempts to achieve the same outcome: that the distribution of your data is disturbed as little as possible (Liu et al., 2018).

However, it does so in a slightly different way:

Instead of a pure reflection, like Reflection Padding, replication padding makes a copy of the input, reverses it, and then apply it. Take a look at the first row again: \([3, 5, 1]\) -> \([1, 5, 3]\), after which it's applied. In the results, you should thus see a more broad "transition zone" from input to padding.

In TensorFlow, replication padding is not known by the name "replication padding". Instead, in TF, it's called "symmetric padding". Hence, we'll use SYMMETRIC as our padding mode throughout the 1D and 2D examples that will follow next.

Keras ReplicationPadding1D

Here's the code for ReplicationPadding1D:

from tensorflow import pad
from tensorflow.keras.layers import Layer

'''
  1D Replication Padding
  Attributes:
    - padding: (padding_left, padding_right) tuple
'''
class ReplicationPadding1D(Layer):
    def __init__(self, padding=(1, 1), **kwargs):
        self.padding = tuple(padding)
        super(ReplicationPadding1D, self).__init__(**kwargs)

    def compute_output_shape(self, input_shape):
        return input_shape[1] + self.padding[0] + self.padding[1]

    def call(self, input_tensor, mask=None):
        padding_left, padding_right = self.padding
        return pad(input_tensor,  [[0, 0], [padding_left, padding_right], [0, 0]], mode='SYMMETRIC')

Contrary to reflection padding, not much was changed: REFLECT -> SYMMETRIC.

Results for 1D Replication Padding

Now, let's take a look at the results :)

Indeed, it's clear that the "transition zone" between input and padding is broader. Compared to reflection padding, the gray zone around position 5 is broader. This, obviously, is caused by the copy instead of reflection that replication padding makes:

Keras ReplicationPadding2D

Now, the code for 2D Replication Padding a.k.a. ReplicationPadding2D:

from tensorflow import pad
from tensorflow.keras.layers import Layer

'''
  2D Replication Padding
  Attributes:
    - padding: (padding_width, padding_height) tuple
'''
class ReplicationPadding2D(Layer):
    def __init__(self, padding=(1, 1), **kwargs):
        self.padding = tuple(padding)
        super(ReplicationPadding2D, self).__init__(**kwargs)

    def compute_output_shape(self, input_shape):
        return (input_shape[0], input_shape[1] + 2 * self.padding[0], input_shape[2] + 2 * self.padding[1], input_shape[3])

    def call(self, input_tensor, mask=None):
        padding_width, padding_height = self.padding
        return pad(input_tensor, [[0,0], [padding_height, padding_height], [padding_width, padding_width], [0,0] ], 'SYMMETRIC')

Results for 2D Replication Padding

Here, the difference between replication and reflection padding is visible even better. The feature map is generated - i.e., it's more abstract than the input data - and it is padded smoothly. However, the "transition zone" is broader than with reflection padding - and this is clearly visible on the bottom right of the feature map:

Using these paddings in a Keras model

Okay, time to show you how to use these paddings :)

We've made ConstantPadding, ReflectionPadding and ReplicationPadding available for Conv1D and Conv2D layers in Keras, but we still don't know how to use them.

Here's an example for 2D padding, where we create a Sequential model, apply a Conv2D layer and subsequently apply both replication and constant padding. Obviously, this will produce a padded feature map that is larger than our original input. However, we just wanted to show how to apply replication/reflection padding, and constant padding, as you require an additional parameter there :)

model = Sequential()
model.add(Conv2D(img_num_channels, kernel_size=(5, 5), activation='linear', input_shape=input_shape, kernel_initializer=Ones(), bias_initializer=Ones()))
model.add(ReplicationPadding2D(padding=(3, 3)))
model.add(ConstantPadding1D(padding=(5, 4), constant=23))

Summary

In this blog post, you found how to use Constant Padding, Reflection Padding and Replication Padding with Keras using TensorFlow. The blog started with a recap on padding, showing that you might need it if you want your Conv-generated feature maps to be of equal size to your input data. This was followed by discussing how Keras only supports zero padding, while more advanced paddings are available.

We subsequently provided Python based implementations of these paddings, and gave an example of how to apply them into your Keras models.

I hope you've learnt something from this blog or that it was useful! :) If it was, feel free to leave a comment in the comments section. Please do the same if you think I made mistakes, or when you have questions or remarks.

Thank you for reading MachineCurve today and happy engineering! 😎

References

StackOverflow. (n.d.). Reflection padding Conv2D. Retrieved from https://stackoverflow.com/questions/50677544/reflection-padding-conv2d

MachineCurve. (2020, February 9). What is padding in a neural network? Retrieved from https://www.machinecurve.com/index.php/2020/02/07/what-is-padding-in-a-neural-network/

Liu, G., Shih, K. J., Wang, T. C., Reda, F. A., Sapra, K., Yu, Z., … & Catanzaro, B. (2018). Partial convolution based padding. arXiv preprint arXiv:1811.11718.

TensorFlow. (n.d.). tf.pad. Retrieved from https://www.tensorflow.org/api_docs/python/tf/pad

Hi, I'm Chris!

I know a thing or two about AI and machine learning. Welcome to MachineCurve.com, where machine learning is explained in gentle terms.

Getting started

Foundation models

Learn how large language models and other foundation models are working and how you can train open source ones yourself.

Keras

Keras is a high-level API for TensorFlow. It is one of the most popular deep learning frameworks.

TensorFlow

TensorFlow is the most popular deep learning framework. It is is used by many companies.

PyTorch

PyTorch is a deep learning framework which is popular for its ease of use and flexibility.

Machine learning theory

Read about the fundamentals of machine learning, deep learning and artificial intelligence.

Transformer architectures

Emerging since 2017, Transformer architectures are part of the state of the art in deep learning.

Most recent articles

January 8, 2024

LLM in a Flash: improving memory requirements of large language models

January 2, 2024

What is Retrieval-Augmented Generation?

December 27, 2023

Building a zero-shot image classifier with CLIP and HuggingFace Transformers

December 27, 2023

In-Context Learning: what it is and how it works

December 22, 2023

CLIP: how it works, how it's trained and how to use it

Article tags

constant padding

convolutional neural networks

keras

neural network

padding

reflection padding

replication padding

Connect on social media

Connect with me on LinkedIn

To get in touch with me, please connect with me on LinkedIn. Make sure to write me a message saying hi!

See my work on GitHub

My work is available on GitHub. Feel free to check it out and see if it can be of use to you!

Side info

The content on this website is written for educational purposes. In writing the articles, I have attempted to be as correct and precise as possible. Should you find any errors, please let me know by creating an issue or pull request in this GitHub repository.

All text on this website written by me is copyrighted and may not be used without prior permission. Creating citations using content from this website is allowed if a reference is added, including an URL reference to the referenced article.

If you have any questions or remarks, feel free to get in touch.

TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.

PyTorch, the PyTorch logo and any related marks are trademarks of The Linux Foundation.

Montserrat and Source Sans are fonts licensed under the SIL Open Font License version 1.1.

Mathjax is licensed under the Apache License, Version 2.0.

Using Constant Padding, Reflection Padding and Replication Padding with TensorFlow and Keras

February 10, 2020 by Chris

Recap: what is padding and why is it useful?

Constant padding

What is constant padding?

Keras ConstantPadding1D

Results for 1D Constant Padding

Keras ConstantPadding2D

Results for 2D Constant Padding

Reflection padding

What is reflection padding?

Keras ReflectionPadding1D

Results for 1D Reflection Padding

Keras ReflectionPadding2D

Results for 2D Reflection Padding

Replication padding

What is replication padding?

Keras ReplicationPadding1D

Results for 1D Replication Padding

Keras ReplicationPadding2D

Results for 2D Replication Padding

Using these paddings in a Keras model

Summary

References

Hi, I'm Chris!

I know a thing or two about AI and machine learning. Welcome to MachineCurve.com, where machine learning is explained in gentle terms.

Getting started

Foundation models

Keras

TensorFlow

PyTorch

Machine learning theory

Transformer architectures

Most recent articles

January 8, 2024

LLM in a Flash: improving memory requirements of large language models

January 2, 2024

What is Retrieval-Augmented Generation?

December 27, 2023

Building a zero-shot image classifier with CLIP and HuggingFace Transformers

December 27, 2023

In-Context Learning: what it is and how it works

December 22, 2023

CLIP: how it works, how it's trained and how to use it

Article tags

Most popular articles

February 18, 2020

How to use K-fold Cross Validation with TensorFlow 2 and Keras?

December 28, 2020

Introduction to Transformers in Machine Learning

December 27, 2021

StyleGAN, a step-by-step introduction

July 17, 2019

This Person Does Not Exist - how does it work?

October 26, 2020

Your First Machine Learning Project with TensorFlow 2.0 and Keras

Connect on social media

Connect with me on LinkedIn

See my work on GitHub

Side info

Getting started

Foundation models

Keras

TensorFlow

PyTorch

Machine learning theory

Transformer architectures

Most popular articles

February 18, 2020

How to use K-fold Cross Validation with TensorFlow 2 and Keras?

December 28, 2020

Introduction to Transformers in Machine Learning

December 27, 2021

StyleGAN, a step-by-step introduction

July 17, 2019

This Person Does Not Exist - how does it work?

October 26, 2020

Your First Machine Learning Project with TensorFlow 2.0 and Keras

Side info

Connect with me on LinkedIn