This post is a comprehensive review of Data Augmentation techniques for Deep Learning, specific to images. Data augmentation is one of the regularization technique. It consists of generating new training instances from existing ones, artificially boosting the size of the training set.
It will reduce the overfitting. The trick is to generate realistic training instances. Ideally a human should not be able to tell which instances were generated and which ones were not. Many of these tricks are used in Convolutional Neural Network. However, instances should be generated on the fly during training, which computationally efficient, rather than wasting storage space and network bandwidth. Tensorflow offers several image manipulation operations such as transposing (shifting), rotating, resizing, flipping and cropping, as well as adjusting the brightness, contrast, saturation and hue. This makes it easy to implement data augmentation for image datasets.
In order to illustrate the different augmentation techniques we need a image, which is a cute corgi puppy picture!
The image is RGB with height 187 and width 269.
Flip
You can flip images horizontally and vertically. Some frameworks do not provide function for vertical flips. But, a vertical flip is equivalent to rotating an image by 180 degrees and then performing a horizontal flip.
Vertical Flip
Horizontal Flip
Random Flipping
Randomly flips an image vertically (upside down) with a 1 in 2 chance, otherwise output the image as-is.
Randomly flips an image horizontally (left to right) with a 1 in 2 chance, otherwise output the image as-is.
Alternatively you can also use tf.reverse for the same. tf.reverse accepts an additional argument i.e. axis which defines where the image should be flipped along x axis or y axis.
Rotation
NOTE: k denotes the number of times the image is rotated by 90 degrees anti-clockwise.
In order to rotate in any angle, we use tf.contrib.image.rotate() function of Tensorflow. In the example below, angles is in radians, which is angels = degrees * math.pi / 180. Let’s do 135 degrees anticlockwise rotation:
Brightness
Changes the brightness of an image
Where delta is the amount of the value to be added to each pixel. The larger the value of the delta is the brighter the image will be. If delta is negative than the image will be dark. If you want to apply the random brightness, where a delta randomly picked in the interval [-max_delta, max_delta), then you can use the function below:
Crop
Central Crop
Crops the central region of the image(s). Remove the outer parts of an image but retain the central region of the image along each dimension. If we specify central_fraction = 0.5, this function returns the region marked with “X” in the below diagram.
This function works on either a single image (image is a 3-D Tensor), or a batch of images (image is a 4-D Tensor).
crop_to_bounding_box
Crops an image to a specified bounding box.
This op cuts a rectangular part out of image. The top-left corner of the returned image is at offset_height, offset_width in image, and its lower-right corner is at offset_height + target_height, offset_width + target_width.
Random Crop
Unlike scaling, we just randomly sample a section from the original image. We then resize this section to the original image size. This method is popularly known as random cropping.
Since we set the seed, it will always crop the same area. However, if seed=None, every time you run this cell, a different part of the image will be cropped and resized to original size.
Gaussian Noise
Adding just the right amount of noise can enhance the learning capability.
Color augmentations
Color augmentations are applicable to almost every image learning task. In Tensorflow there are three color augmentations readily available: hue, saturation and contrast. These functions only require a range and will result in an unique augmentation for each image.
Other Methods
Apart from the above methods there are similar methods which can be used for image augmentation as below: