In the previous blog How Computer Vision Can Learn Everything That We See, we went over what type of data an image is and how we can interpret it. We also talked about how to label images and how this is used to train a model. But what if there is little to no data available? Is there a way to generate new data from existing images? And when do we have enough? Thankfully, we can solve this problem by using data augmentation, a technique used to alter images and create new trainingsdata (images). Some examples are scaling, rotating, stretching, and many more advanced techniques. In this blog we will go over a few of these techniques and explain how they work and when to use them.
How does data augmentation work?
When starting with the first model trainings, there is often insufficient data available. The reasons are maybe because the cameras have just been installed or there are not enough image variations in the data (e.g. only day images but not night images). To handle this limitation, we want to generate extra data to have enough training in order to create a robust and accurate model in varying environments that will occur in practice.
Now let’s talk about numbers! If we use 7 techniques (explained below), we can generate 7 new images for each existing image. It happens when we only apply each technique just once. However, we can apply each technique several times with different settings. For example, when using rotation we can turn the image 90 degrees clockwise, but also 45 degrees, or 180, and so on. As you can see, we can generate a lot of new images from a single image. The question is how many images do we need to generate from each image? And when do we have too many images generated from the original image? This all depends on the application, not all data augmentation techniques work for all applications. Maybe we do not want to rotate the image because we are confident that the object is always oriented in the same way.
Is data augmentation limited to computer vision?
The answer is no. Data augmentation can be also be used to generate new data in other machine learning applications. At Mediaan we always use data augmentation when (re)training a model. In the case of computer vision, it’s the more the better. The more different images we can put into the model, the better it will be trained and the more capable it will handle new, unseen situations. From initial training, where we use limited data augmentation, to the final model. Where we optimized the data augmentation for the use case, we usually see a significant increase in accuracy.
In a concrete case with farmer Piet, we use rescaling, rotation, color, and exposure (brightness) augmentations. Using these techniques, we went from having 200 images to 2000 images. Without data augmentation, the accuracy was only 59%. However. when we used data augmentation, the accuracy went up to 83%! An increase of 24% for the same number of original images! Great, right?
When it comes to applying data augmentation, there is no one-size-fits-all. Which techniques can be used depends heavily on the type of application and camera setup. At Mediaan, we always start with a base set of data augmentations that we know will work. We optimize the generation of new training images through experiments to find the best solution for the given use case.
Ready to give computer vision a shot and get the most out of your security camera? We are always ready to help and find a solution YOU want! Still not completely convinced? Take a look at our 4 Easy & Simple Steps To Start Using Computer Vision blog and find out how we can help you gain extra insights for your business by using computer vision. Have a look at our other related Computer Vision blogs:
This blog is written by Niels Munters – Data Scientist at Mediaan.