Background Blurring with Semantic Image Segmentation using DeepLabv3+

Few days ago Microsoft released a version of Teams which uses AI to blur your background during video calls.

ezgif.com-optimize

Here’s a video on YouTube

This gave me an idea to try building this myself using AI. A few months ago Google open sourced DeepLab, a state of the art research for semantic image segmentation

From DeepLab Github

DeepLab: Deep Labelling for Semantic Image Segmentation

DeepLab is a state-of-art deep learning model for semantic image segmentation, where the goal is to assign semantic labels (e.g., person, dog, cat and so on) to every pixel in the input image.

So I used a Keras implementation of DeepLabv3+ to blur my background when I use my webcam

ezgif-3-8d13b793b6

In this post I’ll show you how to build this

image

After installing DeepLabV3+ lets import it along with required libraries

image

Then we create an instance of DeepLabv3 model, initialize our webcam stream, create a window using OpenCV where our result will be displayed and set a value for the background blur that is required

image

In the above code we initialize a loop and read every frame streaming from our webcam. We then resize the input to be 512px wide.
In the following lines a padding is added to each frame to resize it to 512×512 pixels required by DeepLabv3 model, please note that can be changed when initializing the model by specifying your own desired input size.
Another dimension is added to the input frame using numpy expand_dims before calling the predict function of the model.

A labels array containing indices of the maximum values is created after we remove single dimensions from the output result.

image

We then cleanup labels array by padding some values from the bottom of the image. We create a mask over all the black pixels in the output array. Then we resize the input frame from our webcam to match the size of our output labels for fitting our mask we just created
Then we blur this resized frame, in the original frame I replace the values of every dark pixel with values from our blurred frame and show the results. We exit while loop and close the stream after pressing ‘q’ to clean up OpenCV and streaming video frames objects

You can grab the code from my github and if you like it dont forget to give it a Star 🙂

Review of Deep Learning for Computer Vision with Python – Practitioner Bundle

Deep Learning for Computer Vision - Practitioner Bundle Review

Some of you know that I have been reading Adrian Rosebrock’s book – Deep Learning for Computer Vision with PythonDL4CV”, I did a review of the Starter bundle a few months back. Recently I finished reading Practitioner Bundle so here’s a review of this book.

Practitioner bundle starts from where Starter bundle left off. While Starter bundle gives you the necessary introduction to the field of Computer vision and Image processing it’s more geared towards the beginners who are just entering the field of Deep learning for Computer vision, Practitioner Bundle is suited for more real-life uses cases. In this book, Adrian walks you through some of the practical tips and tricks of writing your own Convolution neural networks, training and evaluating them with real data sets and so on

First covered in the book is Data Augmentation technique for your training input data which is a recommended pre-processing step to have better generalization capability in your neural network during inference, Adrian shows you all the necessary steps to do Data augmentation before training any network.

One of the core techniques for enhancing your Neural network and getting accurate predictions is Transfer Learning. It is a technique where you use existing neural networks which are trained on similar data sets and modify them to suit your own training data, in the book Adrian shows you how to do that along with Fine-tuning a neural network.

Adrian also shows what is rank-1, rank-5 accuracy and when to use which one, how to use an ensemble of a neural network to get even higher accuracies, what are different network optimization methods like Adagrad, RMSProp, and Adam when to use each of them with examples.

Like the Starter bundle, this Practitioner bundle is structured in a way that every chapter builds on top of the previous one.  After the chapters on Image Recognition and Classification, Adrian takes the natural next step which is Object Detection along with other more recent researches in Convolutional Neural networks that resulted in Neural Style Transfers, Image Super-Resolution and Generative Adversarial Networks (GANs).

As explained in the Code section of my Starter Bundle blog post, the code only gets better in this bundle where every chapter not only has a code for it but also from the previous chapters which make it easier to get an understanding of how all of the concepts work together. The code is structured in such a way that it’s easier to read with descriptive comments and it’s even ready to be used in your own projects without modifications.Speaking ResNet

Personally, this book has helped me better understand architectures like ResNet, GoogleNet and how to implement them among other things. Adrian has done a really good job of explaining every detail of these architectures including code implementation in Keras, this helped me explain the ResNet architecture to professionals recently in a Computer Vision-based meetup in Karachi that I organized.

In the beginning I had to read several blog posts to learn each concept for example data augmentation, transfer learning or working with large datasets but I still couldn’t piece them together not to mention that code examples varied in quality, DL4CV and the code that accompanies the book has all of those necessary information and more in one place, it has given me that solid understanding that I have used in my own work.

I moderate the largest Deep Learning group on Facebook and see a lot of people who are new in the field of Computer vision struggle with some of these concepts and I often recommend this book to them.

I totally enjoyed reading this book and it’s a book which you’ll either have a copy sitting on your desk or just a click away in your favorite PDF reader on your computer when you’re implementing or training your own networks

Happy reading!

First Computer Vision community meetup in Karachi and next one on ResNet

first_cv_meetup (2)

I helped organize our first community meetup on Computer Vision and it’s applications in Karachi. There were two talks presented by our members.

Ali Shoaib presented and discussed the Dermatologist-level classification of skin cancer with deep neural networks paper and Umair Arif presented the basics of Image Processing and its importance in Computer Vision, he also talked about some of the research that his Computer Vision Research lab is currently doing in terms of Activity Recognition, Automatic Number plate recognition and more

It was great to know people from our Computer Vision community and learn about their work

In the next meet up I will discuss state of the art Computer Vision architecture – ResNet, I will also show it’s implementation in Keras, how to use ResNet to train on CIFAR-10 and Tiny ImageNet datasets, how to get maximum accuracy by using techniques like babysitting model during training, learning rate decay to train maximum epochs and more, I’m excited about that

If you would like to join our next meetup sign up now

Building a Dual-GPU supported Deep Learning Rig for $1500

After some thought and going from cloud to cloud I decided to build my own deep learning rig

Picking up GPU, mobo and processor

Price of a 1080Ti is so high at the moment I decided to settle for an AORUS 1060 Rev 2 GPU with 6Gb memory. I was initially going for the 3Gb variant of this GPU but after reading this post I decided not to, note that there were two other variants of the 1060 6Gb GPU with slightly lower clock speeds so I decided to get a higher clock speed model

Dual GPU support was crucial to me for when GTX 1080Ti becomes reasonably priced again or Nvidia releases newer GPUs, so I decided to get a motherboard and processor that supports at least two GPUs within my budget. I went with the latest Intel Z370 chipset so I chose Asus ROG Strix Z370-H and 8th Generation Core i5 8400 processor to support it. This processor only has max 16 PCIe lanes because Intel doesn’t support more PCI lanes even in 8th Gen, so that means dual GPUs would work in x8 x8 mode only instead of dual x16 mode, which according to some isn’t a big deal.

My mobo supports SLI for dual-GPUs if I ever wish to use them together, but I prefer to do multiple experiments and train models on each GPU. I spent a lot of time on PCPartPicker to check compatibility and with my local vendors for parts availability

Storage, RAM, OS…and everything else

For boot OS and storage respectively I chose an Intel M.2 240 Gb SSD and Seagate Barracuda 1TB hard drive for those large datasets, I have a NAS with 4TB storage for everything else. I choose to use Windows for my work because it also offers great support for deep learning with a GPU.  Since my mobo supports higher speed memory so I got a Corsair LPX 1×16 Gb @ 3000 Mhz. I also chose a Corsair 850W Gold rated power supply unit, which is more than enough power to support dual GPU

I have connected this rig with an HDMI cable to a 27” screen and extended the laptop’s screen with a VGA cable, it’s easier to switch between both with the touch of a button on the screen, but I’m thinking to add another monitor. Putting this all together in a nice Antec GX 330 case

To add backup power I installed 1 KVA Easy Tech UPS that I previously had to support my newest rig in the event of a power failure.

Here’s my full configuration https://pcpartpicker.com/list/MyfWD2

CPU Intel – Core i5-8400 2.8GHz 6-Core Processor
Motherboard Asus – ROG Strix Z370-H Gaming ATX LGA1151 Motherboard
Memory Corsair – Vengeance LPX 16GB (1 x 16GB) DDR4-3000 Memory
Storage Intel – 540s 240GB M.2-2280 Solid State Drive
Storage Seagate – BarraCuda 1TB 3.5″ 7200RPM Internal Hard Drive
Video Card Gigabyte – GeForce GTX 1060 6GB 6GB AORUS Video Card
Power Supply Corsair – RMx 850W 80+ Gold Certified Fully-Modular ATX Power Supply

Benchmarks

I trained a variant of GoogleNet called DeeperGoogleNet from my friend Adrian’s awesome book, 5 epochs take 20 minutes on my 1060 GPU which is comparable to results from Adrian’s mighty Titan X GPU, so I’m quite happy with the results

Finally here are some shots of my unit and my not-so-optimal setup 🙂

20180403_18324020180403_185843

20180403_190037IMG_6909IMG_6917

IMG_6933IMG_6963

Review of Deep Learning for Computer Vision with Python – Starter Bundle

Deep Learning for Computer Vision with Python book

Last month I started reading through Adrian Rosebrock’s latest book Deep Learning for Computer Vision with Python, this book is divided in 3 bundles, Starter, Practitioner and Image Net bundle.

Each bundle is targeted at different audience, for those familiar with Python, Machine Learning and looking to get started with Deep Learning for computer vision there is Starter Bundle and data scientists looking to apply Image Recognition to their own problems can go for Practitioner and Researchers would be more interested in the ImageNet Bundle

Starter Bundle

I completed reading Starter Bundle recently so I decided to share my review in this post.

As oppose to some other books which assume prior knowledge of basics of deep learning with Convolutional Neural Networks and image processing, Adrian starts from the early days and history of deep learning explaining why it didn’t work and why now, then he goes to show the fundamentals of image processing and how they are constructed, this gives a solid foundation for rest of the book especially to newcomers in the field of image processing and computer vision.

Dividing his book in 3 different bundles allows him to expand on every bit of detail that is important, to give you an idea Starter bundle alone consists of 23 chapters, starting from basics to case studies where learners can apply their knowledge with practical code examples. Some other books for the sake of completeness and in favor of keeping low number of pages would generally avoid going in depth into those details.

Prior to reading this book I have learned a lot of this from various courses and blog posts so having seen those topics covered in such a detailed way in one place which builds on top of learnings from the previous chapter looked very refreshing to me

Adrian spends a good amount of time implementing a neural network from scratch and guides you along the way, this is different from other sources who would jump straight into Tensorflow or Keras without building an intuition of the reader. In my opinion if you are new to deep learning, implementing a Neural network yourself is the key to understanding the inner working before diving into frameworks which hide some of the low-level details from you

Code

This book comes with code for each chapter that is not only detailed and easy to understand but if you are an experienced developer it is also ready to be used in your own implementations. Code is covered with very descriptive comments to help you understand what’s going on in every block of code which is very helpful when using it.

Also in his book Adrian highlights some of the subtle details such as the Keras configuration file, in my opinion most of the books would just skip over and go straight to coding. I find this helpful and if it wasn’t for this book I would not have considered this file for a long time, what those values are and how to change them when required, for example Theano uses channels first ordering whereas Tensorflow uses Channels last ordering when processing images so depending on which underlying framework you use for Keras you need to change this setting in the config file. All code examples in the book are also easier to run even in the cloud where I have tested them myself

I have definitely learned new techniques such as how to schedule your learning rate, techniques to spot underfitting and overfitting by babysitting your deep learning models and so on.

Like I covered earlier, this Starter bundle ends by showing you practical case-studies including obtaining and labelling datasets from scratch, training your models and prediction with live-camera or video.

Summary

I totally enjoyed reading this book and I can’t wait to start reading the Practitioner bundle and even implement some of the learning in my own work. If you’re new to deep learning and looking to get started I recommend that you read this book

Happy learning!

Training Deep Learning models on Google Cloud Platform

Recently I have been training my Deep learning models on Google Cloud platform, GCP makes it easier to utilize cloud infrastructure to train your deep learning models.

However it requires some steps that are needed to be taken before your code can be trained in the cloud, such as it requires additional files like __init__.py and setup.py with necessary instructions

share

In this Github repository I have included those details which are necessary for executing and training your code on Google Cloud given you have done necessary changes required in Google Cloud console

Access the Github repo here https://github.com/zubairahmed-ai/Training-Deep-Learning-Models-on-Google-Cloud

Multi-arm bandit algorithm vs classical A/B Testing

How do you know which ad to serve to a customer and get more conversions, or which version of the website to show?

Multi-arm bandit “MAB” works by assigning weights to multiple experiments “arms” using an algorithm known as epsilon-greedy algorithm and uses a explore vs exploit strategy to choose an arm to show

In the ‘classical’ A/B testing you’ll conclude your experiment B is significant if the confidence is more than 95%

MAB is specially useful when you have more than two experiments to run to see which gives better conversions in our case, this is where it truly shines and A/B testing lacks support for this type of testing

Here’s a paper by Google that performs different tests with their results https://support.google.com/analytics/answer/2844870?hl=en 

This article challenges this argument ‘MAB is better than A/B Testing’ using some tests where they compare the two and get similar results  https://vwo.com/blog/multi-armed-bandit-algorithm

Watch this quick intro to learn about Multi-Arm bandit https://www.youtube.com/watch?v=qAvY2tkMHHA

Next read ‘Contextual-bandit’ a strategy that Netflix uses to show personalized artwork of their shows to get maximum views

image

Thoughts on Course 1 of Prof. Andrew Ng’s DeepLearning.ai specialization

I recently passed my first course of the DeepLearning.ai specialization on Coursera and I decided to share some thoughts from someone new to Deep Learning.

Courses

 

Course 1 is structured very nicely, in week 1 it starts from the very basic of Neural Networks, though as the course prerequisite says a user needs some experience with Machine Learning ideally using Python to grasp concepts presented in these lectures.

In the following week basics of binary classification and logistic regression are explained including the cost function, gradient descent and derivatives, as well as basics of vectorization in python and shown with examples why vectorization is so important in Deep Learning.

I feel that there cannot be a simpler way to show this concept than this. Each lecture video in the course builds on the lecture before it which makes it easier to digest information in chunks or go over a particular concept multiple times to fully understand it.

In week 3 you implement a Shallow Neural Network with the knowledge you have gained from previous lectures. You learn that a Shallow Neural Network is a neural network with 1 hidden layer, in this week you build and use activation functions, vectorization, computing costs, gradient descent and more.

Finally in week 4 you learn to implement a fully connected deep neural network including forward and backward propagations, this week also gives difference of parameters and hyper parameters and why they are important, which is covered in Course 2

Assignments

Quiz and Assignments are nicely prepared to help you gauge your own understanding, they cover all important concepts delivered throughout the lectures, I recommend spending some time there to fully understand those lectures before jumping to the following week.

I would highly recommend taking this course to anyone with experience in Machine Learning looking to start with Deep Learning.

Coursera_NN_certificate

How-to approach a ML problem, a beginner’s perspective

How you approach a Machine Learning problem? In my little experience after participating in Kaggle competitions and learning from other kernels below is a set of steps that I have come to follow probably not in the same order

Imputing null values

Removing instances with null values is not a good approach, a lot of people impute null values with mean or in some cases median when there are too many outliers and mean is not a good representation, fortunately scikit-learn makes is really easy to impute null values, so this is probably the first step

Before imputing null values it is important to understand why these values were missed in the first place, was it a result of human error? or it is an industrial system where periodic missing values are common? understanding these will help replacing missing values with appropriate ones

But if you really need to use null values then use a decision trees algorithm

Removing Multicolliearity

When you’re dealing with too many features, it is important to understand inter-correlated features, they don’t necessarily add more value in your model and could negatively affect them. One way that I have found to check them in Python is using VIF or Variable Inflation Factor, setting a moderate value for VIF is important to get rid of unwanted features in your dataset

Outliers and normal distribution

Another very important thing to check is detecting outliers in your dataset and visualizing them using Box plots, I have seen approaches where people set upper and lower limits using np.percentile(array, 0.99) or np.percentile(array, 0.05) and set everything above and below this range to minimum or maximum number returned from the np.percentile()

Also if data is right or left skewed it can hurt your model’s performance so it is important to fix skewness in your data for example by taking a log to get a normal distribution, but if you have zero values in your data then a log transformation cannot work

Label encoding for categorical variables

If data contains categorical variables such as Model of a car (Honda, Toyota, Nissan) it is imperative to convert them to numbers using Label Encoder, One hot encoding or using get_dummies in Pandas where appropriate

Correlation check with output variable or finding the most important features

It is important to check correlation with your target variable and recognize your most important features, Scikit-learn makes it really easy to do, also you can generate a heat map or scatterplot to visualize any correlation and choose your important features

We can also use Random Forrest or XGBoost that gives a list of important features in your dataset, we can start by discarding least important features for getting a better model, we must also use cross-validation for testing our model with held-out training data

Feature Engineering

One important bit that is true for any winning Kaggle competition is building your intuition for data and engineer features, this cannot be emphasized enough and it really takes your creativity and experience to bring new features in your dataset that will make your model more robust

After the above I have seen users build their first model probably using XGBClassifier or XGBRegressor but not before reducing dimensions with PCA or TruncatedSVD and using FeatureUnion for stacking features horizontally if faced with many dimensions, more on this in a later blog post

So questions for you dear reader

What do you think of the above, do you find it helpful? What practices do you follow?

Let me know in the comments

Thank you Salahuddin and Irtaza Shah for reviewing this post and sharing your feedback

Q&A: Given the barrage of information around us..How do you all handle and propose to handle information overload given the above?

A member of the Awesome Artificial Intelligence & Deep Learning Facebook group posted this question to which Arthur Chan replied, I am sharing this Q&A here so that we can refer to it later

Question: [summarized] How to do you keep up with the barrage of information in Machine Learning and Deep Learning?

Answer: [Arthur Chan] I would go with one basic tutorial first – depends on my need, keep on absorbing new material. e.g. I first start from Ng’s class, since I need to learn more about DL on both CV and NLP, I listen to Kapathy’s and Socher’s. Sooner/later you would feel those classes are not as in-depth, that’s when I took the Hinton’s class. For interest, I audit Silver’s class as well.
But narrow it down, one thing at a time. Choose quality material first rather than following sensational hyped news. As you learn more, your judgement would improve on a topic. Then you can start to come up with a list of resources you want to go through.

For a detailed discussion and answers from other members refer to the original post

image

I highly recommend you join this Facebook group