Developing deep learning model is easy than developing same kind of software . To Build model we need to figure out three components data, architecture and loss function.You can develop complex application by using this three components.You may face mainly four types of problems in development of deep learning model which is termed as bug in software development.


Deep learning or machine learning model is aimed to generalize properly around a data-set.I will try to make you understand generalization term through example.Let say you want to develop a model to distinguished between dog and cat.Whenever you introduce dog or cat image to deep learning model,then model can recognize it as dog or cat.For developing model we have a data-set of dog and cat.And we start to train model.At the time of training we found out that our model achieve accuracy of 98%.But when we valid that model on validation data-set we found out that our model achieve only 60% accuracy.Then in practicality our model is only 60% accurate.But question arises why train accuracy reaches 98%.

Answer to above question is that our model is over-fitted to data-set means we train model for more time due to which model is over train.Model becomes good in recognizing only train data-set.Images which are only available in train data-set is only recognized.After you got worst validation score,you train your model with different hyper parameters and then you achieved again around accuracy score to 95%.And check your model again on validation data-set and now you have achieve 92% accuracy which is quite good.This model have said to be generalized well.


Models which learns more patterns than required in understanding or recognizing an image is called over fitting.And models which learns only those patterns which are necessary in recognizing or understanding is called generalization.


In fear of over-fit you have taken steps before only and train models with that hyper-parameters which is not sufficient ,then that situation is called under-fitting.

Main problems which are faced during model's development in deep learning are OVER-FIT, UNDER-FIT, DATA PIPELINE ERROR and GRADIENT EXPLOSION.We will discuss cause and quick remedies when you find out this problems.


OVER-FIT:-As we discussed above models which learn more pattern than required then that situation is considered as over-fit.This situation can generally be found out when you have high training accuracy and very less validation set accuracy.Strategies to fight with over-fit is to use increase train set, use of dropout layers and L1 , L2 regularization can also be caused sometimes because of bias data.


UNDER-FIT:-When model didn't able to learn basic pattern also for recognizing something than that situation is considered as Under-fit.This situation is found when validation accuracy is much higher than training accuracy.Quick remedies is to reduce dropout and regulator ,if you have used any in your architecture.And try to use more training data to train a model.

DATA PIPELINE ERROR:-This error is really hard to detect.And it is having high probability to be occurred.Some times when your accuracy goes to negative or your loss values goes too high than it is always found that there are some bugs in data pipeline.Try to find out and rectified it.


GRADIENT EXPLOSION:-This type of situation arises when you introduce un-normalized data to architecture which let your gradient to boost to higher values.You have to check your gradient values periodically.If this situation happens, you need to introduce batch normalization in your architecture.


CONCLUSION:-IN this we have seen over-fit, under-fit and generalization concept in details we have also seen problems such as over-fit, under-fit, data pipeline error and gradient explosion.


Taher Ali Badnawarwala

Taher Ali, drives to create something special, He loves swimming ,family and AI from depth of his heart . He loves to write and make videos about AI and its usage

Leave a Comment

No Comments Yet

Leave a Reply

Your email address will not be published. Required fields are marked *