BASIC PROBLEM FACED IN DEVELOPMENT OF DEEP LEARNING MODEL.-OVERFIT , UNDERFIT , GRADIENT EXPLOSION

taher
23 Jun 2023

Developing deep learning model is easy than developing same kind of software . To Build model we need to figure out three components data, architecture and loss function.You can develop complex application by using this three components.You may face mainly four types of problems in development of deep learning model which is termed as bug in software development.

Deep learning or machine learning model is aimed to generalize properly around a data-set.I will try to make you understand generalization term through example.Let say you want to develop a model to distinguished between dog and cat.Whenever you introduce dog or cat image to deep learning model,then model can recognize it as dog or cat.For developing model we have a data-set of dog and cat.And we start to train model.At the time of training we found out that our model achieve accuracy of 98%.But when we valid that model on validation data-set we found out that our model achieve only 60% accuracy.Then in practicality our model is only 60% accurate.But question arises why train accuracy reaches 98%.

Answer to above question is that our model is over-fitted to data-set means we train model for more time due to which model is over train.Model becomes good in recognizing only train data-set.Images which are only available in train data-set is only recognized.After you got worst validation score,you train your model with different hyper parameters and then you achieved again around accuracy score to 95%.And check your model again on validation data-set and now you have achieve 92% accuracy which is quite good.This model have said to be generalized well.

Models which learns more patterns than required in understanding or recognizing an image is called over fitting.And models which learns only those patterns which are necessary in recognizing or understanding is called generalization.

In fear of over-fit you have taken steps before only and train models with that hyper-parameters which is not sufficient ,then that situation is called under-fitting.

Main problems which are faced during model's development in deep learning are OVER-FIT, UNDER-FIT, DATA PIPELINE ERROR and GRADIENT EXPLOSION.We will discuss cause and quick remedies when you find out this problems.

OVER-FIT:-As we discussed above models which learn more pattern than required then that situation is considered as over-fit.This situation can generally be found out when you have high training accuracy and very less validation set accuracy.Strategies to fight with over-fit is to use increase train set, use of dropout layers and L1 , L2 regularization can also be caused sometimes because of bias data.

UNDER-FIT:-When model didn't able to learn basic pattern also for recognizing something than that situation is considered as Under-fit.This situation is found when validation accuracy is much higher than training accuracy.Quick remedies is to reduce dropout and regulator ,if you have used any in your architecture.And try to use more training data to train a model.

DATA PIPELINE ERROR:-This error is really hard to detect.And it is having high probability to be occurred.Some times when your accuracy goes to negative or your loss values goes too high than it is always found that there are some bugs in data pipeline.Try to find out and rectified it.

GRADIENT EXPLOSION:-This type of situation arises when you introduce un-normalized data to architecture which let your gradient to boost to higher values.You have to check your gradient values periodically.If this situation happens, you need to introduce batch normalization in your architecture.

CONCLUSION:-IN this we have seen over-fit, under-fit and generalization concept in details we have also seen problems such as over-fit, under-fit, data pipeline error and gradient explosion.

BASIC PROBLEM FACED IN DEVELOPMENT OF DEEP LEARNING MODEL.-OVERFIT , UNDERFIT , GRADIENT EXPLOSION

Taher Ali Badnawarwala

Leave a Comment

Leave a Reply

Search

Categories

Recent Posts

Introducing Falcon-40B and Falcon-7B: Open-Source Language Models for Enhanced Natural Language Processing

Simplifying Model Size and Inference Time with Falcon 40B Instruct in 4-Bit Quantization

MusicGen: A State-of-the-Art Model for Music Generation by META's (Facebook) Audiocraft

Exploring Llama LLM and Its all Variants: A Revolution in Language Generation

How to Develop an AI Mobile Application: Choosing the Right Technology Stack for Faster and Efficient Development

Tags

Staff Augmentation

AI Development

Mobile App Development

Web App Development

IT Consulting Services

Emerging Technologies

Quality Assurance

Artificial Intelligence

BASIC PROBLEM FACED IN DEVELOPMENT OF DEEP LEARNING MODEL.-OVERFIT , UNDERFIT , GRADIENT EXPLOSION

Taher Ali Badnawarwala

Leave a Comment

Leave a Reply

Search

Categories

Recent Posts

Introducing Falcon-40B and Falcon-7B: Open-Source Language Models for Enhanced Natural Language Processing

Simplifying Model Size and Inference Time with Falcon 40B Instruct in 4-Bit Quantization

MusicGen: A State-of-the-Art Model for Music Generation by META's (Facebook) Audiocraft

Exploring Llama LLM and Its all Variants: A Revolution in Language Generation

How to Develop an AI Mobile Application: Choosing the Right Technology Stack for Faster and Efficient Development

Tags