8 تغريدة 6 قراءة Mar 02, 2024
Standardization vs Normalization.
What is the difference?
Normalization rescales the values into a range of [0,1].
It is also known as Min-Max scaling.
It is useful if you want different datasets to be on the same positive scale.
It also creates a boundary for values.
Consider this:
You analyze the stock market.
One stock might range from $10 to $50 over a year, while another could range from $100 to $500.
Without normalization the scales are different.
If you normalize the prices the relative performance can be compared easily.
Standardization rescales data so that it has a mean of 0 and a standard deviation of 1.
It is good for algorithms that assume data is centered around 0 and that the variance is the same for all features.
Here there is no boundary created for the data.
Consider this:
A hotel is analyzing guest feedback.
- The scores for cleanliness are low (people are rating it harshly)
- Friendliness is rated well.
If we compare these two results, cleanliness is definitely something the Hotel needs to improve.
Here is the trick:
What if people generally are more critical about this attribute and they always give lower scores to cleanliness than friendliness.
We cannot compare pure feedback scores.
By standardizing the data we can compare the results and decide which areas are truly underperforming.
Did you like this post?
Hit that follow button for me and pay back with your support.
It literally takes 1 second for you but makes me 10x happier.
Thanks 😉
If you haven't already, join our newsletter DSBoost.
We share:
• Podcast notes
• Learning resources
• Interesting collections of content
dsboost.dev

جاري تحميل الاقتراحات...