Vikas Rajput
Vikas Rajput

@vikasrajputin

11 تغريدة 2 قراءة Sep 12, 2023
No matter if you're learning SQL for Data Science or Software Development.
You should master SQL "Normalization".
Here's a simple guide to get you started:
1. Introduction
Normalization is a process to eliminate redundant data, prevent data anomalies, and ensure data integrity.
It makes your life super easy to manage, store and query data effectively in longer run!
Normalization happens in multiple stages:
1NF, 2NF, 3NF
2. Realworld Example
To understand normalization, let's consider a simple restaurant database example.
We have one table "Orders" with columns: OrderID, CustomerName, Dish, Chef, DishType, Price.
3. First Normal Form (1NF)
To satisfy 1NF, data must be atomic (indivisible), and there must be no duplicate rows.
Since, our table has no repeating groups and each cell contains atomic values, it's in 1NF already.
2. Second Normal Form (2NF)
If a table is in 1NF and all non-key columns are fully dependent on the primary key, then you can say a table is in 2NF.
Notice how our Dish, Chef, DishType, and Price data are all repeated?
So, we can split table into "Orders" and "Dishes" tables:
As shown above, we've eliminated redundant Dish, Chef, DishType, and Price data.
This change makes our tables smaller and our updates more efficient - we only have to update one record in the Dishes table to change the price of a dish!
4. Third Normal Form(3NF)
It requires a table to be in 2NF and all columns are only dependent on the primary key.
This means there are no transitive dependencies.
Notice how in the Dishes table, Chef is dependent on the Dish, but not the DishID?
So, for 3NF, we'll further split "Dishes" into "Dishes" and "Chefs".
The Dishes table will now contain DishID, Dish, DishType, and Price, while the Chefs table will have ChefID and Chef.
We will replace Chef with ChefID in the Dishes table.
Now we're in 3NF!
Normalization helps to reduce data redundancy and improve data integrity.
This is especially important as databases grow, making it easier to manage, faster to query, and less error-prone.
Note: As a beginner, knowing upto 3NF normalization is a good starting point to start practising normalization.
But, there're other stages as well like Boyce-Codd Normal Form, 4NF, 5NF, etc to further normalize a more complex data.
Hope you find it useful!
📣 Announcement
You can now have a conversation with me!
If you're feeling stuck in your tech career and need personalized guidance, you can book a 1:1 call with me.
Visit topmate.io for more information.

جاري تحميل الاقتراحات...