Data Normalization Hello World with MSSQL

Jay (Vijayasimha BR)
3 min readFeb 16, 2025

--

A charming and whimsical stylized painting of two indian women with long hair, dressed in crop tops and shorts, looking over brass pots of varying sizes, placed on a kitchen table. The artwork is created in a mixed media, textured art style, giving it a playful, colorful feel with rich, layered textures.

I came across this term called ‘data normalization’ in my ongoing DP 900 cert preparation.

First up, make sure that you have a MSSQL database ready to roll in Azure Portal, or some other alternative of your choice.

First, we need a ‘non-normalized’ table with some data. Here are some queries that will give you that.

CREATE TABLE StudentsCourses (
StudentID INT,
StudentName VARCHAR(100),
CourseID INT,
CourseName VARCHAR(100),
Instructor VARCHAR(100)
);

Here, notice how everything is combined into a single table. The student details, the course details and also the instructor details. That is a problem. It’s just bad table design, and one of the ways to make it better, is to ‘normalize’ this by designing a better table system.

If we put some data into this, like this…

INSERT INTO StudentsCourses (StudentID, StudentName, CourseID, CourseName, Instructor) VALUES
(1, 'Alice', 101, 'Math', 'Prof. A'),
(1, 'Alice', 102, 'Science', 'Prof. B'),
(2, 'Bob', 101, 'Math', 'Prof. A'),
(2, 'Bob', 103, 'History', 'Prof. C');

and we get something like this.

Right now, we normalize, whilst following the principles laid out for the same. (borrowed from this Microsoft learn chapter)

  1. Separate each entity into its own table.
  2. Separate each discrete attribute into its own column.
  3. Uniquely identify each entity instance (row) using a primary key.
  4. Use foreign key columns to link related entities.

Okay, we see there are three main players.

  1. Students
  2. Courses
  3. Enrollments

So, we should ideally have 3 tables as per the normalization rules (and other rules also)

CREATE TABLE Students (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(100)
);

CREATE TABLE Courses (
CourseID INT PRIMARY KEY,
CourseName VARCHAR(100),
Instructor VARCHAR(100)
);

CREATE TABLE Enrollments (
StudentID INT,
CourseID INT,
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
);
-- Students
INSERT INTO Students (StudentID, StudentName) VALUES
(1, 'Alice'),
(2, 'Bob');

-- Courses
INSERT INTO Courses (CourseID, CourseName, Instructor) VALUES
(101, 'Math', 'Prof. A'),
(102, 'Science', 'Prof. B'),
(103, 'History', 'Prof. C');

-- Enrollments
INSERT INTO Enrollments (StudentID, CourseID) VALUES
(1, 101),
(1, 102),
(2, 101),
(2, 103);
-- Retrieve all students and their courses
SELECT s.StudentName, c.CourseName, c.Instructor
FROM Students s
JOIN Enrollments e ON s.StudentID = e.StudentID
JOIN Courses c ON e.CourseID = c.CourseID;

-- Retrieve all courses and their enrolled students
SELECT c.CourseName, s.StudentName
FROM Courses c
JOIN Enrollments e ON c.CourseID = e.CourseID
JOIN Students s ON e.StudentID = s.StudentID;

and see how things look, just, better, normalized.

There you go. a simple example to get me, and hopefully, started.

I work as a coding tutor. You can hire me on Upwork, Fiverr and Codementor. You can also book a session on calendly, and visit my website. Also, video tutorials on my YouTube Channel. My Podcast is here.

--

--

No responses yet