The importance of Database Relations Normalization and its effects on efficiency.
'Schema Busters' is a multipart technical overview of Database Design elements that you can implement when designing and accessing your database implementation. In this first section we will discuss the importance of a concept titled Normalization which aims to eliminate a number of undesirable attributes that a database relation may contain. Initially we will discuss these undesirable attributes and then describe a number of popular methods, called Normalizations, to remove any undesirable attributes and convert an existing database relation into what is known as a Higher Normal Form Relation. In each section we will also provide real world examples where such normalizations improve the quality of relations and query efficiency.
Undesirable #1: Non-atomic field values
Storing atomic (or indivisible) values in a field makes accessing data simpler and reduces redundancy. Atomicity can be obtained by disallowing the direct storage of multiple values in a single attribute field. That is, a single field contains at most a single element of the attribute domain associated with this field. Take for example a CAR relation with a Color attribute. Even though a single car may come in several colors, we would not want to store a list of colors in a single field since it makes accessing a single color element indirect and complicated (shown in figure 1).
CAR
Car_Id | Name | Color |
---|---|---|
001 | Honda Civic | {Black} |
002 | Toyota Camry | {Red,White,Blue} |
To resolve this issue we need to convert our CAR relation into what is called a First Normal Form relation. There are different ways of achieving this, some more efficient than others. Below is my recommended method:
- Remove the Color attribute from the CAR entity and place it into an associative table CAR_COLOR along with the primary key of CAR, Car_Id.
Undesirable #2: Non-Full Functional Dependency
Enforcing full functional dependency more accurately and properly associates relational data together. For example, a table may have a composite primary key with different elements being functionally dependent on pieces of the composite key. Full functional dependency separates data relations into intuitive dependency sets and allows for further database normalization. Take for example the following CAR_SHOP, which refers to a shop that works on cars, relation and functional dependencies (denoted FD):
CAR_SHOP
Car_Id | Shop_Id | Driver | Owner | Price |
---|
- FD1: Car_Id, Shop_Id >> Price
- FD2: Car_Id >> Driver
- FD3: Shop_Id >> Owner
The functional dependencies describe relations between data. FD1 simply states that for a given car at a given shop the price can be evaluated, FD2 states that a given car has a given driver, and FD3 states that a given shop has a given owner. We can break up the CAR_SHOP entity to be simpler and more descriptive by performing the following manipulations:
- Create a CAR entity that associates Car_Id and Driver.
- Create a SHOP entity that associates Shop_Id and Owner.
- Modify the CAR_SHOP entity that associates Car_Id, Shop_Id, and Price.
Undesirable #3: Transitive Dependencies
Transitive dependencies exist when a non-key attribute of an entity is found on both the left and right-hand side of a known functional dependency. This can lead to attributes being functionally dependent to non-key attributes. Take for example a modified version of the CAR entity we described in the previous section:
CAR
Car_Id | VIN | OWNER | DL_NUMBER | SSN |
---|
- Car_Id >> Owner >> DL_Number, SSN
- Modify the CAR entity and associates Car_Id, VIN, and Owner.
- Create an OWNER entity that associates Owner, DL_Number, and SSN.
Conclusion (for now)
While there are more normalizations we can consider, several have limited practicality in real-world environments and can be made unnecessary if a database is designed carefully. I say this not to belittle their importance, but to recognize the matter of pertinence in our discussion. If your projects are requiring you to perform additional normalizations, you may consider picking up Fundamentals of Database Systems by Elmasri & Navanthe or another database schema textbook and doing some independent studies. Hopefully by now you have gained a general understanding of the importance of database relation normalization and how it can improve your database schema. Look for my next entry in this series where we will discuss additional database schema concepts and practices.
.blogtable { width:100%; border:solid 1px #333333; color:White; } .blogtable th { background-color:#555555; color:White; } .blogtable td { background-color:#aaaaaa; color:White; padding:5px; }