All ArticlesSchemaVis logo

Database Normalization Explained: 1NF, 2NF, 3NF

A
Akshat Kotpalliwar
6 min read

Database normalization is a systematic approach to organizing data within a relational database. The primary goals are to reduce data redundancy (storing the same data in multiple places) and to improve data integrity. Normalization involves dividing large tables into smaller, less redundant tables and defining relationships between them.

The process is guided by a series of rules known as Normal Forms (NF). While higher normal forms exist, achieving the Third Normal Form (3NF) is the standard for most transactional databases (OLTP).

First Normal Form (1NF)

To satisfy 1NF, a table must meet the following criteria:

  1. Atomic Values: Each column must hold indivisible (atomic) values. No arrays, lists, or comma-separated values in a single cell.
  2. Unique Column Names: Each column in a table must have a unique name.
  3. No Repeating Groups: You should not have multiple columns representing the same attribute (e.g., Phone1, Phone2, Phone3).

Fixing 1NF Violations: If a user has multiple phone numbers, create a separate 'user_phones' table with a foreign key linking back to the 'users' table, rather than storing them in a single column or multiple numbered columns.

Second Normal Form (2NF)

To achieve 2NF, a table must:

  1. Be in 1NF.
  2. Have no Partial Dependency. This means all non-key attributes must depend on the entire primary key.

Context: This only applies if a table has a composite primary key (a primary key made of two or more columns). If a column depends on only one part of the composite key, it violates 2NF.

Fixing 2NF Violations: Move the partially dependent columns into a new table where the part of the composite key they depend on becomes the sole primary key.

Third Normal Form (3NF)

To achieve 3NF, a table must:

  1. Be in 2NF.
  2. Have no Transitive Dependency. This means a non-key column must not depend on another non-key column. All non-key columns must depend only on the primary key.

Example: Consider a 'orders' table containing 'customer_id', 'customer_name', and 'customer_address'. 'customer_name' and 'customer_address' depend on 'customer_id', not the 'order_id' (the primary key).

Fixing 3NF Violations: Extract the transitively dependent columns into a new table (e.g., a 'customers' table) and leave the foreign key ('customer_id') in the original table.

Denormalization

While normalization is crucial for data integrity and reducing update anomalies, strict adherence can sometimes lead to complex queries requiring numerous JOINs, which can degrade read performance. In analytical databases (OLAP) or high-read systems, engineers may intentionally denormalize data, accepting some redundancy in exchange for faster read times. Understanding normalization is a prerequisite to knowing when and how to safely denormalize.

Related Topics

Database DesignNormalization1NF2NF3NFRelational SchemaData Integrity

Visualize your schemas effortlessly

Stop writing manual DDL or guessing relationships. Drop your SQL file into SchemaVis and instantly generate an interactive diagram.

Try SchemaVis for Free