All ArticlesSchemaVis logo

Common SQL Antipatterns and How to Avoid Them

A
Akshat Kotpalliwar
6 min read

SQL antipatterns are common, recurring mistakes in database schema design and query writing. While they might seem like pragmatic solutions initially, they inevitably lead to poor performance, complex queries, and data integrity issues as the application scales.

Here are some of the most prevalent SQL antipatterns and how to refactor them.

1. The Entity-Attribute-Value (EAV) Pattern

The Antipattern: Storing dynamic attributes in a single table with three columns: Entity ID, Attribute Name, and Attribute Value. This is often done to avoid altering the schema when new attributes are needed.

The Problem: EAV destroys data integrity. You cannot enforce data types (everything is usually stored as a string), you cannot easily make attributes required, and querying becomes a nightmare of endless self-joins to assemble a single complete entity.

The Solution: Use standard relational columns. If the schema is truly highly dynamic, utilize native JSON/JSONB data types supported by modern databases like PostgreSQL and MySQL. JSONB allows for dynamic schemas while still supporting indexing and relatively fast querying.

2. Comma-Separated Lists

The Antipattern: Storing multiple values in a single text column separated by commas (e.g., a 'tags' column containing 'sql,database,performance').

The Problem: This violates First Normal Form. It is impossible to use standard indexes to find rows containing a specific tag. Updates require complex string manipulation. You cannot use foreign keys to ensure the tags actually exist in a master tags table.

The Solution: Create an intersection (junction) table to model a many-to-many relationship. For example, a post_tags table containing post_id and tag_id.

3. Polymorphic Associations

The Antipattern: A table referencing multiple parent tables using two columns: parent_id and parent_type. For example, a 'comments' table that can belong to either a 'post' or an 'image'.

The Problem: You cannot define a Foreign Key constraint. The database cannot enforce referential integrity because a single column (parent_id) cannot point to multiple different tables. Orphaned records are inevitable.

The Solution: Use an exclusive arc (multiple nullable foreign keys where only one can be NOT NULL via a CHECK constraint) or create separate junction tables for each parent type (e.g., post_comments and image_comments).

4. Fetching Data to Perform Logic in the Application

The Antipattern: Writing a simple SELECT * FROM users and then writing a loop in Python, Node.js, or Java to aggregate the data, filter it, or calculate sums.

The Problem: Databases are highly optimized C/C++ engines designed specifically for data manipulation. Transferring thousands of rows across the network only to sum them up in an application consumes massive amounts of memory and network bandwidth.

The Solution: Push the compute to the data. Use SQL's powerful aggregate functions (SUM, AVG, COUNT), GROUP BY clauses, and Window Functions to perform calculations inside the database. Return only the final result to the application.

Related Topics

SQL AntipatternsEAVPolymorphic AssociationsDatabase DesignJSONBRefactoring

Visualize your schemas effortlessly

Stop writing manual DDL or guessing relationships. Drop your SQL file into SchemaVis and instantly generate an interactive diagram.

Try SchemaVis for Free