Don’t overthink it, use a SQL database for your next project

No matter your skill level as a developer, starting a side project should be as simple as possible. That’s why so many developers decide…

Feb 16, 2020

No matter your skill level as a developer, starting a side project should be as simple as possible. That’s why so many developers decide to use minimal technical stacks like the JAM stack for their hastily built weekend websites. Frameworks like Gatsby and Next in concert with platforms like Netlify make deploying performant websites possible for many more people.

All of these projects have done a great job of smoothing out the learning curve and providing a simpler path to deploying an efficient application to production.

However, it’s difficult to build an interesting side project without considering how you will be persisting and managing your data. Tools like Firestore and MyJson provide a simple to use API and access to a NoSQL database that complements a JS client nicely. These tools can be easy to learn, but questions arise in beginners who haven’t used a NoSQL database in production before. How should groups of similar objects be treated in a NoSQL database? How much validation does one need to add to the frontend to ensure data doesn’t become corrupt? How should your application objects relate to the objects you are storing in your database?

Dropping the NoSQL database in favor of a more standard SQL based RDBMS is an easier path for developers trying to build their first API that requires some data persistence. Sure, the initial learning curve can be slower, and some more effort needs to be invested upfront thinking about your data model, but that investment is worth it. Long term, the SQL database will serve the project well as it grows.

Getting Started

Before starting on any kind of project, you should have an outline of what you’re attempting to build. When it comes to database design, that means creating an Entity Relationship Diagram or ERD. Working on the model before coding out the real thing will save time and help you.

Creating an ERD doesn’t take much, I’ve used a tool called Lucidchart for this article, but I’ve used multiple tools to achieve the same result in the past. Just pick something that can draw boxes, lines, and words — and hopefully something free as well.

Let’s imagine we were building an application that stores information about books being sold at garage sales across the United States. A sample ERD would look something like this:

ERD that represent the core data model of our application

Choosing your datastore

Compared to the options available in the NoSQL ecosystem, the set of SQL options is more commoditized and battle-hardened. For this article, we will be using Postgres as our DB engine, but these tips can easily translate to other SQL databases like MySQL and SQLite.

For our toy database, we will be using a Postgres image hosted on Docker hub. If you haven’t used Docker before, check out my previous article for a quick crash course.

Running and Connecting to Postgres

We start by running the following command in a terminal to get a Postgres 11 container running on our machine.

$ docker run -d -e POSTGRES_PASSWORD=postgres -p 5432:5432 --name postgres-demo postgres:11

This will run a Postgres container in the background listening for requests from local port 5432 (make sure this port is empty before running the above command). When running docker ps you should see the container running.

$ docker ps 
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES 
a5291652ea29        postgres:11         "docker-entrypoint.s…"   6 seconds ago       Up 4 seconds        0.0.0.0:5432->5432/tcp   postgres-demo

You will also need to have a psql client downloaded on your machine. Check out the directions at https://www.postgresql.org/download/ for installation on your OS.

Once you have everything installed you should be able to run the following command to connect to your new running database.

$ PGPASSWORD=postgres psql -h localhost -U postgres 
psql (12.2 (Ubuntu 12.2-1.pgdg18.04+1), server 11.6 (Debian 11.6-1.pgdg90+1)) 
Type "help" for help.

postgres=#

The password above is being passed insecurely and is not a recommended method for authenticating via psql. Use a ~/.pgpass file if you don’t want to sign in interactively every time you connect to the database.

Creating the database and tables

We will be creating our database and tables via the psql client for this tutorial, but in production, it is recommended to use a migration approach as early as possible.

We’ll start by creating our database.

postgres=# CREATE DATABASE gsb; 
CREATE DATABASE

The rest of our commands will be acting against this new database we’ve created. You can see all the databases on the server by running the \l command. You can connect to a database using \c .

postgres=# \l 
                                 List of databases 
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges
-----------+----------+----------+------------+------------+-----------------------
 gsb       | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          + 
           |          |          |            |            | postgres=CTc/postgres
(4 rows)

postgres=# \c gsb 
psql (12.2 (Ubuntu 12.2-1.pgdg18.04+1), server 11.6 (Debian 11.6-1.pgdg90+1)) 
You are now connected to database "gsb" as user "postgres".
gsb=#

In Postgres, there are transactions that give you atomic and consistent updates. In other words, you can bundle commands as an all or nothing operation, and if the operation fails somewhere in the middle you can easily rollback. To start a transaction you type the BEGIN; command in the prompt, and to undo the operation you type ROLLBACK; . Once you’re ready to commit the operation you run COMMIT; .

Let’s start by making our first table, garage_sales by copy-pasting the command below into the psql prompt:

BEGIN;
CREATE TABLE garage_sales (
    id          SERIAL PRIMARY KEY,
    address     varchar(256)
);
COMMIT;

Congratulations! You’ve just created your first table in a SQL database. You can insert values into your table like this:

gsb=# BEGIN; 
BEGIN 
gsb=# INSERT INTO garage_sales (address) VALUES ('1600 Pennsylvania Ave NW, Washington, DC 20500'); 
INSERT 0 1 
gsb=# COMMIT; 
COMMIT

And you can select and view your data to make sure it’s there:

gsb=# SELECT * FROM garage_sales;
 id |                    address
----+------------------------------------------------
  1 | 1600 Pennsylvania Ave NW, Washington, DC 20500
(1 row)

Let’s create our author table, again copy-pasting the command below in the psql prompt.

BEGIN;
CREATE TABLE authors (
    id      SERIAL PRIMARY KEY,
    name    varchar(256),
    gender  char(1)
);
COMMIT;

We now have 2 tables in our gsb database. We can list and inspect them using the \dt and \d commands

gsb=# \dt 
            List of relations 
 Schema |     Name     | Type  |  Owner
--------+--------------+-------+----------
 public | authors      | table | postgres
 public | garage_sales | table | postgres
(2 rows)

gsb=# \d authors 
                                    Table "public.authors" 
 Column |          Type          | Collation | Nullable |               Default
--------+------------------------+-----------+----------+-------------------------------------
 id     | integer                |           | not null | nextval('authors_id_seq'::regclass)
 name   | character varying(256) |           |          |
 gender | character(1)           |           |          |
Indexes:
    "authors_pkey" PRIMARY KEY, btree (id)

Now we’re going to create our first table that has a foreign key constraint. This foreign key creates a relationship between rows in different tables, to which you can attach special meaning and rules.

BEGIN;
CREATE TABLE books (
    id              SERIAL PRIMARY KEY,
    author_id       integer REFERENCES authors ON DELETE CASCADE,
    authored_date   date,
    published_date  date,
    name            varchar(256),
    genre           varchar(64)
);
INSERT INTO authors (name, gender) VALUES ('Benjamin Graham', 'M'); 
INSERT INTO books (author_id, authored_date, published_date, name, genre) values (1, '1949-01-01', '1949-01-01', 'The Intelligent Investor', 'Investing');
COMMIT;

Let’s break down the foreign key portion of the table above:

The author_id column in the bookstable is a reference to the id column in the authors table. In the absence of a specific column, the foreign key is based on the primary key of the referenced table.
When an author is deleted from the authors table, that triggers a CASCADE onto the books table and deletes all the rows where that author is referenced

Finally let’s create our associative table, which lets us create a many-to-many relationship in our data schema.

We’ll start by activating a module that allows us to create universally unique ids.

gsb=# CREATE EXTENSION IF NOT EXISTS "pgcrypto";

We can then utilize this functionality to create our primary key for our spotting table.

CREATE TABLE spottings (
    guid            uuid PRIMARY KEY DEFAULT uuid_generate_v4(),
    book_id         integer REFERENCES books ON DELETE CASCADE,
    garage_sale_id  integer REFERENCES garage_sales ON DELETE CASCADE
);

And we can now insert our “spotting” which links books to a garage sale.

gsb=# BEGIN;
BEGIN
gsb=# INSERT INTO spottings (book_id, garage_sale_id) VALUES (1,1);
INSERT 0 1
gsb=# COMMIT;
COMMIT

In conclusion

You are now done setting up your relational tables and you’re ready to store data for your application. Let’s go over what you’ve learned, and the benefits you’ve gained by following this approach.

You’ve created foreign key relationships between tables that help you keep your data in sync.
You’ve created data types for all of your data, and your database will ensure that you cannot store data in a different format.
You have created a many-to-many relationship table, which means you won’t have to repeat data across multiple collections, and instead, you can just keep references to tables that don’t change often.
You’ve added random key generation in the database layer, making your application logic simpler and more lightweight.
You’ve created a schema that you can review with other contributors on your project, making it simpler to share your data design if/when your project grows.

I hope this walkthrough provides the incentive to try out a SQL datastore for your next project, you won’t be disappointed in the results.

Tech Mythos