If you don't already have a database named "library", then make sure to run createdb library
from the terminal before starting this chapter. The library database is needed to work through this chapter's content.
The library database can be restored from a backup file. First download the backup file for this chapter from here.
Then run the following command:
psql -d library < multi_tables.sql
When running the above command, you may see the following message: ERROR: relation "public.users" does not exist
. This can be safely ignored. There are some commands in place at the start of the backup file for altering and removing the users
table. These are included in case that table already exists. If the table doesn't exist, PostgreSQL will ignore the command and move on to executing the rest of the backup file.
In the chapter on altering a table, we ended up changing our users database in various ways. With that in mind, we'll start fresh in this chapter. Follow the directions above in the info box to retrieve starter data for this chapter. We'll still be working with the familiar library
database and users
table from previous chapters, but with a blank users
table to start with. In this chapter we want to add several more pieces of data to our database. We would like to store
We could start adding these pieces to the users
table, resulting in a table with many columns. First we need to add these columns using the ALTER TABLE
command.
library=# ALTER TABLE
users
ADD COLUMN book_title VARCHAR(25),
ADD COLUMN review_content VARCHAR(50),
ADD COLUMN rating int,
ADD COLUMN address VARCHAR(255);
ALTER TABLE
The users
table now has 8 fields.
library=# \d users
Table "public.users"
Column | Type | Modifiers
----------------+-----------------------------+----------------------------------------------------
id | integer | not null default nextval('users_id_seq'::regclass)
username | character varying(25) | not null
enabled | boolean | default true
last_login | timestamp without time zone | not null default now()
book_title | character varying(25) |
review_content | character varying(50) |
rating | integer |
address | character varying(255) |
Indexes:
"users_pkey" PRIMARY KEY, btree (id)
Next, we need to insert some data into the table
INSERT into users(username, enabled, book_title, review_content, rating, address)
VALUES('John Smith', false, 'Chaos', 'What is the butterfly effect?', 5, '2 Market Street');
INSERT into users(username, enabled, book_title, review_content, rating, address)
VALUES('Jane Smith', true, 'Simple Taoism', 'About Living in balance', 5, '3 Market Street');
INSERT into users(username, enabled,book_title, review_content,rating,address)
VALUES('Jane Smith', true, 'The Storyteller', 'Memories of WWII', 4, '3 Market Street');
Let's see what the data looks like
library=# SELECT * FROM users;
id | username | enabled | last_login | book_title | review_content | rating | address
----+------------+---------+----------------------------+-----------------+--------------------------------+--------+-----------------
4 | John Smith | f | 2016-03-26 19:01:52.341599 | Chaos | What is the butterfly effect? | 5 | 2 Market Street
5 | Jane Smith | t | 2016-03-26 19:05:55.439876 | Simple Taoism | About Living in balance | 5 | 3 Market Street
6 | Jane Smith | t | 2016-03-26 19:08:06.039351 | The Storyteller | Memories of WWII | 4 | 3 Market Street
(3 rows)
Not pretty!!
With just 3 records, the data is cluttered with duplicate information. For instance, Jane Smith
has to be inserted for every book and review for that user. Furthermore, we still need to add more information, such as author and checkout_date of the book.
It makes sense to store them in separate tables and provide a link between the tables so we can figure out which review or book belongs to which user.
If we break out the data into separate tables, we need tables for Users
, Addresses
, Books
and Reviews
. We've mentioned relational databases a lot, but haven't talked a lot about what that relational part means. Now that we know which tables we need, we also need to map the relationships between these tables, sometimes known as entities.
The relationships we need to model are
Let's take a look at the design of these four tables together, we will then explain how each of these relationships is modeled individually. To describe the overall design, we are using an entity relationship diagram or ERD, of the tables.
What is an ERD?
An entity relationship diagram is a graphical representation of entities and their relationships to each other. An entity represents a real world object, a piece of data that we want to model within our database.
The arrows between these tables indicate the relationships that are linking them together. The links in the tables are formed with Primary Keys and Foreign Keys. We are leveraging these keys to create relationships. You will notice an additional table users_books
, which we have not mentioned so far. The user and books relationship suggests that, the user has many books and a book has many users. We need to capture the data of this many-to-many relationship in the users_books table.
There are 3 types of table relationships in a relational database. Our library
database modeled in the ER diagram above has all 3 of these, shown in the table below.
Relationship | Notes |
---|---|
One-to-One | A User has ONE address |
One-to-Many | A Book has MANY reviews |
Many-to-Many | A user has MANY books and a book has MANY users |
Let's look at each relationship and how to create them in SQL.
A one-to-one relationship between two entities exists when a particular entity instance exists in one table, and it can have only one associated entity instance in another table.
Example: A user has only one address, and an address belongs to only one user.
In the database world, this sort of relationship is implemented like this: the id that is the PRIMARY KEY of the users
table is used as both the FOREIGN KEY and PRIMARY KEY of the addresses
table.
CREATE TABLE users (
id serial,
username VARCHAR(25) NOT NULL,
enabled boolean DEFAULT TRUE,
last_login timestamp NOT NULL DEFAULT NOW(),
PRIMARY KEY (id)
);
/*
one to one: User has one address
*/
CREATE TABLE addresses (
user_id int NOT NULL,
street VARCHAR(30) NOT NULL,
city VARCHAR(30) NOT NULL,
state VARCHAR(30) NOT NULL,
PRIMARY KEY (user_id),
CONSTRAINT fk_user_id FOREIGN KEY (user_id) REFERENCES users (id)
);
Notice the last two lines in the CREATE
addresses command. These two lines create the constraint that makes the user_id
the Primary Key of the addresses table and also the Foreign Key of the users
table.
A one-to-many relationship exists between two entities if an entity instance in one of the tables can be associated with multiple records (entity instances) in the other table. The opposite relationship does not exist; that is, each entity instance in the second table can only be associated with one entity instance in the first table.
Example: A book has many reviews. A review belongs to only one book.
In the database world, this sort of relationship is implemented by ensuring that the book_id that is the PRIMARY KEY of the books
table is a FOREIGN KEY of the reviews
table
CREATE TABLE books (
id serial,
title VARCHAR(100) NOT NULL,
author VARCHAR(100) NOT NULL,
published_date timestamp NOT NULL,
isbn int,
PRIMARY KEY (id),
UNIQUE (isbn)
);
/*
one to many: Book has many reviews
*/
DROP TABLE IF EXISTS reviews;
CREATE TABLE reviews (
id serial,
book_id int NOT NULL,
user_id int NOT NULL,
review_content VARCHAR(255),
rating int,
published_date timestamp DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id),
FOREIGN KEY (book_id) REFERENCES books(id) ON DELETE CASCADE,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);
NOTE: CURRENT_TIMESTAMP is an alias for the NOW() function. They can both be used to set the default of a timestamp as the current date and time. Notice how CURRENT_TIMESTAMP doesn't use parentheses while NOW() does. This has to do with keeping certain functions compatible with SQL standard. You'll almost always find that functions not specific to PostgreSQL use parentheses. PostgreSQL specific functions may have functions with or without parentheses.
Notice the last lines in the CREATE
reviews command. The FOREIGN KEY
constraint maps the book_id
from the reviews table to the Primary Key id
of the books table.
A many-to-many relationship exists between two entities if for one entity instance there may be multiple records in the other table and vice versa.
Example: A user has many books checked out or may have checked them out in the past. A book has many users that have checked a book out.
In the database world, this sort of relationship is implemented by introducing a third cross-reference table, that holds the relationship between the two entities, which is the PRIMARY KEY of the books
table and the PRIMARY KEY of the user
table.
Look at the line PRIMARY KEY (user_id, book_id)
. The primary key is not a single key, but a composite key based on user_id and book_id. Therefore, the pair has to be unique.
CREATE TABLE users_books (
user_id int NOT NULL,
book_id int NOT NULL,
checkout_date timestamp,
return_date timestamp,
PRIMARY KEY (user_id, book_id),
FOREIGN KEY (user_id) REFERENCES users(id) ON UPDATE CASCADE,
FOREIGN KEY (book_id) REFERENCES books(id) ON UPDATE CASCADE
);
We have used the terms Primary Key and Foreign Key throughout this chapter for ids. In the next chapter, we will take a look at a bit of theory on keys and constraints.