Functions

If you don't already have a library database, then make sure to run createdb library from the terminal before starting this chapter. The library database is needed to work through this chapter's content. The data needed to work through this chapter is stored here. If you haven't been following along feel free to download it. Then use the following command to import that data into the library database: psql -d library < advanced_querying.sql

What are Functions?

We've used functions a bit in earlier chapters. In this one we'll go a bit more in depth to learn more about them and what they can be used for. Functions are a set of commands included as part of the RDBMS that perform particular operations on fields or data. Some functions provide data transformations that can be applied before returning results. Others simply return information on operations carried out. We've seen a few functions already: SETVAL() sets the value of a sequence, NEXTVAL() increments the current value of the sequence, and NOW(), returns the current date and time. However, there are many other functions, including MAX and SUM: MAX returns the maximum value of a column in a table, while SUM adds up all values in a column.

Commonly Used Functions

We show below examples of some commonly used functions for dealing with numbers, strings, dates and aggregation.

String

  • TRIM, RTRIM, LTRIM - Removes leading or trailing spaces from a string. These functions are very useful when comparing data that may have been entered by a user and could include spaces at the beginning or end of the value. LTRIM, RTRIM are subsets of TRIM and provide left trim and right trim functionality, respectively.

Follow along by running these SQL statements to observe the results.

First, we will update the books table and change one of the author names to include spaces at both ends of the name

library=# UPDATE books
library-# SET author = '         Rudyard Kipling       '
library-# WHERE author = 'Rudyard Kipling';
UPDATE 0

Now, if we run this query, we won't find an author by that name.

library=# SELECT author FROM books WHERE author='Rudyard Kipling';
 author
--------
(0 rows)

However, if we TRIM the author's name before the comparison, the record is returned.

library=# SELECT author FROM books where TRIM(author)='Rudyard Kipling';
             author
---------------------------------
          Rudyard Kipling
(1 row)
  • LENGTH - Returns the length of the string. This function is useful when working in a database that may have empty data in some of the records.

For example, the following inserts a record with a blank author name:

library=# INSERT INTO books(title, author, published_date, isbn) VALUES ('A book by Anonymous', '', NOW(), 123456789);
INSERT 0 1

If we need to find that entry, we can use this query:

library=# SELECT * FROM books WHERE LENGTH(author) = 0;
 id |        title        | author |       published_date       |   isbn
----+---------------------+--------+----------------------------+-----------
 21 | A book by Anonymous |        | 2016-04-15 18:51:14.651442 | 123456789
(1 row)
  • CONCAT - Is short for concatenation. The dictionary meaning of the word is connecting. In the programming world, and in SQL, it describes the joining of multiple strings.

We saw CONCAT in a previous chapter when we talked about formatting data.

library=# SELECT CONCAT(title, ' by ', author) AS "Books By" FROM books;
                               Books By
-----------------------------------------------------------------------
 My First SQL book by Mary Parker
 My Second SQL book by John Mayer
 My Third SQL book by Alice Munro
 Christmas Stories from French and Spanish writers by Antoinette Ogden
 Michael and His Lost Angel by  Henry Arthur Jones
 Weather Warnings for Watchers by   The Clerk
 Cronaca della rivoluzione di Milano by   Leone Tettoni
 Thomas More by Henriette Roland Holst van der Schalk
 Po├ęsies de Daniel Lesueur by   Daniel Lesueur
 Norman Macleod by   John Wellwood
 Alfried Krupp by   Herman Frobenius
 The Intrusions of Peggy by   Anthony Hope
 Private Spud Tamson by   R. W. Campbell
 The Philosophy of Natural Theology by   William Jackson
 A Treatise on Gunshot Wounds by   Thomas Longmore
 The Sanitary Evolution of London by   Henry Lorenzo Jephson
 The Imprudence of Prue by   Sophie Fisher
 The Forest Schoolmaster by   Peter Rosegger
 Kim by          Rudyard Kipling
 A book by Anonymous by
(20 rows)

You can see the entire list of string functions in in the PostgreSQL documentation.

Aggregation - Count/Max/Min

  • COUNT - A very useful function -- COUNT, as the name implies, lets you count the total number of occurrences of a selection of records.

Count the number of users that have never checked out a book at the library.

library=# SELECT COUNT(*) FROM users LEFT JOIN users_books
library-# ON users.id = users_books.user_id
library-# WHERE users_books.book_id IS NULL;
 count
-------
     1
(1 row)

Since we have a small set of users, if we query to return all the data, we indeed see that Alice Munro has no books associated with her account.

library=# SELECT * FROM users LEFT JOIN users_books ON users.id = users_books.user_id;
 id |  username   | enabled |         last_login         | user_id | book_id |       checkout_date        | return_date
----+-------------+---------+----------------------------+---------+---------+----------------------------+-------------
  1 | John Smith  | t       | 2016-04-06 13:00:16.551272 |       1 |       2 | 2016-04-06 13:00:16.561218 |
  1 | John Smith  | t       | 2016-04-06 13:00:16.551272 |       1 |       1 | 2016-04-06 13:00:16.557111 |
  2 | Jane Smiley | t       | 2016-04-06 13:00:16.562224 |       2 |       2 | 2016-04-06 13:00:16.562979 |
  3 | Alice Munro | t       | 2016-04-06 13:00:16.566877 |         |         |                            |
(4 rows)
  • MAX - Determines the maximum value of a column.
    library=# SELECT MAX(id) FROM books;
     max
    -----
      21
    (1 row)
    

Note, compare this to

library=# SELECT COUNT(id) FROM books;
 count
-------
    20
(1 row)

Do you know why the count of records in the books table is smaller than the max identifier of that table? The difference indicates that some records have been deleted from the database.

  • MIN - Min returns the minimum value of a column.
    library=# SELECT MIN(id) FROM books;
     min
    -----
       1
    (1 row)
    

You can see the entire list of aggregate functions in in the PostgreSQL documentation.

Date/Time

  • CURRENT_DATE - returns the current date in DATE format
  • CURRENT_TIME - returns the current time in TIME format
  • NOW() - returns the full date and time in TIMESTAMP format

Try this command in the PostgreSQL console:

library=# SELECT NOW(),CURRENT_DATE,CURRENT_TIME;
              now              |    date    |       timetz
-------------------------------+------------+--------------------
 2016-04-15 19:23:25.577211-07 | 2016-04-15 | 19:23:25.577211-07
(1 row)

Now, suppose we want to select all books published after April 15th, 2016. Earlier, we added a new record to the books table, which gave it a publication date matching the then current date and time. Therefore, this new record is indeed returned by the query. Note that our example shows the publication date as April 15, 2016 exactly, so you might think it wouldn't be returned by this query. However, not only are the dates compared, but the times are also compared to resolve such discrepancies.

library=# SELECT * FROM books WHERE published_date > '2016-04-15';
 id |        title        | author |       published_date       |   isbn
----+---------------------+--------+----------------------------+-----------
 21 | A book by Anonymous |        | 2016-04-15 18:51:14.651442 | 123456789
(1 row)

We can replace the date in the query with the CURRENT_DATE function. CURRENT_DATE is commonly used in SQL queries to find new records inside a table, especially when troubleshooting or developing new code.

library=# SELECT * FROM books WHERE published_date > CURRENT_DATE;
 id |        title        | author |       published_date       |   isbn
----+---------------------+--------+----------------------------+-----------
 21 | A book by Anonymous |        | 2016-04-15 18:51:14.651442 | 123456789
(1 row)

We can use the time to compare to previous dates starting from that date. For example, CURRENT_DATE - 10, is 10 days ago, April 5th.

library=# SELECT CURRENT_DATE - 10 AS "Ten Days Ago";
 Ten Days Ago
--------------
 2016-04-05
(1 row)

Now, if we want to get a count of all the books published 10 days ago, we can run the following query:

library=# SELECT COUNT(*) FROM books WHERE published_date > CURRENT_DATE - 10;
 count
-------
    20
(1 row)

You can see the entire list of data and time functions in in the PostgreSQL documentation.

Mathematical

Mathematical functions perform various mathematic operations as a part of a SQL statement. You might use similar functions in a programming language such as Ruby or Python after retrieving the raw data from a database. There are times where it is simpler to perform these mathematic operations within a SQL query, especially when the result of the operation is used in a WHERE or HAVING clause. These functions are also commonly used in a query's SELECT clause so the resulting value is returned by the query.

  • ABS - Get the absolute value of a number (removing the sign)

    library=# SELECT ABS(-10);
     abs
    -----
      10
    (1 row)
    
  • ROUND - For types PostgreSQL considers 'exact' values, such as DECIMAL, this simply rounds the way you were likely taught in math class.

    library=# SELECT ROUND(-10.2);
     round
    -------
       -10
    (1 row)
    
  • RANDOM() - Get a random floating point value between 0 and 1

    library=# SELECT RANDOM();
          random
    -------------------
     0.688725820276886
    (1 row)
    

You can see the entire list of mathematical functions in in the PostgreSQL documentation.