- Database & Table Basics
- What is PostgreSQL?
- What is a database?
- What is a table?
- What are the three parts of a Postgres table?
- How to get table size in postgres?
- How to get table tablespace size in postgres?
- How to get table index size in postgres?
- How to get total table size in postgres?
- Why are tables larger on-disk than in memory?
- Conclusion
Database & Table Basics
Before we begin, first we will go over some basics to reaffirm at-least a rudimentary understanding of databases and tables in an object-relational database.
What is PostgreSQL?
PostgresSQL is an object-relational database program along the same vane as MySQL, and is one of its biggest competitors. More commonly referred to as simply “postgres”, it has a proven track record for superior functionality, reliability and performance, dating all the way back to its conception in 1986. An open source alternative to Microsoft SQL, MySQL, MariaDB, and Percona MySQL, which make up the majority of the database software market share used in the IT Industry today. The posgres architecture is an industry titan where ease of use, speed, extensibility, and data security are pivotal to the success of your application.
What is a database?
A database is a software framework used for storing, reading, and modifying the data that an application needs to manipulate. Storing your application dataset in flat text files only takes you so far and will rapidly create a bottleneck that slows performance down to a snail’s crawl. Database software, at its most basic, aims to make your data storage more efficient by avoiding the hindrances of traditional flat file disk storage. A properly optimized database structure leverages indexes to enumerate the most commonly accessed data, keeping queries fast and accurate, while retaining the ability to access any non-indexed data when necessary.
You can think of a database like your local city bus. The seats on the bus would represent table structure while the individual people in those seats would be the data, or tablespace. Each bus stop along its route represents queries which can add/remove people (data) related to the stop in question.
What is a table?
Tables are a skeletal structure within a database used to collate data into rows and columns of information that is easy to reference and updates as needed. Columns make up the individual components of a record, e.g., name, age, address, favorite color, etc., while rows are each individual record contained in the list of stored records.
Continuing our city bus analogy, the tables in your database are the seats. They are a rigid structure in a specific configuration that has been optimized to handle its people (data). Bus stops represent queries which are used to retrieve and deliver the correct people to their intended destination or to stop and pickup, then store in the tablespace (seats) for later use.
What are the three parts of a Postgres table?
Database tables are split up in three individual components:
- Table Structure – This is the framework that outlines what kind of data each column can hold. It is part of the tablespace, but it is not part of the user data. Using the bus analogy, the structure would be how the seats are arranged and the space those seats themselves take up when empty.
- Tablespace – This is the total measure of all the data within a table, including the table structure. Back to our bus analogy, this would be the entire contents of a row of seats, including those seats, and the people on them.
- Table Indexes – Indexes are additional metadata in the form of quick reference tables that enumerate the commonly used columns to speed up processing. Harking back to our bus metaphor, indexes would be the standing hand-holds. Those people standing in anticipation of their up coming stop so they can swiftly disembark is a great example of how indexes speed up databases tables.
These three pieces can be calculated independently, but ultimately together make up the total overall size of a table. This next set of questions will focus on these individual components and how to interact with them via postgres queries and functions. We’ll also go over the discrepancy you can observe between the raw size of a table space in memory versus the amount of disk space that identical data consumes on disk.
How to get table size in postgres?
To retrieve the raw bytes size of a specific table, including its tablespace and structure, we use the pg_table_size()
function as our query like so:
Example Query – How to get table size in postgres | |
SELECT pg_table_size(‘mytablename’); | |
Example Result – How to get table size in postgres | |
pg_table_size ————– 235929600 |
Oftentimes, especially on large datasets, you will want to convert this to a more human friendly format. Unlike MySQL based products, which require constructing the math equation directly into your query, PosgreSQL has a built-in function, i.e., pg_size_pretty()
to convert bytes to the more common human readable terms. We can employ this function along with our previous query like so:
Example Pretty Query – How to get table size in postgres | |
SELECT pg_size_pretty (pg_table_size(‘mytablename’)); | |
Example Pretty Result – How to get table size in postgres | |
pg_size_pretty —————- 225 MB (1 row) |
It’s important to remember that pg_table_size()
only provides the table size excluding indexes, but does factor other associated object data like: TOAST, free space and visibility maps.
How to get table tablespace size in postgres?
In order to get the tablespace size of the dataset in question, Posgres has a function for this as well, i.e. pg_tablespace_size()
and it’s utilized in the exact same manner like so:
Example Query – How to get tablespace size in postgres | |
SELECT pg_tablespace_size(‘mytablename’); | |
Example Result – How to get tablespace size in postgres | |
pg_tablespace_size —————— 54525952 |
|
Example Pretty Query – How to get tablespace size in postgres | |
SELECT pg_size_pretty (pg_tablespace_size(‘mytablename’)); | |
Example Pretty Result – How to get tablespace size in postgres | |
pg_size_pretty —————- 52 MB (1 row) |
How to get table index size in postgres?
Posgres also has a built-in function for querying index sizes separately from pg_total_relation_size()
and this is with pg_indexes_size()
. This function operates the same way as the others and looks like this:
Example Query – How to get table index size in postgres | |
SELECT pg_indexes_size(‘mytablename’); | |
Example Result – How to get table index size in postgres | |
pg_tablespace_size —————— 54525952 |
|
Example Pretty Query – How to get table index size in postgres | |
SELECT pg_size_pretty (pg_indexes_size(‘mytablename’)); | |
Example Pretty Result – How to get table index size in postgres | |
pg_size_pretty —————- 52 MB (1 row) |
How to get total table size in postgres?
Posgres provides a separate function, i.e., pg_total_relation_size()
, for calculating the total overall table size, including structures, indexes, and any other objects associated with the table in question.
Example Query – How to get overall table size in postgres | |
SELECT pg_total_relation_size(‘mytablename’); | |
Example Result – How to get overall table size in postgres | |
pg_total_relation_size ———————- 290455552 |
|
SELECT pg_size_pretty (pg_indexes_size(‘mytablename’)); | |
Example Pretty Result – How to get overall table size in postgres | |
pg_size_pretty —————- 277 MB (1 row) |
Why are tables larger on-disk than in memory?
Due to the mechanics of disk storage and formatting block devices, the actual size of a table, or database, or any file really, is always slightly smaller than the exact same data stored on a disk. This is because disks are divided up into thousands of small bite-size blocks of a fixed size called inodes. These inodes each consume a predetermined amount of data based on how the disk drive was originally formatted. This means that data in a file has a small gap at the end where it did not consume the entire inode. However, since we cannot divide the inode up further, that extra bit of space is lost, resulting in a slightly larger on-disk size versus in memory size of a particular data stream. This is a common truth with disk storage versus memory storage and is an important item to understand when dealing with topics revolving around the sizes of files on the disk compared to the exact same data in memory.
Conclusion
You have been armed with the know-how to query the individual components of table size in postgres databases by using pg_table_size()
, pg_total_relation_size()
, pg_indexes_size()
, pg_tablespace_size()
, and how to format those results into an easy to read human format using pg_size_pretty()
. These built-in tools make postgres stand out as the more user-friendly solution when compared to MySQL based SQL services.
Jason Potter is a Senior Linux Systems Administrator & Technical Writer with more than 20 years experience providing technical support to customers and has a passion for writing competent and thorough technical documentation at all skill levels.
Leave a Reply