PostgreSQL: Question Optimization for Mere People | by Eyal Trabelsi | Dec, 2024

We are going to use it for example of a easy question: we wish to rely the variety of customers that don’t have Twitter handles.

EXPLAIN ANALYZE
SELECT COUNT(*) FROM customers WHERE twitter != '';
We are able to see the execution plan returned from the EXPLAIN ANALYZE clause

It seems to be cryptic at first, and It’s even longer than our question, and that on a small instance of real-world execution plans may be overwhelming for those who do not focus 😭.

But it surely does present helpful data. We are able to see that the question execution took 1.27 seconds, whereas the question planning took solely 0.4 milli-seconds (negligible time).

We are able to see the time the question planning and execution took

The execution plan is structured as an inverse tree. Within the subsequent determine, you’ll be able to see the execution plan is split into completely different nodes every certainly one of which represents a distinct operation whether or not it is an Aggregation or a Scan.

We are able to see the time the question planning and execution took

There are a lot of sorts of nodes operations, from Scan associated (‘Seq Scan’, ‘Index Solely Scan’, and many others…), Be a part of associated( ‘Hash Be a part of’, ’Nested Loop’, and many others…), Aggregation associated (‘GroupAggregate’, ’Combination’, and many others…) and others ( ‘Restrict’, ‘Type’, ‘materialize’, and many others..). Happily it’s good to keep in mind any of this.

Professional Tip #3 💃: Focus is vital, look solely on nodes which can be problematic.

Professional Tip #4 💃: Cheat ! on the problematic nodes search what they imply within the clarify glossary.

Now, let’s drill down into how we all know which node is the problematic one.

There may be a number of data we will see on every node

Let’s drill all the way down to what these metrics truly imply.

  • Precise Loops: the variety of loops the identical node executed is 1. To get the whole time and rows, the precise time and rows should be multiplied by loops values.
  • Precise Rows: the precise variety of produced rows of the Combination node is 1 (per-loop common and we now have loops is 1).
  • Plan Rows: the estimated variety of produced rows of the Combination node is 1. The estimated variety of rows may be off relying on statistics.
  • Precise Startup Time: the time it took to return the primary row in milliseconds of the Combination node is 1271.157 (aggregated and consists of earlier operations).
  • Startup Price: arbitrary items that signify the estimated time to return the primary row of the Combination node is 845110(aggregated and consists of earlier operations).
  • Precise Complete Time: the time it took to return all of the rows in ms of the Combination node is 1271.158 (per-loop common and we now have loops is 1 and aggregated and embody earlier operations).
  • Complete Price: arbitrary items that signify the estimated time to return all of the rows of Combination node is 845110 (aggregated).
  • Plan Width: the estimated common dimension of rows of the Combination node is 8 bytes.

Professional Tip #5 💃: be cautious of loops, keep in mind to multiply loops if you care about Precise Rows and Precise Complete Time.

We are going to drill within the subsequent part on a sensible instance.