Plan‑level visibility lets teams pinpoint inefficient execution paths and verify the impact of schema changes, directly improving PostgreSQL performance and reliability.
PostgreSQL administrators have long relied on pg_stat_statements to surface the most costly queries, but that view stops at the statement level. The newly released pg_stat_plans extension fills the gap by aggregating execution metrics for each distinct query plan, exposing plan text, call counts, and total execution time. This granularity is crucial for environments where the planner may switch between nested‑loop, hash, or merge joins, and where subtle plan changes can dramatically affect latency.
Integrating pg_stat_plans with pgwatch is straightforward. By adding a custom metric definition in a YAML file—specifying the extension’s initialization SQL and the complex CTE that joins pg_stat_plans to pg_stat_statements—pgwatch can poll the database at configurable intervals (e.g., every three minutes). The collected rows are written to a sink database, where they become available for downstream analysis. This extensible approach means any SQL‑accessible metric can be incorporated, turning pgwatch into a universal telemetry hub for PostgreSQL clusters.
Once the data resides in the sink, Grafana can render it through a dedicated dashboard panel. Users can filter by database, query ID, or time range to compare plan execution times, call frequencies, and resource consumption across plan versions. The visual feedback makes it easy to verify that new indexes or configuration tweaks trigger the intended plan, and to spot regressions before they impact production workloads. In short, the pg_stat_plans + pgwatch combo equips teams with actionable, plan‑level insights that drive performance tuning and operational confidence.
The PostgreSQL ecosystem just introduced a new pg_stat_plans extension. It's similar to pg_stat_statements but it tracks aggregated statistics for query plans instead of SQL statements.
It exposes a SQL interface to query those statistics via the pg_stat_plans view, and in this post, I will explore how easy it is to integrate such extensions with pgwatch for those who want to improve/customize the metrics gathered from a monitored database.
Table of Contents
Before moving ahead, please make sure that you have installed the extension and loaded it into your PostgreSQL server by following the installation guide in the extension repo.
I have written the below query to join pg_stat_statements with pg_stat_plans and retrieve me the query plans for the most resource intensive queries to debug them:
WITH /* pgwatch_generated */ p_data AS (
SELECT
max(p.dbid)::int8 as dbid,
max(p.queryid)::int8 as queryid,
p.planid,
sum(p.calls)::int8 as calls,
round(sum(p.total\_exec\_time)::numeric, 3)::double precision AS total\_plan\_exec\_time,
max(p.plan::varchar(15000)) as plan
FROM pg_stat_plans p
WHERE
dbid = (SELECT oid FROM pg\_database WHERE datname = current\_database())
GROUP BY p.planid
), q_data AS (
SELECT
max(s.dbid) as dbid,
queryid,
sum(s.calls)::int8 AS calls,
round(sum(s.total\_exec\_time)::numeric, 3)::double precision AS total\_exec\_time,
sum(shared\_blks\_read)::int8 AS shared\_blks\_read,
sum(shared\_blks\_written)::int8 AS shared\_blks\_written
FROM
pg\_stat\_statements s
WHERE
calls > 5
AND total\_exec\_time > 5
AND dbid = (SELECT oid FROM pg\_database WHERE datname = current\_database())
AND NOT upper(s.query::varchar(50))
LIKE ANY (ARRAY
'DEALLOCATE%',
'SET %',
'RESET %',
'BEGIN%',
'BEGIN;',
'COMMIT%',
'END%',
'ROLLBACK%',
'SHOW%'
)
GROUP BY
queryid
)
SELECT
(EXTRACT(epoch FROM now()) * 1e9)::int8 AS epoch_ns,
queryid::int8,
planid::int8,
plan,
calls::int8,
total_plan_exec_time::int8
FROM
(
(
SELECT
p.\*
FROM p\_data p
JOIN q\_data q ON q.queryid = p.queryid
ORDER BY q.total\_exec\_time DESC
LIMIT 100
)
UNION
(
SELECT
p.\*
FROM p\_data p
JOIN q\_data q ON q.queryid = p.queryid
ORDER BY q.calls DESC
LIMIT 100
)
UNION
(
SELECT
p.\*
FROM p\_data p
JOIN q\_data q ON q.queryid = p.queryid
ORDER BY q.shared\_blks\_read DESC
LIMIT 100
)
UNION
(
SELECT
p.\*
FROM p\_data p
JOIN q\_data q ON q.queryid = p.queryid
ORDER BY q.shared\_blks\_written DESC
LIMIT 100
)
)
You can run it locally, confirm that it works and maybe check your monitoring dashboards that the queryids it returns the plans for are your database's most resource intensive queries.
Next, all we need to do is define the above query as a metric in pgwatch configs, naming it stat_plans (or whichever you feel like), specify a fetching interval for it and start pgwatch, then we will have pgwatch periodically connect to the database to run this query and store the results in the configured sinkss, which we can then connect to data-visualization tools like Grafana to inspect the results at various timestamps.
pgwatch supports two ways to store configurations and metric definitions. One is using yaml files, which is what we are going to do, and the other is using a PostgreSQL database (or any other database that supports the wire protocol). You can learn more about both options from the docs here and here.
We are going to follow the same structure as in the sample.metrics.yaml file available in the pgwatch repo to create the following metric configuration:
# metric.yaml
metrics:
# other-metrics-in-your-setup
stat\_plans:
description: >
This metric collects statistics from the \`pg\_stat\_plans\` extension.
It provides insights about different plans for the most resource intensive queries,
including plan text, number of calls, and execution times.
This metric is useful for monitoring and debugging query plans.
init\_sql: CREATE EXTENSION IF NOT EXISTS pg\_stat\_statements; CREATE EXTENSION IF NOT EXISTS pg\_stat\_plans;
sqls:
16: |-
put-the-above-metric-query-here
presets:
# probably you want to add new metric to a preset and use it.
And do the same for sources following sample.sources.yaml to create this simple sources file:
# source.yaml
name: omdb
conn_str: postgresql://postgres@localhost/omdb
kind: postgres
custom_metrics:
stat_plans: 180
is_enabled: true
And finally run pgwatch to see what happens:
$ pgwatch --sources=source.yaml --metrics=metric.yaml --sink="postgresql://postgres@localhost/pgwatch_metrics"
2026-01-28 16:47:51.706 INFO sink:postgres db:pgwatch_metrics initialising measurements database...
2026-01-28 16:47:52.771 INFO sink:postgres db:pgwatch_metrics measurements sink is activated
2026-01-28 16:47:52.772 INFO sources:1 sources refreshed
2026-01-28 16:47:54.164 INFO source:omdb metric:stat_plans interval:180 starting gatherer
2026-01-28 16:47:55.225 INFO source:omdb metric:stat_plans rows:43 cache:false measurements fetched
2026-01-28 16:47:57.049 INFO sink:postgres db:pgwatch_metrics rows:43 elapsed:856.597634ms measurements written
Voilà, we got measurements fetched and measurements written and no errors, and connecting to our database we can see data in the stat_plans table in the pgwatch_metrics sink db:
pgwatch_metrics=# select * from stat_plans limit 1 offset 1;
time | dbname | data
-------------------------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
2026-01-28 14:47:54.564881+00 | omdb | {"plan": "Function Scan on pg_get_replication_slots l", "calls": 241, "planid": -4170850111752102628, "queryid": 4160689505986365494, "total_plan_exec_time": 4}
Now we need to create a Grafana dashboard that queries and visualizes this gathered data. I chose to extend pgwatch's Single query details dashboard and add a new table panel that displays the different plans and their statistics for the single query we are currently investigating.
I won't go into much detail about the steps to create the panel as this can be learned from other online sources or by asking an LLM. You can find the updated dashboard here, and the query I used inside the new panel to query the sink db for the latest stat_plans metric measurements for a specific query and according to the time interval specified in Grafana by the user is:
select
total_plan_exec_time,
calls,
(total_plan_exec_time / calls) as avg_plan_exec_time,
planid,
plan
from
(
select
sum(tt - tt\_lag) as total\_plan\_exec\_time,
sum(calls - calls\_lag) as calls,
planid,
case
when length(plan) >= 15000 then plan :: varchar(15000) || '...'
else plan
end as plan
from
(
select
data ->> 'queryid' as queryid,
data ->> 'planid' as planid,
data ->> 'plan' as plan,
(data ->> 'total\_plan\_exec\_time') :: float8 as tt,
lag((data ->> 'total\_plan\_exec\_time') :: float8) over w as tt\_lag,
(data ->> 'calls') :: int8 as calls,
lag((data ->> 'calls') :: int8) over w as calls\_lag,
time
from
stat\_plans
where
dbname = '$dbname'
and $\_\_timeFilter(time)
and (data ->> 'queryid') :: int8 = $queryid window w as (
partition by data ->> 'planid'
order by time
)
) y
group by
planid,
plan
having
sum(tt - tt\_lag) > 0
AND sum(calls - calls\_lag) > 0
order by
1 desc,
2 desc nulls last
) z
order by
1 desc,
2 desc nulls last
Then, configuring Grafana to connect to my sink db, adding this dashboard to my local setup's dashboards, and restarting, we get the final result:



You can see that there were 2 different plans chosen by the planner for the same query. Investigating both, you can find that the planner used a nested loop join in one and a hash join in the other.
Now you can debug the query plans of your database's most resource intensive queries - without having to ask the dev team for the query text 🙂 ... compare different plans considered by the planner, ensure new plans get used after creating an index, etc.
pgwatch is a very extensible solution that you can integrate any metric into as long as it can be queried via SQL
If you want help with your pgwatch setup or have some questions, feature requests, or issues feel free to open issues and/or discussions in the official pgwatch repo
The post Monitoring query plans with pgwatch and pg_stat_plans appeared first on CYBERTEC PostgreSQL | Services & Support.
Comments
Want to join the conversation?
Loading comments...