Big Data News and Headlines
  • All Technology
  • AI
  • Autonomy
  • B2B Growth
  • Big Data
  • BioTech
  • ClimateTech
  • Consumer Tech
  • Crypto
  • Cybersecurity
  • DevOps
  • Digital Marketing
  • Ecommerce
  • EdTech
  • Enterprise
  • FinTech
  • GovTech
  • Hardware
  • HealthTech
  • HRTech
  • LegalTech
  • Nanotech
  • PropTech
  • Quantum
  • Robotics
  • SaaS
  • SpaceTech
AllNewsDealsSocialBlogsVideosPodcastsDigests

Big Data Pulse

EMAIL DIGESTS

Daily

Every morning

Weekly

Sunday recap

NewsDealsSocialBlogsVideosPodcasts
Big DataNewsTomas Vondra: The Real Cost of Random I/O
Tomas Vondra: The Real Cost of Random I/O
Big Data

Tomas Vondra: The Real Cost of Random I/O

•February 26, 2026
0
Planet PostgreSQL (aggregator)
Planet PostgreSQL (aggregator)•Feb 26, 2026

Why It Matters

An inaccurate random_page_cost skews PostgreSQL's query planner, degrading performance for a wide range of workloads on contemporary storage hardware.

Key Takeaways

  • •Measured random_page_cost on SSDs ≈ 30, not 4.
  • •Planner picks wrong plan for 0.2%‑2.2% selectivity.
  • •Bitmap scans reduce cost errors via prefetching.
  • •Prefetching speeds sequential reads, but cost model ignores it.
  • •Adjusting random_page_cost helps when cache hits dominate.

Pulse Analysis

The default random_page_cost of 4.0 was chosen when spinning disks dominated I/O, but today’s flash‑based storage delivers dramatically different latency characteristics. PostgreSQL’s optimizer still treats random page reads as only four times more costly than sequential reads, a ratio that no longer reflects reality. As databases migrate to high‑throughput SSDs and cloud‑based block storage, revisiting this parameter becomes essential for accurate cost estimation and efficient plan selection.

Vondra’s experiment isolates I/O costs by disabling caching effects and measuring raw read times for sequential versus index scans. The resulting per‑page timings translate to a random_page_cost near 30 on local SSDs—far above the default. This discrepancy causes the planner to favor index scans for modestly selective predicates, even when a sequential scan would execute faster. The gap between estimated cost and actual duration widens to an order of magnitude, exposing a systemic planning flaw that can affect any workload relying on range predicates or moderate selectivity.

Practically, DBAs can address the mismatch by raising random_page_cost to reflect measured I/O behavior, especially on systems where prefetching is limited or remote storage adds latency. Bitmap scans naturally bridge the cost gap by converting random accesses into more sequential patterns, leveraging PostgreSQL’s prefetch mechanisms. Nonetheless, the cost model still omits prefetching benefits for index scans, suggesting a future avenue for optimizer enhancements. Until such changes land, careful tuning of random_page_cost, combined with monitoring tools like pg_stat_statements, remains the most reliable strategy for maintaining query performance on modern storage stacks.

Tomas Vondra: The real cost of random I/O

Read Original Article
0

Comments

Want to join the conversation?

Loading comments...