OSDI '20 - AGAMOTTO: How Persistent Is Your Persistent Memory Application?
Why It Matters
By automating detection of durability and performance bugs without code changes, Agamoto lowers the barrier to reliable persistent‑memory deployment, protecting data integrity and enabling faster, more efficient applications.
Key Takeaways
- •Persistent memory needs explicit flush and fence for durability.
- •Missing or redundant persistence instructions cause correctness and performance bugs.
- •Agamoto uses symbolic execution to detect bugs without source modifications.
- •Study found 90% of bugs follow two application‑independent patterns.
- •Agamoto uncovered 84 new bugs, outperforming prior tools dramatically.
Summary
The presentation introduced Agamoto, a symbolic‑execution framework designed to automatically uncover persistency bugs in applications that use Intel’s emerging persistent‑memory (PM) technology. By mapping PM directly into a process’s address space, developers can avoid file‑system overhead, but they must also manage explicit flush and fence instructions to guarantee durability.
A survey of 63 real‑world PM bugs revealed two dominant, application‑independent patterns: missing flush/fence sequences (≈80% of cases) and redundant durability calls (≈11%). Leveraging these patterns, Agamoto augments the CLE symbolic executor with a three‑state PM model (clean, dirty, flushed) and injects bug‑oracles that fire on illegal state transitions, eliminating the need for hand‑written test suites or source‑code changes.
The tool demonstrated its efficacy by discovering 84 previously unknown bugs across a range of open‑source PM libraries, dwarfing the three to four bugs found by competing tools such as PM‑Test and XF‑Detector. Developers confirmed many of the findings, and a follow‑up study showed that fixing identified performance‑related bugs could boost application throughput by up to 47%.
Agamoto’s high‑coverage, low‑overhead approach promises to accelerate the adoption of persistent memory by providing developers with a turnkey debugging solution that catches both correctness and performance defects, reducing data‑loss risk and unlocking the technology’s latency‑close, non‑volatile benefits.
Comments
Want to join the conversation?
Loading comments...