NDSS 2025 – Secure Data Analytics

NDSS 2025 – Secure Data Analytics

Security Boulevard
Security BoulevardJan 24, 2026

Why It Matters

Laputa enables enterprises to safely leverage shared Spark clusters without exposing sensitive data, addressing a critical barrier to cloud‑based analytics adoption. Its low overhead and ease of integration make it a practical solution for real‑world big‑data pipelines.

Key Takeaways

  • Laputa enforces policies at Spark physical plan level.
  • Uses pattern matching to detect policy violations.
  • Isolates execution via confidential computing compartments.
  • Blocks malicious users and cloud managers effectively.
  • Adds moderate overhead while preserving Spark usability.

Pulse Analysis

Cloud‑based Apache Spark has become the de‑facto platform for large‑scale data analytics, yet its open architecture leaves data owners vulnerable to policy breaches. Traditional security layers focus on network perimeter or storage encryption, but they rarely inspect the logical execution plan that drives query processing. Without visibility into the physical plan, malicious actors—whether rogue data scientists or compromised cloud administrators—can craft queries that exfiltrate or corrupt sensitive information, stalling broader adoption of shared analytics services.

Laputa tackles this gap by embedding a pattern‑matching engine directly into Spark’s optimizer. At the physical‑plan stage, the framework evaluates fine‑grained policies—such as column‑level access controls or usage quotas—and rejects any plan that violates them. Simultaneously, it leverages confidential computing enclaves to compartmentalize the entire analytics pipeline, ensuring that even a compromised host cannot tamper with code or data in transit. Developers benefit from near‑transparent integration; existing Spark jobs run with only minor configuration tweaks, preserving productivity while elevating security posture.

Empirical results presented at NDSS demonstrate Laputa’s effectiveness across industry‑standard benchmarks like TPC‑H, diverse big‑data workloads, and real‑world machine‑learning models. The framework consistently blocked malicious query patterns and introduced only modest latency—typically under 10 % compared to vanilla Spark. For enterprises, this translates to a viable path for secure multi‑tenant analytics, enabling data sharing across organizational boundaries without sacrificing compliance or performance. As confidential computing hardware matures, solutions like Laputa are poised to become foundational components of next‑generation data platforms.

NDSS 2025 – Secure Data Analytics

Comments

Want to join the conversation?

Loading comments...