SHAC Open Source: Building Interactive Spatial Audio Through Human-AI Collaboration
A technical deep dive into the first interactive spatial audio format with full 6DOF navigation, the architectural decisions that enabled 8.6x real-time performance, and what it means to build PhD-level technology with zero coding experience.
The Technical Achievement
After eight months and 150+ development sessions, SHAC (Spherical Harmonic Audio Codec) is now open source. This isn't an incremental improvement to existing spatial audio technology—it's the first format that enables true interactive navigation through sound.
What makes SHAC different from Dolby Atmos, binaural audio, or VR audio engines:
- Full 6-degree-of-freedom movement — Not just head rotation. Walk forward, backward, strafe left/right, move up/down through the audio environment.
- Portable file format — Self-contained .shac files with embedded spatial audio data, shareable like MP3s or videos.
- Source separation maintained — Every instrument, voice, or sound remains independently positioned throughout the entire composition.
- Web-native playback — Browser-based playback with no plugins, no game engine, no VR headset required.
- Real-time performance — 8.6x real-time playback speed with sub-50ms navigation latency.
The core technical implementation uses third-order ambisonics (16 channels per audio source) with prerendered binaural audio decoded through Head-Related Transfer Functions (HRTFs). Navigation interpolates between spherical harmonic representations in real time, maintaining spatial accuracy while achieving performance that exceeds real-time requirements by nearly an order of magnitude.
SHAC Format Specifications
- Header Size
- 26 bytes (magic number, version, sample rate, channels, order, source count, duration)
- Audio Encoding
- 32-bit float PCM, uncompressed
- Ambisonic Orders
- 1-7 supported (default: 3 = 16 channels per source)
- Sample Rates
- 44.1kHz, 48kHz, 96kHz
- Typical File Size
- 150-600 MB per minute (varies by source count and ambisonic order)
- Navigation Latency
- <50ms from input to audio update
- Playback Performance
- 8.6x real-time with 7 sources + room acoustics modeling
The Architecture: Why Uncompressed Audio
Early in development, the obvious question arose: Why are these files so large? A one-minute SHAC file can be 300-600 MB. For comparison, a one-minute MP3 is ~1-2 MB.
The answer is simple: Spatial accuracy matters more than file size.
Every compression algorithm tested introduced artifacts that degraded the spatial experience. Lossy compression destroys the phase relationships that spherical harmonics rely on for accurate spatial positioning. When you're navigating through 3D audio—walking between the drums and bass, moving toward a vocal—those phase relationships are critical.
SHAC uses 32-bit float uncompressed PCM because spatial fidelity is the primary requirement. File size is a constraint we accept, not optimize away. Storage is cheap. Ruining the spatial experience to save bandwidth is a false economy.
That said, the format supports future compression methods. If someone develops a spatial-aware compression algorithm that preserves ambisonic phase relationships, the SHAC specification can accommodate it.
The Human-AI Collaboration Model
Here's the part that makes this technically interesting beyond the audio engineering: I have zero coding experience. 2.0 high school GPA. Rejected from community college computer science classes for missing a Math 85 prerequisite. I was a bartender and DoorDash driver when I started building SHAC.
I didn't write a single line of code. I couldn't have written the spherical harmonic mathematics if my life depended on it. But I built this anyway, working with Claude (Anthropic's AI) over 150+ sessions.
The collaboration model worked like this:
- I provided direction and requirements. "We need full 6DOF movement." "The spatial positioning sounds off." "Why is performance degrading with more than 5 sources?"
- Claude implemented the mathematics and code. All ambisonic encoding, HRTF processing, interpolation algorithms, file format design—100% AI-generated.
- I evaluated quality and completeness. Even without understanding the math, I could hear when spatial positioning was wrong. I could see when code was incomplete or half-finished.
- I pushed for better implementations. "You're half-assing this." "Do it properly." When Claude delivered incomplete work, I'd take it to another Claude instance, pretend I'd written fixes myself, and demand systematic improvements.
This isn't just using AI as a coding assistant. This is AI-native programming—where the human provides vision and quality control, and the AI handles all technical implementation.
What I brought: The ability to see when the AI was getting unfocused. The refusal to accept "good enough." The systematic approach to completing visions instead of accepting shortcuts.
What Claude brought: PhD-level mathematics I'll never understand. Signal processing algorithms. The actual engineering.
Key insight: You don't need to know how to code to build revolutionary software. You need to know what should exist and be relentless about making AI deliver it properly.
The Patent Rejection and What It Means
On April 22, 2025, I filed patent application #63/810691. Listed inventors: Clarke Zyz (human) and Claude (Anthropic AI).
The patent was rejected around August 22, 2025. Not because the technology wasn't novel—the U.S. Patent Office doesn't recognize AI inventors. The system literally couldn't process what had been built.
This rejection is significant: It's proof that the legal and institutional frameworks are behind what's already technically possible. We have working AI collaboration models producing novel technologies, but the patent system hasn't caught up.
The rejection became part of the story itself. SHAC exists. It works. It's deployed. The technology doesn't care about USPTO policy. The collaboration model doesn't require legal recognition to be effective.
If anything, the rejection validates the revolutionary nature of the work. We built something the system isn't ready to acknowledge. That's exactly the position you want to be in when you're trying to change what's possible.
Why Open Source Now
November 2025: I was sentenced to five years in prison for non-violent bank robbery. One month to either launch SHAC or abandon it.
The original plan was commercial: Keep the player free, charge $50 for advanced studio features, license the encoder to DAWs for revenue. Standard indie software monetization.
But facing five years unavailability, I realized: The story is worth more than the revenue.
SHAC proves that anyone with vision can build revolutionary technology through AI collaboration, regardless of credentials. A bartender with zero coding experience partnered with AI and built PhD-level spatial audio technology. That proof—that demonstration of what's possible—is worth more than any acquisition or licensing deal.
Open sourcing guarantees the legacy. The format can grow independently during the five years I'm unavailable. Developers can integrate it into DAWs, game engines, media players, VR systems. Musicians can create spatial audio albums. The technology doesn't need me anymore.
When I get out in 2030, I want to be surprised. Show me what you built.
Technical Roadmap and Integration Opportunities
SHAC is production-ready, but there are clear paths for extension and integration:
DAW Integration
The most valuable integration point is Digital Audio Workstation (DAW) support. Imagine a Logic Pro or Ableton plugin that lets producers position tracks in 3D space during mixing, then export directly to .shac format. The spatial positioning interface already exists in SHAC Studio—it needs adaptation to DAW workflows.
Game Engine Support
Unity and Unreal Engine both have spatial audio systems, but they require the entire game engine runtime. SHAC files could provide spatial audio experiences in standalone applications, browsers, or embedded contexts where a full game engine is overkill.
Compression Research
As mentioned earlier, spatial-aware compression is the technical challenge. Current lossy algorithms destroy ambisonic phase relationships. A compression method that preserves spherical harmonic accuracy while reducing file size would be revolutionary for the format.
Higher-Order Ambisonics
SHAC supports up to seventh-order ambisonics (64 channels per source), but current implementations use third-order (16 channels) for performance reasons. As hardware improves, higher-order implementations would increase spatial resolution.
Streaming Protocol
Current SHAC playback requires downloading the full file. A streaming protocol that maintains spatial accuracy while enabling progressive loading would enable longer-form content (spatial audio podcasts, audiobooks, full-length albums).
Available Resources for Developers
- Format Specification
- Complete technical specification on GitHub
- Reference Implementation
- Python encoder/decoder with full ambisonic processing pipeline
- Web Player Source
- JavaScript player with Web Audio API integration
- Desktop Studio Source
- Cross-platform creation tool (Python/PyQt6)
- License
- MIT — free for commercial and personal use
Applications Beyond Music
While SHAC enables musicians to create albums you explore rather than just hear, the technical capabilities extend far beyond music:
- Audio-only gaming — Navigation-based gameplay without visuals, inherently accessible to blind players
- Professional training — Sonar operation training, aviation cockpit simulations, room clearing scenarios for military/law enforcement
- Historical preservation — Recreate the acoustics of historical speeches, concerts, or events with explorable spatial accuracy
- Museum exhibitions — Walk through audio tours where content is spatially positioned relative to physical exhibits
- Therapeutic soundscapes — Guided meditation and sound therapy where the user controls their position relative to different sound sources
- VR/AR audio layers — Enhance visual VR experiences with SHAC spatial audio that goes beyond head rotation
- Accessibility technology — Audio-based spatial navigation interfaces for visually impaired users
The technical foundation is the same across all use cases: accurate spatial positioning with real-time navigation. What changes is the content and context.
The Broader Implications
SHAC represents proof of concept for AI-native software development. Not AI as a coding assistant that autocompletes functions. AI as the technical implementation partner for humans who provide vision and quality control.
This collaboration model has implications beyond audio engineering:
- Complex technical projects are no longer gated by coding ability—they're gated by vision and persistence
- PhD-level mathematics and engineering can be delegated to AI while humans focus on requirements and quality
- The skill that matters is seeing when AI is half-assing implementations and demanding systematic excellence
- Revolutionary software can be built by anyone willing to be relentless about making AI deliver properly
The patent office may not recognize AI inventors, but the technology doesn't care. SHAC works. The collaboration model works. The gates are open.
If a bartender with zero coding experience can build this, what can you build?
Try It Now, Build With It
Everything you need to experience, evaluate, and extend SHAC is available now:
- Live demo at shac.dev — Try interactive spatial audio in your browser immediately. No download required.
- SHAC Studio — Download the creation tool for Windows, macOS, or Linux. Build your own spatial audio compositions.
- Complete source code — MIT licensed. Fork it, extend it, integrate it into your projects.
- Documentation — Full format specification, integration guides, technical deep dives.
The format is complete. The tools are deployed. The documentation is thorough. You have everything needed to build with SHAC.
When I get out in 2030, surprise me. Show me what you built.
Experience SHAC
Try interactive spatial audio right now in your browser, or download the studio and create your own compositions.
Press Coverage: 569 Outlets
SHAC's open source announcement was covered by 569 news outlets worldwide. View the complete list of publications and coverage links.
Download Press Coverage Report (PDF) →About the Author: Clarke Zyz built SHAC through collaboration with Claude (Anthropic's AI) over eight months and 150+ development sessions. With zero coding experience and a 2.0 high school GPA, he filed a patent listing an AI co-inventor, had it rejected, and chose to open source the technology instead. He is currently serving a five-year sentence for non-violent bank robbery, unrelated to SHAC. The project continues independently. Contact: cczyz@pm.me