DEV Community

Steven Hans
Steven Hans

Posted on

πŸŽ™οΈ I built a voice chat app because I wanted to talk to my son while we play online

It all started with something simple: my son and I wanted to communicate while playing online together. I tried the existing options β€” Discord, TeamSpeak, and others β€” but something always bothered me: ads, data collection, cluttered interfaces, or mediocre audio quality. I just wanted something secure, private, with good audio quality, and free. So I decided to build it myself.

The result is NoamVC: an open source, peer-to-peer voice chat application with end-to-end encryption.

What makes it technically different?
πŸ”Š Direct audio, no middlemen
Voice travels directly between participants using native WebRTC β€” no server processes or stores audio. The server only coordinates the initial connection (signaling).

πŸ”’ Multi-layered security

E2E encryption with Insertable Streams API + 256-byte key derived via PBKDF2 (100K iterations, SHA-256)
Signaling signed with HMAC-SHA256, with secret embedded in Rust β€” never exposed to JavaScript
Anti-replay protection with a 5-minute window
Storage in an encrypted IOTA Stronghold vault β€” zero data in localStorage
🎧 Optimized Opus codec
32 kbps, mono, with Forward Error Correction (FEC), Discontinuous Transmission (DTX), and 20 ms frames. Minimum bandwidth consumption with maximum clarity.

πŸ–₯️ Native desktop app
Built with Tauri 2 (Rust) for macOS (Apple Silicon + Intel), Windows, and Linux. It's not a heavy Electron app β€” it's a real native app, lightweight and signed.

⚑ Modern stack
React 19 Β· TypeScript 5.9 Β· Vite 7 Β· Tailwind CSS 4 Β· NestJS 11 Β· Zustand Β· shadcn/ui

🎨 Polished design
Dark theme with reactive animated backgrounds that respond when someone speaks: SVG waves, equalizer bars, floating particles, and pulse rings. Real-time speech detection with FFT-256 analyzer. Latency indicator with live RTT.

🌐 Bilingual
Interface in Spanish and English with automatic system language detection.

This was definitely a project I really enjoyed building. I gained a ton of learning along the way. I hope you find it useful and that you're encouraged to leave your feedback so I can keep improving it.

πŸ”— Project: https://noamvc.web.app/

OpenSource #WebRTC #Tauri #React #Rust #TypeScript #VoiceChat #P2P #Encryption #DevDad #BuildInPublic

Top comments (3)

Collapse
 
chovy profile image
chovy

Really solid crypto choices here. PBKDF2 with 100K iterations for the key derivation is the right call, and using Insertable Streams for E2EE on WebRTC is still underutilized β€” most P2P voice apps skip that layer entirely.

One thing I've been thinking about a lot: the classical crypto primitives (ECDH for key exchange, SHA-256, etc.) are going to need post-quantum upgrades eventually. Harvest-now-decrypt-later attacks mean that even voice signaling metadata could be valuable to store and crack later.

I've been building something in the same privacy-first space β€” qrypt.chat, a text chat focused on quantum-resistant encryption from day one. Different use case (text vs voice) but the same fundamental problem: making strong crypto invisible to users.

Your Tauri + Rust approach is interesting for this too. Rust's memory safety is a genuine security advantage over Electron for anything handling crypto. Nice work shipping this.

Collapse
 
steven_hans_b26a962c69563 profile image
Steven Hans

Thanks for the detailed technical analysis! I'm glad the cryptography decisions resonate with someone who clearly understands the space.

On the post-quantum point: totally agree. The "harvest now, decrypt later" scenario is real and something I have on my radar. Today NoamVC uses ECDH + PBKDF2-SHA256 (100K iterations) + Insertable Streams for frame-by-frame E2EE, but the modular architecture allows the key derivation layer to be swapped out without touching the rest of the audio pipeline. When post-quantum primitives (ML-KEM/Kyber) are more mature and stabilized in WebCrypto, migrating the key derivation will be a surgical change.

Since the last post, several important improvements have been implemented:

  • Signaling server rewritten in Rust β€” Migrated from NestJS to Axum + Socketioxide. Same protocol, but now with strict payload validation, sliding window rate limiting (100 events/10s), 5-min anti-replay window, and all backed by Rust's memory safety guarantees.
  • Room admission system β€” Room creators now approve/reject each participant before they join. WebRTC is not negotiated until the host explicitly admits them.
  • Storage migrated to SQLite β€” Replaced IOTA Stronghold with tauri-plugin-sql (SQLite). More stable, more lightweight, and eliminates a heavy dependency without sacrificing local storage security.
  • Firebase SDK removed from the client β€” Bug reporting now uses the Firestore REST API directly (228 KB β†’ 6 KB bundle), with the API key embedded in the Rust binary via env!() at compile time β€” never exposed to the frontend.
  • E2EE text chat β€” Messages encrypted with AES-256-GCM via P2P DataChannels, using the same key derived from the room ID.
  • Full CI/CD β€” GitHub Actions generates signed and notarized builds for macOS (ARM + Intel) and Windows, with automatic in-app updates via Ed25519.

I completely agree that Rust + Tauri is a real advantage over Electron for systems handling cryptography. Not just because of memory safety β€” the ability to embed secrets in the binary at compile time and keep the entire HMAC-SHA256 signing layer in Rust, inaccessible from JavaScript, is a level of isolation that Electron simply cannot offer.

I'll take a look at qrypt.chat β€” the quantum-first approach to text is interesting. Different problems, same philosophy: making robust cryptography invisible to the user.

Collapse
 
ankush_banyal_708fa19a469 profile image
Ankush Banyal

Love this β€” building it for your son makes it even more special

Technically very solid approach with pure P2P WebRTC and E2E encryption. Super clean privacy-first mindset.

If you ever decide to scale beyond small peer groups (larger rooms, recording, moderation, etc.), you might eventually need an SFU-based setup instead of pure mesh. That’s where something like Ant Media Server can help handle low-latency real-time audio at scale.

But honestly, for what you built β€” this is awesome work. Respect