Game Audio Without Libraries: Building a Sound Engine from Scratch
📅 June 15, 2026✍️ Tom Reeves🏷️ Technical⏱️ 5 min read
All Gerk Games use Web Audio API for sound effects. Not a library, not a framework — just raw oscillator nodes and gain envelopes. Here's what we learned building a full sound engine in 50 lines.
Why Web Audio API Beats Audio Files
Audio files (MP3, OGG, WAV) need to load, decode, and buffer. On a slow connection, this can take 2-5 seconds per sound. Web Audio API generates sounds procedurally — zero network overhead, zero file size, zero loading time. A game with 10 sound effects using procedural audio loads instantly.
Our sound engine uses a single function: an oscillator with a frequency, duration, waveform type (sine/square/sawtooth), and a gain envelope that starts at a volume and fades to zero. That's it. Everything from the Snake Arena eating sound to the Pixel Jumper jump sound is just different parameters to the same function.
Audio files typically use MP3 at roughly 1MB per minute of audio. A game with 10 sound effects at 2 seconds each would need roughly 350KB of audio files. On a 3G connection, that adds 2-3 seconds of loading time. Worse, the browser must decode each audio file before playing it, adding another 200-500ms of latency. Procedural audio using Web Audio API eliminates all of this. Our entire sound engine is about 600 bytes of JavaScript. It loads instantly and plays sounds with zero decoding latency.
The specific technique: an OscillatorNode generates a waveform (sine for musical tones, square for impact sounds, sawtooth for "danger" sounds). A GainNode controls the volume envelope (attack, sustain, decay). By varying these parameters, we can produce a surprising range of sounds. A sine wave at 440Hz with 50ms attack and 200ms decay sounds like a soft button press. A square wave at 200Hz with 5ms attack and 100ms decay sounds like a game over thud. A sawtooth wave at 800Hz with rapid frequency modulation sounds like a power-up activation. All of these come from the same 600-byte function, just called with different parameters.