The Audio Format Problem Nobody Solved

There’s a moment in every project where someone sends you the wrong audio file.

It’s an MP3. You needed a WAV. Or it’s a FLAC and your editor doesn’t recognise it. Or it’s an M4A because someone exported from Voice Memos and didn’t think about it. Why would they? They recorded something. They sent it. The file has a name. It plays when you click it. Job done.

Except it isn’t. Because audio formats aren’t interchangeable, and the differences between them matter in ways that aren’t visible until something goes wrong.

Here’s the version of this explanation that actually sticks.

An uncompressed audio file like WAV or AIFF is the full recording. Every sample captured by the microphone is preserved. Nothing removed, nothing approximated. It’s the equivalent of a RAW photo file: everything the sensor saw, kept. A three-minute WAV at CD quality runs about 30 megabytes. That’s the cost of completeness.

A lossless compressed file like FLAC or ALAC is the same audio, packed tighter. Think of it like zipping a folder: the contents are identical, they just take up less space. A FLAC of that same three-minute track might be 18 megabytes. Same quality. Smaller container. The trade-off is compatibility. Not every player and not every editor handles FLAC gracefully, even in 2026.

A lossy compressed file like MP3, AAC, or OGG is smaller still. Around 3 megabytes for that same track. But the size reduction comes from removing audio data that an algorithm decided you probably can’t hear. High frequencies above a certain threshold. Subtle harmonics. Spatial details in the stereo field. For casual listening, the difference is negligible. For editing, it’s not.

This is where the old advice of “always use WAV” was correct but incomplete. The question was never really WAV versus MP3. The question was: what are you doing with the file?

If you’re editing video, use uncompressed. Your timeline is going to render, compress, and re-encode the audio during export. If you start with a lossy source, you’re compressing already-compressed audio. Each generation loses more. It’s photocopying a photocopy. By the time the final video renders, the audio has been degraded twice. Once by the original MP3 encoding, once by the video export codec. Start with WAV and you get one compression, on your terms, at the end.

If you’re distributing online, lossy is fine. Spotify streams AAC at 256kbps. YouTube re-encodes everything regardless of what you upload. Podcasts are delivered as MP3 because file size affects download speed and storage on phones. For delivery, lossy formats exist for good reason. They’re smaller, they’re universal, and the quality loss is inaudible to anyone who isn’t listening on studio monitors in a treated room.

If you’re archiving, use FLAC. It’s the insurance policy. Full quality, smaller than WAV, and it’s an open format. No licensing, no proprietary lock-in, no risk that the codec disappears in ten years. Your grandchildren’s computers will play FLAC.

If you’re looping, use WAV. This is a detail that almost nobody explains well. MP3 encoding introduces a tiny silence at the start and end of every file. Between 10 and 50 milliseconds. It’s a byproduct of how the compression algorithm works. For a song, you’ll never notice. For a loop that’s supposed to cycle seamlessly (background music on a website, an ambient track in a game, a beat on repeat) that gap creates an audible hiccup at the loop point. A tiny stutter every time the file restarts. WAV doesn’t have this problem because there’s no encoding artefact. The audio starts at sample one and ends at the last sample. Clean edges.

The reason this confusion persists, thirty years after MP3 was invented, is that audio formats are invisible. You can’t look at a file and see the difference. A WAV and an MP3 of the same song have the same name, the same album art, and sound identical on laptop speakers. The differences only surface when you edit, loop, compress again, or listen carefully. By then, you’ve already made the wrong choice.

The fix isn’t education alone. It’s access to conversion. Most people don’t need to understand psychoacoustic modelling. They need to turn a file from one format into another, quickly, without installing software or uploading the file to a website that may or may not do something useful with it.

Convert locally. Keep the quality. Move on.