Lossy audio file formats explained (data compressed formats)

Connect With Us

• MTT POSTS

• MTT OPEN POSTS

Friday

May102013

May 10, 2013

Barry Gardner |

Print Article |

Post a Comment

Almost all distributed audio files online use lossy, data compressed file formats. Formats such as MP3 (a shortening of MPEG-2 Layer III), AAC (Advanced Audio Coding), .wma, .m4a files and the slightly lesser known and oddly named Ogg Vorbis. These files use complex algorithms to reduce the file size for faster upload and download times and allow more tracks to be stored on phones and iPods. Streaming services such as Spotify, online radio stations and Soundcloud also use compressed audio streams which reduce the data rate of the music that is heard.

Whilst they are technically very advanced none of the formats retain the same sonic detail as uncompressed files such as .wav or .aiff files.

The way these data compressed file formats work is by using a mathematical model on how human hearing works. The ear is nothing short of miraculous and yet it works using known principles which allow it to be fairly successfully modeled in mathematic terms and allows a high resolution audio file to be data reduced.

An example of one of the perceptual aspects of the human ear is known as masking. The human ear is rather biased towards the perception of particular aspects of sound as opposed to the exact sound itself as created in a natural environment. This is a very human trait and we have evolved this physiological mechanism for survival purposes, prioritizing auditory cues that relate to staying alive.

Masking explained

A fairly easy example to understand is when 2 sounds occur at or close to the same point in time, dependent on frequency and level relationships one of the 2 sounds may appear more audible than the other. As such perceptions can be represented mathematically it allows an algorithm to be written to mimic this action and reduce the data accordingly.

These occurences that can be masked happen multiple times per second and allow for the significant size reduction with a relatively low level of quality loss. This information can be discarded and a new file is created with less information. (incidentally if you convert an mp3 file or other compressed/lossy file format into a .wav or .aiff high res file you do not magically get the discarded information back, you are simply storing the mp3 data reduced file in a larger sized file format container. The new file will sound identical to the mp3 subject to variations in the mp3 versions software decoding ability)

Loss of fidelity in compressed file formats

When a 16 bit 44.1kHz .wav file (The same quality and specification of audio file that is found on a commercial CD in the form of a table of contents and .cda cue) is data compressed to say… 128kbps (kilo bits per second) you should be able to hear a loss of accuracy in both the stereo image (width of a stereo piece of music) and in the accuracy of reproduction of the high frequencies. The treble frequencies often sound a little cloudy, phasey and even a little swirly. 128kbps is quite a large data reduction and it is more common that files are encoded at 320kbps. The stereo image sometimes sounds more narrow and less wide. 320kbps tends to be an acceptable compromise between fidelity and file size. This allows people to enjoy music very close to how it was meant to sound but also allows many songs to be held on a MP3 player with limited storage memory.

As a mastering engineer I have always mastered music in such a way that encoding to compressed data formats sounds good for the end listeners. There are certain technical tweaks that one can apply during mastering to ensure this remains the case. This has recently been formalized by at least 1 major player in the realms of digital music distribution. There are some set guidelines to ensure that a specific data reduction algorithm is able to work with minimal losses to fidelity.

These guidelines tend to produce good results with all audio codecs at 320kbps. Of course the more a file is compressed (lower kbps value) the lower the quality is, this stands to reason.

Summary

Compressed file formats seem to be with us for a while yet so understanding what makes them sound as good as possible before people listen to them makes a lot of sense for those involved in music production and listeners who care about sound quality.

Barry Gardner operates SafeandSound Mastering. A mastering studio that understands the process which is known as mastered for iTunes

music production | tagged

iTunes

Email Article |

Permalink |

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

Post a New Comment

Enter your information below to add a new comment.

Author:

Author Email (optional):

Author URL (optional):

Post:

↓ | ↑

Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>