audio – Grey Panthers Savannah

Getting the most out of your audio recording with Audacity

gpanther — Sat, 08 Oct 2011 10:45:00 +0000

This article aims to show you some simple techniques to improve the quality of your voice recording quickly and cheaply (for free actually). But first things first:

The best audio is the one you don’t have to improve. Some simple steps you can perform in advance to maximize quality:

Use quality equipment. Here are some articles about the equipment great-sounding podcasters use. You don’t have to spend a lot of money, but definitely stay away from the built-in laptop microphone
Eliminate ambient noise as much as possible (close windows, draw the blinds, stop other electronic equipment in the room, etc)
Record each person on a separate channel – if possible on a computer local to them (avoid recording trough Skype, GoToMeeting or other VoIP solutions)
Try keeping the recording volume for each microphone at the optimal level – not too low, but also avoiding clipping

After you have the audio recording there is still a lot you can do, but it is preferable to start out with the best source material. For the example below I’ll be using the raw recordings from a recent SE Radio podcast:

The situation with this recording is as follows:

There are separate audio tracks for the interviewer and interviewee (good)
There is background noise on the tracks (easily correctable)
Both persons were picked up by both microphones (correctable)
The interviewer has some clipping (partially correctable – luckily it’s not the interviewee who has clipping)

The steps to improve the quality of this recording are as follows:

First, install the Noise Gate plugin for Audacity, since it requires program restart (under Windows you have to copy the downloaded noisegate.ny to C:Program Files (x86)Audacity 1.3 Beta (Unicode)Plug-Ins or to a similar location, under Linux you have to place it in /usr/share/audacity). After copying the file you have to close and restart Audacity. To verify that the plugin was properly installed check in the Effect menu – you should see an entry title “Noise gate”.

Now that we have Audacity all set up and the plugin installed, first split the stereo track into mono tracks, since they don’t actually represent left-right channels but rather two speakers which will be mixed together at the end. For this click on the arrow after the filename in the track and select “Split Stereo to Mono”. Sidenote: some people will prefer to mix different speakers in podcasts with different panning (that is to the left or to the right). I would advise against this: it is distracting if you are doing something else while listening to the podcast (like walking / jogging / riding a bike / etc). It can also backfire if for some reason the listening device is missing one of the channels (the “damaged headphone” scenario).

The first thing will be to remove the constant background noise (like AC hum for example). To do this zoom in (Ctrl + 1) and look for low volume zones. Select those zones and go to Effects –> Noise Removal –> Get Noise Profile. Now select a zone where the noise is mixed with speech and test out the settings (Effect –> Noise Removal –> Ok). After the test you can use Undo (Ctrl + Z) to roll back the changes. You should watch for the noise being removed but also the natural sound of the voice being preserved (too aggressive of a noise removal can lead to a “robot voice” effect). If you are satisfied, you can go ahead and apply it to the entire track. Also, since the noise source might change during the recording, you should at least do a quick scroll to check for other low-volume zones which can be a sign of noise. If you find noise from other sources, you can use the same steps to remove it.

Now that you have removed the noise, the next step would be to remove the voices from the channels they don’t belong to. This is where we’ll be using the Noise Gate plugin: since there is a considerable level difference between the wanted audio and the unwanted audio on each channel, we can just declare everything below a certain volume “noise” and use the plugin to silence it. A couple of tips:

This needs to be done separately for each channel, since the cutoff volume will be different
You can use the “Analyse Noise Level” function of the plugin to gauge the approximate level of the cutoff volume – this will only give you an estimate and you will have to play around with the settings a little bit to find the optimal volume
Use a “Level reduction” of –100 dB to completely filter out the sound and an “Attack/Decay” of 1000 milliseconds to avoid false positives
As with all the steps, you can experiment on a smaller portion of the audio file (since it is much quicker) to fine tune the settings by repeatedly applying the effect with different parameters and undoing (Ctrl+Z) the result after evaluation. When the parameters seem right, just select the entire track and press Ctrl+R (Repeat last effect)

After we’ve finished with both tracks, we have a better situation:

Now we will fix the clipping as much as possible (a perfect fix isn’t possible since clipping means that information got lost and all the plugins can do is to “guess” what the information might have looked like). First we reduce the aplification of the second track (the one which contains the clipping) by 10 dB as the Clip Fix plugin suggests (Effect –> Aplify –> –10 dB) after which we use the Clip Fix plugin. Unfortunately this plugin runs very slowly if we would to apply it to the entire track at once. Fortunately we have a reasonable workaround: select portions of the track and apply the plugin to them individually. After the first application you can use the “Repeat last effect” shortcut (Ctrl+R) to speed up the operation. Sidenote: it is a good habit to use the “Find Zero Crossing” function whenever you do a selection (the shortcut is Z – so whenever you select a portion, just press Z afterwards). This eliminates some weird artifacts when cutting / pasting / silencing part of the audio and it might even help when applying different effects. The fixed audio looks like this:

Now, that all the cleanup steps have been performed, there is one last step which is as important as the cleanup: maximizing the audible volume without introducing clipping. This is very important because all devices can reduce volume but few of them can increase it (some exceptions being: the Linux audio stack and VLC). The easiest way to do this is by using the Levelator (note: while the Levelator is free – as in beer – and does not restrict what you can do with the output, it is not free as in freedom if this is a consideration for you).

To do this, export the audio to WAV (make sure that all tracks are unmuted during export) and run the Levelator on it. The end result will look like the following:

Of course the Levelator isn’t magic pixie dust either, so here are a couple of things to check after it has been run:

Did it amplify some residual noise which wasn’t available in the initial audio? (if so, you should remove it using the Noise Removal plugin)
Did it miss segments? (it is rare, but it happens – those segments need to be amplified manually)
It results in “weird” sounding audio if the recording has been preprocessed by a dynamic compressor – for example GoToMeeting has an option to improve sound quality which uses dynamic compression and thus makes the recording unsuitable for the use with Levelator

That’s it for this rather long article. Don’t be discouraged by the length of the article: after going over the steps a couple of times, it shouldn’t take longer than 15 minutes to process a 2 hour interview (excluding the cutting / pasting / moving parts around) and you will gain listeners because of the higher production value.

A final note on the output formats: while during processing you should always use lossless formats, the final output format I recommend is: MP3 at 64 kbps CBR, Joint Stereo, 22050 MHz sampling rate. I found that this is the best balance between quality, file size and compatibility with the most playback devices out there.

Audio quality redux

gpanther — Tue, 20 Sep 2011 12:20:00 +0000

Yet an other example for how simple steps can improve the audio quality considerably. The clip below is taken from this blogpost (which I originally found trough Hacker News). You can find the processed version here, or use the controls below to do a quick A/B comparison of the two. The processing was very simple (1. noise removal and 2. running trough the Levelator) and quick.

Position:

Volume:

Crossfade (Original – New):

PS: For people reading the post trough an RSS reader: you probably need to click trough to the site to see the comparison in action, since most (all?) RSS readers filter out javascript for security reasons.

PS: If you are interested in the simple script which was use to interact with the two youtube players, you can find it in my code repository.

Power Line Humm Removal With Audacity

gpanther — Sun, 18 Sep 2011 13:45:00 +0000

As a response to George Starcher’s Removing Power Line Hum from Audio with GarageBand I would like to post a quick tutorial on how to do the same with Audacity:

Audio quality

gpanther — Sat, 05 Mar 2011 07:39:00 +0000

This is just one of those topics which comes up from time to time in my life (probably because I consume a lot of media). I was recently watching the Jim Zemlin interviewed by Jeremy Allison (Jim Zemlin is the Executive Director of the Linux Foundation) on the Google Open Source YouTube channel and was frustrated by the background noise and low audio volume, since the topic was really interesting to me. So I decided to look into the problem and see if the audio quality could have been easily improved. I covered the topic a couple of years so I won’t go into details, rather just give a 10 000 foot view of the process. Please read the original post for more details, since everything in it still applies.

Step 1: download the YouTube video. VLC natively supports YouTube playback, so exporting the sound to a FLAC file (you should always use lossless codecs during processing!) was just a matter of a couple of clicks and one or two minutes.

Step 2: load up in Audacity and remove the noise. The loading of the FLAC file is a little buggy (the progress bar keeps jumping between 0 and 100% and the time estimation is useless, but it loaded in under a minute). As you can see in the screenshot below, the volume is really low, but there are the occasional spikes, so plain normalization wouldn’t help you here. On the upside, there is no clipping which would result in a hard (impossible?) to repair artifacts.

After noise removal and keeping only one channel (no need for stereo here – we would add it back in the last step if we would to publish it since some devices can’t handle mono and the overhead with joint stereo is almost zero) the file was exported into WAV and fed into the Levelator. Here is the end result:

As you can see, we have much better volume resulting in a much improved experience for the consumer, all this with a couple of minutes of work while browsing Hacker News and with free (and mostly open-source) cross platform tools.

Content publishers of the world: please take a couple of minutes of your time after editing to do a proper post-production! Thank you.

Update: YouTube downloading is broken in the current VLC release but it will be fixed in the next version (1.1.12). Until then you can use the nighly builds.

Basic multi-media (post)processing

gpanther — Fri, 31 Jul 2009 13:21:00 +0000

Having the best audio/video quality available when you publish media is very important. I’ve heard an theory (which sound logical – although that doesn’t automatically mean that it is true :-)) that if the voice quality is poor, you get more tired of listening to it, since your brain works harder during the interpretation phase (because it has to fill in the blanks).

Of course if you have a good source material, it goes a long way. But even if you don’t, you can do some post-processing to make the quality less-bad. In the following example I will use the recording for the talk Beer Hacking – Real World Examples by Scott Milliken and Erin Shelton as it was published by Irongeek. I don’t want to harp on Irongeek, since he has a great site with lots of useful resources, but this video was unwatchable. What are the problems which we’ll try to fix:

very low audio volume
background noise
interlacing artifacts

The tools we will be using are cross-platform (meaning Windows, Linux and MacOS X), even though the screenshots are from a Windows machine:

Avidemux (which I gave a short review some time ago). On Windows you can use VirtualDub which is an other similar tool (also free and open-source), but it is Windows specific. If you want to go with VirtualDub, you have to have the correct video codecs installed (I would recommend ffdshow)
Audacity for audio editing
The Levelator from The Conversation Network

First, we load up the video in Avidemux. The first we observe is that it has interlace artifacts (because it was probably ripped from a DVD which mainly targets interlaced display – ie. TV sets – as opposed to non-interlaced displays – ie. LCD screens). These are quite easy to fix, so we come back to them later.

The first thing we do is to export the audio, so that we can work on it separately. Go to Audio –> Encoder and select and select WAVE PCM. Then go to Audio –> Save and save it somewhere (this is uncompressed audio, so expect it to take up some disk space – ~900 MB in this case). Now we have to do two things: remove (or reduce) the noise and raise the volume. Usually I would recommend doing this in the order I just said (noise removal first and the raising the average volume) so that the second process doesn’t amplify also the volume, but in this case (because the initial volume is so low), we will have to do noise removal twice. Below you can see the waveform of the AVI file (notice the low level of the volume):

Sidenote: notice that the sample rate of the sound file is 48000 Hz (samples / sec). This again is most probably an artifact of the fact that it was ripped from a DVD. I usually find 44100 Hz stereo with 16 bits good enough and I am of the opinion that anything above that is not really noticeable. However we won’t change the sample rate in this case to avoid possible issues (like desynchronizing the audio and video).

To remove the noise, select a couple of seconds of silence (where you hear the background noise), go to Effect –> Noise removal and click on the “Get Noise Profile” button. Now press cancel. Then select the whole audio and go to the plugin again and press Ok. Notice that aggressive noise removal can lead to voices sound “metallic”. Use the preview button to strike a balance between the noise level and voice quality. I found values between 10 dB and 14 dB to be a good choice. Export the resulting file in WAV format.

Now it is time to use The Levelator. What is the levelator? It is a very easy to use program: it contains absolute no properties to “tune”. You just start it, drop your WAV file on it and wait for it to finish processing. However, in the background it contains some very nice algorithms which compress and normalize the the audio file. It is meant to work with an input containing voices from multiple persons (like a podcast or a panel) and raises them to the same level. We could have done what the Levelator does with Audacity, but it would have been a lot of work.

Here is how the audio looks like after normalization. As you can see the volume is much more uniform (without clipping anywhere!). The Levelator saves the output file in the same directory as the original one (so make sure that you have free space) with the suffix .output. You can do a second pass of noise removal (usually this is not needed, but since we had to amplify the audio very much, we also inevitably amplified some noise).

Now we put it all back together: go to Avidemux and select Audio –> Main track. Choose “External WAV” and specify the wave file you’ve exported from Audacity (after the second denoising). Press ok. Go to Audio –> Encoder and select MP3 (LAME). I would recommend to use the “Joint stereo mode” with a constant bitrate (CBR) of 64. “Joint stereo” means that the commonalities are encoded only once and for the two channels only the differences are kept. This greatly reduces the data volume (if the two channels are very similar – which they are in this case), thus resulting in improved quality at the same bitrate. I wouldn’t recommend going mono, since some players have weird issues playing back mono streams. The same is true for bitrates below 64 kbps and VBR / ABR modes.

Sidenote: using lossy compression (like MP3, XviD, H.264) repeatedly leads to degraded quality. When possible use lossless formats (like WAV or FLAC) during the processing phase and only render to a lossy format at the end.

For the video part: for the encoder select MPEG-4 AVC (x264). Use the filters button and add a deinterlace filter (I used yadif). Now use the calculator to find out the bitrate you need to set for a particular filesize. For example a bitrate of 800 kbps resulted in a ~548 MB file (you need to consider also the audio part when estimating the final filesize – this is why it is easier to use the calculator).

Now configure the encoder. I would recommend the “Two pass – average bitrate” mode. This means that the encoder does two rounds: first it estimates the “compresseability” of each frame, and on the second pass it does the compressing. This mode results in better quality and better approximation of the desired file size at the expense of doubling the encoding time (even so, it was done in less than a hour on a Core 2 Duo @ 2.39 GHz). When all the parameters are set, go to File –> Save –> Save Video to render the result.

Hopefully this tutorial will help people in producing better quality media. If you have questions, please leave them as comments and I will try to answer as fast and as best as I can. I’m not a multimedia guru, but I play around a lot with these tools.

Update: below is the resulting video. You can compare it to the original.

Two new podcasts

gpanther — Mon, 12 Jan 2009 05:42:00 +0000

Just wanted to announce two new podcasts I’ve started listening to, and maybe they would be of interest to people interested in security:

The IT Security Pubcast – a South African podcast with security professionals who have real, hands-on experience with the physical aspects of security. Being a more electronic-only guy, this is a very interesting source for me. Also, the sound quality is very good. If you are interested in security, be sure to listen to it, it’s well worth your time.
The Reality Check podcast. From the host of the Silver Bullet podcast. It is described as:

The Reality Check Podcast with Gary McGraw will focus on software security practitioners and practical software security. We’ll interview people involved in running large-scale software security initiatives.

A quick personal todo

gpanther — Sun, 11 Jan 2009 08:12:00 +0000

Check out the Sony PS-LX300USB turntable. I’ve known about the one ThinkGeek offers, but this review sounds very good. Also, Amazon seems to offer some nice accessories for music archiving (like the record cleaner brushes / solutions).

pl/lolcode

gpanther — Wed, 07 Jan 2009 14:59:00 +0000

The news (via Joshua Drake’s blog): video / audio / slides available for two more talks on the postgresql conference site. Now for the funny part (this is from the slides of the “Babel of Procedural Languages” by David Fetter):


HAI
    CAN HAS DATABUKKIT?
    I HAS A RESULT
    I HAS A RECORD
    GIMMEH RESULT OUTTA DATABUKKIT "SELECT ﬁeld FROM mytable"
    IZ RESULT NOOB?
        YARLY
            BYES "SUMWUNZ IN YR PGSQL STEELIN YR DATA"
    KTHX
    IM IN YR LOOP
        GIMMEH RECORD OUTTA RESULT
        VISIBLE RECORD!!FIELD
        IZ RESULT NOOB? KTHXBYE
    IM OUTTA YR LOOP
KTHXBYE

This… is… incredibly… funny!!!

Mixed links

gpanther — Tue, 23 Dec 2008 09:58:00 +0000

A list of rich content if you ~~are bored~~ have free time the following weeks:

(Some) videos from the Fall 2008 Microsoft Bluehat security conference (from the extern SensePost blog).

From PerlBuzz: Higher Order Perl available for free (legal!) download. I started reading it and already found some interesting tidbits. It always felt that Perl is more a functional language than an OO one (see this post from chromatic on the same topic) and this book strengthens this idea.

Staying at the topic of Perl for a moment, from taint.org we have How I learnt to love Perl. A good read. From the same blogpost I got the link of Python Makes Me Nervous. Even though I’m no fan of Python, the article is a little overstated. Python does have exceptions, you just don’t have to catch them. And the majority of the problems can’t be codified in a form that your compiler can check (yes, some languages are better than others – but exactly how many of you program in language that has extended support for codifying pre- a postconditions again?), they must be codified in unit tests.

From XKCD comes the following comic (this is very true on multiple levels, not just for highschools):

Via Perlbuzz comes a link to a rant about what not to do in your Perl code.

On the Postgresql Conference page we have links to a lot of videos / recordings / slides about PostgreSQL and other related topics. Very useful (please rate the videos if you watch them – many of them don’t have any ratings yet!) Speaking of videos, they are up for HEAtnet conference 2008 (you might know them from the SourceForge mirror list).

From Hosts news comes the following very useful information: Google Diagnostic now reports on entire Networks.

On the Microsoft SDL blog we have MS08-078 and the SDL. It gives a very good description of the source of the bug (basically the array size was changed, but the variable used to store the upper limit for the iteration was not updated).

From RemkoWeijnen.nl the blog (try typing that without copy-paste) we have a New Universal Patch Method if you need your windows server to accept more than N simultaneous RDP connection (where N is 3 for 2K3 and 2 for 2K8 if I remember correctly). It always ticks me off when a company limits a product not for technological but for sales reasons. So I use Linux :-). But it is nice to have options (although probably they are in the gray zone from a legal stand point probably).