Studies of Life

Learning by doing.

Using Audacity Chains to Quickly Adjust Audio for Transcriptions

05 April 2014 by Jim

Since transcription work, if done at high volume, takes up a huge amount of time it is important to find ways to make the process quicker or easier. Since I’ve been working in this area for a few years now, I’m always tweaking my process.

I’ve tried using Dragon Dictate / Mac OS X’s dictation feature to quickly type the text, and even though this is not ideal due to lack of accuracy, it can sometimes be nice to do when my fingers hurt from typing too much.

One trick that really helps me, though, is using Audacity to 1) improve the audio quality and 2) shorten the audio.

I used to do this manually but it takes about 20 clicks, along with some waiting in between, per file, so that was not ideal when I received 5 audio files at one time.

So I’ve discovered (after searching for some time on google) that Audacity has a feature called ‘Chains’ that allow you to define a series of effects to apply to audio files before saving it. This allows me to process a whole series of files without having to supervise the process.

Here are the settings I use:

Audacity Chain settings for audio transcripts

– truncate silence: removes pauses in the audio (this can sometimes shorten the audio file by 5-10% depending on what is being talked about and if there are a lot of reflective pauses)

– leveller: this simply evens out the audio to avoid too different high and low volume parts

– normalise: wikipedia says it’s ‘the application of a constant amount of gain to an audio recording to bring the average or peak amplitude to a target level (the norm)’ – this essentially removes too high peaks (which click or pop uncomfortably during listening)

– compressor: dynamic range compression is used to lift all parts of the audio to the same level of volume, making quiet parts louder

– change tempo: this increases the speed of the audio by 15%, which shaves off another 15% of the audio (of course this is only for experienced transcribers to use who can type faster than people speak usually)

The result is a proper audio file, about 20% shorter than the initial one, with equal volume throughout. The 20% shortening may not be hugely important, but it has, if nothing else, a certain psychological effect on me that makes me look differently at a 45 minute file compared to a 58 minute file. And that’s sometimes enough to motivate me.

Leave a comment | Categories: Freelancing | Tags: , , ,