soxexam

TriggerTek Logo
abcdefghijklmnopqrstuvwxyz_
SoX(1)								       SoX(1)



NAME
       soxexam - SoX Examples (CHEAT SHEET)

CONVERSIONS
       Introduction

       In  general,  SoX  will attempt to take an input sound file format and
       convert it into a new file format using a similar data type and sample
       rate.   For instance, "sox monkey.au monkey.wav" would try and convert
       the mono 8000Hz u-law sample .au file that comes with SoX to a  8000Hz
       u-law .wav file.

       If  an  output  format doesn’t support the same data type as the input
       file then SoX will generally select a default data type to save it in.
       You can override the default data type selection by using command line
       options.	 This is also useful for producing an output file with higher
       or lower precision data and/or sample rate.

       Most  file  formats that contain headers can automatically be read in.
       When working with header-less file formats then a user  must  manually
       tell SoX the data type and sample rate using command line options.

       When  working  with header-less files (raw files), you may take advan-
       tage of the pseudo-file types of .ub, .uw, .sb, .sw, .ul, and .sl.  By
       using  these extensions on your filenames you will not have to specify
       the corresponding options on the command line.

       Precision

       The following data types and formats can be represented by their total
       uncompressed  bit  precision.   When  converting from one data type to
       another care must be taken to insure it has an equal or greater preci-
       sion.   If  not	then the audio quality will be degraded.  This is not
       always a bad thing when your working with things such as	 voice	audio
       and are concerned about disk space or bandwidth of the audio data.

	       Data Format    Precision
	       ___________    _________
	       unsigned byte	8-bit
	       signed byte	8-bit
	       u-law	       14-bit
	       A-law	       13-bit
	       unsigned word   16-bit
	       signed word     16-bit
	       ADPCM	       16-bit
	       GSM	       16-bit
	       unsigned long   32-bit
	       signed long     32-bit
	       ___________    _________

       Examples

       Use the ’-V’ option on all your command lines.  It makes SoX print out
       its idea of what is going on.  ’-V’ is your friend.

       To convert from unsigned bytes at 8000 Hz to signed words at 8000 Hz:

	 sox -r 8000 -c 1 filename.ub newfile.sw

       To convert from Apple’s AIFF format to Microsoft’s WAV format:

	 sox filename.aiff filename.wav

       To convert from mono raw 8000 Hz 8-bit unsigned	PCM  data  to  a  WAV
       file:

	 sox -r 8000 -u -b -c 1 filename.raw filename.wav

       SoX  may	 even  be  used to convert sample rates.  Downconverting will
       reduce the bandwidth of a sample, but will  reduce  storage  space  on
       your  disk.   All  such	conversions are lossy and will introduce some
       noise.  You should really pass your sample through a low	 pass  filter
       prior  to  downconverting  as  this  will prevent alias signals (which
       would sound like additional noise).  For example	 to  convert  from  a
       sample recorded at 11025 Hz to a u-law file at 8000 Hz sample rate:

	 sox infile.wav -t au -r 8000 -U -b -c 1 outputfile.au

       To  add	a low-pass filter (note use of stdout for output of the first
       stage and stdin for input on the second stage):

	 sox infile.wav -t raw -s -w -c 1 - lowpass 3700  |
	   sox -t raw -r 11025 -s -w -c 1 - -t au -r 8000 -U -b -c 1 ofile.au

       If  you	hear  some clicks and pops when converting to u-law or A-law,
       reduce the output level slightly, for example this will decrease it by
       20%:

	 sox infile.wav -t au -r 8000 -U -b -c 1 -v .8 outputfile.au


       SoX  is great to use along with other command line programs by passing
       data between the programs using pipelines.  The most common example is
       to  use	mpg123	to  convert mp3 files in to wav files.	The following
       command line will do this:

	 mpg123 -b 10000 -s filename.mp3 | sox -t raw -r 44100 -s -w -c	 2  -
       filename.wav

       When working with totally unknown audio data then the "auto" file for-
       mat may be of use.  It attempts to guess what the  file	type  is  and
       then you may save it into a known audio format.

	 sox -V -t auto filename.snd filename.wav

       It  is important to understand how the internals of SoX work with com-
       pressed audio including u-law, A-law, ADPCM, or GSM.   SoX  takes  ALL
       input data types and converts them to uncompressed 32-bit signed data.
       It will then convert this internal version into the  requested  output
       format.	 This  means  additional  noise can be introduced from decom-
       pressing data and then recompressing.  If applying multiple effects to
       audio  data,  it	 is  best  to save the intermediate data as PCM data.
       After the final effect is performed, then you can specify it as a com-
       pressed	output	format.	 This will keep noise introduction to a mini-
       mum.

       The following example applies various effects  to  an  8000  Hz	ADPCM
       input file and then end up with the final file as 44100 Hz ADPCM.

	 sox firstfile.wav -r 44100 -s -w secondfile.wav
	 sox secondfile.wav thirdfile.wav swap
	 sox thirdfile.wav -a -b finalfile.wav mask

       Under  a DOS shell, you can convert several audio files to an new out-
       put format using something similar to the following command line:

	 FOR %X IN (*.RAW) DO sox -r 11025 -w -s -t raw $X $X.wav

EFFECTS
       Special thanks goes to  Juergen	Mueller	 (jmeuller@uia.au.ac.be)  for
       this write up on effects.

       Introduction:

       The  core problem is that you need some experience in using effects in
       order to say "that any old sound file sounds with  effects  absolutely
       hip".  There  isn’t  any	 rule-based system which tell you the correct
       setting of all the parameters for every effect.	But after  some	 time
       you will become an expert in using effects.

       Here  are some examples which can be used with any music sample.	 (For
       a sample where only a single instrument is playing, extreme  parameter
       setting	may  make well-known "typically" or "classical" sounds. Like-
       wise, for drums, vocals or guitars.)

       Single effects will be explained and  some  given  parameter  settings
       that  can  be  used to understand the theory by listening to the sound
       file with the added effect.

       Using multiple effects in parallel or in series can result either in a
       very nice sound or (mostly) in a dramatic overloading in variations of
       sounds such that your ear may follow  the  sound	 but  you  will	 feel
       unsatisfied.  Hence,  for  the first time using effects try to compose
       them as minimally as possible. We  don’t	 regard	 the  composition  of
       effects in the examples because too many combinations are possible and
       you really need a very fast machine and a lot of memory to  play	 them
       in real-time.

       However,	 real-time  playing  of sounds will greatly speed up learning
       and/or tuning the parameter settings for your sounds in order  to  get
       that "perfect" effect.

       Basically,  we will use the "play" front-end of SoX since it is easier
       to listen sounds coming out of the  speaker  or	earphone  instead  of
       looking at cryptic data in sound files.

       For easy listening of file.xxx ("xxx" is any sound format):

	     play file.xxx effect-name effect-parameters

       Or more SoX-like (for "dsp" output on a UNIX/Linux computer):

	     sox file.xxx -t ossdsp -w -s /dev/dsp effect-name effect-parame-
       ters

       or (for "au" output):

	     sox file.xxx -t sunau -w -s /dev/audio effect-name effect-param-
       eters

       And for date freaks:

	     sox file.xxx file.yyy effect-name effect-parameters

       Additional  options  can be used. However, in this case, for real-time
       playing you’ll need a very fast machine.

       Notes:

       I played all examples in real-time on a Pentium 100  with  32  MB  and
       Linux  2.0.30  using  a	self-recorded sample ( 3:15 min long in "wav"
       format with 44.1 kHz sample rate and stereo  16	bit  ).	  The  sample
       should  not  contain  any  of  the  effects.  However, if you take any
       recording of a sound track from radio or tape or	 CD,  and  it  sounds
       like  a	live  concert  or ten people are playing the same rhythm with
       their drums or funky-grooves, then take any other sample.  (Typically,
       less  then four different instruments and no synthesizer in the sample
       is suitable. Likewise, the combination vocal, drums, bass and guitar.)

       Effects:

       Echo

       An echo effect can be naturally found in the mountains, standing some-
       where on a mountain and shouting a single word will result in  one  or
       more repetitions of the word (if not, turn a bit around and try again,
       or climb to the next mountain).

       However, the time difference between shouting  and  repeating  is  the
       delay  (time), its loudness is the decay. Multiple echos can have dif-
       ferent delays and decays.

       It is very popular to use echos to  play	 an  instrument	 with  itself
       together, like some guitar players (Brain May from Queen) or vocalists
       are doing.  For music samples of more than one instrument, echo can be
       used to add a second sample shortly after the original one.

       This will sound as if you are doubling the number of instruments play-
       ing in the same sample:

	     play file.xxx echo 0.8 0.88 60.0 0.4

       If the delay is very short, then it  sound  like	 a  (metallic)	robot
       playing music:

	     play file.xxx echo 0.8 0.88 6.0 0.4

       Longer delay will sound like an open air concert in the mountains:

	     play file.xxx echo 0.8 0.9 1000.0 0.3

       One mountain more, and:

	     play file.xxx echo 0.8 0.9 1000.0 0.3 1800.0 0.25

       Echos

       Like  the  echo	effect, echos stand for "ECHO in Sequel", that is the
       first echos takes the input, the second the input and the first echos,
       the third the input and the first and the second echos, ... and so on.
       Care should be taken using many echos  (see  introduction);  a  single
       echos has the same effect as a single echo.

       The sample will be bounced twice in symmetric echos:

	     play file.xxx echos 0.8 0.7 700.0 0.25 700.0 0.3

       The sample will be bounced twice in asymmetric echos:

	     play file.xxx echos 0.8 0.7 700.0 0.25 900.0 0.3

       The sample will sound as if played in a garage:

	     play file.xxx echos 0.8 0.7 40.0 0.25 63.0 0.3

       Chorus

       The chorus effect has its name because it will often be used to make a
       single vocal sound like a chorus. But  it  can  be  applied  to	other
       instrument samples too.

       It  works like the echo effect with a short delay, but the delay isn’t
       constant.  The delay is varied using a sinusoidal or triangular	modu-
       lation.	The modulation depth defines the range the modulated delay is
       played before or after the delay. Hence the delayed sound  will	sound
       slower  or faster, that is the delayed sound tuned around the original
       one, like in a chorus where some vocals are a bit out of tune.

       The typical delay is around 40ms to 60ms, the speed of the  modulation
       is best near 0.25Hz and the modulation depth around 2ms.

       A single delay will make the sample more overloaded:

	     play file.xxx chorus 0.7 0.9 55.0 0.4 0.25 2.0 -t

       Two delays of the original samples sound like this:

	      play file.xxx chorus 0.6 0.9 50.0 0.4 0.25 2.0 -t 60.0 0.32 0.4
       1.3 -s

       A big chorus of the sample is (three additional samples):

	     play file.xxx chorus 0.5 0.9 50.0 0.4 0.25 2.0 -t 60.0 0.32  0.4
       2.3 -t	       40.0 0.3 0.3 1.3 -s

       Flanger

       The  flanger  effect  is	 like the chorus effect, but the delay varies
       between 0ms and maximal 5ms. It sound  like  wind  blowing,  sometimes
       faster or slower including changes of the speed.

       The  flanger  effect  is widely used in funk and soul music, where the
       guitar sound varies frequently slow or a bit faster.

       The typical delay is around 3ms to 5ms, the speed of the modulation is
       best near 0.5Hz.

       Now, let’s groove the sample:

	     play file.xxx flanger 0.6 0.87 3.0 0.9 0.5 -s

       listen  carefully  between the difference of sinusoidal and triangular
       modulation:

	     play file.xxx flanger 0.6 0.87 3.0 0.9 0.5 -t

       If the decay is a bit lower, than the effect sounds more popular:

	     play file.xxx flanger 0.8 0.88 3.0 0.4 0.5 -t

       The drunken loudspeaker system:

	     play file.xxx flanger 0.9 0.9 4.0 0.23 1.3 -s

       Reverb

       The reverb effect is often used in audience hall which are to small or
       contain	too  many many visitors which disturb (dampen) the reflection
       of sound at the walls.  Reverb will make the sound be perceived as  if
       it  were in a large hall.  You can try the reverb effect in your bath-
       room or garage or sport halls by shouting loud some words. You’ll hear
       the words reflected from the walls.

       The  biggest problem in using the reverb effect is the correct setting
       of the (wall) delays such that the  sound  is  realistic	 and  doesn’t
       sound like music playing in a tin can or has overloaded feedback which
       destroys any illusion of playing in a big hall.	To  help  you  obtain
       realistic  reverb effects, you should decide first how long the reverb
       should take place until it is not loud enough to be registered by your
       ears.  This  is	be  done by varying the reverb time "t".  To simulate
       small  halls,  use  200ms.   To	simulate  large	 halls,	 use  1000ms.
       Clearly,	 the  walls  of	 such  a  hall aren’t far away, so you should
       define its setting be given every wall its delay	 time.	 However,  if
       the  wall  is  to  far  away  for  the reverb time, you won’t hear the
       reverb, so the nearest wall will be best at "t/4" delay and  the	 far-
       thest  at  "t/2".  You  can  try other distances as well, but it won’t
       sound very realistic.  The walls shouldn’t  stand  to  close  to	 each
       other  and not in a multiple integer distance to each other ( so avoid
       wall like: 200.0 and 202.0, or something like 100.0 and 200.0 ).

       Since audience halls do have a lot of walls, we will  start  designing
       one beginning with one wall:

	     play file.xxx reverb 1.0 600.0 180.0

       One wall more:

	     play file.xxx reverb 1.0 600.0 180.0 200.0

       Next two walls:

	     play file.xxx reverb 1.0 600.0 180.0 200.0 220.0 240.0

       Now, why not a futuristic hall with six walls:

	      play  file.xxx  reverb  1.0 600.0 180.0 200.0 220.0 240.0 280.0
       300.0

       If you run out of machine power or memory, then stop as many  applica-
       tions  as  possible  (every  interrupt  will consume a lot of CPU time
       which for bigger halls is absolutely necessary).

       Phaser

       The phaser effect is like the flanger effect, but  it  uses  a  reverb
       instead of an echo and does phase shifting. You’ll hear the difference
       in the examples comparing  both	effects	 (simply  change  the  effect
       name).	The delay modulation can be sinusoidal or triangular, prefer-
       able is the later for  multiple	instruments.  For  single  instrument
       sounds,	the  sinusoidal	 phaser	 effect	 will  give a sharper phasing
       effect.	The decay shouldn’t be to close to 1.0 which will cause	 dra-
       matic feedback.	A good range is about 0.5 to 0.1 for the decay.

       We  will	 take a parameter setting as for the flanger before (gain-out
       is lower since feedback can raise the output dramatically):

	     play file.xxx phaser 0.8 0.74 3.0 0.4 0.5 -t

       The drunken loudspeaker system (now less alcohol):

	     play file.xxx phaser 0.9 0.85 4.0 0.23 1.3 -s

       A popular sound of the sample is as follows:

	     play file.xxx phaser 0.89 0.85 1.0 0.24 2.0 -t

       The sample sounds if ten springs are in your ears:

	     play file.xxx phaser 0.6 0.66 3.0 0.6 2.0 -t

       Compander

       The compander effect allows the dynamic range of a signal to  be	 com-
       pressed	or  expanded.  For most situations, the attack time (response
       to the music getting louder) should be shorter  than  the  decay	 time
       because	our  ears  are	more sensitive to suddenly loud music than to
       suddenly soft music.

       For example, suppose  you  are  listening  to  Strauss’	"Also  Sprach
       Zarathustra" in a noisy environment such as a car.  If you turn up the
       volume enough to hear the soft passages over the road noise, the	 loud
       sections will be too loud.  You could try this:

	     play file.xxx compand 0.3,1 -90,-90,-70,-70,-60,-20,0,0 -5 0 0.2

       The transfer function ("-90,...") says that very soft  sounds  between
       -90  and -70 decibels (-90 is about the limit of 16-bit encoding) will
       remain unchanged.  That keeps the compander from boosting  the  volume
       on  "silent"  passages  such as between movements.  However, sounds in
       the range -60 decibels to 0 decibels (maximum volume) will be  boosted
       so  that	 the  60-dB  dynamic range of the original music will be com-
       pressed 3-to-1 into a 20-dB range, which is wide enough to  enjoy  the
       music  but narrow enough to get around the road noise.  The -5 dB out-
       put gain is needed to avoid clipping (the number is inexact,  and  was
       derived	by  experimentation).  The 0 for the initial volume will work
       fine for a clip that starts with a bit of silence, and  the  delay  of
       0.2  has	 the  effect  of  causing  the	compander to react a bit more
       quickly to sudden volume changes.

       Changing the Rate of Playback

       You can use stretch to change the rate of playback of an audio  sample
       while preserving the pitch.  For example to play at 1/2 the speed:

	     play file.wav stretch 2

       To play a file at twice the speed:

	     play file.wav stretch .5

       Other  related  options	are  "speed" to change the speed of play (and
       changing the pitch accordingly), and pitch, to alter the	 pitch	of  a
       sample.	 For  example  to  speed a sample so it plays in 1/2 the time
       (for those Mickey Mouse voices):

	     play file.wav speed 2

       To raise the pitch of a sample 1 while note (100 cents):

	     play file.wav pitch 100



       Other effects (copy,  rate,  avg,  stat,	 vibro,	 lowp,	highp,	band,
       reverb)

       The  other effects are simple to use. However, an "easy to use manual"
       should be given here.

       More effects (to do !)

       There are a lot of effects around like noise gates, compressors,	 waw-
       waw,  stereo effects and so on. They should be implemented, making SoX
       more useful in sound mixing techniques coming together  with  a	great
       variety of different sound effects.

       Combining  effects  by using them in parallel or serially on different
       channels needs some easy mechanism which is stable for  use  in	real-
       time.

       Really  missing	are  the  the  changing	 of the parameters and start-
       ing/stopping of effects while playing samples in real-time!

       Good luck and have fun with all the effects!

	    Juergen Mueller	     (jmueller@uia.ua.ac.be)


SEE ALSO
       sox(1), play(1), rec(1)

AUTHOR
       Juergen Mueller	   (jmueller@uia.ua.ac.be)

       Updates by Anonymous.



			      December 11, 2001			       SoX(1)