Mixing Guide

How to Mix Vocals Step by Step

Build a professional vocal mixing chain from scratch. Learn the exact order for EQ, compression, de-essing, reverb, and delay to make vocals sit perfectly in your mix.

Vocals are the most important element in nearly every modern mix. They carry the melody, the emotion, and the message. If your vocals sound thin, buried, or harsh, listeners will notice before anything else. Building a reliable vocal mixing chain is one of the most valuable skills you can develop as a producer or engineer.

This guide is part of our mixing fundamentals series. We will walk through every step of a professional vocal chain in the correct order, with specific settings and frequency references you can use as starting points. These are not rules carved in stone. They are informed defaults that work across genres, and you should adjust them to taste once you understand why each step exists.

Step 1: Gain Staging

Before you touch a single plugin, set your vocal fader so the signal peaks around -18 dBFS to -12 dBFS. This gives every processor in your chain enough headroom to work without clipping. If your raw vocal was recorded too hot or too quiet, use a gain utility plugin at the top of the chain to bring it into range.

Gain staging is not glamorous, but skipping it is the number one reason vocal chains fall apart. Compressors behave differently when they receive signals at different levels. An EQ boost that sounds musical at -15 dBFS can sound harsh at -3 dBFS because the signal is pushing into the plugin's internal ceiling. Start clean and every subsequent step becomes easier.

Step 2: Subtractive EQ - Remove What Hurts

The first EQ in your chain is surgical. Its job is to remove problems, not add character. Start with a high-pass filter set between 80 Hz and 120 Hz with a steep slope (18 dB/octave or higher). This removes low-frequency rumble, plosive energy, and proximity effect buildup that has no business in a vocal track.

Next, sweep through the low-mids between 200 Hz and 400 Hz with a narrow bell (Q of 4-6) boosted by 6-8 dB. Listen for boxy, woody resonances. When you find a frequency that makes the vocal sound like the singer is in a cardboard box, cut it by 2-4 dB with a moderate Q of 2-3. The 250 Hz region is the most common offender.

Check the 500-800 Hz range for nasal honkiness. If the vocal sounds like the singer is pinching their nose, a gentle 1-2 dB cut in this range with a wide Q can help. Be conservative here because this range also carries body and warmth that you do not want to lose entirely.

Pro Tip

Always make subtractive EQ decisions while listening to the vocal in context with the full mix playing. A frequency that sounds fine in solo might be causing masking problems against other instruments.

Step 3: Compression - Control the Dynamics

Compression is what makes a vocal sit consistently in the mix instead of jumping between too loud and too quiet. For most vocal styles, start with a ratio between 2:1 and 4:1. A medium attack time of 10-30 ms lets the initial consonant transients through so the vocal retains its natural articulation, while a release time of 40-80 ms lets the compressor recover before the next phrase.

Set the threshold so the compressor catches 3-6 dB of gain reduction on the loudest phrases. You want to hear the vocal become more even without sounding squashed or lifeless. If the vocal sounds pumpy or breathing unnaturally, your release is too fast. If the compressor never lets go between phrases, your release is too slow.

Many engineers use two compressors in series, each doing 2-3 dB of reduction, rather than one compressor doing 6 dB. This approach sounds more transparent because no single compressor is working hard enough to introduce obvious artifacts. The first compressor tames the peaks, and the second one evens out the overall level.

If your mix has a muddy quality that is hard to pinpoint, check whether over-compression on the vocal is bringing up low-mid room tone during quiet passages. Compression amplifies everything, including the parts you do not want.

Step 4: De-essing - Tame the Sibilance

Sibilance, the harsh "S" and "T" sounds that spike in the 5-9 kHz range, becomes more pronounced after compression because the compressor brings up the overall level. A de-esser is a frequency-specific compressor that only clamps down when those sibilant frequencies exceed a threshold.

Set the de-esser to target the 5-8 kHz range. Most male vocals have their sibilance peak around 5-7 kHz, while female vocals tend to sit higher at 6-9 kHz. Aim for 3-6 dB of reduction on the harshest sibilants. If you hear a lisp, you have gone too far. A well-set de-esser should be invisible. You should only notice it when you bypass it and hear the sibilance come rushing back.

Place the de-esser after compression, not before. Compression brings up the sibilance, so de-essing before compression means the compressor will undo your work. Some engineers also place a second lighter de-esser after the additive EQ step if presence boosts have re-introduced harshness.

Step 5: Additive EQ - Shape the Character

Now that you have cleaned up problems and controlled dynamics, this second EQ shapes the vocal's tone. This is where you add presence, air, and body to taste.

A broad boost of 1-3 dB in the 3-5 kHz range adds presence and helps the vocal cut through a dense mix. This is the frequency range where human hearing is most sensitive, so a little goes a long way. For breathy, airy vocals, add a gentle high shelf starting at 10-12 kHz with 2-4 dB of boost. This adds sparkle and openness without making the vocal harsh.

If the vocal needs more weight or body, try a small boost of 1-2 dB around 150-200 Hz. Be careful here because too much low-end on a vocal creates proximity-effect problems and fights with bass instruments. Always check your low-end boost with the bass and kick playing.

Step 6: Saturation - Add Warmth and Density

Saturation adds harmonic content that makes vocals sound fuller, warmer, and more present without turning up the volume. Tape-style saturation is the most forgiving for vocals because it adds even harmonics that sound musical and smooth. Tube saturation offers a slightly grittier character.

Use saturation subtly. Mix in 10-20% wet signal or drive the input just until you hear the vocal thicken slightly. If you can hear obvious distortion on a clean vocal style, you have pushed too far. For aggressive genres, you can push harder, but for pop and R&B, less is more. Saturation is particularly effective on hip-hop and rap vocals where grit and character are part of the aesthetic.

Step 7: Reverb - Place the Vocal in Space

Reverb gives the vocal a sense of space and depth. Always use reverb on a send/return (auxiliary bus) rather than as an insert, so you can control the wet/dry balance independently and EQ the reverb without affecting the dry vocal.

For most modern mixes, a plate reverb with a decay time of 1.2-2.0 seconds works well. Shorter decays (0.8-1.2 seconds) keep the vocal upfront and intimate. Longer decays (2.0-3.5 seconds) push the vocal back and create a larger-than-life feel. High pass the reverb return at 200-300 Hz to prevent low-frequency buildup, and consider rolling off the top end above 8-10 kHz to keep the reverb smooth and behind the dry vocal.

Pre-delay is an underused control. Setting a pre-delay of 20-60 ms creates a gap between the dry vocal and the reverb onset, which preserves the vocal's clarity and intelligibility while still giving it a sense of space. Without pre-delay, reverb can smear the vocal and make lyrics harder to understand.

Step 8: Delay - Add Depth and Rhythm

Delay and reverb serve different spatial functions. Reverb creates an ambient environment, while delay creates rhythmic repetitions that add depth and interest. Like reverb, use delay on a send/return bus.

A quarter-note or eighth-note tempo-synced delay with 2-4 repeats and moderate feedback (20-35%) fills gaps between phrases without cluttering the mix. Roll off the high end of the delay returns above 3-5 kHz so the repeats sit behind the dry vocal rather than competing with it. A subtle stereo slapback delay with a 60-100 ms delay time can add width without being heard as a distinct echo.

The Complete Chain in Order

  1. Gain staging utility (peak at -18 to -12 dBFS)
  2. Subtractive EQ (high-pass, remove mud and boxiness)
  3. Compression (2:1 to 4:1 ratio, 3-6 dB gain reduction)
  4. De-esser (target 5-8 kHz, 3-6 dB reduction on peaks)
  5. Additive EQ (presence at 3-5 kHz, air shelf at 10-12 kHz)
  6. Saturation (tape or tube, subtle blend)
  7. Reverb (send/return, plate 1.2-2.0s, pre-delay 20-60 ms)
  8. Delay (send/return, tempo-synced, high-end rolled off)

This order is not arbitrary. Each step prepares the signal for the next. Subtractive EQ cleans up the signal before compression amplifies everything. De-essing follows compression because compression exaggerates sibilance. Additive EQ shapes the compressed signal. Saturation thickens the final tonal shape. Reverb and delay go last because they need to process the fully-shaped vocal.

Frequently Asked Questions

Ready to Hear the Difference?

Upload your track and let AI handle the heavy lifting. Professional results in minutes, not hours.