Deep dive: what is the difference between a Proxy Clone and an IR?

Tonevault · April 2026

Having never really used IRs, when Line 6 announced Proxy clone support for the Helix Stadium, I decided it was about time to figure out what these different kinds of captures are and how they work. Hopefully what I found will help you too.

Sound is waveforms

Sound is air moving. When a speaker cone pushes forward, it compresses the air in front of it. When it pulls back, the pressure drops. Those changes travel outward as waves and eventually move your eardrums.

A waveform is a graph of that pressure over time. A single musical note — a pure A at 220Hz — looks like a smooth, repeating curve. It completes 220 cycles per second. That repetition rate is what your brain interprets as pitch.

Real sounds are never pure. A guitar string vibrating at 110Hz also vibrates at 220Hz, 330Hz, 440Hz, and higher — all simultaneously, in decreasing amounts. These are harmonics. Their relative strengths are what give different instruments their character. A violin and a guitar playing the same note share the same fundamental frequency but have different harmonic profiles. That difference is timbre.

There's a useful way to visualize this called a frequency spectrum: instead of plotting pressure over time, you plot amplitude per frequency. A pure tone is one spike. A guitar note is a cluster of spikes — one at the fundamental, smaller ones at each harmonic.

What a system does to audio

That harmonic profile isn't fixed once it leaves the string. Every piece of gear in your signal chain takes a waveform in and puts a different waveform out. A preamp does this. So does a cabinet, a reverb, a pedal, and a length of cable. In signal processing, anything with an input and an output is called a system.

The useful question about any system: how does it transform the signal? What is the relationship between what goes in and what comes out?

That question divides the world of systems cleanly into two categories.

Linear vs. nonlinear

A linear system can only do two things to audio: scale frequencies (make them louder or quieter) and delay them. It cannot create a frequency that wasn't in the input. Feed a linear system a pure sine wave at 440Hz and you get 440Hz back — maybe quieter, maybe shifted slightly in time, but nothing new.

A nonlinear system breaks that rule. Feed it a 440Hz sine at high enough amplitude and you get back 440Hz — plus 880Hz, 1320Hz, 1760Hz. Harmonics that were not in the input. They aren't added on top of the signal. They're generated by the nonlinearity itself, born from the shape of the distortion.

Why amps break up

Your amplifier runs on a DC power supply — a voltage rail around 300–400V for a typical tube amp. The output stage uses this supply to make your guitar signal bigger. That amplified signal swings between the positive rail at the top and the negative rail at the bottom.

When you play softly, the amplified signal fits within those rails. When you play hard, the amplified signal would exceed the rails — but physically can't. So as the signal approaches those limits, it bends. It curves toward the ceiling and floor rather than continuing to grow.

A tube doesn't hit those limits sharply. It curves toward them, gradually running out of headroom. That's soft clipping. A transistor circuit follows the input cleanly until it can't, then cuts off hard.

A bent waveform is no longer a sine wave. Mathematically, the only way to describe that bent shape is as a combination of the original frequency plus harmonics. The clipping doesn't add frequencies on top of the signal. It transforms the waveform into a shape that requires new frequencies to describe. They're born from the bending itself.

Play with the widget below. At low drive, the transfer curve is nearly a straight line — linear. Increase drive and the curve bends. Watch the output waveform clip against the rails, and watch new spikes appear in the frequency spectrum below.

Those new frequencies weren't in the input. A pure sine wave has exactly one spike in the spectrum. The clipped waveform has a different shape — flattened at the peaks — and a different shape means different frequency content. The extra spikes are the harmonics that distinguish it from a sine wave.

That's distortion. The geometry of the nonlinearity shapes which harmonics appear and how strong they are. The two modes in the widget show the same signal through two different types of clipping; notice how the harmonic content extends further up in frequency with hard clipping, giving it a brighter, edgier character.

What an IR actually is

With that background, impulse responses are easier to explain.

An IR is a measurement. The way you take one: send a brief test signal into a system and record the output. The test signal is as short and sharp as possible — ideally a single sample, a click. That's the impulse. The recording of the output is the impulse response.

For a speaker cabinet, this captures everything: the resonance of the enclosure, the frequency response of the drivers, the coloration of the microphone, the room. All of it, in one recording. Run any audio through a convolution engine loaded with that IR and you get the cabinet applied to your signal — accurately.

This works because speaker cabinets are, to a close approximation, what signal processing calls LTI systems.

The LTI assumption — where IRs break down

LTI stands for Linear, Time-Invariant. Linear you know from above: no new frequencies. Time-Invariant means the system responds the same way regardless of what just happened. Fire an impulse through a cabinet today, fire another tomorrow, and you get the same response. The cabinet doesn't remember.

Speaker cabinets really are this way. IRs capture them accurately.

Amplifiers are not. They have memory.

When you play a hard chord, the amp's power supply sags slightly under the load and then recovers. The next note is processed by a slightly different amp than the one that played the previous note. The output depends on history. That's not time-invariant.

Tube amplifiers also saturate — their gain changes based on the amplitude of the signal right now. That's not linear.

An IR is taken by firing a single, short, clean test signal. That doesn't stress the power supply. It doesn't push tubes into saturation. You're measuring the amp in its most idle, linear state — and then assuming it always behaves that way. For the cabinet portion, that assumption holds. For the amplifier itself, it doesn't.

This is the core of why IRs have limits. It's not a resolution problem or a matter of making a better measurement. The format itself can't represent what's happening inside an amplifier when you're playing.

Think of it like this: an IR is a photograph. A Clone is a movie.

A spectrum of linearity

It helps to think about different gear on a spectrum from most to least linear. Click through the positions below.

IRs lose accuracy as you move right on that spectrum — not because they're badly made, but because the systems they're trying to capture are fundamentally nonlinear.

Pedals make it obvious

A distortion pedal is deeply nonlinear. Roll back your guitar's volume and the character changes — not just quieter, but genuinely different. The pedal behaves differently at every input level, which means there's no single "response" to capture.

Nobody has ever made a distortion pedal IR, and now you understand exactly why. The format doesn't apply.

What neural capture actually does

Instead of one test impulse, the cloner sends many test signals — sweeps at different frequencies, tones at different amplitudes — and records the device's output for all of them. That collection of input-output pairs is used to train a neural network.

The network doesn't know what tubes are or how gain stages work. It learns, purely from examples, what input produces what output. After training, it can predict the output for inputs it's never seen, because it's learned the underlying pattern well enough to generalize.

For Proxy, the training happens on Line 6's servers. The hardware sends captured recordings up; the servers train; a small, efficient model comes back and runs in real time on the device. Server-side training allows more thorough models than anything the hardware itself could compute.

Putting it together

IRs and Proxy clones aren't competitors. They answer different questions.

An IR asks: fire an impulse and record what this system outputs. The answer is a static recording — compact, accurate, and fixed. Your IR library captures real cabinets and rooms, and it does that job well.

A neural capture asks: across all possible inputs, how does this system behave? The answer is a learned model. It responds to dynamics. Push it harder and it pushes back the way the original gear would.

A full Proxy clone captures both the nonlinear amp behavior and the acoustic response of the cabinet in one pass. Your IR library covers the linear side of your rig well. Neural capture fills in what the IR format can't reach.

One more thing — if you're on a Stadium, you can upload and download presets with Clones bundled right in on Tonevault. Try it out and let me know what you think :)