Is it possible to use video to modulate audio?


Apologies if this is totally obvious newbie nonsense, but i’m trying to figure out how (or if) its possible to send video into a module and use the varying voltages from the video to create CV that modulates an audio source.

I’d like to send a constantly moving video image, which is on an endlessly repeating loop into a module and use the video to morph a set of audio stems or samples via eurorack.

So is that possible, and if so, what modules can i use to do it?




Lzx standard is 1v so you would just want to boost that signal to a more standard audio friendly 5v with something like that the LZX Bridge module. If you already have a bunch of eurorack audio modules you could just scale up the voltage with your existing gear and see if that works for you?


Ahh great thanks for your reply, that’s very helpful.

I’ll give it a go!



I´ve had some quite nice results upscaling the signal from my lzx setup with a bridge and then running it through a Manhattan Analog CVP to smoothen it up a bit with glide and then modulate morphagene and clouds with it. In those patches audio has also modulated the video.


Nice one, that sounds cool, had thought it would be nice with rainmaker/or magneto and erbverb etc. I would be trying to create some weirdly evolving floaty stuff to match some images of water.

What LZX systems do you use to get the video signal into the bridge module? and des the signal have to come from an old school analogue tv signal? Is there way to generate a voltage from a purely digital image?




I recommend to experiment with it! :slight_smile:
I meant that I use what ever video signals on the patch but use the bridge to scale the voltage up so that it has some effect in the audio synth as its voltage is hotter. You can use digital sources but you have to convert the information from digital to analog first.

1 Like

Keep in mind the different frequencies used in video and audio. A lot of the video information is above the audible range and thus not necessarily useful for modulating audio without a lot of slew or quantization.

Possibly helpful info: Using audio modules in a video synth system

Probably more interesting/useful to modulate video with audio than the reverse. Or modulate both with the same LFO.


That is true. I mentioned the Morphagene because it reacts to very subtle cv and it has got a trig out for “end of gene” plus an env followed output thus making it quite possible to have the synths talk to each other. I’m compelled by the idea of the visuals affecting the audio and vice versa and will definitely investigate ways to make progress in that department.

1 Like

I’ve used LZX Sensory Translator with some good success. It boosts the signal (no need for Bridge), and it isolates certain frequency bands for outputs or envelopes – effectively teasing apart your video signal into different bands.


You’ll want either Visual Cortex, Vidiot, or TBC2 to translate video into voltage.
If you’re handy, you can build Cadet III to do the job. A version of Cadet V will scale the voltage into the 5V range.

1 Like

Hey everybody,
First post, and not sure this is the right place for it, but my end goal is to “sonify” some videos I have so this seems like the best place to post it…

Is it possible to use the Vidiot to “extract” the R,G,B, and/or luminance values from an incoming video signal so I can send them to individual CV outs?

This is the main reason that I got my Vidiot, so I really hope there is some way to make this happen…

Hope you are all holding up okay in these crazy times,


if you scale the 0-1v signal to audio (eurorack?) levels, you can use it .
but most video signals are high frequency, so you might want to pitch shift it, or put in through a divider
to make it audible.
To use it as CV signal… even slower rates.
Maybe with the Sensory Translator , as mentioned above?
And there is the sync signal that is embedded into it, which will be audible too.

Try it out, see where you can get to?

Most people do it the other way around. use the audio to get the video moving.
There is also something to say about 1:1 translations from video to audio / audio to video.
While there is a lot of possibilities to explore, they will mostly be in very abstract fields. you could also look at indirect video / audio translations, which can give you more room (less borders) to play with.

Hey, thanks so much for the quick reply…
I guess my problem is I don’t understand where in the Vidiot the RGB/luminence values show up…
In other words, which jack would I need to patch out of in order to access those values…
As you can probably tell, I am quite new to this and I don’t really know what I’m doing yet…
I think I’ve got a handle on scaling out to Eurorack, but I just can’t find how to patch it up to allow the incoming video signal to drive some kind of output…

I know most folks go from audio to video, there’s tons of solutions for that in both hardware and software, unfortunately I’m kinda perverse, and really hoping to find a way to go the other way…

I’m good with abstract, the videos I am hoping to use are ones I generated algorithmically, and I want to find a way to have them create their own soundtracks, reactive to the content of the video itself…

What do you mean by “indirect a/v translations”? That sounds really interesting, and less borders might be just what I’m after…

Sorry for so many questions, I’m a little in over my head, but that’s usually where the fun is… :wink:

Luminance, yes. This is what comes out of the output on the Vidiot front panel with a camera icon over it. RGB, no, because Vidiot discards the chrominance subcarrier from the camera input. This is actually an important reason why composite video uses this signal format, it allowed black & white TVs to still show a signal from a color broadcast because they didn’t have any circuitry to decode the high-frequency subcarrier, and neither does the human visual system. However, if you patch this signal elsewhere in the Vidiot or otherwise use the Vidiot to colorize the camera input, you can get the RGB channels for the colorized output from the 3.5mm RGB output jacks on the back of the Vidiot.

I think using video signals as an audio source is a really interesting idea and I have played with it some with my Eurorack system. I find mostly that video signals will have some strong low-frequency component (harmonics of the sync signal I’m guessing) and often some very high frequency content. Generally vertical variation is more in the audio range and you can work with this more easily but it is fun to try lots of stuff. Much of the time these signals are useful as drones or percussion since video-rate oscillators aren’t designed to track V/octave (as far as I know?). Having a lot of LFOs available will help to keep modulating the other stuff in your video patch, in some ways animation rate modulation can help to glue audio and video parts of your patch together. Camera / external inputs with natural images will of course have rather different characteristics from synthesized images, which can be interesting but harder to tame.

This video demos Visionary modules by letting you see and hear the waveform simultaneously and watching it was perhaps the exact moment where I decided to plunge into video synthesis.

I’m also very interested in this topic. One thing that’s a bit difficult about working with a video signal or an audio signal is that it’s kind of one-dimensional and if you want to react to it in “real time” to translate one to the other, you have basically have a single value to work with at any moment in time, plus whatever memory you have to work with. In the audio regime we have loads of delays and loopers and such to act as memory, but memory is much more difficult / expensive to build and manipulate with high-frequency signals like analog video feeds.

For analog delay media it is generally difficult to achieve very long delay times even for audio, and the higher bandwidth required for video means that for instance tapes need to run at much higher speeds (e.g. VHS or PixelVision), which means less total delay time and more likelihood of breaking tapes, etc. But I think, for instance, recording video to tape and then playing it back at a much lower speed seems like an interesting way to derive an audio signal from video. You can of course run video signals through audio-rate effects and often have interesting things happen, but this will discard most of the horizontal information in an analog signal.

As far as digital delays, there is the Memory Palace! Which is rather special because its memory is frame-level – it captures the continuous video signal into its memory of all the “pixels” in an entire image, and can keep and manipulate 60 past frames. Digitized video information gives you some more dimensions to work with that could be used to resynthesize this information into audio via some software techniques. A full color patchable video signal flow (RGB or HSV) is of course more than 1-dimensional, so this seems like another interesting topic to explore – deriving audio signals from the interactions between multiple analog video signals.

This sounds extremely cool and I can’t wait to see what you come up with!

Is it possible to use video to modulate audio? Absolutely! The 3.5mm video output jacks on the back of Vidiot (and the one with the camera icon on the front) are sending out voltages that you can send to any CV input on your audio synths/modules. As noted above, the range of this voltage output is only 0V-1V, but that can be scaled, if necessary. Have fun!