Arden Butterfield releases MAIM (MAIM Ain't an Implementation of Mp3), an open source effect plugin that is both a work of art as well as a nostalgic tribute to the digital vibe of the 1990s.
Sudara (from Melatonin.dev) speaks with them a couple weeks after release.
I was browsing Hacker News and happened to see something about an MP3 plugin — and it was a GitHub link. I was like, "What!? It's gotta be JUCE!" I clicked through and was very impressed and thought "Where did this person come from?" So I'm happy to have this excuse to interview you!
As you know, I've got Pamplejuce at the core of it! That GitHub Action script was really a game changer because I'm trying to get it to build across platforms. I've got 2 ancient MP3 libraries that I'm constantly hacking in, including LAME, which has a really convoluted suite of Makefiles. Yeah, having an automated way of seeing "oh, I changed a thing and it crashed on Windows" — absolute game changer there. So thank you very much for that.
You're welcome! How did you get started with all of this?
I was in my undergraduate, studying computer science and got into glitch art. I was doing a bunch of projects with that. What set me off down that route was this blog post by Aria Beingessner, where essentially she went into the Firefox rendering code and glitched it up, made all sorts of weird rendering-bug-colorful-disaster things.
She described her philosophy about it in a really interesting way, as Right Value, Wrong Place. So thinking about glitch, not as "we're injecting noise into this system" but "we're taking a value that's supposed to be there and sending it to the wrong part of the code." Or we're flipping it or distorting it in some way.
And I love that! It really encapsulates that ethos of glitch and seeing the structures behind code in a new way. So my first instinct was, well, how can we do that with audio? There's really no audio process that's as ubiquitous and complicated and mangling-of-the-sound as MP3 compression.
So I really wanted to do something like that with MP3 compression, but then going into the code and taking some of those right values and putting them to the wrong place.
So, is that how you would describe MAIM?
[Laughs] We're going all out of order! First maybe I should talk about what the hell my plugin even is!
I'll edit it to make it look great, don't worry! [laughs].
Yeah, so my plugin is a distortion effect for digital distortion. It's got a bitrate knob that you can turn down, but it's not a bit crusher.
It uses MP3 encoders at its core to give the sound a really heavy MP3 distortion — but with a bit a twist. One of the twists is there's two MP3 encoders inside of there. Which is actually really important because the MP3 codec standard leaves some leeway to the makers of the encoders.
How MP3 works, speaking very broadly, is it removes parts of the frequency spectrum that are deemed inaudible or less audible or less necessary to understand what the sound is. And exactly which parts to remove is left up to the designers of the encoder. There's some standards, but over the development of MP3s, the game has ramped up a lot. The really early encoders, like the one I included, the Blade encoder from 1999, was pretty rough.
I keep saying it was pretty rough, but I'm worried that the guy who made the Blade encoder is going to find us and going to like "Come on man, I was doing my best!" — and he was! And it's very good code.
It sounds great!
I love the way that it sounds. Tord Jansson, if you are out there, I want to talk to you. I tried to find some contact information but I could not, but I really appreciate your work, man!
So LAME and Blade, they're both real-time safe?
By default, even when you are encoding with a constant bit rate, generally the encoder will still want to do some look ahead. If you've got one frame that has very little information going on, it'll be like, well, let's save some room for the next frames in our MP3. So that, if you allow it to do that, it does give you a lot of latency, because instead of the latency being like one frame, plus the overlap with the next frame, it's like trying to wait and get all those.
So I had to turn that off — that look ahead, that bit reservoir, as it's called. In that sense, the LAME in my plugin is slightly worse than the LAME that you would use in the wild, because we're not able to use that bit reservoir.
So through a combination of calculating the latency, running a lot of tests, and pluginval, it's locked down. It's set and calibrated with the dry/wet. So in terms of "is it real time safe?" — it does have some latency, as any spectral effect will, but it's going to be consistent.
Zooming back out, you said you were a comp sci major and glitch art brought you to audio?
I was towards the end of my degree, looking to do a thesis. I took a class with Professor Jon Bellona. Literally a primer on JUCE, making plugins. He's a professor from the audio tech department. So somehow they let me do a thesis with him even though I was a computer science major, which was really exciting. I learned a lot working with him.
For that [course], I didn't make this plugin, I made two other ones, Empy and Fish. Fish was my first attempt at putting an MP3 encoder into a plugin. It was a very silly plugin. There's a little spinning fish and you turn up the knob and it spun faster and the sound got worse. Empy was kind of my attempt at recreating the process of MP3 encoding from scratch, a bit similar to Goodhertz Lossy (although theirs sounded much better).
So MAIM is your third MP3 plugin?
Yeah, MAIM is the third iteration. The child of those two. And definitely the one I feel most proud of. After I graduated I just kept going. I felt like I hadn't gotten to that Right Value, Wrong Place.
So with MAIM, the second secret is that there's all spots where I would go into the encoder, go through the code and find all of these places where I'm like "ok, what if I flip the sign on this?" or "what if i misalign this data?", or send it down a different routing.
Basically, it was circuit bending, but with code. Instead of having the circuit board in front of me and with little wires touching different spots — literally I was doing the same thing but in my IDE, which is maybe a bit less romantic, but less solder exposure.
What made you post to Hacker News? Why did you think they'd be into it, or what was the strategy there?
You know, I scroll them on my phone when I'm procrastinating getting out of bed in the morning sometimes. I had seen several threads about audio and MP3 stuff there. I'd been surprised that they'd been so well received. It seemed like kind of a niche topic but there was a lot of interest. Which made me excited because, you know, I was working on this thing that I thought nobody would really care about.
I was really blown away by all of the responses. You were on there. I think there was someone from the Goodhertz Lossy team as well.
Yeah, Rob showed up to plug his new pedal!
I felt very bad that they had released this guitar pedal, and then like a week later, I'm like, oh, that $400 guitar pedal? I just released something for $0, you should check it out! It does a suspiciously similar thing. Though of course both of us had been working for a long time separately.
Someone provided that Brian Eno quote... ["Whatever you now find weird, ugly, uncomfortable and nasty about a new medium will surely become its signature.."]
Yeah. The classic one. You can't talk about MP3 compression without the Brian Eno quote. It's literally mandatory. It's in the handbook. I think he wrote it when he had appendicitis (not that that's relevant).
Someone brought up another interesting thing. There's that nostalgia that's maybe now coming in with MP3, but there's also a distinction to be made between MP3 distortion and vinyl and tape distortion, in that MP3 is more subtractive whereas the others are more additive. I'm not so sure I agree with that. I think any sort of distortion is subtractive in a way...
Exactly. "Analog Warmth" might be saturation but really it's roll off...All you gotta do is roll off your highs and it sounds more "analog!"
Which is what LAME does!
The secret that LAME figured out was that when you really gotta compress, you just roll off the highs, all the way down to 8k. Maybe even lower, like 4k. If you think about the spectrum, there's so much information in those upper bands that really we don't hear that much, just a few upper harmonics of our vowels and the sibilance and the high hats. But that's taking up like half, three quarters of the space.
I just love MP3, the dreamlike removal of parts of the sound and that slithering... It's like nothing else. It's not a sound that you can hear in the world, really. It's only a sound that you can hear digitally.
What was the motivation for releasing it as open source? Was it in part a licensing concern, or just wanting to get your work out there?
It's a combination. It is absolutely the licensing because LAME and Blade are both under GNU license. I really appreciate open source software and it's been a really big part of what's allowed me to grow as a programmer and the things that I use. I've learned a lot from from looking at this audio code and from using this audio code. I wanted to contribute to that and kind of pay it forward.
And then also, I didn't desperately need the money from people buying it. I wasn't sure if anyone would buy it. And I didn't want to be a sales guy, have to troubleshoot people being like oh it doesn't work, can I get my money back, do you have a free trial, etc...
I gotta say, for a piece of solo open source, you did such a fantastic job on every aspect of it. I went there, downloaded a release, installed it, it just worked. It took me like 20 seconds and I was playing with it. I want to talk about the UI by the way.
I love working with JUCE GUI stuff. Maybe that's surprising. I enjoy setting up the components and that whole tree and parent child stuff makes sense to me.
Would you say that you spent a lot of dev time on the UI? I guess you were deep in MP3 encoding algorithms, so that was the majority?
There's definitely a few months of banging my head against the wall with the MP3 encoding algorithms. The UI.... I spent some time with it as well, maybe it was 2/3 in the audio, 1/3 in the GUI.
With designing it, a lot of the skin I lifted over from Empy. I really wanted that Y2k clunky software. I was looking at lots and lots of MP3 players. It'd be cool to get some of those skeuomorphic space buttons in it but I don't have the digital art skills...
I really like the UI. Those 1990s bezels—
Oh yeah, I needed the bezels.
Did you just have to go back to JUCE's LookAndFeel V1 to get those out of the box?
One other UI question I had for you: I looked at the code. I think the code is nice. It's clean, I could understand what's going on. I saw that you're using JUCE
Paths for the visualizers (like for the spectral visualization). Did those just work for you out of the box or did you have to do some tuning to get that happy?
Uh, let me look back at the code really quick and familiarize myself with what I did there...
If you don't remember it being painful, then that's great!
[laughs] Yeah, it worked fine!
[spends a few minutes poking around the codebase] I'm trying to find anything to talk about here, but designing the frontend user interface stuff was really a pleasant experience. I'll look through the code later and have something for you. But off the top of my head, I can't think of anything that was was really frustrating.
So what's next? Are you going deeper down the MP3 rabbit hole?
I think I need to go down a different rabbit hole.
There's several plugins where you can draw something and then it in some way processes the sound, whether it's a wavetable or a filter that sweeps across. I feel like there's something more in there. I feel like they're both great, but neither of those approaches to turning a drawing into a sound quite capture it for me.
This all started when I was playing around with tux paint — also open source, also ancient, very linux drawing program. You know, this is C... I've been turning these old C programs into VSTs.... what if! what if! I spent a couple weeks trying to take the tux paint GUI and rebuild it in JUCE.
Then got a bit tired of that — what's the concept here? We can make an image, but how are we going to turn it into sound? That's the important part. Lately, I've been rethinking that project. What if it's more of a physical modeling thing where the lines you draw are strings and the shapes that you draw are plates or boxes — could we make something with that? The goal here is for it to be playful and intuitive.
That definitely sounds more interesting than "y-axis is frequency, here's your sketch pad."
Right, I think when we look at an image, we don't scan left to right or right to left. Our eyes move, they move through it, we interact with different points of it.
I really want to find a way of turning an image into sound. Creating sound in a visual way that feels more natural and more expressive. We'll see how this turns out or if it comes to anything.
It's very much still in the brainstorming phase. I'd say "oh, I'm giving too much away" but I'll probably release it under an open source license again — so if anyone wants to steal the idea, lemme know! I can help you steal it from me!
Download MAIM from GitHub.
Have comments on this interview? Ideas for who we should talk to next? Let us know on the JUCE forum.