EM GUIDE special: Human was being used by an AI algorithm to play live music on the stage

In this collection of satirical vignettes, Slovakian artist Samčo, brat dážďoviek presents a vision of a weird future transformed by generative AI. Through these speculative fragments, Samčo unveils dystopian yet goofy worlds where an algorithmic arms race of censorship and generation births an entirely new musical genre; where real-time prompting replaces the performativity of live music; and where bizarre competitions test musicians’ skill in creating remakes both faithful and estranged from their originals. In one piece, the boundaries of gender in music dissolve, giving rise to a universal sonic language, while the final vignette serves as a sharp critique of pervasive car-centrism. The text is accompanied by photos documenting the vibrant music scene revolving around Samčo and his friends.

Bleepcore

Censoring bots have gone rampant on the internet. It’s impossible to write a simple message to a friend, since almost every word might be flagged as inappropriate. People start auto-censoring common, everyday words, because those censoring bots might misinterpret them as referring to sensitive topics. If the people don't self-censor their messages, it’s likely their reach will be reduced or they will be blocked.

At the same time, another group of bots is learning grammar and vocabulary by reading real discussions on the internet. They learn to mimic this auto-censored form of communication very quickly. Soon a sort of algorithmic war breaks out between two kinds of bots – generative bots and censoring bots, the creative vs the restrictive. Generative bots start to adapt, and start referring to sensitive topics in more and more cryptic ways. For example, they hide more and more letters from the actual texts, and replace them with various placeholder characters. At the same time, unsafe content detection algorithms become more adept in deciphering these more subtle ways of referring to unsafe content.

The same thing will happen to the music. Censoring bots will be able to discern any references to obscenity, drugs, violence, and automatically flag tracks as explicit. Generative music algorithms therefore start to replace such references with beeping, and it soon becomes a normalised part of their vocabulary. Beeping will soon start to outnumber actual music on these tracks, as detection algorithms become more and more effective in finding hidden, cryptic, symbolic references to drug use or gang crime in the collection of various beeps.

People will soon start to prefer music with a more aesthetic approach to beeping, and machines will gradually evolve to generate ‘beepcore’, a genre consisting only of various aesthetic approaches to beeping sounds. That is, a point of equilibrium between creative machines, restrictive machines and human consumers will have been reached.

Trending in real-time

A live show. The venue is crowded with people thrilled by the expectation of a BIG dance party. It’s a one man EDM show. This guy went viral on TikTok yesterday, so the bar is set pretty high. The setup consists of one iPad connected to a stereo output.

 

final_edit_1.jpg
 

 

The house lights dim. The artist walks slowly on the stage. The audience goes mad. The artist stands next to his iPad, and enters the first prompt. "Make some Chicago house music in C sharp mixolydian, 122 BPM, TR-727 kick drum, M1 bassline, Volca Modular lead patches.”

Software analyses the prompt for a short period of time, and then the music starts playing, true to the prompt. Deep razor-sharp sounds of the Korg M1 bass line are cutting through the walls, and a mass of human bodies is moving to a 4/4 rhythm.

The artist opens his AI chatbot, the one that can provide reliable sources to all his statements. The artist is entering the prompt. "I have my house music set being played before the audience of 10,000 people. My original prompt was 'Chicago house music in C sharp mixolydian, 122 BPM, TR-727 kick drum, M1 bassline, Volca Modular lead patches'. What musical elements can I use to make a solid build-up, so the people can go wilder and energy in the room will gradually rise to the maximum?"

The machine is processing, and then it outputs a lengthy answer concerning various elements of music: tempo and volume change, melodic and harmonic tension, low pass filter sweeps, sampled vocal hooks, drum fills, breakdowns, and much more. The artist is too busy to read such lengthy text at once, so he puts another prompt into his chatbot. "Could you shorten this advice to one sentence, which I can use as a prompt for my AI music generative model, so it will make the music change over time according to those instructions?" The chatbot shortly comes up with an answer. "Gradually increase tempo, layer percussive elements, add white noise risers and sweeps, build with snare rolls, introduce vocal chops and FX, manipulate the bass line with sweeping EQ, use melodic tension, and create silence or minimalism before a powerful drop." The artist puts the prompt into his AI music model. The model starts to behave according to the instructions.

However, the artist didn't specify at which rate are these musical changes are supposed to take effect. Therefore the changes are rather noticeable. In a few minutes the tempo increases to 140 BPM, five layers of various percussive sounds are added, synth pads are playing in tritones to increase harmonic tension, and more and more snare rolls appear, followed by a breakdown. The audience, first thrilled by heavy drops and tritone chords, now starts feeling confused and exhausted. The artist is a sensitive human being, as all artists are, so he soon becomes aware of that. He calls out for help promptly, and starts typing. "My audience is overwhelmed by too many changes in the direction of my live music set. How can I make it more consistent, while retaining the attention of the audience?" The chatbot has an answer: keep it all around one motif and develop it slowly.

Presently, the artist arrives at the solution: he makes his chatbot create a series of prompts derived from the original prompt, but with slow changes in tempo and rhythm, and cleverly placed rolls, fills, chord tensions and huge drops. Then he asks his bot to write a script, that feeds all those prompts into the input of his AI music model, new prompt each minute.

It works, and the energy in venue starts to rise, the dancefloor is filling and the atmosphere is peaking. The artist now knows what to do. "Now it's the peak moment of my show, the atmosphere is hot. I want to give my show a climax, so it will end on a high note. How can I approach that?" The bot replies. "Increase intensity with layered synths, fast drum fills, pitch rises, call-and-response patterns, a brief breakdown, and a final explosive drop, ending with high-energy elements and catchy hooks".

Well, that didn't work last time, thinks the artist. It's time for something really unexpected, something remarkable that will remain in their memory for the rest of their lives. The artist inputs a completely new prompt into his AI music machine. "Swiss folk/dance music with heavy use of schwyzerörgeli accordion and alpenhorns, catchy chorus sung in the dialect of Basel, D flat mixolydian but chorus modulates into D sharp, 142 BPM, three minutes". Then he leaves the stage and doesn't come back for an encore.

The following day the guy goes viral on TikTok all over again.
 

final_edit_3.jpg
 


Imitation nation

Welcome to our talent show. Today we'll see some aspiring young musicians trying to play classics that we all have etched in our memories. The goal of the contestants will be to give their own interpretation of popular tunes while making them as close to the original as possible. However, if they get too close to an original, it will be recognised by an automated copyright strike system. Therefore the artists need to flavour the originals with their own expression.

Let's start. The first contestant is Joorut Ĭ from Nuuk, Greenland. He used his self-trained LLM model to create his interpretation of The Beatles ‘Yesterday', but with the lead vocal by Mick Jagger, and the guitar part replaced by a bouzouki. The audience is only mildly thrilled, and he scores 6.5/10. The expert judges gives him 7.5, as he didn't use the original background music track, but rebuilt it from scratch, training his model on the sound of 60s tape machines and session string players. Their main concern is the laidback intonation of Mick Jagger. He sounds slightly drug-induced, and sets a poor example for young music-makers.

Next up is a brass band from Nebraska, called Alabama Swingers. It consists of two cornets, two trumpets, four saxophones, flugelhorn, three trombones, euphonium and tuba. Each instrument is digital, and contains a chip capable of real-time generation of the instrument’s sound using massive orchestral LLM models. It captures the whole expression range of the real instrument making the sound as realistic as possible. The band director can control all the instruments centrally from one computer, and the band has no other human members. This evening, Alabama Swingers play the brass-only arrangement of "Over the rainbow" made famous by Judy Garland. We can even see the valves of the instruments moving by themselves, as they follow the stream of digital instructions. The audience is obviously startled by this novel approach, and gives the guy a 8.5/10.

However, the professional judges have concerns, as this rendition is way too different from the original song – only 62% similarity is detected by the automated system. Obviously it's because of the different instrumentation, and it lacks the lead vocal of the original, having been replaced by a saxophone section. The song fails to achieve the goal set by the rules: making it true to the original. The rules state that points will be subtracted if there’s a lower similarity to the original song. Alabama Swingers therefore get only 6/10 from the judges.

The next contestant, Ilona from Žabljak, Montenegro, has a penchant for 80s Yugoslavian synth pop. Her selection is the song Opasne Igre by the Serbian band Beograd. She reconstructs the song using a modular synth. The synth uses hardware switches instead of classic cables, and can be patched entirely from one computer. She uses AI deep learning to patch the synth and tweak every parameter, until the hardware is able to produce an almost 1:1 replica of the original song, even reproducing the tape noise. The vocal track is being simultaneously extracted from the original by voice separator algorithms. As the algorithm’s accuracy is imperfect and leaves some artefacts, there's still a small degree of distinction from the original song. The automated algorithm identifies a similarity of 97.3%, which is right under the competition’s limit of 98%. The audience doesn't seem to know the original tune, and Ilona gets 6.5/10 from them, but expert judges do appreciate the result, having only minor objections. "That's a pretty good interpretation, but it sounds more Croatian than Serbian" observes one of the judges. Finally, she gets 9/10.
 

final_edit_4.jpg
 

 

The last contestant, Giuseppe Spumante from Trieste, comes with a mega-mix of the greatest Eros Ramazzotti hits. He uses a simple style transfer model to give it some Italo disco and eurodance flavor. The audience’s response is raw, immediate and emotional. Their score is 9.5/10. The judges are also rather impressed. "It makes me feel like riding a Ferrari through the centre of Bibione with the windows down" claims one of them. The next judge sums it up: "It feels like eating gelato instead of ice cream". The automated system shows a 97,6% similarity, giving it the highest score: 10/10.

The final verdict is clear. Giuseppe Spumante makes it through to the next round. The other three contestants get a 20% discount voucher for a yearly subscription of LLM prompt-based music generation model RčgčFJE (all other possible names had already been taken and registered).

Spice Humans

Uzhgorod is a non-binary artist from Lithuania. Their longstanding project is to make androgynous versions of pop classics. They replace all gender references with neutral expressions, and modify the voice so it sounds outside the standard male-female spectrum. Uzhgorod consider AI models to be approximations of the human mind that can't be treated as objects of physical sexuality. Therefore it's possible to make music which conveys relatable sexual expression while preventing the audience from sexualising the author. Removing the gender aspect from popular music would make the lyrics more universally applicable. To achieve this goal, Uzhgorod would take boy bands or girl bands and turn them into human bands, like "Spice Humans" or "Jonas Siblings", and produce their own versions of those songs.

This concept gained a degree of success. These pan human bands gained fans, and other musicians started to follow the trend. It brought androgynous pop music back to the forefront. Many fans and artists welcomed the idea of disconnecting the songs’ sexual expression from the actual physical artist. However, as this idea became popular, when the subculture went mainstream, there was a demand for live shows. This would demand some performative counterpart, besides music being generated live on the stage by LLM models. That’s when various new performance approaches and sub-genres started to emerge.

Among the first attempts were various mechanical onstage installations, that were controlled directly by the generative software. Those installations, while strictly abstract, were often huge and complex, and made the whole stage into an actual physical entity, that would complement the music. While it could be pretty impressive, and could even evoke strong sexual feelings by mimicking the rhythm of human dancing – adding another layer of sexual expression without actually needing it to be represented by a human – it was still pretty artsy and abstract to the general audience.

 

final_edit_2.jpg
 

 

There was a gradual push towards some physical entity that would represent the music onstage, while keeping that physical entity inherently asexual. The scene split into two big groups, the Neuralists and the Geometrists. The first group was focused more on psychological affect – using very abstract but flexible sculptures, capable of wide range of movements that could mimic human dance or sexual moves in uncanny ways, without even implying human origin, but still triggering the right parts of the audience’s brain. As it was the result of carefully calculated psychological effects induced by the sculpture, the sculpture itself wasn't subject to sexual objectification.

The second group, the Geometrists, was focused on finding entirely asexual symbolic representations of the human body. It was usually done by reconstructing the human figure in some twisted and usually simple geometric way – based on the assumption that elementary geometry is one of the least likely fetishizable concepts, while being perfectly acceptable and flexible as a form of artistic expression. But these figures still retained enough structure to allow the reproduction of human movement. These figures were usually used as a canvas for projections, typical visuals included various maps, landscape or satellite photography, natural or urban structures.

The development of the Neuralist approach was driven by advances in neurobotics, that allowed to directly connect human brains with computers. This tricked human brains with more and more crazy hallucinations, making irreversible changes to their psyche. Lots of subculture members went insane, and neurobotics soon became illegal. The development of the Geometrist movement took a direction of gradual simplification. Soon, it developed into transcendental aesthetics based on basic geometric shapes in the vein of Kazimir Malevich. Basic human physical instincts were symbolically represented via basic physical shapes such as cubes, spheres or cones. Given the requirement for such a high level of abstraction, only a fading group of true fans followed the aesthetic.

Soon this visual culture came to its end, and the music was all that remained.

The rise of exhausted noise

Neuroscience had made substantial advances in understanding how emotional responses to music are elicited. Soon listeners could bypass music, and just directly activate the right centres of the brain to produce the same, but much stronger, emotional response. Instead of relying on various configurations of sound waves, people were now able to directly induce chills or tears any time they wanted.

Nevertheless, the human brain still worked the same way, and kept making strong sonic associations to intense emotions, including to music. People who would use various neural interfaces to stimulate their emotional response mechanisms, still heard actual sounds during moments of strong emotional response. The most common sound an average user of such an interface would hear, was the sound of city traffic.

Eventually a big portion of people would develop strong emotional associations with the sounds of city traffic. Double blind tests with testing groups revealed the same pattern. When exposed to sounds of city traffic, people who were regularly using neural interfaces to induce chills or tears would have spontaneous and measurable reactions in their brains. People not using those interfaces would remain indifferent, or showed negative emotional reactions.

The sounds of city traffic became the new generally popular approach to music. Whole new genres evolved, such as dance music or pop balladry based on city traffic noises, accenting the ambient and rhythmical qualities of sound instead of melody.

 


 

Written by Samčo, brat dážďoviek

Samčo, brat dážďoviek, is a prominent figure in Czech and Slovak experimental music. His DIY and lo-fi music, videos, lectures, and various texts use a collage-like approach to reinterpret the skewed everyday life, nationalism, and paradoxes of Eastern Europe. Samčo places everyday social events into absurd contexts, creating an idiosyncratic and bizarre commentary on reality.

This article is an EM GUIDE special curated by the editors of the EM GUIDE members and created in response to current trends and issues of the regional and global music industry. 

Pictures were modified by Samčo, brat dážďoviek and Peter Gonda using various “inpainting” models on fal.ai