[Column] Experimental Vocal / Voice Manipulation — History and techniques of vocal experiments

Column en Experimental Vocal Voice Manipulation
[Column] Experimental Vocal / Voice Manipulation — History and techniques of vocal experiments

Prologue

Text: mmr|Theme: Experimental vocals that process, generate, and rearrange the voice as a material/history, techniques, representatives, and contemporary trends of voice manipulation

In the 21st century, the singing voice has gone from simply conveying melodies to an editable acoustic material. With the availability of tools such as recording/editing technology, effects, sampling, real-time processing, and generation technology, the voice has acquired a diverse role as an instrument, texture, spatial description, and even data resource. Artists like Bjork and Imogen Heap, who use their own voices as the main material for their experiments, and expressions created by technological breakthroughs such as granular synthesis and AI-based resynthesis have greatly expanded the possibilities of the voice.


Chapter 1: Historical context of voice manipulation

1-1. Recording technology and tape generation

By the mid-20th century, recording and tape manipulation provided the first major venues for vocal manipulation. Techniques such as changing the speed of the tape, cutting it up, and playing it backwards created a way to strategically alter the temporal and frequency properties of the voice and derive new musical meanings. In radio works and electronic music experiments, voices were sometimes separated from their existing linguistic meanings and transformed into sonic textures.

1-2. The era of synthesis and analog equipment

Since the 1970s, vocoders and analog synthesizers have become popular, giving rise to crossovers between voices and synthesized sounds. The vocoder applied the spectral envelope of the voice to another carrier sound, making it possible to fuse the human voice with electronic sounds. As a result, robotic/mechanical tones have become widely available.


Chapter 2: Digitalization and the reorganization of voices

2-1. Popularization of DAW and samplers

The spread of digital audio workstations (DAW) and hardware/software samplers since the 1990s has made it possible to freely cut and paste voices and independently manipulate the time axis, pitch, and tempo. Voices were fragmented and rearranged to take on new rhythmic, textural, and melodic functions.

2-2. Formant correction and pitch processing

Advances in techniques for changing pitch while maintaining formants (the elements that govern the resonance of vowels) have made it possible to change pitches without sacrificing the naturalness of the voice. This encouraged the spread of effects such as harmonization, chorus generation, and pitch shifting.

2-3. Loops, live sampling, and performance techniques

Performances using loop stations and real-time loopers have popularized the way a single singer creates multiple vocal layers on the spot. Here is a new vocal expression that combines improvisation and arrangement decisions live.


Chapter 3: Representative artists and their work practices

3-1. Bjork — Expression that sculpts the voice

Bjork has clearly shown a production approach that treats the voice as the main sound source of the song, not just the object of singing. His works, which combine layered voice samples, non-traditional vocalizations (breathing, whispers, fragmented phrases), and electronic processing, redefine the voice itself as “sound sculpture.”

3-2. Imogen Heap — Real-time control and physicality

Imogen Heap demonstrated that the voice can be played instrumentally through live expression using real-time effect control, gestures, and controllers. By using a harmonizer and self-made signal processing to manipulate the texture and harmonics of the voice during a performance, he visualized the immediacy and transformability of the voice.


Chapter 4: Classification of techniques and acoustic effects

Below, we will summarize the main techniques frequently used in experimental vocals and their acoustic and expressive effects.

4-1. Pitch processing

  • Pitch Shift: Changes only the pitch, creating a different tonality or composite harmony. Extreme changes create a mechanical tone.
  • Pitch Tracking + Harmonizer: Generates multiple harmonies for the input sound, creating a choral effect by one person.

4-2. Formant manipulation

Formant manipulation is effective for manipulating voice quality indicators such as gender and age, and is used for voice changes and character generation.

4-3. Granular synthesis

By breaking down into short sound pieces (grains) and rearranging them over time, or changing their density, sustained sounds are stretched, fragmented, and produce grainy textures. Particularly useful for texturing voices.

4-4. Sample & Chop (Slicing)

The technique of cutting out short phrases and consonants and rearranging them creates rhythmic accents and unexpected flows. Used in hip-hop, electronica, and many experimental works.

4-5. Spatial processing (reverb/delay/convolution)

Adding echo characteristics, panning, and manipulating perspective are important means of creating narrative and emotion, and effectively arranging subtle elements such as whispers and breathing.

4-6. Noise and nonlinear processing

Distortion, waveform shaping, and feedback processing amplify the harshness and aggressiveness of the voice, giving it a sense of substance not found in traditional singing voices.


Chapter 5: Musical and social meaning

The transformation of voice into a material is an expansion of expression, and at the same time presents new questions regarding physicality and identity. Manipulating voice tone has social meanings such as expressing gender, building character, and ensuring anonymity.

5-1. Identity and voice

Voice reflects an individual’s physical characteristics, but it is easily altered by processing techniques. This allows artists to expand their range of self-expression, and listeners to update their awareness of voices.

5-2. Performance ethics and generation technology

Reproducing voices and imitating the voices of others using AI raises ethical questions. Issues of consent and attribution, such as whose voice is used and how it is used, need to be discussed in parallel with the development of expressive technology.


6-1. Audio feature extraction and conversion

Machine learning has made it possible to extract the pitch, spectral envelope, timing, and pronunciation characteristics of speech with high precision and transfer them to other voices or synthesized sounds. Style transfer applications are attracting attention as a method of incorporating features of existing singing into new contexts.

6-2. Boundary between synthetic speech and creation

Text-to-speech synthesis (TTS) and singing voice synthesis are redefining the boundaries between composition and vocal performance. While the generated voice becomes a natural part of the song, it complicates the role of the creator, copyright, and attribution issues.


Chapter 7: Production Workflow

Here, we will show you a basic workflow that can actually be used to produce experimental vocals. Although the equipment and software used vary widely, the principles are common.

  1. Recording — Record various vocalizations (full voice, whisper, breath, voice percussion) in multiple takes.
  2. Pre-production — Select important phrases and determine the direction of processing (harmonic, texture, rhythm).
  3. Edit — Chop samples, adjust length, align timing, and prepare material.
  4. Processing — Apply pitch correction, formant manipulation, granular transformation, EQ, compression, etc.
  5. Spatialization — Create perspective with reverb and delay, and build the stereo field with panning.
  6. Arrangement — Distribute each layer of voice according to the overall context of the song.
  7. Live implementation — Adapt to real-time controls (looper, MIDI controller, gesture interface).

Chapter 8: Educational and research perspectives

Voice manipulation is an area closely connected to acoustics, psychology, and artificial intelligence research. Perception of voice quality, parameterization for voice synthesis, and physiological understanding of vocalization will contribute to the technical deepening of experimental vocals.


Voice manipulation technique system

flowchart TD A["Voice material"] --> B["Pitch operation"] A --> C["Formund operation"] A --> D["Granular processing"] A --> E["Loop/Layer"] B --> F["Harmony generation"] C --> G["Voice change"] D --> H["Texturing"]

Chronology (simple)

flowchart TD A["Starting 1930s Tape Operations"] --> B["1950s Expansion of speed change/editing technology"] B --> C["1970s Increased use of vocoder"] C --> D["1990s DAW popularization"] D --> E["Comprehensive use of 2000s voice materials"] E --> F["A collection of works centered on Bjork's voice"] E --> G["Real-time operation of Imogen Heap"] H["2010s AI voice analysis progress"] --> I["2020s Popularization of voice generation model"]

Conclusion

Voice manipulation continues to be an area where technological progress and artistic exploration intersect. While the voice is a symbol of physicality, it is also a material that can be freely transformed by digital technology. The issues of ethics, identity, and creativity that arise during this time will become increasingly important in the future.


Monumental Movement Records

Monumental Movement Records