DOI: 10.1145/3816077 ISSN: 2577-6193

Expanded Voices: Voice Cloning, Latent Spaces, and Embodied Translation 30

Tomas Andrade Weber, Maria Arnal Dimas, Raquel Barrachina Llobet, Sol Bucalo Mana, Jerónimo Calderón, Fernando Cucchietti, Adrià Espinoza Gómez, Paula Fernández Vergara, David Garcia-Povedano, Alex Gil, Roger Gonzalez March, Othmane Hayoun Mya, Marc Heras Villacampa, Míriam Herrero-Valea, Guillermo Marín, Paula Méndez Gordillo, Thaleia Ntiniakou, Sara Tolosa Alarcón

Expanded Voices is an interactive installation that explores voice cloning through timbre transfer and latent space visualization. Developed through a collaboration between the Data Analytics and Visualization Group at the Barcelona Supercomputing Center and professional singer and artist Maria Arnal, the work allows participants to transform their voices into a trained vocal model while visually navigating the latent representations underlying the conversion.

While not a translation system in the traditional linguistic sense, the installation reveals translational processes operating across representation and embodiment. Users report that converted voices sound perceptually native across multiple languages, even though the model is trained on a limited set of source languages. Motivated by this observation, we analyze the structure of phonetic latent spaces showing that speech from different languages organizes into shared patterns anchored in breath and articulation.

By combining artistic practice, data analysis, and visualization, this work positions timbre transfer as a creative form of translation between bodies, languages, and machines, and demonstrates how latent spaces can function as interpretable bridges between human vocal production and computational representation.

More from our Archive