How does Magenta generate new sounds?

One of Magenta's projects, NSynth, uses deep learning models to generate novel sounds by exploring a compressed 'latent space' of audio. This allows for regeneration of sounds that are close to, but not identical to, the original audio.

How are AI-generated art and music evaluated?

Evaluating AI-generated media is a significant challenge. Currently, quality is often assessed through user feedback and observation of how artists integrate and interact with the tools, rather than purely objective metrics.

What is SketchRNN and how is it used?

SketchRNN is a model trained on drawings from the 'Quick, Draw!' game. It can generate new sketches of objects and has been used by artists to sample from, find unusual examples, or explore the raw data.

What has been the impact of LSTMs in AI?

LSTMs (Long Short-Term Memory networks) have been crucial for sequence learning, particularly in speech and language processing. Their success is attributed to their ability to handle long-term dependencies and the availability of faster machines and larger datasets.

Can AI generate complex, long-form creative works like songs or novels?

While current AI can generate short pieces or elements, creating coherent long-form works with nuanced structure and narrative is still a challenge. Magenta aims to develop models capable of handling longer structures in music and art.

What is the future of AI in creative expression?

AI is expected to become an integral part of the creative toolkit, aiding in communication and expression. It will likely automate some tasks, freeing humans to explore more complex and novel creative avenues.

How can someone get involved with Magenta or creative coding?

Interested individuals can visit the Magenta website (g.co/Magenta or magenta.tensorflow.org) to explore open-source code, contribute to issues, and engage in technical and philosophical discussions.

Key Moments

Making Music and Art Through Machine Learning - Doug Eck of Magenta

Y Combinator

Science & Technology6 min read45 min video

Jul 21, 2017|4,099 views|65|3

YC Y Combinator Magenta Tensorflow Doug Eck Podcast Interview AI Art Music Machine Learning Google

Save to Pod

Key Moments

On this page

TL;DR

Magenta project uses AI for art/music creation, focusing on tools for artists and exploring new creative frontiers.

Key Insights

Magenta aims to empower artists with cutting-edge AI tools, not replace them.

The project explores the 'broken' or unexpected outputs of AI as a source of new artistic expression.

NSynth generates novel sounds by interpolating between existing audio samples in a latent space.

Sketch RNN, trained on QuickDraw data, allows AI to generate new sketches and aids in artistic exploration.

Evaluating the 'goodness' of AI-generated art and music remains a significant challenge.

Reinforcement learning and GANs are key to moving beyond safe, predictable AI outputs towards more creative results.

The future of AI in art may involve generating complex structures like plotlines or jokes, and enabling new forms of 'creative coding'.

THE PHILOSOPHY OF NEW MEDIUMS

Doug Eck begins by referencing Brian Eno's quote about how perceived flaws in new artistic mediums become their defining characteristics. Applied to Magenta, this suggests embracing and exploring the 'broken,' unexpected, or uncomfortable outputs of machine learning as fertile ground for new art. The goal isn't to create AI artists, but to build tools that enable humans to explore novel forms of creativity, much like early film or guitar distortion. This perspective reframes AI outputs not as failures, but as unique signatures of a new medium.

NSYNTH AND LATENT SPACE EXPLORATION

A core project at Magenta is NSynth, which focuses on generating novel sounds using deep learning. It operates within a 'latent space,' a compressed representation of audio data. By interpolating between points in this space, new sounds can be created that are similar to, but distinct from, the original audio. While currently slow, the ambition is to enable real-time generation and even train models to generate these embeddings, allowing for dynamic exploration of sound possibilities.

EVOLVING MUSIC SEQUENCE GENERATION

Magenta is also rethinking its music sequence generation capabilities. Moving beyond primitive recurrent neural networks that generate MIDI from MIDI, the project is now focusing on learning from large datasets of performed music. This involves a deeper consideration of expressive timing, dynamics, and polyphony. The aim is to move from simple reference models to generating high-quality, usable musical elements that can genuinely assist human composers and musicians.

THE CHALLENGE OF EVALUATION

A significant hurdle for AI-generated art and music is evaluation: how do we objectively determine what is 'good'? Doug Eck acknowledges this as a central question. Initially, Magenta's outputs weren't deemed good enough for formal evaluation. The ideal scenario involves creating engaging tools or applications that go viral, gathering user feedback to iteratively improve the models. This human feedback loop, similar to collaborative filtering in recommendation systems, is seen as crucial for progress.

ARTISTIC APPLICATIONS: SKETCH RNN AND BEYOND

Beyond music, Sketch RNN, a model trained on the QuickDraw dataset, demonstrates AI's potential in visual art. It can generate new drawings based on categories learned from user-submitted sketches. Artists are already sampling from this model, using it as a distance measure for unusual examples, or simply playing with the raw data. While the QuickDraw data has limitations due to its 20-second creation time, it shows how AI can inspire and be integrated into artistic workflows.

THE ROLE OF MUSICIANS AND CREATIVE CODING

Early adoption has shown that talented musicians and improvisers are getting the most interesting results from Magenta's tools. These artists often engage in a 'call and response' with the AI, using its primitive outputs as a starting point for their own creative endeavors. There's a growing desire for 'creative coding,' where artists can manipulate and extend AI models through code. This involves not just using AI as a black box, but actively coding with and around it to achieve specific artistic goals.

LSTM'S JOURNEY AND DEEP LEARNING'S ASCENSION

The discussion touches upon the history of Long Short-Term Memory (LSTM) networks, a key recurrent neural network architecture. Doug Eck shares his personal experience as one of the few early adopters of LSTMs, highlighting Alex Graves' persistent work in making them practical for sequence learning. The breakthrough for LSTMs and deep learning in general is attributed to increased computational power and memory, allowing these data-absorptive models to become effective with large datasets, particularly in areas like speech and language.

THE QUEST FOR LONGER STRUCTURE AND EXPRESSION

A 'holy grail' for Magenta is the ability to compose long-form pieces of music or art. This requires models capable of understanding and generating complex, nested structures over extended periods, moving beyond short, 20-second segments. Such advancements would not only make AI outputs more engaging but also provide composers with tools to offload tasks like managing expressive timing or complex harmonic progressions, allowing human artists to focus on higher-level creative decisions.

ADVANCEMENTS THROUGH GENERATIVE ADVERSARIAL NETWORKS (GANS)

To overcome the tendency of generative models to produce 'safe' or blurry outputs, Magenta explores techniques like Generative Adversarial Networks (GANs). GANs involve a generator and a critic, forcing the generator to create more convincing outputs by trying to fool the critic. This adversarial process pushes the models beyond merely reproducing data to creating novel and less predictable results, which is essential for true artistic innovation.

REINFORCEMENT LEARNING FOR TARGETED CREATIVITY

Reinforcement learning (RL) offers another powerful avenue for directing AI creativity. By defining specific rewards, models can be trained to generate outputs that meet particular criteria, such as adherence to compositional rules or subjective qualities like 'shimmeriness.' This approach allows existing generative models to be 'tilted' towards desired characteristics, enabling artists to guide AI creation without explicitly coding complex rules, thus opening new possibilities for personalized artistic expression.

THE FUTURE OF 'PERFECT' POP AND ARTISTIC EVOLUTION

The conversation considers whether AI could generate the 'perfect' pop song. While acknowledging the possibility of easy generation of predictable music, the consensus is that human creativity will likely shift towards new challenges and less predictable forms. Historically, new technologies like the drum machine or distorted guitar didn't eliminate creativity but provided new tools for artists to push boundaries, often by subverting or playing against the technology's inherent characteristics.

THE ROLE OF TOOLS AND CREATIVE CODING ACCESSIBILITY

Magenta aims to be a tool, not a replacement for human artists, and the ease of use for these tools is critical. The project is moving beyond command-line interfaces to more expressive APIs. There's a recognized need for more accessible 'garage band'-like tools that lower the barrier to entry for creative coding, enabling a wider audience to experiment and contribute to the evolving landscape of AI-assisted art and music creation.

ENGAGING WITH THE MAGENTA COMMUNITY

For those interested in learning more or contributing, the primary call to action is to visit the Magenta website (g.co/magenta or magenta.tensorflow.org). The project encourages community involvement through open issues, code installation, and active participation in discussion lists. Both philosophical and technical discussions are welcomed, as Magenta continues its research and works to build a vibrant community around AI and creative expression.

Mentioned in This Episode

●Products

●Software & Apps

●Organizations

●Studies Cited

●People Referenced

Common Questions

Magenta is a Google project that aims to create open-source machine learning tools and models to enhance the creativity of musicians and artists. Its goal is to enable new forms of art and music generation through AI.

Topics

AI & Machine Learning Technology & Innovation Creativity & Media Neural Networks Deep Learning Generative Art Music Generation AI In Art Machine Learning Tools Creative Coding

Mentioned in this video

Software & Apps

Quick, Draw!

A Google game where users have 20 seconds to draw a common object, used as a data source for the SketchRNN model.

Ableton Live

A digital audio workstation (DAW) that Doug Eck's team is considering for integration with Magenta's tools.

DeepDream

A computer vision program created by Google that uses a convolutional neural network to find and enhance patterns in images, used as an analogy for what AI models learn.

SketchRNN

A recurrent neural network model trained on sketches from Quick, Draw! that can generate new drawings.

AI Duet

A web-based application using a simple RNN that allows users to play a melody and have the AI respond, demonstrating a call-and-response interaction.

ImageNet

A large visual database used for visual object recognition research, mentioned as an example of a task where deep neural networks perform well with large datasets.

Hacker News

A social news website focusing on computer science and entrepreneurship, mentioned for its role in discussions about Magenta and its impact.

Generative Adversarial Network

A type of machine learning model that uses two competing neural networks to generate new data, discussed as a way to overcome the limitations of simpler generative models.

Deep Q-Learning

A type of reinforcement learning algorithm used to train LSTMs to follow specific compositional rules, resulting in catchier music.

G.co/Magenta

A short URL for accessing Magenta's website and resources.

LSTM

Long Short-Term Memory, a type of recurrent neural network known for its ability to learn long-term dependencies, discussed in the context of its history and development.

Magenta TensorFlow

The official website for the Magenta project, providing access to tools, models, and information.

Organizations

Magenta

A Google project focused on creating open-source tools and models to help creative people be more creative using machine learning.

People

Brian Eno

His quote about the characteristics of new media was used to open the discussion on Magenta.

Aphex Twin

An influential electronic musician whose approach to sound design is contrasted with the capabilities of Magenta's tools.

Jürgen Schmidhuber

One of the co-authors of LSTM, who was Doug Eck's advisor.

Alex Graves

A key figure in the development and application of LSTMs, noted for his persistent work in making the technology useful for sequence learning.

Frank Ocean

Mentioned as an artist whose 'poppiest' music is enjoyable, used to illustrate the broad spectrum of pop music and listener preferences.

Ian Goodfellow

Credited with the concept of Generative Adversarial Networks (GANs).

Jesse Engel

A member of the Magenta team who built an Ableton plugin to manipulate note onsets and tails, demonstrating interesting sound design possibilities.

Felix Gers

One of the three individuals in the world initially using LSTM, along with Doug Eck and Alex Graves.

Sepp Hochreiter

The person credited with creating LSTM, whose work is discussed in the context of its development and its impact.

Thelonious Monk

A jazz pianist whose unique and expressive timing is used as an example of musicality that sophisticated AI models might one day help composers achieve.

Companies

Y Combinator

The podcast host's affiliation, which produces the podcast where Doug Eck is being interviewed.

Products

NSynth

A project using deep learning models to generate new sounds, with a focus on exploring a compressed latent space of audio.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free