Key Moments

Designing Characters with Deep Learning: Spellbrush (W18) - YC Gaming Tech Talks 2020

Y CombinatorY Combinator
Science & Technology5 min read11 min video
Dec 7, 2020|7,716 views|172|10
Save to Pod
TL;DR

AI can generate anime characters in under two seconds, indistinguishable from human art, but training these models is costly, costing thousands per iteration.

Key Insights

1

Spellbrush's AI can generate a character portrait in under two seconds, a task that would take a human illustrator 2 to 15 hours.

2

The company utilizes Generative Adversarial Networks (GANs), comprising a generator and a discriminator, to create art.

3

Publicly available internet images used for training are heavily skewed, with female characters outnumbering males 6:1 and darker skin tones representing less than 3% of illustrations.

4

Spellbrush has invested significant effort to improve the generation of darker skin tones and male characters to enhance representation not reflected in raw internet data.

5

Training a single AI model is expensive, costing $3,000-$4,000 and taking 7-10 days, necessitating a self-built supercomputer for cost efficiency.

6

Spellbrush is building the world's first AI-illustrated game and is actively hiring for various art and AI research roles.

AI generates professional-level anime art in seconds

Spellbrush, a Y Combinator company, is developing deep learning tools to revolutionize art creation in the gaming industry. Art production is a significant cost in game development, often consuming 50-70% of the total budget. The company addresses this by using AI to scale up art creation capabilities without requiring massive studio expansion. Their AI can generate character portraits in the anime style in under two seconds, a stark contrast to the 2 to 15 hours a professional illustrator might take. Furthermore, the AI can produce hundreds of characters in the time it would take a human to create just one. This capability allows for rapid iteration and extensive character variety, which would be prohibitively time-consuming and expensive with traditional methods. The quality is so high that it is on par with professional human artists, making it difficult to distinguish AI-generated art from human-created art, as demonstrated by a quiz where the AI's output was indistinguishable from that of popular Twitter artists. This technology has the potential to drastically reduce art production costs and timelines for game developers.

How Generative Adversarial Networks (GANs) create art

The core technology behind Spellbrush's character generation is Generative Adversarial Networks (GANs). A GAN consists of two neural networks: a generator and a discriminator. The generator's role is to learn how to create art, aiming to produce outputs that mimic a given dataset. The discriminator's role is to distinguish between real art from the dataset and fake art produced by the generator. These two networks are trained in opposition: the generator tries to fool the discriminator, and the discriminator tries to accurately identify fakes. Through millions of training cycles, both networks improve. The generator learns to produce increasingly realistic images, while the discriminator becomes better at detecting subtle flaws. Crucially, the generator requires a 'latent space' or random noise input to produce varied outputs. By manipulating this noise, developers can control various aspects of the generated image, such as character expressions, colors, and even artistic style, tasks that would normally require significant manual effort from an artist.

Addressing bias and improving representation in training data

Spellbrush trains its GANs using publicly available images scraped from the internet, initially focusing on the anime aesthetic due to the abundance of available data (around 10 million images). However, they discovered significant biases in this data. The dataset is heavily skewed towards female characters, outnumbering male characters by a ratio of approximately 6:1. Additionally, darker skin tones and people of color are underrepresented, making up less than 3% of the illustrations. Recognizing that these percentages do not reflect real-world demographics and that representation is crucial, Spellbrush has dedicated considerable effort to mitigate these biases. They have enhanced the AI's ability to generate darker skin tones at a higher frequency than present in the raw data and improved the generation of male characters. This algorithmic correction aims to produce more diverse and representative character options, even addressing the fact that illustrators in some regions may shy away from drawing male characters due to lower engagement on social media compared to female characters.

The high cost of training AI models

Training deep learning models, especially for complex visual tasks like character generation, is computationally intensive and expensive. Spellbrush found that relying solely on cloud services like AWS could be prohibitively costly. A comparable machine on AWS (p316xlarge) can cost around $24 per hour on-demand, or about $10 per hour using spot instances. Since training their models takes approximately 7 to 10 days, each individual model training run incurs costs of $3,000 to $4,000. This significant expense led the startup to build its own mini-supercomputer in-house. This DIY supercomputer, housed in a 42U rack, features over 200 CPU cores, more than 20 high-end GPUs (Titan RTX), 100-gigabit Ethernet, and substantial storage. The total running cost for this cluster is estimated at around 60 cents per hour, offering a massive cost saving compared to cloud solutions.

Spellbrush's custom architecture and research directions

To manage their custom-built hardware setup and streamline the AI development process, Spellbrush has developed internal tools and workflows. They utilize a proprietary language called 'NetGen' for quickly describing GAN architectures, which compiles down to low-level TensorFlow operations. These operations are then packaged into singularity containers and scheduled onto their cluster using Slurm. Standard monitoring tools like Prometheus, Grafana, and TensorBoard are used for tracking system performance and model training progress, including loss functions. Beyond character generation, Spellbrush is actively researching other areas to enhance the art pipeline. These include automated animation, developing tools to assist with 2D animation workflows (like Live 2D and Spine), and exploring super-resolution techniques for animation processes. This broad research agenda aims to provide a comprehensive suite of AI-powered tools for game art creation.

Building the future: an AI-illustrated game and hiring needs

Leveraging their advanced AI technology, Spellbrush is currently developing what they claim will be the world's first AI-illustrated game. This ambitious project aims to showcase the full potential of their art generation and manipulation tools within a real-world application. The company is currently a small team of five people but is actively looking to expand. They are seeking to hire a sixth team member, specifically targeting individuals who resonate with their vision. Open roles include a 2D animator and motion designer, a real-time VFX artist, and an AI research intern for the upcoming winter. Interested candidates are encouraged to reach out via email at jobs@spellbrush.com, with the CEO available for further discussion in breakout rooms.

AI Character Design Workflow

Practical takeaways from this episode

Do This

Leverage AI for rapid character generation (sub-second).
Utilize GANs with a generator and discriminator for training.
Control character output by manipulating latent space noise.
Address dataset bias to improve representation of darker skin tones and male characters.
Build in-house infrastructure for cost-effective model training.
Use custom languages like Netgen and tools like TensorFlow for GAN architectures.
Monitor training with Prometheus, Grafana, and TensorBoard.

Avoid This

Don't rely solely on human illustration for scaling content creation.
Don't ignore cost implications of cloud-based model training.
Don't overlook representation issues in training datasets.

AI Art Generation Speed Comparison

Data extracted from this episode

MethodTime per Character
AI ToolSub 2 seconds
Professional Illustrator2-15 hours

Cloud vs. In-house GPU Training Costs

Data extracted from this episode

PlatformOn-Demand Cost per HourSpot Instance Cost per HourTraining Time per ModelCost per Model
AWS p316xlarge$24$10 (approx.)7-10 days$3,000-$4,000
Spellbrush In-house (DIY Supercomputer)$0.60 (total running cost)7-10 daysSignificantly less than cloud

Common Questions

Spellbrush is a startup developing deep learning tools specifically for artists. They leverage AI, particularly Generative Adversarial Networks (GANs), to create character illustrations rapidly, aiming to help scale art production in the game industry.

Topics

Mentioned in this video

More from Y Combinator

View all 562 summaries

Found this useful? Build your knowledge library

Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.

Try Summify free