Diffusion Digest
Posts
AI VIDEO GAMES, PhotoMaker V2, SD3 UNBANNED | This Week In AI Art 🎨

AI VIDEO GAMES, PhotoMaker V2, SD3 UNBANNED | This Week In AI Art 🎨

Cut through the noise, stay informed — new stories every Sunday.

July 28, 2024

…too much (on the visuals)? I personally think it's a bit much, but there's something intriguing about opening a newsletter and being greeted by the Pope wearing Gucci mink. It's unexpected, a bit provocative, and perfectly encapsulates the wild world of AI art we're diving into. I'm curious about your thoughts – leave a comment and let me know if it hits the mark or misses entirely.

Now, let's get into the week's highlights without too much fanfare. We've seen some interesting developments: a practical application of generative AI in gaming with "Horizon: Legend of Clans", Tencent's release of PhotoMaker V2 (your chance to see yourself as a Disney princess, if that's your thing), and for those with an eye on the business side, there's a new fine-tuned Lora model for SDXL aimed at generating Instagram influencer-type images. It's been a week of creative leaps and practical applications. Let's break it down…

🎮 AI's New Game+: Changing the Rules of Play
🦸‍♂️ Tencent releases PhotoMaker V2
🔀 Stable Video 4D
🔓 SD3 Unbanned from CivitAI
🌐 KLING AI Goes Global
📡 Put This On Your Radar…

🎮 AI's New Game+: Changing the Rules of Play

The intersection of artificial intelligence and video games is a frontier that captivates like no other. The possibilities are mind-boggling: imagine NPCs with true conversational depth, dynamically generated storylines, or even AI-crafted game worlds. Though, let's be honest, I'm not sure I'm ready for an AI-enhanced Radahn who can come up with new ways to mock me after the 5001st defeat.

Take “Horizon: Legend of Clans” for example, set for release in summer 2025, is being developed using AI-generated 2D images and character dialog voiceovers.

*Here’s a link for those interested, but please note that this link will take you to Youtube, it’s 2024 and we still can’t embed video in an email newsletter. Crazy.

This innovative approach hints at the future of game development, but it's just the tip of the iceberg. The impact of AI on the gaming industry extends far beyond a single title, reshaping everything from creative processes to workforce dynamics.

The gaming industry is witnessing a significant shift with the integration of AI technologies. According to a recent survey, 49% of game developers reported AI use in their workplace, while 80% expressed ethical concerns about its implementation.

Major players like Activision Blizzard and Electronic Arts (EA) are at the forefront of this AI revolution. Activision has already implemented AI-generated cosmetics in "Call of Duty: Modern Warfare 3," while EA's CEO, Andrew Willson, has spoken optimistically about AI's potential to create new workforce opportunities.

AI is currently primarily being used for:

First drafts and inspiration for writers
Concept art and textures for artists
Code assistance for programmers (e.g., GitHub Copilot)
Productivity tools for management

These tools are transforming major tasks into minor ones. For instance, creating in-game advertisements or graffiti that previously took days now takes mere hours. However, this efficiency comes at a cost, as evidenced by a recent ongoing strike of video game voice actors.

The strike, which began on July 26, 2024, involves roughly 2,600 voice actors and motion capture artists, including talents like Troy Baker from The Last of Us and Jennifer Hale from Mass Effect. The actors, represented by SAG-AFTRA, are demanding higher pay, better safety measures, and protections from new generative AI technologies. Their primary concern is securing rights over how their work is used in training AI or creating AI-generated copies.

And unfortunately, the impact on the workforce is being felt beyond just voice actors. Activision Blizzard and Microsoft's Xbox division collectively laid off thousands of employees in early 2024, with 2D artists and concept artists particularly affected. Riot Games, known for "League of Legends," also saw significant layoffs despite earlier assurances about the value of their artists.

The rise of AI in game development has sparked growing interest in unionization among game workers, who seek more say in how AI is implemented. This mirrors successful efforts by Hollywood writers to secure protections from AI use in their contracts. The ongoing voice actors' strike is a clear indication of this trend.

While concerns persist about AI-generated content being "good enough" for cost-cutting measures, some developers argue that AI could enable the creation of larger, more detailed game worlds and potentially lead to more niche games and indie productions as development costs decrease. However, the challenge lies in balancing these possibilities with fair treatment and protection of human talent.

As we stand on the brink of this new era in game development, one thing is clear: AI is reshaping the landscape of video games, bringing both exciting possibilities and significant challenges for the industry and its workforce. The ongoing voice actors' strike serves as a reminder that as we embrace AI's potential, we must also address the concerns of the human talent that has long been the backbone of the gaming industry.

‘Wired’ Source

‘Kotaku’ Source

‘Horizon: Legend of Clans’ Reddit Source

🦸‍♂️ Tencent releases PhotoMaker V2

Ever dreamed of being Iron Man? Or maybe a pirate captain (but with good dental work)? Look no further.

Tencent has released PhotoMaker V2, an updated version of their AI image generation tool with improved ID fidelity (how accurately the AI can maintain the identity of a person when generating new images) while maintaining generation quality and editability.

PhotoMaker V2 offers enhanced control capabilities through its compatibility with plugins and integration scripts for ControlNet, T2I-Adapter, and IP-Adapter. This tool can accessed this via a demo on Hugging Face, with the code available on GitHub.

Since its release, PhotoMaker V2 has garnered attention and feedback from various users…

Generation speed: The tool generates recognizable images without requiring extensive model training.
Result consistency: Effectiveness varies based on input face characteristics. Unique facial features tend to yield more accurate results compared to common ones.
Input type preference: PhotoMaker V2 demonstrates better performance with photographic inputs versus illustrated ones.

Despite these limitations, the community is exploring various applications. One notable use is creating additional images for training custom Stable Diffusion models (LoRAs).

Looking ahead, there's anticipation for improvements such as enhanced ComfyUI support and LoRA training capabilities, which could expand the tool's potential for AI-assisted image generation and manipulation.

Demo Source

Hugging Face Source

Github Source

🔀 Stable Video 4D

Picture this: you're watching your favorite film and suddenly you can peek around corners in the scene. Or maybe you're a doctor examining a medical scan, and you’re able to rotate and explore it from every angle. All from a single image or clip. Sounds like sci-fi, right? Well, we might be closer than you think.

Enter Stable Video 4D.

Stable Video 4D, a new AI model from Stability AI, transforms single object videos into multiple viewpoints. It generates 5 frames across 8 different angles in about 40 seconds, with full optimization taking 20-25 minutes. Users can specify camera angles for customized output.

Basically, using an input video as a reference, Stable Video 4D allows you to generate different viewpoints of an object. You get to pick which angles you want to see, and the AI creates new video clips from those perspectives, even if they weren't in the original footage.

The model produces detailed, consistent results across frames and views without needing multiple diffusion models. This technology has potential applications in game development, video editing, and virtual reality, where it could enhance realism and immersion.

Possible uses include generating stereoscopic content for VR from 2D inputs and changing camera views for photos after they're taken. Currently in the research phase, Stable Video 4D is available on Hugging Face. Stability AI is working to expand its capabilities to handle a wider range of real-world videos beyond synthetic datasets.

While promising, the technology is still developing. Its full potential and practical applications are yet to be determined as it evolves to work with more diverse and complex real-world inputs.

Project Page Source

Model Page Source

‘Stability.ai’ Source

🔓 SD3 Unbanned from CivitAI

Stable Diffusion 3 (SD3) has been unbanned from CivitAI, this decision comes after Stability AI addressed major concerns with the SD3 license, particularly by no longer considering outputs from SD3 as "Derivative Works." This change allows creators to use SD3 outputs for training or fine-tuning other models without fear of Stability AI claiming rights over those new creations, effectively removing a major barrier to adoption.

However, CivitAI has decided not to purchase an enterprise license for SD3, citing high costs and uncertain demand. This means users won't be able to generate SD3 images directly on the CivitAI platform or use it for on-site fine-tuning.

Is it too late to embrace SD3 despite its limitations and licensing concerns, or should the community pivot towards more open alternatives? Let me know your thoughts in the comments below.

‘Civitai’ Source

🌐 KLING AI Goes Global

Kling AI has finally released its ‘International Version 1.0’. This move marks a significant step towards democratizing access to advanced AI video creation tools, allowing users worldwide to sign up using any email address without the need for mobile verification.

Generation times are what you would expect, reports range from a few minutes to up to 30 minutes for a 5-second clip. Despite these challenges, several users have praised Kling's capabilities, with one noting that it's "leagues beyond" competing tools like Luma.

As the platform continues to evolve, it will be interesting to see how it addresses user feedback and competes in the rapidly advancing field of AI-generated video content (looking at you, SORA 👀).

‘X’ Source.

🤔 Feeling overwhelmed by the lightning-fast world of AI? Get your daily dose of clarity in just 5 minutes.

Learn AI in 5 Minutes a Day

AI Tool Report is one of the fastest-growing and most respected newsletters in the world, with over 550,000 readers from companies like OpenAI, Nvidia, Meta, Microsoft, and more.

Our research team spends hundreds of hours a week summarizing the latest news, and finding you the best opportunities to save time and earn more using AI.

📡 Put This On Your Radar…

Ultimate Instagram Influencer Pony Lora

This fine-tuned model for SDXL aims to generate Instagram influencer-style images. It's a powerful tool for creating trendy, social media-ready visuals. Compatible with popular interfaces like Automatic1111 and ComfyUI.

‘Civitai’ Source

Udio 1.5

Udio's latest version brings crystal-clear 48kHz stereo tracks with improved instrument separation and musicality. New features include key control, stem downloads, and audio-to-audio remixing. Udio 1.5 also boasts better handling of non-English languages, expanding its global reach.

‘Udio’ Source

Intel's AI Playground

AI Playground is an open-source project by Intel that enables AI image creation, stylizing, and chatbot functionality on PCs with Intel Arc GPUs or Core Ultra-H processors. It supports various AI models and provides a user interface for easier interaction. The app is available for download on GitHub, with specific hardware and software requirements. While it offers opportunities for experimentation and community contributions, users should note that it's in beta, has limited hardware compatibility, and requires separate model downloads.

Youtube Source

ComfyUI Video Player

This custom node for ComfyUI introduces video playback functionality to Stable Diffusion workflows. The project provides conversion scripts and example workflows, making it easier for users to get started with this tool. Users should be prepared for potential UI stuttering.

Github Source

IMAGDressing-v1

A virtual dressing tool that works with Stable Diffusion (and built on top of IDM-VTON), allows for digital clothes try-ons. It offers rapid customization and flexible plugin compatibility (IP-Adapter, ControlNet, T2I-Adapter, AnimateDiff), including random faces/poses and outfit changing. While it supports various scenarios, some users have reported issues when using personal pictures.

Github Source

✨ Enjoy the content? Want to show your appreciation? Just click the sponsor's link (or don't, I see you 👀)

Million dollar AI strategies packed in this free 3 hour AI Masterclass – designed for founders & professionals. Act fast because It’s free only for the first 100.

Join it here for $0. 🎁

Or you can buy me a coffee :)

What did you think of this week's issue?

Reply

or to participate.