- Diffusion Digest
- Posts
- Trump 💗 AI Pics, Procreate Says 'Human Only' | This Week in AI Art 🎭
Trump 💗 AI Pics, Procreate Says 'Human Only' | This Week in AI Art 🎭
Cut through the noise, stay informed — new stories every Sunday.
Housekeeping notes - Flux is evolving faster than we can keep up, so I'm giving this newsletter a makeover (again). Major Flux updates get the spotlight, while LoRAs and other goodies move to "Put This On Your Radar."
Next week, I'm interviewing Dori from FineTunersAI about their Flux LoRA training insights (covered below) - comment your questions!
And watch out for email clipping; content might play hide-and-seek.
In this issue:
FLUX UPDATES
Realism Updates
u/Luciferian_lord has demonstrated impressive capabilities in generating CCTV-style images using Flux dev. Users have found that adding "Low quality cctv camera pov" at the beginning of prompts produces convincing surveillance-like footage. The resulting images often capture the characteristic low-resolution, high-contrast look of security camera footage.
An Amateur Photography LoRA for Flux, created by u/Major_Specific_23, has been developed to enhance the model's ability to create realistic casual photographs. Recently updated to version 2, this LoRA is available for download here. The update includes an adjusted dataset to reduce ethnic bias, improved attribute tagging, expanded training captions, and better background blur capabilities. It works best at weights between 0.5-0.6 for base model similarity, and 0.8-1.0 for maximum realism. Prompts should begin with "Amateur photography of" and end with "on flickr in 2007, 2005 blog, 2007 blog" for optimal results.
LoRA Updates
Reddit user u/20yroldentrepreneur shared impressive results from training a Flux LoRA on their personal likeness. They used only 15 self-captioned images (10 selfies and 5 full-body shots) and trained for 3 hours at 2000 steps. The user recommends including full-body shots to help the model understand body shape and skin tone. They used the AI-TOOLKIT from GitHub for local training on a 3090 GPU. For those interested in trying this themselves, the AI-TOOLKIT is available here. You can also find more details and updates here.
CeFurkan reported success in training Flux LoRA on a more modest RTX 3060 with 12GB VRAM. He achieved this using Kohya SS GUI, running at 1024x1024 resolution with LoRA Rank 128. The process uses about 9.7GB of VRAM. CeFurkan suggests that training at 512x512 resolution is 2-2.5 times faster but may result in lower quality. They're using a lower learning rate of 5e-5, which they found works well with rank 128. For those wanting to try this setup, CeFurkan recommends using the bmaltais GUI version with the sd3 flux branch. You can find more details and updates here.
Applied_intelligence shared a quick guide for training Flux LoRA using only 16GB of VRAM. They achieved good results using just 10 selfies, downsized to 512px and captioned with Florence base in ComfyUI. The training took about an hour for 1600 steps on an Nvidia A4500 20GB GPU. Applied_intelligence provided a detailed guide for Linux users, including GitHub links for the necessary scripts and configs. They also shared a YouTube tutorial, though it's in Brazilian Portuguese. This method seems to offer a good balance of quality and accessibility for users with mid-range GPUs. You can find the full guide and resources here.
FinetunersAI, specialists in AI model training, has shared valuable insights on training LoRA models for Flux. Their recommendations include:
Dataset quality: Use high-quality images with a minimum resolution of 1024x1024 and a 1:1 aspect ratio.
Dataset size: Beginners should start with 10-20 images.
Training software options:
Local or cloud-based: Kohya SS and AI-Toolkit
Online services: CivitAI, Astria.AI, and Fal.ai
The article also provides guidance on crucial training parameters, including steps, learning rate, and batch size. For detailed explanations and examples, readers can refer to the full article here.
ControlNet Updates
XLabs has released three new ControlNet models for Flux, specifically for Canny, HED, and Depth. These Version 3 models were trained at 1024x1024 resolution, offer improved quality, and can fit into 12GB of VRAM. The models come with ComfyUI scripts and workflows.
u/eesahe implemented InstantX's union ControlNet for Flux in ComfyUI. This implementation includes modes for canny, depth, pose, and tile. To use it, users need to update ComfyUI to the latest version, download the model file to the controlnet folder, and install ComfyUI-eesahesNodes from ComfyUI manager. The union ControlNet is more robust than XLabs' standalone versions, with 15 transformer layers applied to all 57 Flux layers, compared to XLabs' 2 transformer layers. However, it requires about 7GB of VRAM. Users reported good results, especially with the canny and depth modes. The pose mode had some issues that were being addressed by the developers. The implementation works with Flux fp8 and Q8 models, and some users successfully ran it on 24GB VRAM GPUs. More information can be found here.
AI-MAGA: When Deepfakes Meet Deep State
Trump is LOVING AI generated content.
Recent events have thrust AI-generated images into the spotlight of political discourse. Former President Donald Trump shared AI-created images on social media, including one depicting Taylor Swift endorsing him and another showing Vice President Kamala Harris speaking at the Democratic National Convention surrounded by communist imagery. This incident has sparked discussions about the role of AI in politics and its potential impact on public perception.
So, what's the deal with politicians playing around with AI art tools now?
It seems AI image generators have become the latest tool in political messaging. They're being used to create fictional scenarios involving public figures, from fake endorsements to imaginary political rallies. According to The Independent, Trump shared these AI-generated images just ahead of the Democratic National Convention, potentially to undermine Harris. This shift demonstrates how AI tools, typically used for creative purposes, are now being employed in attempts to influence public opinion. As one community member put it, "Trump has literally turned into just another boomer posting terrible GOP AI art."
Lol, Trump posted a collage of AI generated Taylor Swift fans wearing ‘Swifities for Trump’ T-shits, and wrote “I accept!” as if this were real.
I mean…..this is uniquely pathetic, even for Trump.
— Peter Henlein (@SwissWatchGuy)
8:22 PM • Aug 18, 2024
Should we be concerned about the spread of AI-generated misinformation?
There's definitely cause for concern, particularly regarding the ability of some voters to distinguish AI-generated content from reality. The ease of creating and sharing these images has made them a powerful tool for spreading misinformation. Interestingly, this incident occurred shortly after Trump had accused Harris of using AI to "fake" her crowd sizes, calling it election interference.
What does this mean for the future of AI in politics and creative fields?
We might be on the cusp of a significant shift in how AI tools are perceived and regulated. The incident has sparked discussions about potential legal implications of using AI-generated content in political campaigns. As one user suggested, "I actually think spreading false imagery like this knowingly should be illegal/subject to election fraud." For creatives who use these tools, this could mean increased scrutiny and potential regulation of AI-generated content. It also underscores the importance of ethical considerations in AI use, even in creative fields. As these tools evolve, we'll likely see an ongoing battle between those creating AI content and those trying to detect it, potentially reshaping not just political landscapes but also creative industries.
Procreate's Canvas Rebellion: Painting a Future Without AI
Procreate, the popular iPad illustration app, has taken a firm stance against incorporating generative AI into its products. This decision, announced by CEO James Cuda, has been met with widespread praise from digital creatives who are increasingly concerned about the impact of AI on their industry. The move comes at a time when many other creative software companies are rushing to integrate AI tools, often to the dismay of their user base.
What's the scoop on Procreate giving AI the cold shoulder?
Procreate has made a clear commitment to avoid introducing generative AI into their products. CEO James Cuda stated, "We're not going to be introducing any generative AI into our products. I don't like what's happening to the industry, and I don't like what it's doing to artists." This decision stands in stark contrast to competitors like Adobe, who have embraced AI integration. As u/BergaChatting on Reddit pointed out, "a drawing/design app designed and praised for being the best for creatives would have to absolutely hate themselves to incorporate much of that.
Why are digital artists cheering for this decision?
The creative community's positive reaction stems from growing concerns about AI's impact on their profession. Many artists worry that AI models are trained on their work without consent or compensation, and fear that widespread AI adoption could reduce job opportunities. Procreate's stance resonates with these concerns. As stated on their website, "Generative AI is ripping the humanity out of things. Built on a foundation of theft, the technology is steering us toward a barren future." This aligns with the sentiments of many artists, as reflected in u/frotmonkey's comment: "This has been my primary app for years now and I applaud their determination to stay clean of AI."
How might this decision shape the future of creative software?
Procreate's anti-AI pledge could potentially influence the broader creative software industry. It sets a precedent for prioritizing human creativity over AI-generated content, which may pressure other companies to reconsider their AI strategies. As u/rigobueno suggests, Procreate "understand[s] the threat to their brand." This decision might also attract more users who are specifically seeking AI-free tools. However, the long-term implications remain uncertain. As Cuda mentioned, "We don't exactly know where this story's gonna go, or how it ends, but we believe that we're on the right path to supporting human creativity." This approach could lead to a divided market, with some tools embracing AI and others remaining AI-free, giving artists more choice in how they create their work.
Put This On Your Radar
Playbook: Flux ControlNets + 3D scenes web editor
New integration combines Flux ControlNets with 3D scene editing, allowing real-time syncing of 3D changes to ComfyUI workflows. Cloud rendering available to bypass VRAM limitations. This workflow is using Flux ControlNets by XLabs AI.
Pony Diffusion V7: Stepping into the Future
PurpleSmartAI has announced plans for Pony Diffusion V7, a significant update to the popular AI image generation model.
Switching to AuraFlow as the primary base model
Improved captioning using GPT-4-level AI with enhanced character recognition
Updated aesthetic classifier for better style control
Introduction of "Super Artists" for generalized art styles
Expanded dataset now including realistic images
Ongoing work on safety classifiers and character codex
Training is set to begin soon, with the team aiming to enhance both anime and realistic image generation capabilities.
Apple Podcast Interview: From Stable Diffusion to Black Forest Labs
The founders of Black Forest Labs, known for their work on Flux and originally part of the Stable Diffusion team, recently gave an insightful podcast interview via Apple.
Robin Rombach, Andreas Blattmann, and Patrick Esser discuss their journey from PhD researchers to Stability AI, and now to launching their own company.
They emphasize the importance of open-weight models in AI development.
The team explains their strategy of releasing Flux before venturing into video models.
Insights are shared about their experiences and learnings from the Stable Diffusion project.
This interview offers a unique perspective on the evolution of AI image generation from some of its key innovators.
Ideogram 2.0: New Contender in AI Image Generation
Ideogram has released version 2.0 of their AI image generation platform.
Claims to outperform Flux Pro and DALL·E 3 in human evaluations
Offers 40 free images per day (public generations, cannot be deleted)
Introduces a new text-to-image API (beta) priced at $0.08 per input
Features upgraded model, new styles, and color palette control
Includes an iOS app for mobile access
Luma AI Upgrades Dream Machine to Version 1.5
Luma AI has released an update to their Dream Machine text-to-video generator.
Dream Machine 1.5 is here 🎉 Now with higher-quality text-to-video, smarter understanding of your prompts, custom text rendering, and improved image-to-video! Level up. lumalabs.ai/dream-machine
#LumaDreamMachine— Luma AI (@LumaLabsAI)
9:04 PM • Aug 19, 2024
Enhanced realism and improved motion tracking
Smarter understanding of prompts, including better handling of non-English inputs
New ability to render text within generated videos (e.g., title sequences, animated logos)
5x speed boost: now generates 5 seconds of high-quality video in 2 minutes
Improved image-to-video capabilities
XLabs-AI Releases Flux Implementation of Deforum Framework
XLabs-AI has released a Flux-based implementation of the Deforum framework.
Enables creation of animations with 3D camera motion control
Supports prompt morphing for smooth transitions between concepts
Provides image-to-image capabilities for iterative animation creation
GitHub repository available for open-source collaboration
Requires careful setup (using virtual environments recommended)
Some users report potential conflicts with existing Stable Diffusion setups
While the release has generated excitement, users are advised to be cautious about integration with existing workflows.
ComfyUI Goes Multiplayer with Nexus Extension
A new extension for ComfyUI, called ComfyUI-Nexus, brings collaborative features to the popular AI image generation interface.
Enables multiple users to work on the same ComfyUI instance simultaneously
Supports up to 15 concurrent users
Includes a spectator mode for observing others' workflows
Utilizes HTTPS and ComfyUI's existing security protocols
Potential applications in education, small businesses, and collaborative projects
Hosted by one user, with others joining remotely
This addition opens up new possibilities for team-based AI image generation and interactive learning experiences.
Reply