• Diffusion Digest
  • Posts
  • California AI Bill, Juggernaut XI Launch, FLUX LoRA SHOWCASE | This Week In AI Art ๐Ÿ›๏ธ

California AI Bill, Juggernaut XI Launch, FLUX LoRA SHOWCASE | This Week In AI Art ๐Ÿ›๏ธ

Cut through the noise, stay informed โ€” new stories every Sunday.

No, we haven't forgotten about Stable Diffusion. While Flux has been hogging the spotlight, SD is still in the game. Check out the SDXL updates in the 'Put This On Your Radar' section, including the new Juggernaut XI release and Melyn's 3D Render LoRA.

Heads up - your email might chop off the good stuff. Scroll to the bottom to catch everything.

In this issue:

If youโ€™re trying to get to inbox zero but you still want to read this later:

FLUX UPDATES

General Updates
  • There is an updated version of Joy Caption, a tool for generating natural language captions for images, including NSFW content. The update, by user u/z_3454_pfk, added batching support and optimizations, significantly reducing processing time to 2.5s per image on a 3090 GPU with batch size 16. Based on work by u/fpgaminer, Joy Caption uses CLIP for vision and an LLM for language processing. Users can modify the prompt and model in the script, with any LLaMA-based model being compatible. The tool can be integrated into ComfyUI workflow, and some users suggested Florence-2 Large for non-NSFW captions. The updated version is available on Hugging Face, with the original implementation also accessible for comparison, it can be found here. The respective comfyui implementation thread can be found here.

  • u/Pyros-SD-Models released an article discussing new insights into training and using the FLUX image generation model titled โ€œFLUX is smarter than you! - and other surprising findings on making the model your ownโ€œ. The post suggests that FLUX, which uses the T5 language model, can understand more complex concepts and semantics than previously thought. Key findings include: using minimal captions (often just a single word) during training can lead to better results; FLUX can interpret abstract prompts semantically; it can learn concepts from very small datasets; and it may be possible to "talk" to FLUX through dataset captions to influence its understanding. The author also notes that FLUX's capabilities blur the lines between text understanding and image creation, potentially allowing for more natural language interfaces in image generation. You can find the article here.

Realism Update
  • u/tabula_rasa22 discusses techniques for generating more realistic images using the Flux AI model. The key insight is that Flux tends to produce overly polished images by default, so OP suggests deliberately lowering image quality in prompts to achieve a more natural look. This includes using terms like "potato quality," "compressed," and "low light." Additional techniques include reducing the guidance scale (CFG) to around 2-3 for more natural results. The post also references Flux's advanced semantic understanding, suggesting that minimal captions can lead to better results.

LoRA Update
  • u/blankey1337 discusses training LoRAs (Low-Rank Adaptations) of company logos using Flux. OP used flux-dev, a small dataset of less than 15 images including vector illustrations and real-world photos, trained at 1024 resolution for 3 epochs. They followed this Civitai article on flux character LoRA training.

CALIFORNIA'S AI IMAGE BAN

California, long known as a hub of technological innovation, is once again at the forefront of a controversial tech policy debate. A new bill, AB 3211, has been proposed in the California legislature that could dramatically reshape the landscape of AI-generated imagery. This legislation comes at a time when AI image generation tools are becoming increasingly sophisticated and widely available, raising concerns about the potential for misuse and misinformation.

What's this new California bill about AI images all about? It sounds pretty strict.

The bill, AB 3211, would require AI image generation systems and services to incorporate robust watermarking technology to identify AI-generated images. It mandates embedding specific, invisible, and hard-to-remove metadata about how and when images were created. The requirements are so stringent that they may effectively ban most existing AI image generation tools in California, including open-source models. However, there are significant concerns about the technical feasibility of this approach. As one user pointed out, "Making an image file (or any digital file for that matter) from which appended or embedded metadata can't be removed is nigh impossible" (u/malakon).

How might this impact companies and users?

If passed, the bill could have far-reaching consequences. The legislation could render illegal essentially every existing Stable Diffusion model, fine-tune, and LoRA in California. Popular AI model hosting sites like CivitAI and HuggingFace might be forced to either filter content for California residents or block access entirely. This could severely limit access to AI image generation tools for Californians and potentially drive AI development out of the state. However, potential workarounds exist, such as using VPNs or taking screenshots to remove metadata, which highlights the potential enforcement challenges the bill might face.

Is there any pushback or controversy around this bill?

There's significant controversy, with critics arguing the bill's requirements may be technologically infeasible. Some see it as regulatory overreach that could stifle innovation. Interestingly, major tech companies like Microsoft, OpenAI, and Adobe support the measure, leading some to view it as potential regulatory capture. Critics argue that this support stems from the fact that open-source image generation models and services may struggle to meet the technological requirements, effectively eliminating competition for these larger companies. The bill draws parallels to previous California legislation that mandated non-existent or infeasible technology, suggesting a pattern of the state pushing technological boundaries beyond what's currently possible. The Content Authenticity Initiative (C2PA), an effort to address digital misinformation, provides context for why some major tech companies might support such legislation. This initiative aims to develop open standards for certifying the source and history of media content, aligning with the bill's goals of authenticating AI-generated images.

GENERATIVE AI: A QUICK REFERESHER

For those new to generative AI or seeking a quick update, here's an overview of this transformative technology provided by 1440 Media.

Generative AI is a form of artificial intelligence capable of creating original content such as text, images, video, and audio. It can automate complex tasks, analyze large datasets, and boost productivity across various fields.

These systems work on the principle of prediction. Text-based AI, like ChatGPT, uses large language models trained on vast internet data to predict likely word sequences. For visual content, many systems employ generative adversarial networks, where a generator creates content and a discriminator evaluates it, resulting in increasingly realistic outputs.

People use generative AI for diverse applications. Text generation tools assist in writing, coding, and summarizing. Image and video creation platforms produce art and realistic visuals from text descriptions. AI-powered web search offers more conversational experiences, while advanced digital assistants handle a wide range of tasks.

However, generative AI also presents risks. These include potential job displacement, the generation of incorrect information ("AI hallucinations"), cybersecurity concerns, easier creation of misinformation, and various ethical issues related to privacy, bias, and intellectual property.

As this field rapidly evolves, ongoing research continues to expand our understanding of generative AI's capabilities and implications.

Seeking impartial news? Meet 1440.

Every day, 3.5 million readers turn to 1440 for their factual news. We sift through 100+ sources to bring you a complete summary of politics, global events, business, and culture, all in a brief 5-minute email. Enjoy an impartial news experience.

Put This On Your Radar

Juggernaut XI: Enhanced SDXL Model Released

RunDiffusion has unveiled Juggernaut XI, the latest iteration of their popular SDXL fine-tune.

  • Better prompt adherence

  • Expanded dataset with ChatGPT-4 captions

  • Improved text generation capabilities

  • Enhanced style control options

Previously available via API, Juggernaut XI is now open for public use. The team is already working on Juggernaut XII and exploring Flux model adaptations.

For details and downloads:

FLUX.1 ai-toolkit UI on Gradio
  • Drag and drop images

  • Caption them (or use AI to caption)

  • No code/yaml needed

Kolors Virtual Try-On App UI on Gradio
New Open-Weights Text-to-Video Model: CogVideoX-5B

THUDM has released CogVideoX-5B, an open-weights text-to-video generation model.

  • Generates 6-second, 720x480 videos at 8 FPS

  • Handles complex prompts up to 226 tokens

  • Offers various inference options (BF16, FP16, INT8)

  • Runs on consumer GPUs with optimizations

  • Includes quantized version for lower VRAM usage

Try it out on Hugging Face or check the GitHub for detailed usage instructions and fine-tuning options. The comfyui wrapper can be found here.

SDXL LoRA Model: Melyn's 3D Render

u/PixarCEO has released their first LoRA model for Stable Diffusion XL, trained on their personal 3D renders created over a decade. The model aims to generate images in the style of detailed 3D renders.

  • Compatible with SDXL

  • Trained on creator's own 3D artwork

  • Creator plans a future Flux Dev version

For prompts and usage tips, visit: https://civitai.com/models/696795/melyns-3d-render

FluxForge v0.1 Update: Search All Flux LoRAs

FluxForge, a tool for searching FLUX LoRA models, has released version 0.1.

  • Searches Civitai and Hugging Face repositories

  • Updates every 2 hours

  • Fast and seamless interface

  • Plans to add platform filtering

*Note: Some users reported initial errors, but the developer has been actively addressing issues.

Visit https://fluxforge.app to try it out.

Regional Prompt Support for ComfyUI in Photoshop

A new Photoshop extension called "sd-ppp" brings regional prompt support to ComfyUI, allowing for precise control over AI image generation directly within Photoshop.

  • Custom nodes for Photoshop integration

  • Ability to use text layers for regional prompting

  • Support for both dense diffusion and ComfyUI's built-in masked condition

This tool bridges the gap between Photoshop's powerful editing capabilities and AI image generation, offering a streamlined workflow for artists and designers.

GenWarp: Novel Views from a Single Image

Sony AI researchers have released GenWarp, an AI model that generates plausible new viewpoints of a scene from just one input image.

  • Generates novel views from a single input image

  • Works on both in-domain and out-of-domain images, including illustrations

  • Uses a diffusion model to learn geometric relationships, avoiding explicit pixel warping

  • Can be used to create 3D reconstructions via tools like InstantSplat

A demo is available on Hugging Face Spaces, with code and pre-trained weights on GitHub for those wanting to experiment further.

Flux Latent Detailer Workflow

u/renderartist has shared an experimental ComfyUI workflow that enhances fine details in images by interpolating latents, potentially avoiding the "overcooked" look that some processes cause.

  • Uses latent interpolation to generate finer details

  • Includes option to vary images while maintaining quality and composition

  • Utilizes the dev version of Flux.1 checkpoint and araminta_k_flux_koda.safetensors

  • Reported 98-second generation time on RTX 4090

This workflow uses araminta_k_flux_koda.safetensors which can be found at CivitAI.

Subscribe to keep reading

This content is free, but you must be subscribed to Diffusion Digest to continue reading.

Already a subscriber?Sign In.Not now

Reply

or to participate.