I by no means pictured AI writing human-like textual content, no I’m not speaking concerning the textual content era however somewhat picture era with human handwriting. The Flux fashions made it simple to deduce, generate and edit photographs. At this time, on this article, we’ll be taking a look at one such mannequin used for producing photographs with hand-written textual content. No that’s not all, we’ll additionally construct a narrative telling software in direction of the top of the article.
What are FLUX Fashions?
Flux fashions are generative fashions which are usually related to producing high-quality photographs, movies, or different content material. These fashions are constructed utilizing superior neural networks like Secure Diffusion or Variational Autoencoders (VAEs). We’ll be specializing in a Flux mannequin, specifically the fofr/flux-handwriting mannequin all through the article.
Flux Handwriting Mannequin
“fofr/flux-hndwriting” is a flux Lora fine-tuned to provide handwritten textual content, let’s take a look at varied methods to make use of it to generate some photographs with handwritten textual content.
Hugging Face
You’ll be able to shortly to the mannequin web page on the Hugging Face and use the ‘diffusers library’ or the ‘inference api’ to generate the pictures.
Word: Do not forget that it is best to use HWRIT handwriting to set off the picture era.
I prompted it to generate: ““HWRIT shaky messy handwriting stating “The solar will rise,” illegible, darkish inexperienced ink on previous water-damaged paper with seen mould marks.““
The generated picture has the textual content in the identical model I had talked about within the immediate.
Let’s attempt the Inference API
Get your HuggingFace entry token from right here: Hugging Face Tokens
from huggingface_hub import InferenceClient
consumer = InferenceClient("fofr/flux-handwriting", token="hf_token")
# output
picture = consumer.text_to_image('HWRIT scrawling messy handwriting saying "I am Iron Man", illegible, written with a HB pencil on a grainy paper')
Output
This appears fairly good and the mannequin didn’t mess up with any characters too.
Word: It takes some time for the picture era.
Replicate
It’s also possible to select to run this mannequin on Replicate nevertheless it’ll price you roughly $1 for 90 runs or roughly $0.11/run, this could differ.
Story Telling Utility
Let’s create an LLM software that first writes a narrative after which breaks into 7 items after which generates 7 hand-written photographs to assist the storytelling. We’ll then mix these photographs to finish the applying.
We’ll be utilizing Gemini fashions to generate the story and make prompts to generate photographs from flux-handwriting. First, let’s get on our fingers on the Gemini API-key:
Merely click on on Create API Key to get a brand new key to make use of Gemini fashions.
Set up
!pip set up -q -U google-generativeai
To make use of the Gemini fashions.
Implementation
Configure and select the mannequin, I’ll be utilizing the ‘Gemini-1.5-flash’ mannequin.
import google.generativeai as genai
genai.configure(api_key=”API-Key”)
mannequin = genai.GenerativeModel("gemini-1.5-flash")
Producing the story:
response = mannequin.generate_content("Write a brief and clear story in about 80 phrases, a few day within the lifetime of a person named Cyan turning right into a superhero.")
story = response.textual content
print(story)
Cyan woke to a throbbing headache, a wierd image burning into his palm.
That day, mundane duties – grocery procuring, canine strolling – felt amplified,
his senses sharper. A rushing automotive careened in direction of a toddler; instinctively,
Cyan reacted. He moved quicker than he thought attainable, a blur of movement,
saving the kid. The image glowed. He was not simply Cyan. He was
one thing extra.
Now break up the story into 7 components:
sentences = story.break up(". ")
prompts = [f"{sentences[i]}." if i < len(sentences) else "" for i in vary(7)]
Construction the 7 components into prompts to request for a response:
handwriting_prompts = [
f"HWRIT handwriting style for the text: '{prompt}' in a neat cursive writing in orange Ink and red paper background"
for prompt in prompts if prompt.strip()
]
Operate to generate the handwritten photographs:
(Get your hugging face token and ensure to examine all of the inference bins whereas making a token)
from huggingface_hub import InferenceClient
import time
consumer = InferenceClient("fofr/flux-handwriting", token="hf_token")
def handwriting_text(immediate):
picture = consumer.text_to_image(immediate)
return picture
Producing photographs with handwritten textual content:
handwritten_images = []
for immediate in handwriting_prompts:
picture = handwriting_text(immediate)
handwritten_images.append(picture)
time.sleep(120) # 2-minute delay
Word: The API request would possibly throw an error on the traces of “Max requests
complete reached on picture era inference (3). Wait as much as one minute
earlier than with the ability to course of extra Diffusion requests.”, Therefore we’re including a
120 second sleep after every request within the for loop.
Producing the video utilizing OpenCV:
import cv2
import os
def create_video_from_images(image_list, output_video_path, fps=1):
# Load the primary picture to get dimensions
body = cv2.imread(image_list[0])
top, width, _ = body.form
# Initialize the video author
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
video = cv2.VideoWriter(output_video_path, fourcc, fps, (width, top))
# Write every picture to the video
for image_path in image_list:
body = cv2.imread(image_path)
video.write(body)
# Launch the video author
video.launch()
# Save the pictures to disk and create a video
image_file_paths = []
for idx, picture in enumerate(handwritten_images):
file_path = f"handwritten_image_{idx}.png"
picture.save(file_path)
image_file_paths.append(file_path)
# Mix photographs right into a video
create_video_from_images(image_file_paths, "handwritten_story.mp4", fps=0.25)
print("Video created: handwritten_story.mp4")
We saved the pictures after which mixed them right into a video with a body fee of 0.25 (1 body per 4 seconds for readability).
Output
Hyperlink to the video: handwritten_story.mp4
Word: The mannequin struggles whereas producing photographs with greater than 4-5 phrases per picture so we have to prohibit the textual content.
One train you would attempt is to make use of an LLM working to make the prompts as an alternative of splitting the story and utilizing a typical template, this can make sure the textual content restrict and the model and background of the textual content will be tuned in response to the textual content by the LLM.
Conclusion
In conclusion, utilizing Flux fashions corresponding to “fofr/flux-handwriting” introduces new alternatives for crafting customized handwritten-style visuals. Whether or not creating standalone prompts or growing full storytelling options, these instruments spotlight AI’s capacity to merge creative creativity with sensible purposes. The storytelling characteristic exemplifies how effortlessly AI-generated visuals can mix into multimedia tasks, driving ahead ingenious and charming potentialities.
Additionally in case you are in search of a Generative AI course on-line then, discover: GenAI Pinnacle Program
Often Requested Questions
Ans. The “flux-handwriting” mannequin is a LoRA (Low-Rank Adaptation) fine-tuned model of the FLUX.1-dev mannequin, designed to generate photographs of handwriting in varied types primarily based on textual content prompts.
Ans. First, load the bottom FLUX.1-dev mannequin utilizing the Diffusers library. Then, apply the “flux-handwriting” LoRA weights to the pipeline. Lastly, generate photographs by offering prompts.
Ans. To activate the handwriting era characteristic, embrace the set off phrase HWRIT handwriting in your immediate.
Ans. You should utilize Replicate to deduce utilizing the fofr/flux-handwriting mannequin: Replicate.