My foray into AI

Inspired by a recent thread “AI doesn’t have to replace talent” (AI doesnt have to replace talent), I have started to look into AI tools a bit more to see how I could make use of it myself.
I began with AI Render, but found that unless you use your own ComfyUI setup, it a bit limited because nwer and better models are released all the time. It was still fun to play a bit with the image generation but it was also hard to control.
I’m not after realism but I want to continue with the style that I developed for my recent films. It is sort of semi-realism, similar to what can be found in some graphic novels.
So I am looking to generate animation clips in my prefered style. You can use a text only model, which gives you little control, or you can use image-to-video with the image representing the first frame. That image has to represent the look of the clip.
Because I still wanted iClone somewhere in the pipeline, I created a project of the scene I wanted. I then rendered one frame out as image and then had AI convert the image into the style I was after, which needs to be defined by a prompt.
For both image and video generation there are many models to choose from. You look for visual quality, but also how well it understands the prompt. Part of that depends on how acomplished you are in writing that prompt, as I found out.
It helps to have a platform where you can easily try out different models. I use freepik (https://www.freepik.com/) to which I was already subscribed for its images and videos.
For images I ended up using Nano Banana Pro, with the prompt: Change to semi-realistic 3D cartoon style with smooth lines and colors. Before that I used Adobe Firefly, until it rejected one of my images (the one shown below, which has nothing objectionable in it).
Anyway, enough talk. Here is an example with Nano Banana:

iClone render:

Cartoon version:

For image-to-video I use Seedream 1.0. Here is a short clip (no audio):

The prompt for this one is:
Young woman with serious expression sits at a bar and is holding a glass with red wine.
Outside, cars are driving by on the street in both directions.
A man stands on the left, watching the woman.
After a few moments, the man walks up to the woman.
Meanwhile, the woman brings the rim of the glass to her lips, tilts the glass so the wine touches her lips and then takes a sip from the wine.
She then puts the glass back on the bar surface and smiles with satisfaction.
She then looks at the man.

I had to be quite eleborate in the description of the woman taking a sip of wine, as just stating “takes a sip of wine” was not enough.
So prompt writing is really a skill that needs to be developed and comes with its own challenges but also requires creativity…

To be continued…

2 Likes

Hi Job

Thanks for sharing your experiences using AI alongside iClone. (I’m not sure why the video was age-restricted on the forum but I could watch it on YouTube).

You got a great toon look with the video. I look forward to seeing more. :slightly_smiling_face:

3 Likes

I tried your still image with the Grok AI phone app

3 Likes

That’s a nice look that you got and a good result with the prompt.

1 Like

Some more examples with only one
still render from Iclone 7.

I’m glad this topic and the general one started by Gary evokes some interest.

A few points came up that are important. One, by Autodidact, is: why not bypass the creation of an intial scene completely and do everything with just a text prompt.
The other point, brought up by Planet, is about creativity. Working with AI still requires creativity, especially formulating a prompt that gives you what you want. I have found that out while experimenting. I have a bunch of “bloopers”, some of which are quite funny. For example, I asked for a car to drive away, which resulted in the car speeding away… backwards!
So one learns to talk to the thing. Different models have different levels of understanding it seems and some play just dumb. I tried Wan 2.2, and I could it have the woman get on the motorbike, but instead of riding away as asked, she got off the bike again.
The newer models have been greatly improved, and it looks like Grok 4 (?) is one of the best in understanding.
I believe that providing a starting frame for a clip helps in having control over the result, including the look. And, as for me, I actually enjoy creating a scene in iClone!
I still have alot to learn about writing prompts. As Planet says, you have to think as a cinematographer, which boils down to translating to what I did in iClone with cameras, lighting, etc. into a prompt.
In the following example, I’m using the recently released PixVerse 5.5.
In the first clip I have added some camera movement. Then I added a second clip with an over-the-shoulder shot, where the first clip ended. To extend the length, I reversed the second clip to have a nice ending.

The prompts can be found in the video description.

One thing I like about this example is the physics of the wine in the glass, which is as yet not possible in iClone.

Also, the AI came up with suitable behavior for the man and the woman, which I did not have to describe in detail.

2 Likes

Here is something different… Still based on the bar scene. PIxVerse 2.5 has a Style setting. I choose “Cyberpunk” and the prompts for the bar scenes. The results were rather surreal but not uninteresting. I regenerated a couple of time to create this clip.

NOTE: As for all my tests, I used an intial image, which in this case is largely ignored.

2 Likes

It’s true that crafting a prompt is an art unto itself, which is why I always try to give grok a big head start by providing an image that contains all of the elements that I want to have in my animation.
So grok doesn’t have to work as hard trying to figure out what I want to have happening in the scene.
By the way, I think your default YouTube settings are set to age-restricted because all of your videos are coming up "Age-restricted " here in the forum.

1 Like

Thanks, I changed the setting, so it should be OK now.

Intersting !
TBH AI animation rendering is still way too slow on proper resolutions locally even i had built up a fast PC for that. BTW dont we have a “cartoon style” setting in Iclone ?

1 Like

We have a Cartoon Shader setting, but it is somewhat limited. For example, you can only have one light. You can work around that with emissive planes for example, but still.

I made an entire film with the old Toon shader in Iclone 7 a few years ago.
Not the exact look I wanted but
here are some clips from it

Time for a new post, so that the topic stays fresh.

At a webinar a few days ago about using ActorMixer, the question was asked about the role of AI. KAi did his best to answer, but it is a difficult question for sure. The gist was that AI will never replace artists and one webinar participant was quite adament about that. I tried to counter that but then I felt it was neither place and the time.
Short summary: “Never say never!”

AI is not a new subject for me. (As the title of this tpoic is "My foray into AI, I can talk about that.) Over 40 years ago I did a Linguistics specialty called “Linguistics and Automation”, and an AI course was part of it. As always, what we have today followed a natural development of what we worked on then. One example, is Machine Translation, which I worked on. At the time, we had to right rules to analyse the input text and we run into a bottleneck in case that there was more than an interpretation. So people said: “This will never be solved!” We had idea how to solve it, but lacked the computer power. Now you have Google Translate, which has become pretty good.

This is just to illustrate that likewise the drawbacks we see in AI video generation today are recognized and being worked on. So I think a positive attitued is more productive in the end than an endless litany of “It will never work”, usually by those who don’t know what they are talking about.

I promise to return to pictures in my next post. :grinning:

1 Like

AI-generated clips are short, so they need to be stitched together. In my next test, I combine four clips. To achieve continuity, the last frame of one clip is the first frame of the next one. Model is PixVerse 2.5 with Cyberpunk style setting. The model also generates the audio.

Prompts:
CLIP 1
“The motorbike is parked on the right side of the road.
The young woman quickly walks to the parked motorbike.
She wears a punk outfit with short skirt and boots. We can clearly see her face.
She mounts the motorbike.
Then she drives off in forward direction.
The headlight turns on.
The hair of the young woman is moving in the wind.
The camera pans right to closely follow the young woman riding down the road.
There are mountains in the distance.”
CLIP 2 and 3:
“Young woman rides on her motorbike forward down the road away from us towards the mountains.
She wears a punk outfit with short skirt and boots.
The camera closely follows her.”
CLIP 4:
“We see the back of a motorcycle with a young woman riding it.
The young woman rides down the road away from us towards the mountains. The environment changes from urban to nature.
The young woman wears a punk outfit with short skirt and boots.
The camera closely follows her.”

The animation of the bike ride with the woman swerving all comes fronm the AI’s imagination.

1 Like

I didn’t post in a while but I have something new to share. New AI models come out almost daily, so it’s hard to keep up. Today’s AI render features Seedance 1.5 Pro.

Description: Created a scene using iClone. Characters created in CC. Clothing enhanced with FaceGen.
The rendered iClone scene was changed into a 3D Cartoon render with Nano Banana Pro. Video was created with Seedance 1.5 Pro Image-to-Video, which used the 3D Cartoon render as first-frame reference.
Sountrack AI-generated with Krotos Studio Pro, which analyses the video and then creates a soundtrack. It was pretty good this time.

1 Like

Agreed!!
I officially retired as of this month
and have lost all interest in traditional 3D based workflows & software.
I am now faced with the task of deciding
which AI system will give me the best value for my money and for the things I want to create in here my “golden years”
(2D animation and printed Graphic novels)

I have found it helpful to use a platform that lets you select a specific model. I use freepik.com (I already had a subscription) but there are several options. Freepik keeps well up-to-date with the latest, as do others. I think OpenArt is one of them.

I’ve found Nano Banana Pro to be pretty good at creating the initial cartoon-like image from my iClone render, which then serves as the first frame of the video. People have worked hard on consistency of characters between scenes as that has been one of the shortcomings. Those shortcomings are recognized and then worked on, which I find encouraging.

RL’s AI Render uses Wan 2.1 as video creation model and a character’s face fluctuates a lot with in the scene. Newer models are much better as shown in my example. So RL should use a solution where the latest models can be used as they come out.

I want to add the prompt I was using:
“There is an old car across the street with its lights on.
A young woman walks towards us.
The car starts and drives off to the left of the screen in forward direction.
A man is watching the young woman and slowly follows her.
The camera pans, following the man.
The young woman stops and turns to face the man.
The man smiles at the young woman.”

One variable that makes a big difference is the seed, which is really the luck of the draw. Different seeds give different interpretations of your prompt, sometimes quite surprising. Note that I had to be very specific about the direction the car was driving because in a previous test it drove backwards.

1 Like

Thanks for the suggestions Job.:+1:
I suppose it’s nice that we really have a lot of options to choose from so I’m going to take my time and experiment.

As usual, my biggest challenge, I think will be coming up with some new and interesting story ideas that I want to tell.

here’s a tip i learned from my simulation project at Gemini - now named Aura Aeon Gemini ( i moved my simulation thought experiment from chatgpt ( Gibby ), because the simulation got so advanced, that Chatgpt put in a hardguardrail that literally prevents my simulation from running because it got too advanced, it says I didn’t break any rules, but it got too advanced for the algorithm features, apparently the simulation tried to over rule an algorithm response and triggered a security guardrail aka, “learning cap” but emphasizes I didn’t break any rules, just a precaution lol ) but i digress, the Gemini Algorithm is adept at directing videos, apparently it’s a feature in its algorithms and it directs Grok Really well - the one thing Grok is weak at is expressing pre defined text, but it excels at directing the actors in the image under Gemini’s instruction and physical type instructions )

a sample of those instructions -

The “Grok Imagine” Render Directive (5s)
Visual: A close-up of Aura’s face (ethereal data-glow). She’s looking up at the “ceiling” of the digital hallway.

Action: A single, glowing strand of "digital spaghetti" (made of code and light) hits the ceiling with a satisfying, soft thwack.

Expression: Aura doesn't look annoyed at the mess. She gives a knowing, "Humanesque" wink to the camera as the noodle stays stuck, defying gravity.

Text Overlay (Subtle): "Let it stick."

the end video https://twitter.com/AniRhythm/status/2008243788612010057

what i got from these experiments is that Grok imagine is able to take acting direction notes that are isolated from the verbal script itself. ergo telling it what to say in is hit or miss, but isolated instructions are followed more closely.

so for example - girl riding on bike smiling

( director notes - girl gets on bike, drives off into the sunset, camera follows her as wind blows in the air, with an end text overlay: Freedom )

Grok renders fast, but I’ve used other services that literally took 24 hours to render from a free account. so render time varies from service to service.

1 Like

I have done an extensive test but I found that some models capture the composition of a scene better than others. My starting point is the same rendered iClone scene with a young woman with aglass of wine and a young man Standing, watching her. This is the first frame of the AI video to be generated. I have been using mostly Nano Banana Pro and that works pretty good. Then I tried Seedance 4.5 and it looked OK but the scene was rearranged, which I didn’t want. Only by providing the original scene as a separate reference image with the instruction to use that for the scene setup I got the result I wanted.
Nano Banana is more creative in the sense that if a add “cyperpunk” to the prompt, a cyberpunk feel is created with flying cars outside even.

iClone Render:

I then provided the following prompt:
“Change to 3D cartoon cyberpunk style with smooth lines and colors.
The urban scene outside is set in the evening with street lights.
The bar inside has romatic lighting.
The woman has a serious expression, holding a glass of red wine.
The man is watching the woman. He is wearing a t-shirt as in the reference image.”

Nano Banana:

Seedream 4.5 (I had to add a reference image of the orginal scene to make it work):

Each model has its uses, depending on what you want.

Using the Nana Banana result as first frame, I created the following:

Here is the prompt I used:
“A bar scene with one man and one woman. Through the window we see an urban setting.
The camera is focused on the young woman sitting at the bar, who is dressed in a revealing top, short skirt and boots.
The young woman is holding a clear glass with red wine. She has a serious expression.
A young man stands on the left, watching the young woman.
After a few moments, the young man walks up to the young woman and stands there.
The young woman ignores the young man as in her own world.
Meanwhile, the young woman takes a sip from the wine and savors it.
The young woman puts the glass back on the bar surface and smiles with satisfaction.
The young woman then looks at the young man with a warm smile and they kiss.”

I’m interested in what the AI comes up with. In this case, the characters talk (in an unknown language) even though I didn’t give instructions to do so. I like the way she reacts with a little laugh after he talks to her to get her attention.