Last week Meta announced their entry into the field of artifical intelligence image generation. Their new tool is called Make-a-Scene and it takes both an image and a line of text as prompts. It's a more collaborative tool than some of the other image generators made by other companies. A person can sketch out a rough scene as input, then use text to tell Make-a-Scene how to fill it in.
Make-a-Scene isn't yet open to the public, but as a Meta employee, I was able to get my hands on it early. In this post, I'm going to show you my first experiments with Make-a-Scene, and you can see how it compares to the other image generation tools.
As a lover of California landscapes, and a collector of the painters known as the California Impressionists, I had to start by trying to generate some California landscapes. I drew an image with a sky, a mountain, a river, and a fence. Then I gave it the prompt "an impressionist painting of california". This image shows the input along with four generated images.
I particularly liked the first two images.
As you can see, Make-a-Scene tries to follow the image input as closely as possible, and it's able to interpret the phrase "impressionist painting" in many different ways.
Next, I fed it the much more specific prompt of "Emperor Palpatine training Anakin Skywalker". As you can see from the generated images, it struggled much more to understand both my poor drawing, and the very specific text prompt.
You can see how my drawing led the AI astray in the generated images. I included lightsabers in sections of the image labeled as "person" so Make-a-Scene added some funky looking arms onto the people. Interestingly, it didn't necessarily understand who the fictional characters were, but it knew that they were soldiers.
For my last experiment, I went back to something more generic. I though about the type of images that marketers might need. I drew a picture of a person-shaped blob holding a spoon-shaped blob, and gave it the prompt "a woman eating breakfast". The results are trippy, but interesting.
The first image it generated looks almost like usable clip art.
Overall, I think Make-a-Scene is interesting and fun. I think, even in this early state, it has some real possibility for generating art in some situations. I think it would be particularly good at creating trippy art for album covers or single artwork on music streaming sites. I also think it could be useful for brainstorming visual ideas about characters, concepts, and even fiction. I hope that Meta opens it up to the public soon.