Intro↑§

CLIP is a natural language processing model that can learn visual concepts & categorize images (e.g. identifying cats & dogs) from unfiltered & very noisy data, and it can basically turn images into text by automatically describing them or match search queries to a database of images.

And then this person, decided to use CLIP to train the SIREN network (which uh, is way too high-level for me to understand) to generate images that match a given description, which we now know as Deep Daze.

Aleph2Image is more recent attempt at this sort of text-to-image generation that uses parts of DALL-E as the generator in conjunction with CLIP.

The plan for this post is pretty much to just run tests of whatever comes into my mind & explore the limitations & possibilities of Aleph2Image.

Aleph2Image↑§

Prompt: a neon city at night

A decent first test.

Prompt: a cloud of smog painted on a canvas

It's more of a blob but sure whatever.

Prompt: a solarpunk warship

You can clearly make out the warship, but it looks like it's being attacked by a solar flare.

Prompt: a rainy cobblestone street

I think it tried to make cobblestone rain onto a street, but I'm not entirely sure.

Prompt: a cat wearing a birthday hat

This is just an undulating mass of cat flesh.

Prompt: a bee listening to jazz

Jazz is a gateway to many abilities some consider to be unnatural.

Prompt: a demonic symbol in the sky revealing hell

It looks a bit like the Reddit snoo character.

Prompt: a cafe in a monsoon

I let this run for a bit and when I came back I forgot what the prompt was and had no idea what I was looking at.

Prompt: an anime girl made out of garlic bread

V O I D G A R L I C.

Conclusion↑§

~~"I was afraid that not a single thing on earth would ever again surprise me", Jorge Luis Borges~~

"Uuooooooaaaahhhhohhhhhhhh okay", Vinny

I'm pretty excited to see what this will evolve into. For now though, I don't see a use for any of the images I got other than maybe inspiration.

Special Thanks↑§

@advadnoun was the creator of the Google Colab that made all of this possible. You can donate to him via Venmo at "@rynnn".

Intro II (Dall-E Mini)↑§

Okay so I found this Google Colab for Dall-E Mini, so let's do this again and see what happens.

Prompt: a neon city at night

Okay, this looks a lot cooler than the last one--Even if you can't really see any detail whatsoever.

Prompt: a cloud of smog painted on a canvas

This looks like some sort of abstract art--Pretty good!

Prompt: a solarpunk warship

It's really easy to see the warship & water, this is really cool.

Prompt: a rainy cobblestone street

It doesn't look rainy, but there is very clearly a cobblestone street & buildings. I'm also digging the painterly style.

Prompt: a cat wearing a birthday hat

You can barely make out the cat & even a bit of the birthday hat. Looks like pretty good abstract art.

Prompt: a bee listening to jazz

It correctly got the low DOF effect from having a camera focused on something really small, but not the bee unfortunately.

Prompt: a demonic symbol in the sky revealing hell

I can't make heads or tails of this one.

Prompt: a cafe in a monsoon

Now this is pretty fucking sick. A lot of these images could look orders of magnitudes better with human manipulation.

Prompt: an anime girl made out of garlic bread

Looks like a character that would come out of an obscure RPG Maker game lmao.

Prompt: a euclidean bedroom

It looks more like a bathroom.

Prompt: a non-euclidean bedroom

Ironically, this one looks more like a room than the previous one.

Prompt: a lavish hotel lobby

This looks more like a bathroom than a hotel lobby.

Prompt: a cute anime girl

Another character that would fit seamlessly into an obscure RPG Maker game.

Prompt: a cute anime boy

This is like an alternate reality fever dream version of Avagado6's artwork.

Prompt: a kobold in a hoodie

Okay, so it took the mythological interpretation of the kobold instead of the furry version.

Prompt: a cute kobold in a hoodie

Some issue as the previous attempt.

Prompt: a redditor

I can't make out anything in this one.

Prompt: a sign that says, "ybubbus"

I wasn't expecting it to work, and it didn't, but I think it tried to make a storefront.

Prompt: an isometric view of a pixelated car

Isometric Pablo Picasso.

Prompt: the Notre Dame made of human flesh

I guess that might be the Notre Dame.

Prompt: a bottle of water

I was not expecting such an abstract image to come out of this prompt.

Prompt: a violent bottle of water

Not only is this one more recognizable as a bottle of water than the previous prompt, you can even see it trying to replicate the Shutterstock watermark.

Prompt: Francis Bacon in the style of Francis Bacon

Okay, wow. That actually looks like a Francis Bacon piece.

Prompt: Francis Bacon in the style of Francis Bacon in the style of Francis Bacon

Obviously a later Francis Bacon piece.

Prompt: a Pikachu poster

That's... not a Pikachu.

Prompt: Jim Carrey is an anti-vaxxer

Looks like we caught the bastard in the middle of some shape-shifting.

Prompt: Joe Biden's America

Literally Hide the Pain Harold.

Prompt: Donald Trump's America

Literally a Jellyfish.

Prompt: Barack Obama's America

Literally Boris Johnson.

Prompt: banana

A for effort.

Conclusion II↑§

Dall-E Mini is much, much faster than Aleph2Image--At the expense of resolution. But that's a fair trade-off, and what I want to do now is see if I can modify these images to be better.