Intro↑§
CLIP is a natural language processing model that can learn visual concepts & categorize images (e.g. identifying cats & dogs) from unfiltered & very noisy data, and it can basically turn images into text by automatically describing them or match search queries to a database of images.
And then this person, decided to use CLIP to train the SIREN network (which uh, is way too high-level for me to understand) to generate images that match a given description, which we now know as Deep Daze.
Aleph2Image is more recent attempt at this sort of text-to-image generation that uses parts of DALL-E as the generator in conjunction with CLIP.
The plan for this post is pretty much to just run tests of whatever comes into my mind & explore the limitations & possibilities of Aleph2Image.
Aleph2Image↑§
Prompt: a neon city at night
A decent first test.
Prompt: a cloud of smog painted on a canvas
It's more of a blob but sure whatever.
Prompt: a solarpunk warship
You can clearly make out the warship, but it looks like it's being attacked by a solar flare.
Prompt: a rainy cobblestone street
I think it tried to make cobblestone rain onto a street, but I'm not entirely sure.
Prompt: a cat wearing a birthday hat
This is just an undulating mass of cat flesh.
Prompt: a bee listening to jazz
Jazz is a gateway to many abilities some consider to be unnatural.
Prompt: a demonic symbol in the sky revealing hell
It looks a bit like the Reddit snoo character.
Prompt: a cafe in a monsoon
I let this run for a bit and when I came back I forgot what the prompt was and had no idea what I was looking at.
Prompt: an anime girl made out of garlic bread
V O I D G A R L I C.
Conclusion↑§
"I was afraid that not a single thing on earth would ever again surprise me", Jorge Luis Borges
"Uuooooooaaaahhhhohhhhhhhh okay", Vinny
I'm pretty excited to see what this will evolve into. For now though, I don't see a use for any of the images I got other than maybe inspiration.
Special Thanks↑§
@advadnoun was the creator of the Google Colab that made all of this possible. You can donate to him via Venmo at "@rynnn".
Intro II (Dall-E Mini)↑§
Okay so I found this Google Colab for Dall-E Mini, so let's do this again and see what happens.
Prompt: a neon city at night
Okay, this looks a lot cooler than the last one--Even if you can't really see any detail whatsoever.
Prompt: a cloud of smog painted on a canvas
This looks like some sort of abstract art--Pretty good!
Prompt: a solarpunk warship
It's really easy to see the warship & water, this is really cool.
Prompt: a rainy cobblestone street
It doesn't look rainy, but there is very clearly a cobblestone street & buildings. I'm also digging the painterly style.
Prompt: a cat wearing a birthday hat
You can barely make out the cat & even a bit of the birthday hat. Looks like pretty good abstract art.
Prompt: a bee listening to jazz
It correctly got the low DOF effect from having a camera focused on something really small, but not the bee unfortunately.
Prompt: a demonic symbol in the sky revealing hell
I can't make heads or tails of this one.
Prompt: a cafe in a monsoon
Now this is pretty fucking sick. A lot of these images could look orders of magnitudes better with human manipulation.
Prompt: an anime girl made out of garlic bread
Looks like a character that would come out of an obscure RPG Maker game lmao.
Prompt: a euclidean bedroom
It looks more like a bathroom.
Prompt: a non-euclidean bedroom
Ironically, this one looks more like a room than the previous one.
Prompt: a lavish hotel lobby
This looks more like a bathroom than a hotel lobby.
Prompt: a cute anime girl
Another character that would fit seamlessly into an obscure RPG Maker game.
Prompt: a cute anime boy
This is like an alternate reality fever dream version of Avagado6's artwork.
Prompt: a kobold in a hoodie
Okay, so it took the mythological interpretation of the kobold instead of the furry version.
Prompt: a cute kobold in a hoodie
Some issue as the previous attempt.
Prompt: a redditor
I can't make out anything in this one.
Prompt: a sign that says, "ybubbus"
I wasn't expecting it to work, and it didn't, but I think it tried to make a storefront.
Prompt: an isometric view of a pixelated car
Isometric Pablo Picasso.
Prompt: the Notre Dame made of human flesh
I guess that might be the Notre Dame.
Prompt: a bottle of water
I was not expecting such an abstract image to come out of this prompt.
Prompt: a violent bottle of water
Not only is this one more recognizable as a bottle of water than the previous prompt, you can even see it trying to replicate the Shutterstock watermark.
Prompt: Francis Bacon in the style of Francis Bacon
Okay, wow. That actually looks like a Francis Bacon piece.
Prompt: Francis Bacon in the style of Francis Bacon in the style of Francis Bacon
Obviously a later Francis Bacon piece.
Prompt: a Pikachu poster
That's... not a Pikachu.
Prompt: Jim Carrey is an anti-vaxxer
Looks like we caught the bastard in the middle of some shape-shifting.
Prompt: Joe Biden's America
Literally Hide the Pain Harold.
Prompt: Donald Trump's America
Literally a Jellyfish.
Prompt: Barack Obama's America
Literally Boris Johnson.
Prompt: banana
A for effort.
Conclusion II↑§
Dall-E Mini is much, much faster than Aleph2Image--At the expense of resolution. But that's a fair trade-off, and what I want to do now is see if I can modify these images to be better.
Special Thanks II↑§
The Dall-E Mini colab is used was made by "mega b#6696" on Discord and was incredibly easy to use.
Addendum: Batch-editing Dall-E Mini Output
Here's all of the previous Dall-E images you saw ran through various filters:
Conclusion III↑§
This is pretty fun.