Stable Diff
Experiments with Stable Diffusion image generation. Curated outputs, prompt exploration, and notes on the process.
It started as brute force. Run a prompt, see what comes out, change a word, run it again. Hundreds of generations, most discarded. The rhythm felt familiar — change one variable, observe the output, decide if the delta was signal or noise. Except the system was a neural network and the variables were adjectives. The early outputs were dense, overstuffed, the model trying to satisfy every token at once.
Then the river scenes started showing up. Unprompted warmth — amber skies, saturated water, desert vegetation. The model kept returning to the same palette no matter how the prompt shifted. You stop fighting it and start working with the grain. That’s when the outputs got interesting.
Prompting turned out to be a strange skill. Somewhere between writing and engineering. You’re describing something that doesn’t exist yet, in language precise enough for a model to interpret but loose enough to leave room for surprise. Too specific and the output is rigid. Too vague and it drifts into noise. The sweet spot is structured ambiguity — a phrase that constrains without dictating.
The pixel art outputs were unexpected. The model renders them with a confidence that suggests the training data ran deep with retro game art. These feel less like generated images and more like recovered screenshots from games that were never shipped. Whole worlds compressed into a few hundred pixels, implied rather than rendered.
The best outputs are the ones that don’t quite make sense. Objects almost recognizable. Scenes that feel like memories of places you’ve never been. That’s where diffusion models are most honest — not when they faithfully reproduce a prompt, but when they wander off and show you something you didn’t ask for.