Machine Collaboration: Making Images with Generative AI
Working with AI to generate visuals isn’t like drawing, painting, or photography. It’s closer to whispering a dream into a box and watching it guess what you meant. The process is powerful, but not precise. You don’t tell it what to make. You suggest. You describe. You provoke.
A Black Box
The process for AI image generation is fundamentally opaque. You input words; you get images. But what happens in between is buried in billions of parameters and statistical patterns. You can’t fine-tune pixel by pixel. You can’t expect it to be exact. Asking an AI to make an exact thing is like asking a stranger to rebuild your childhood home based on a poem you wrote in high school. It might capture the feeling, or get lucky and hit a detail, but it’s not a blueprint.
Common Pitfalls
Currently, AI image tools don’t perform well with overstuffed prompts. Cramming too much into a single generation spreads an AI’s effort too thin. If you ask for ten realistic characters, all with specific appearances, a detailed background, a complex mood, precise lighting, etc., then something has to give. You’ll often end up with a murky soup instead of a coherent scene.
That’s because visual generation relies on dispersed effort. The more you ask for, the less clearly each thing is rendered. Prioritize. Simplify. Choose one or two anchors and let the rest stay loose. You’ll get better results and they’ll feel more intentional. If you want clarity, give the model a focal point: “a portrait of a family of three.” If you’re okay with ambiguity, let it stay vague: “a family portrait.” Both can work, just don’t expect a perfect result from a blurry ask. It helps to either focus tightly or embrace the dream logic.
Prompting is an Art
We’ve all heard of prompt engineering, but the real skill is more like poetry than programming. The best prompts are expressive, visual, and suggestive. But with every adjective, every descriptor, you open the door to interpretation. AI is not literal. “A giant pizza” might output a pie that barely fits on a table or one as big as a building. That’s the thrill and the risk.
Prompting isn’t just about what you say. It’s also about how much you give. In AI, there are three common prompt styles; zero-shot, one-shot, and few-shot. Each method has its use, but more inputs don’t always make for a better output. Sometimes one clear and concise prompt is all you need.
Zero-Shot
• A zero-shot prompt asks the model to generate something with a simple description—and no examples
• The most common prompt style and often the most surprising
• Example: "a futuristic bicycle in a desert landscape"
One-Shot
• One-shot prompting gives a single example to guide the output
• Example: "an image like this sketch, but with a different color scheme"
Few-Shot
• Few-shot prompting offers a short series of examples to establish a pattern, style, or format
• Can help steer the model more consistently, but it also increases complexity and reduces flexibility
• Example: "a futuristic concept car based on these eight examples"
A Tasty Analogy
Creating AI-generated images is a lot like baking. First, you gather your ingredients. These are the elements you want in your image: the style, the setting, the tone, and the subject matter. Next, you write your recipe, which is your prompt. This step shapes how all the ingredients might come together. Then comes the mixing and kneading. This is an important step that is often overlooked in AI generation. This iteration phase is where things start to take form, but need refinement through repeated adjustments to prompt wording, order, and detail. We desire an immediate solution, but sometimes the bread comes out burnt, sometimes it’s undercooked. Occasionally it’s perfect by accident, but no matter what, it always needs a bit of hands-on effort from our human hearts and minds.
Our Process of Organized Exploration
Behind the scenes, CARNEVALE’s process blends experimentation with structure. We use FigJam whiteboards to store and organize prompts and generations. We then use this space to facilitate conversations to gather client feedback. It becomes a living workspace where every iteration, successful or strange, gets pinned and evaluated. We start with broad concept prompts, review early generations, and identify what feels right or wrong. We map this out visually using side-by-side comparisons, refinements, and new branches of exploration. FigJam becomes our shared visual diary. It allows clients to react and dial in together. Our job is part designer, part translator, part curator. We guide both the machine and clients toward a result we all believe in.
The Human Touch
Lastly, this isn’t replacing artists or experts. A good AI output still depends on the taste, direction, and vision of the person behind the prompt. It’s a tool, and like any tool, it can elevate good thinking or amplify bad taste. The magic is in the collaboration. We'd love to hear how others are approaching generative AI in their own visual work. What’s worked for you? What’s surprised you? If you have thoughts, questions, or stories to share, let’s chat.