Gen AI & ML
Gen AI & ML
The demand for high-quality product visuals is ever-present in the online marketplace. Balancing the need for speed, budget, and artistic control can be a challenge. Can generative AI automate product imagery while maintaining artistic integrity? This article explores this possibility, using an experiment with ComfyUI to find a balance between automation and artistry.
Can a predefined workflow, tailored to a specific product category, generate diverse images with consistent style and quality?
What does effective automation look like in this context?
Following an initial exploration of home furniture in the previous article, the selection of perfume as a category in this article strategically tests automating image generation1. Perfume's association with elegance makes it an ideal test case for assessing whether generative AI can capture subtle brand nuances. The article focuses on automating lifestyle shots of perfume bottles in curated settings. The intent is to create specialized workflows for specific product categories while ensuring consistent and aesthetically pleasing results.
To simulate a creative process designing perfume product lifestyle imagery, three assumptions are made:
Desired aesthetics, such as soft lighting, rich colors, and a sense of depth, can be predefined.
Centered product composition with accentuating inspiration.
Thematic backdrops aligned with the perfume's scent profile. In this research, I defined 2 backdrop themes: floral and beach.
The primary automation goals are:
Creative Direction Input: A user-friendly interface for providing visual feedback and adjustments.
Consistency: A unified visual style across all product images.
Scalability and Efficiency: Easy scaling of image production to meet demands.
Here is the breakdown of how the automation workflow is designed into 4 parts:
User-Friendly Front-End UI: The front-end UI prioritizes simplicity and ease of use - a simple webpage with chat interface, where users can upload a product image, and a text prompt to select the background theme - Floral or Beach.
The Functional Automation Functions: functions connect the front-end UI with the backend ComfyUI running on a remote server.
ComfyUI Backend Mechanics: Remote Server configured with ComfyUI and dependencies. Some automation processes include - image masking, background theme generation based on prompt, compositing, relighting and color correction.
Output Retrieval: Generated images are automatically saved to a designated location - e.g. Google Drive.
The results demonstrate the automation potential with web user interface communicating to a local server and output saved to Google Drive. The results validated process efficiency, with an average rendering time of 30-120 seconds per image using a GPU (RTX3070+). Image quality and consistent themes demonstrated below.
Although this simplistic approach would benefit from additional processes (composting, relighting, color correction, etc.), it demonstrates that automating sequential workflows can improve consistency, usability, and creative control when generating AI-driven content.
The experiment deliberately focused on utilizing only APIs to showcase the workflow's capabilities. While this approach demonstrates the potential of automation, it also reveals that certain nuanced, creative decision-making is still best achieved with tools like ComfyUI. As such tools continue to evolve, they provide greater flexibility for experimenting and refining AI-driven content generation. Once workflows are well-defined, structuring them into sequential steps—with human oversight—enhances reliability, ensuring a balance between automation and creative control.
What do you see AI shaping creative workflows in your field? What key considerations do you find most important? Would love to hear your thoughts!