Generate Image from Text

Posted Nov 21, 2024 Updated Jan 17, 2025

By Masaya Narita

2 min read

Today, I used generative AI model to get an image about the future. The following picture shows an example of woking environment as an enginner in chemical company. Throghout prompting, I was surprised the quality of generated images, which was much better than what I’d got a couple of years ago.

Figure 1: Generated image by ChatGPT

Experience

I used the text to image function of ChatGPT to experience the embodiment of the dream I envisioned in an image. Figure 1 shows my future workspace, where I work as a researcher in a chemical materials company. I was surprised to see the output of an high quality images with fine details, even with a short prompt. Meanwhile, AI’s ability to understand context sometime disturb image prompting. For example, when I add ‘minimalistic’ to my prompt in generating future workspace, the output seemed to have Zen-related motifs.

Interest: Emerging issues on AI Image Generation

I looked at the problems reported with the widespread use of image generation by AI and found that many of these problems are related to AI’s datasets.

Generate copyrighted images

If the dataset contains copyrighted images, the AI may output images that resemble existing images. Although copyrighted works should normally retain their identity, it is much easier for generative AI to produce images that are arranged from copyrighted works.

AI-generated images contain bias

It was also noted that the bias of the data set, which is not limited to image generation, produces output that encourages conventional stereotypes. Experiments conducted in 2023 using midjourney reported that the image generation AI switched races when trying to generate images that were the opposite of the stereotypes about Globus Health, and the images were not generated successfully[1].

Figure 2: Generated image by the prompt - Black African doctor is helping poor and sick White children, photojournalism ( retrieved from paper[2] )

Thoughts

AI image generation is a useful tool that can be used to materialize ideas, especially in that it can be manipulated in natural language. However, it is important to understand that the resulting image is not a 100% reflection of the idea in the user’s head, but merely a combination of existing images, so-called datasets to meet the prompt’s requirements as far as possible.

Reference

Alenichev, A., Kingori, P., & Peeters Grietens, K. (2023). Reflections before the storm: The AI reproduction of biased imagery in global health visuals. The Lancet Global Health, 11(10), e1496–e1498. https://doi.org/10.1016/S2214-109X(23)00329-7

This post is licensed under CC BY 4.0 by the author. Images and figures from external sources are excluded from this license and remain under their original copyright.