Rethinking How We Visualize Generative AI
I have taken all sorts of online courses on artificial intelligence, from the very basic 101s to areas dealing with niche topics. Most of these classes feature to some degree the now-classical tropes of how the world has come to visualize “AI”. You probably have seen them too – the brain, the illuminated neural net, the robot or android, and even the all-seeing mechanical eye.
They are so common and well understood that some of the instructors even call out their overuse. But further, these images which have become attached to what we collectively refer to as artificial intelligence are not terribly good at providing a visual of what AI truly is as understood today, and they can even be misleading, reinforcing general misconceptions about AI, e.g. that AI is conscious, or on the cusp of attaining man-made consciousness.
So I set off on a small journey, asking myself, if we throw out all of these worn, overused, and tired images which represent AI at this very moment, what visualizations would best replace them? What sort of imagery could capture what generative AI actually is?
Like any AI prompting, the quality of the inquiry will lead to better and refined answers. I could have just asked any LLM, “If you ignore <list of all AI image tropes> what is left to be our visual manifestation of AI?” This could be useful to start, but bland. Instead, I wanted to frame the structure of “generative AI”, and only then seek the assistance of a model to help with the visualization.
In the broadest terms, generative AI consists of three components:
- Foundation Training – All models, whether LLM or derivatives for images, videos, or any other data asset, must be trained on a large source set. For LLMs that training set is “the Internet” for all the hazards that come with it. The Internet is not a fountain of truth but a chaotic medley of statements, assertions, opinions, debate, and often times misdirection and outright falsifications. Yet still, collectively, these mountains and mountains of data are necessary for the models to build their abilities, to weave strands together and output something that mimics normal language responses.
- Prompt Interpretation – We humans then engage the model, and all of its back end training data, through the prompt inquiries we feed in. Through our prompts and these prompts alone (and via APIs, agents, automations, etc.) the models are given purpose and direction, a means to harness their foundational training data.
- Generative Output – The emergent creation is what the model builds, based on our prompt directions and sourced from the model’s foundational set.
I used this three-tiered approach to frame my inquiry to Copilot, asking for a way to visualize generative AI within the context of a vast foundational data set, prompt interpretation, and final creation, to suggest how all of this could be visualized and get closer to imagery that truly explains what generative AI is.
After collaborating through multiple analogies and iterations, I arrived at the image concept attached to this article. I fully appreciate that is it not perfect, but I do feel it to be “closer to perfect” than the classical and tired images are.
The image can almost be called a tribunal of the distinct concepts, each playing its role on this generative AI tapestry.
In the background is a huge stone wall filled with hieroglyphic-like shapes, representing the foundation data set the AI was trained on. A man holds a prism of sorts, angling the light to reflect off of the monolithic wall. He represents us and our prompting as we attempt to harness the foundation data into something we desire to create. And this direction leads to the final aspect, the plant coral springing to life, or at least appearing to, through our guided positioning of the prism to harness the hieroglyphics in just the right way.
Yet it is entirely possible that the man did not want coral at all, but instead needed a spreadsheet with last month’s sales figures. In that case, he needs to go back to the drawing board and re-engineer his prompt!
Hopefully you found this exercise insightful or at least fun. How would you rethink the imagery associated with generative AI? Are there other, completely different visual cues that capture the essence of generative AI better?
