With the dawn of generative AI, everyone is rushing to integrate this powerful new capability into their products and services. At Momentum Design Lab, we have been working on projects with increasing demand to drive the experience with AI and to think about the implications of this for the user. How do people interact with systems powered by artificial intelligence? What support do they need in transitioning to this new paradigm? How can we enable a sense of trust and control with this new technology?
In this article, we will discuss considerations around designing interfaces powered by generative AI, pulling from experiences in projects, conferences attended, and exploration of numerous AI tools and technologies.
Consideration 01: Guidance and Freedom
In our exploration of AI tools, we have seen varying levels of guidance and freedom available through the interface, from the openness of a free text field like in Chat GPT, to a constrained, wizard-like experience when generating photos with Remini. How do we determine the best level of guidance and structure for collecting user input?
Remini’s interface that takes users through a series of steps to generate customized photos. There is no place to enter free text. Instead, users choose from a variety of predefined options.
To answer this question, we must empathize with the user by understanding their mindset when they are approaching the experience. How familiar will they be with interacting with AI? Are they excited by the technology itself, or the tool’s capacity to deliver the desired outcome seamlessly? What task are they trying to accomplish, and what outcomes are they expecting?
It can be daunting and overwhelming to ask people to input something from infinite possibilities. We are used to having some form of constraints from user interfaces. Inputs that directly affected outputs were controlled with elements like dropdowns, checkboxes, or short words and phrases before genAI.
Open AI’s Chat GPT interface with a free text field. There is still minimal guidance provided with examples on where to start. The type of information provided is aimed towards users who may be exposed to this interaction for the first time.
This feeling of being lost in infinite possibilities compounds if the task at hand is more nebulous, like wanting to learn or explore a topic. In cases like this, people may need more guidance on what to ask and where to start.
An example of this was discussed by the design team at Duolingo, a language learning service, during Figma Config this year: Learners ran out of questions to ask the AI language tutor after one to three questions. They were seeking guidance from the tutor, so the AI needed to have an opinion on what it wanted the user to ask to guide them towards that direction. This mimicked the real-life relationship between a teacher and student much closer than an ask-me-anything approach.
Duolingo Max interface that provides pre-populated options for the learner to choose from.
However, if the objective and method in how to get to the desired outcome is clear in the mind of the user, the free text field may be the best way to empower them to get the result they are looking for quickly. Providing total freedom by bypassing the steps of a wizard and removing constraints of pre-populated options may result in less frustration and more time saved.
In many cases, a hybrid approach may be employed by mixing familiar, constrained inputs with free text entry. This ensures that necessary information to generate the expected outcome is collected, while giving the user freedom to express anything additional to further tailor the output.
An example of a hybrid approach from Uizard’s Autodesigner. This ensures that critical information is collected to create a desirable output while still giving the user freedom to enter anything.
We think about different approaches to genAI powered interfaces as sitting on spectrums of freedom and guidance. Depending on how defined the task and method is in the mind of the user, we can employ different interaction patterns to meet them where they are. The critical questions to ask are (1) What is the task the user needs to accomplish? Does the interface provide enough guidance to enable success without slowing down the user?, and (2) What are the expected outcomes? Does the interface ensure the appropriate quality and quantity of information will be entered to generate the expected outcome?
Consideration 02: Prompt Writing and the Design Process
Prompt writing considerations mapped to the design process.
To generate expected outcomes, prompt writing must be conducted parallel to the design process. Prompt engineers and designers should work in close collaboration with each other as the interface and prompt are tightly interwoven to create the desired experience. Prompt writing should start early on in the design process and remain as a key component throughout.
Prompt engineers should be looped in during the discovery and research phase to understand context and scope. Prompt writing will play a large part in driving the experience. The prompt will determine what input is needed from the user and will therefore have influence over the direction of the wireframes. In strategy sessions, prompt writers and designers should be on the same page when determining the goals for the desired outputs.
During the ideation and design phases, designers and prompt engineers should work together to explore different approaches and refine what the desired outcome looks like. They must build the experience by modifying the prompt and design in conjunction with one another. It is important to note that when conducting testing, feedback can have implications for both the design as well as the underlying prompt. Iteration may apply to both areas, and each affects the other.
It is also likely that designers and others on the team are responsible for the prompt writing if there isn’t a designated role for this work. The prompt development will align with the design process in the same way as shown above, but it will be important for team members to know how to write effective prompts.
Consideration 03: Cognitive Demand
A benefit of designing with generative AI is that it can be used to reduce cognitive load. With natural language processing we have the opportunity to make interactions more natural with less cognitive demand. However, we want to be careful when providing instructions or asking for input so it is not unnecessarily adding to the cognitive load of the user.
For example, if we opt for a high guidance interface like a wizard, we want to make sure that questions don’t rely on remembering the input to previous questions. We also want to design interfaces to encourage easy data collection from the user. Asking for smaller bits of information, focusing on one topic at a time, will help reduce the cognitive load.
A goal for an AI powered experience is for the user to communicate their thoughts and intentions as easily and naturally as possible. We want to be careful not to over-complicate the process or demand too much problem-solving from the user. Information we are asking people to provide should be easily associated with the output and not difficult to come up with.
In this example of Khroma, an AI color tool, users must select 50 colors from a variety of hues and shades to train the AI to their taste. The sheer number of required selections, keeping track of the progress, along with the tracking the balance of colors introduces a high level of cognitive load.
Consideration 04: User Control
Maintaining a sense of control for the user is crucial for a successful, safe experience and avoiding frustration. This is especially important when designing with AI as there may be a feeling of apprehension or fear in the beginning. Giving control to the user helps to build trust and a sense of ease with the technology.
The ability to see how input relates to the output is important to understand why and how a user got where they are in the experience. Displaying the input alongside the output can also provide context to help decide what modifications may want to be made.
If the user generates an undesirable output, we don’t want them to be stuck with it. The ability to modify the input and regenerate the output are also important places to give the user control. Alternatively, if the user gets a desired output, results will be slightly different each time, so the ability to download, save, or share results is also important.
Depending on the task, results could be generated instantly or it may take a long time for the AI to produce an output. In either case, paying attention to the details of the waiting experience are integral in making the interaction feel natural and seamless. If the output is generated instantly, we still want to present a short loader to the user, even if it doesn’t actually take that amount of time to load. This is to indicate that their input is being processed and to provide context for what they will see next. Be sure to follow best practices with progress indicators such as setting expectations and giving the option to cancel a request, especially with longer wait times.
Consideration 05: Ethics
At Momentum, ethics play an important role in our decisions and approach around AI. We are a human-centered organization that values empathy and humanity. We always want to consider the safety and well-being of the user.
Our approach to AI is that it is here to augment, not replace humans. Even though the experience may be powered by AI, it is always driven by a human. The considerations outlined in this article enable people to drive with their intentions and ideas.
When asking for input from users, only ask for information necessary for the task at hand. We do not use the helpfulness of the AI tool to coerce the collection of information for other purposes. We want to ensure informed consent, which means people understand what we are asking for and why. We also want to make sure that people understand their rights around privacy and controlling their personal data.
It is also our responsibility as designers and makers to strive for creating true representations. Do not distort or misrepresent the intent of the user in the output. This could include a myriad of considerations, including the data source used. If the data is biased, the output may in turn be distorted.
Finally, we want to ensure that what we design is inclusive and equitable. Reducing cognitive load is one way to make interfaces more widely accessible. We can also think about different input methods such as voice, eye-movement, or brain wave inputs to include more people in the AI revolution.
As the landscape changes rapidly and new capabilities are developed everyday, we want to be prepared to address new challenges and build the future. AI is a powerful engine that holds a lot of potential, and if we can effectively apply human-centered design principles to create useful and meaningful interactions, the results will be truly transformative.
As you embark on your journey to design experiences powered by this extraordinary technology, be sure to consider the following questions:
What is the task the user needs to accomplish? Is there enough guidance provided to enable success without slowing down the user?
What are the expected outcomes? Does the interface ensure the appropriate quality and quantity of information will be entered to generate the expected outcome?
Is prompt writing being considered alongside the entire design process? Are designers and prompt engineers working in conjunction and collaboration with one another?
Is the implementation of AI helping to reduce cognitive load or is it introducing unnecessary complexity for the user?
Does the user have control over the experience? Are they always in the driver’s seat?
Are we only asking for relevant information? Do people understand what they are providing us and why?
Is the experience excluding anyone? Are there alternate ways we can make the functionality available to more people?