Generate Image with Chat Prompt

When building a chatbot, you may want to allow the user to generate an image. This can be done by creating a tool that generates an image using the experimental_generateImage function from the AI SDK.

Server

Let's create an endpoint at /api/chat that generates the assistant's response based on the conversation history. You will also define a tool called generateImage that will generate an image based on the assistant's response.

tools/get-weather.ts
import { openai } from '@ai-sdk/openai';
import { experimental_generateImage, tool } from 'ai';
import z from 'zod';
export const generateImage = tool({
description: 'Generate an image',
inputSchema: z.object({
prompt: z.string().describe('The prompt to generate the image from'),
}),
execute: async ({ prompt }) => {
const { image } = await experimental_generateImage({
model: openai.imageModel('dall-e-3'),
prompt,
});
// in production, save this image to blob storage and return a URL
return { image: image.base64, prompt };
},
});
app/api/chat/route.ts
import { openai } from '@ai-sdk/openai';
import {
convertToModelMessages,
type InferUITools,
stepCountIs,
streamText,
type UIMessage,
} from 'ai';
import { generateImage } from '@/tools/get-weather';
const tools = {
generateImage,
};
export type ChatTools = InferUITools<typeof tools>;
export async function POST(request: Request) {
const { messages }: { messages: UIMessage[] } = await request.json();
const result = streamText({
model: openai('gpt-4o'),
messages: convertToModelMessages(messages),
stopWhen: stepCountIs(5),
tools,
});
return result.toUIMessageStreamResponse();
}

In production, you should save the generated image to a blob storage and return a URL instead of the base64 image data. If you don't, the base64 image data will be sent to the model which may cause the generation to fail.

Client

Let's create a simple chat interface with useChat. You will call the /api/chat endpoint to generate the assistant's response. If the assistant's response contains a generateImage tool invocation, you will display the tool result (the image in base64 format and the prompt) using the Next Image component.

app/page.tsx
'use client';
import { useChat } from '@ai-sdk/react';
import { DefaultChatTransport, type UIMessage } from 'ai';
import Image from 'next/image';
import { type FormEvent, useState } from 'react';
import type { ChatTools } from './api/chat/route';
type ChatMessage = UIMessage<never, never, ChatTools>;
export default function Chat() {
const [input, setInput] = useState('');
const { messages, sendMessage } = useChat<ChatMessage>({
transport: new DefaultChatTransport({
api: '/api/chat',
}),
});
const handleInputChange = (event: React.ChangeEvent<HTMLInputElement>) => {
setInput(event.target.value);
};
const handleSubmit = async (event: FormEvent<HTMLFormElement>) => {
event.preventDefault();
sendMessage({
parts: [{ type: 'text', text: input }],
});
setInput('');
};
return (
<div className="flex flex-col w-full max-w-md py-24 mx-auto stretch">
<div className="space-y-4">
{messages.map(message => (
<div key={message.id} className="whitespace-pre-wrap">
<div key={message.id}>
<div className="font-bold">{message.role}</div>
{message.parts.map((part, partIndex) => {
const { type } = part;
if (type === 'text') {
return (
<div key={`${message.id}-part-${partIndex}`}>
{part.text}
</div>
);
}
if (type === 'tool-generateImage') {
const { state, toolCallId } = part;
if (state === 'input-available') {
return (
<div key={`${message.id}-part-${partIndex}`}>
Generating image...
</div>
);
}
if (state === 'output-available') {
const { input, output } = part;
return (
<Image
key={toolCallId}
src={`data:image/png;base64,${output.image}`}
alt={input.prompt}
height={400}
width={400}
/>
);
}
}
})}
</div>
</div>
))}
</div>
<form onSubmit={handleSubmit}>
<input
className="fixed bottom-0 w-full max-w-md p-2 mb-8 border border-gray-300 rounded shadow-xl"
value={input}
placeholder="Say something..."
onChange={handleInputChange}
/>
</form>
</div>
);
}