9. 图像生成与分析

7 min

为项目新增两项图像能力:文生图(根据文字描述生成图片)和图像分析(上传图片让 AI 分析内容)。两项能力分别使用不同的模型,最终集成到现有的聊天界面中。

1. 能力概览

能力 模型 Ollama 端点 Spring AI 支持
文生图 x/z-image-turbo POST /api/generate 不支持,需用 WebClient 直调
图像分析 qwen3.5:9b POST /api/chat 支持,通过多模态 UserMessage

2. 环境准备

2.1 安装图像生成模型

bash
ollama pull x/z-image-turbo

x/z-image-turbo 是阿里通义实验室的文生图模型(60 亿参数),支持中英文提示词,生成 1024×1024 分辨率的图片。

2.2 图像分析模型

图像分析直接使用已安装的 qwen3.5:9b,它支持 Vision(图文多模态输入),无需额外安装。

3. 数据库与实体变更

3.1 schema.sql

MESSAGES 表新增 image 列,用于存储 base64 编码的图片数据:

sql
DROP TABLE MESSAGES;
CREATE TABLE IF NOT EXISTS MESSAGES (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    topic_id BIGINT NOT NULL,
    role VARCHAR(20) NOT NULL,
    content TEXT NOT NULL,
    thinking TEXT,
    image CLOB,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (topic_id) REFERENCES TOPICS(id)
);

3.2 Message 实体

更新 Message 记录,新增 image 字段和对应的工厂方法:

src/main/java/com/albertstack/aichat/entity/Message.java

java
package com.albertstack.aichat.entity;

import org.springframework.data.annotation.Id;
import org.springframework.data.relational.core.mapping.Table;

import java.time.LocalDateTime;

@Table("MESSAGES")
public record Message(
        @Id Long id,
        Long topicId,
        String role,
        String content,
        String thinking,
        String image,
        LocalDateTime createdAt
) {
    public static Message create(Long topicId, String role, String content) {
        return new Message(null, topicId, role, content, null, null, LocalDateTime.now());
    }

    public static Message create(Long topicId, String role, String content, String thinking) {
        return new Message(null, topicId, role, content, thinking, null, LocalDateTime.now());
    }

    public static Message createWithImage(Long topicId, String role, String content, String image) {
        return new Message(null, topicId, role, content, null, image, LocalDateTime.now());
    }
}

3.3 ChatRequest 更新

聊天请求新增可选的 image 字段,用于图像分析场景(用户上传图片 + 提问):

src/main/java/com/albertstack/aichat/dto/ChatRequest.java

java
package com.albertstack.aichat.dto;

public record ChatRequest(Long topicId, String content, boolean thinking, String image) {}

3.4 ChatChunk 更新

ChatChunk 新增 image 工厂方法,图像生成的结果也通过 SSE 流推送:

src/main/java/com/albertstack/aichat/dto/ChatChunk.java

java
package com.albertstack.aichat.dto;

public record ChatChunk(String type, String text) {
    public static ChatChunk thinking(String text) {
        return new ChatChunk("thinking", text);
    }

    public static ChatChunk content(String text) {
        return new ChatChunk("content", text);
    }

    public static ChatChunk image(String base64) {
        return new ChatChunk("image", base64);
    }
}

现在 SSE 流有三种 chunk 类型:thinking(思考过程)、content(文字回复)、image(base64 图片)。

4. 后端:图像生成

4.1 配置

application.yaml 中添加图像模型配置:

yaml
ai:
  model:
    no-think: frob/qwen3.5-instruct:9b
    image: x/z-image-turbo:latest
  prompt:
    default: classpath:prompts/default.txt
    no-think: classpath:prompts/no-think.txt

4.2 WebConfig

图片 base64 约 1-2MB,WebFlux 默认的请求体/响应体缓冲区只有 256KB,需要通过 WebFluxConfigurer 全局调大。否则上传图片分析会报 413 CONTENT_TOO_LARGE,接收图片响应会报 DataBufferLimitException

src/main/java/com/albertstack/aichat/config/WebConfig.java

java
package com.albertstack.aichat.config;

import org.springframework.context.annotation.Configuration;
import org.springframework.http.codec.ServerCodecConfigurer;
import org.springframework.web.reactive.config.WebFluxConfigurer;

@Configuration
public class WebConfig implements WebFluxConfigurer {

    @Override
    public void configureHttpMessageCodecs(ServerCodecConfigurer configurer) {
        // 图片 base64 约 1-2MB,默认 256KB 不够
        configurer.defaultCodecs().maxInMemorySize(10 * 1024 * 1024);
    }
}

4.3 ImageRequest DTO

src/main/java/com/albertstack/aichat/dto/ImageRequest.java

java
package com.albertstack.aichat.dto;

public record ImageRequest(Long topicId, String prompt) {}

4.4 ImageService

Spring AI 2.0.0-M3 尚未提供 OllamaImageModel,因此我们用 WebClient 直接调用 Ollama 的 /api/generate 端点。

src/main/java/com/albertstack/aichat/service/ImageService.java

java
package com.albertstack.aichat.service;

import com.albertstack.aichat.dto.ChatChunk;
import com.albertstack.aichat.entity.Message;
import com.albertstack.aichat.repository.MessageRepository;
import com.albertstack.aichat.repository.TopicRepository;
import com.albertstack.aichat.util.SecurityUtils;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Service;
import org.springframework.web.reactive.function.client.ExchangeStrategies;
import org.springframework.web.reactive.function.client.WebClient;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;

import java.util.Map;

@Service
public class ImageService {

    private final WebClient webClient;
    private final TopicRepository topicRepository;
    private final MessageRepository messageRepository;
    private final String imageModel;

    public ImageService(@Value("${spring.ai.ollama.base-url}") String baseUrl,
                        @Value("${ai.model.image}") String imageModel,
                        TopicRepository topicRepository,
                        MessageRepository messageRepository) {
        // 图片 base64 约 1MB,需要通过 exchangeStrategies 调大缓冲区(默认 256KB)
        this.webClient = WebClient.builder()
                .baseUrl(baseUrl)
                .exchangeStrategies(ExchangeStrategies.builder()
                        .codecs(config -> config.defaultCodecs()
                                .maxInMemorySize(10 * 1024 * 1024))
                        .build())
                .build();
        this.imageModel = imageModel;
        this.topicRepository = topicRepository;
        this.messageRepository = messageRepository;
    }

    public Flux<ChatChunk> generateImage(Long topicId, String prompt) {
        return SecurityUtils.getCurrentUserId()
                .flatMap(userId -> topicRepository.findByIdAndUserId(topicId, userId))
                .switchIfEmpty(Mono.error(new RuntimeException("话题不存在")))
                .flatMapMany(topic ->
                        // 保存用户的提示词消息
                        messageRepository.save(Message.create(topicId, "user", prompt))
                                .thenMany(Flux.concat(
                                        // 先推送状态提示
                                        Flux.just(ChatChunk.content("正在生成图片,请稍候...")),
                                        // 生成图片并保存
                                        callOllamaImageApi(prompt)
                                                .flatMap(imageBase64 ->
                                                        messageRepository.save(
                                                                Message.createWithImage(topicId, "assistant", "", imageBase64)
                                                        )
                                                )
                                                .map(msg -> ChatChunk.image(msg.image()))
                                                .flux()
                                ))
                );
    }

    private Mono<String> callOllamaImageApi(String prompt) {
        return webClient.post()
                .uri("/api/generate")
                .bodyValue(Map.of(
                        "model", imageModel,
                        "prompt", prompt,
                        "stream", false
                ))
                .retrieve()
                .bodyToMono(Map.class)
                .map(response -> {
                    Object image = response.get("image");
                    if (image == null || image.toString().isBlank()) {
                        throw new RuntimeException("图像生成失败:模型未返回图片");
                    }
                    return image.toString();
                });
    }
}

核心设计:

  • SSE 流式返回generateImage 返回 Flux<ChatChunk>,先推送一个 content 类型的状态提示("正在生成图片..."),生成完成后推送 image 类型的 chunk(base64 图片)。前端通过 SSE 接收,不会因为生成时间长而超时
  • callOllamaImageApi:向 Ollama 的 /api/generate 发送请求,模型返回 JSON 中 image 字段为 base64 编码的 PNG 图片
  • 消息持久化:用户消息(prompt 文字)和 AI 消息(生成的图片)都保存到数据库

4.5 ImageController

图像生成同样使用 SSE 端点,与聊天接口保持一致:

src/main/java/com/albertstack/aichat/controller/ImageController.java

java
package com.albertstack.aichat.controller;

import com.albertstack.aichat.dto.ChatChunk;
import com.albertstack.aichat.dto.ImageRequest;
import com.albertstack.aichat.service.ImageService;
import org.springframework.http.MediaType;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;

@RestController
@RequestMapping("/api/image")
public class ImageController {

    private final ImageService imageService;

    public ImageController(ImageService imageService) {
        this.imageService = imageService;
    }

    @PostMapping(value = "/generate", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<ChatChunk> generate(@RequestBody ImageRequest request) {
        return imageService.generateImage(request.topicId(), request.prompt());
    }
}

图像生成也使用 SSE 流,虽然最终只返回一张图片,但生成过程可能耗时 30 秒以上,同步请求会超时。通过 SSE 保持连接,先推送状态提示,生成完成后推送图片数据。

5. 后端:图像分析

图像分析复用现有的 /api/chat 端点。当 ChatRequest.image 不为空时,将图片和文字组合为多模态消息发给 qwen3.5(它支持 Vision)。

5.1 ChatService 扩展

更新 buildAiMessages 方法,支持携带图片的消息:

src/main/java/com/albertstack/aichat/service/ChatService.java

java
package com.albertstack.aichat.service;

import com.albertstack.aichat.dto.ChatChunk;
import com.albertstack.aichat.entity.Message;
import com.albertstack.aichat.repository.MessageRepository;
import com.albertstack.aichat.repository.TopicRepository;
import com.albertstack.aichat.util.SecurityUtils;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.messages.AssistantMessage;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.model.Generation;
import org.springframework.ai.content.Media;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.core.io.ByteArrayResource;
import org.springframework.stereotype.Service;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;

import java.util.ArrayList;
import java.util.Base64;
import java.util.List;

@Service
public class ChatService {

    private final ChatClient chatClient;
    private final ChatClient noThinkChatClient;
    private final TopicRepository topicRepository;
    private final MessageRepository messageRepository;

    private static final int CONTEXT_WINDOW = 20;

    public ChatService(ChatClient chatClient,
                       @Qualifier("noThinkChatClient") ChatClient noThinkChatClient,
                       TopicRepository topicRepository,
                       MessageRepository messageRepository) {
        this.chatClient = chatClient;
        this.noThinkChatClient = noThinkChatClient;
        this.topicRepository = topicRepository;
        this.messageRepository = messageRepository;
    }

    public Flux<ChatChunk> chat(Long topicId, String userContent, boolean thinking, String image) {
        return SecurityUtils.getCurrentUserId()
                .flatMap(userId -> topicRepository.findByIdAndUserId(topicId, userId))
                .switchIfEmpty(Mono.error(new RuntimeException("话题不存在")))
                .flatMapMany(topic -> {
                        // 保存用户消息(如果有图片,一并保存)
                        Mono<Message> saveUser = image != null && !image.isBlank()
                                ? messageRepository.save(Message.createWithImage(topicId, "user", userContent, image))
                                : messageRepository.save(Message.create(topicId, "user", userContent));

                        return saveUser
                                .thenMany(messageRepository.findByTopicIdOrderByCreatedAtAsc(topicId))
                                .collectList()
                                .flatMapMany(history -> {
                                    List<org.springframework.ai.chat.messages.Message> aiMessages =
                                            buildAiMessages(history);

                                    ChatClient client = thinking ? chatClient : noThinkChatClient;

                                    StringBuilder fullContent = new StringBuilder();
                                    StringBuilder fullThinking = new StringBuilder();

                                    return client.prompt()
                                            .messages(aiMessages)
                                            .stream()
                                            .chatResponse()
                                            .mapNotNull(chunk -> {
                                                Generation result = chunk.getResult();
                                                if (result == null || result.getOutput() == null)
                                                    return null;

                                                Object thinkingObj = result.getMetadata().get("thinking");
                                                if (thinkingObj != null && !thinkingObj.toString().isEmpty()) {
                                                    String t = thinkingObj.toString();
                                                    fullThinking.append(t);
                                                    return ChatChunk.thinking(t);
                                                }

                                                String text = result.getOutput().getText();
                                                if (text != null && !text.isEmpty()) {
                                                    fullContent.append(text);
                                                    return ChatChunk.content(text);
                                                }

                                                return null;
                                            })
                                            .doOnComplete(() -> {
                                                String thinkingText = fullThinking.isEmpty()
                                                        ? null : fullThinking.toString();
                                                messageRepository.save(
                                                        Message.create(topicId, "assistant",
                                                                fullContent.toString(), thinkingText)
                                                ).subscribe();
                                            });
                                });
                });
    }

    private List<org.springframework.ai.chat.messages.Message> buildAiMessages(
            List<Message> history) {
        List<Message> recent = history.size() > CONTEXT_WINDOW
                ? history.subList(history.size() - CONTEXT_WINDOW, history.size())
                : history;

        List<org.springframework.ai.chat.messages.Message> aiMessages = new ArrayList<>();

        for (Message msg : recent) {
            switch (msg.role()) {
                case "user" -> {
                    if (msg.image() != null && !msg.image().isBlank()) {
                        // 多模态消息:文字 + 图片
                        byte[] imageBytes = Base64.getDecoder().decode(msg.image());
                        var media = new Media(Media.Format.IMAGE_PNG,
                                new ByteArrayResource(imageBytes));
                        aiMessages.add(UserMessage.builder()
                                .text(msg.content())
                                .media(media)
                                .build());
                    } else {
                        aiMessages.add(new UserMessage(msg.content()));
                    }
                }
                case "assistant" -> aiMessages.add(new AssistantMessage(msg.content()));
            }
        }

        return aiMessages;
    }
}

关键变化:

  • chat 方法签名:新增 String image 参数
  • 用户消息保存:如果有图片,使用 Message.createWithImage 把图片一并存入
  • buildAiMessages:检测历史消息中是否包含图片,如果有则用 UserMessage.builder().text(...).media(...).build() 构建多模态消息。Media 构造函数接受 Resource 而非 byte[],所以用 ByteArrayResource 包装解码后的字节数组

5.2 ChatController 更新

传递 image 参数:

java
@PostMapping(produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ChatChunk> chat(@RequestBody ChatRequest request) {
    return chatService.chat(request.topicId(), request.content(),
            request.thinking(), request.image());
}

6. 后端测试

6.1 ImageServiceTest

src/test/java/com/albertstack/aichat/service/ImageServiceTest.java

java
package com.albertstack.aichat.service;

import com.albertstack.aichat.dto.ChatChunk;
import com.albertstack.aichat.entity.Message;
import com.albertstack.aichat.entity.Topic;
import com.albertstack.aichat.entity.User;
import com.albertstack.aichat.repository.MessageRepository;
import com.albertstack.aichat.repository.TopicRepository;
import com.albertstack.aichat.repository.UserRepository;
import lombok.extern.slf4j.Slf4j;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.r2dbc.core.DatabaseClient;
import org.springframework.security.authentication.UsernamePasswordAuthenticationToken;
import org.springframework.security.core.context.ReactiveSecurityContextHolder;
import reactor.test.StepVerifier;

import java.util.List;

import static org.junit.jupiter.api.Assertions.*;

@Slf4j
@SpringBootTest
class ImageServiceTest {

    @Autowired
    private ImageService imageService;

    @Autowired
    private UserRepository userRepository;

    @Autowired
    private TopicRepository topicRepository;

    @Autowired
    private MessageRepository messageRepository;

    @Autowired
    private DatabaseClient databaseClient;

    private Long userId;
    private Long topicId;

    @BeforeEach
    void setUp() {
        databaseClient.sql("SET REFERENTIAL_INTEGRITY FALSE").then().block();
        databaseClient.sql("TRUNCATE TABLE MESSAGES RESTART IDENTITY").then().block();
        databaseClient.sql("TRUNCATE TABLE TOPICS RESTART IDENTITY").then().block();
        databaseClient.sql("TRUNCATE TABLE USERS RESTART IDENTITY").then().block();
        databaseClient.sql("SET REFERENTIAL_INTEGRITY TRUE").then().block();

        userId = userRepository.save(User.create("Albert", "password")).block().id();
        topicId = topicRepository.save(Topic.create(userId, "图片测试")).block().id();
    }

    @Test
    void generateImage_shouldReturnImageChunk() {
        var auth = new UsernamePasswordAuthenticationToken(userId, null, List.of());

        String[] imageBase64 = {null};

        StepVerifier.create(
                imageService.generateImage(topicId, "a cute cat sitting on a desk")
                        .contextWrite(ReactiveSecurityContextHolder.withAuthentication(auth))
        )
        .thenConsumeWhile(chunk -> {
            log.info("收到 chunk => type={}, text length={}", chunk.type(),
                    chunk.text() != null ? chunk.text().length() : 0);
            if ("image".equals(chunk.type())) {
                imageBase64[0] = chunk.text();
            }
            return true;
        })
        .verifyComplete();

        assertNotNull(imageBase64[0], "应收到 image 类型的 chunk");
        log.info("图片 base64 长度: {}", imageBase64[0].length());

        // 验证消息持久化
        List<Message> messages = messageRepository
                .findByTopicIdOrderByCreatedAtAsc(topicId)
                .collectList().block();

        assertEquals(2, messages.size(), "应有用户提示词和 AI 图片两条记录");
        assertEquals("user", messages.get(0).role());
        assertEquals("assistant", messages.get(1).role());
        assertNotNull(messages.get(1).image(), "AI 消息应包含图片");
        log.info(" ✅ 图片生成并持久化成功(图片大小: {} 字符)", messages.get(1).image().length());
    }
}

注意:图像生成测试需要 Ollama 运行且 x/z-image-turbo 模型可用,首次运行可能较慢(模型加载)。

6.2 ChatServiceTest 更新

ChatService.chat 方法签名新增了 image 参数,测试类需要同步更新,并增加图像分析的测试用例。

src/test/java/com/albertstack/aichat/service/ChatServiceTest.java

java
package com.albertstack.aichat.service;

import com.albertstack.aichat.dto.ChatChunk;
import com.albertstack.aichat.entity.Message;
import com.albertstack.aichat.entity.Topic;
import com.albertstack.aichat.entity.User;
import com.albertstack.aichat.repository.MessageRepository;
import com.albertstack.aichat.repository.TopicRepository;
import com.albertstack.aichat.repository.UserRepository;
import lombok.extern.slf4j.Slf4j;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.r2dbc.core.DatabaseClient;
import org.springframework.security.authentication.UsernamePasswordAuthenticationToken;
import org.springframework.security.core.context.ReactiveSecurityContextHolder;
import reactor.core.publisher.Flux;
import reactor.test.StepVerifier;

import java.util.List;

import static org.junit.jupiter.api.Assertions.*;

@Slf4j
@SpringBootTest
class ChatServiceTest {

    @Autowired
    private ChatService chatService;

    @Autowired
    private ImageService imageService;

    @Autowired
    private UserRepository userRepository;

    @Autowired
    private TopicRepository topicRepository;

    @Autowired
    private MessageRepository messageRepository;

    @Autowired
    private DatabaseClient databaseClient;

    private Long userId;
    private Long topicId;

    @BeforeEach
    void setUp() {
        databaseClient.sql("SET REFERENTIAL_INTEGRITY FALSE").then().block();
        databaseClient.sql("TRUNCATE TABLE MESSAGES RESTART IDENTITY").then().block();
        databaseClient.sql("TRUNCATE TABLE TOPICS RESTART IDENTITY").then().block();
        databaseClient.sql("TRUNCATE TABLE USERS RESTART IDENTITY").then().block();
        databaseClient.sql("SET REFERENTIAL_INTEGRITY TRUE").then().block();

        User user = userRepository.save(User.create("Albert", "password")).block();
        userId = user.id();

        Topic topic = topicRepository.save(Topic.create(userId, "测试话题")).block();
        topicId = topic.id();
    }

    @Test
    void chat_withThinking_shouldReturnThinkingAndContent() {
        var auth = new UsernamePasswordAuthenticationToken(userId, null, List.of());

        Flux<ChatChunk> responseStream = chatService.chat(topicId, "1+1等于几?", true, null)
                .contextWrite(ReactiveSecurityContextHolder.withAuthentication(auth));

        StringBuilder thinkingText = new StringBuilder();
        StringBuilder contentText = new StringBuilder();

        StepVerifier.create(responseStream)
                .thenConsumeWhile(chunk -> {
                    if ("thinking".equals(chunk.type())) thinkingText.append(chunk.text());
                    else if ("content".equals(chunk.type())) contentText.append(chunk.text());
                    return true;
                })
                .verifyComplete();

        assertFalse(thinkingText.isEmpty(), "开启思考模式应返回思考过程");
        assertFalse(contentText.isEmpty(), "应返回回复内容");
        log.info(" ✅ 思考模式:思考过程 + 回复内容均已返回");
    }

    @Test
    void chat_withoutThinking_shouldReturnContentOnly() {
        var auth = new UsernamePasswordAuthenticationToken(userId, null, List.of());

        Flux<ChatChunk> responseStream = chatService.chat(topicId, "你好,请说'测试成功'", false, null)
                .contextWrite(ReactiveSecurityContextHolder.withAuthentication(auth));

        StringBuilder contentText = new StringBuilder();
        boolean[] hasThinking = {false};

        StepVerifier.create(responseStream)
                .thenConsumeWhile(chunk -> {
                    if ("thinking".equals(chunk.type())) hasThinking[0] = true;
                    if ("content".equals(chunk.type())) contentText.append(chunk.text());
                    return true;
                })
                .verifyComplete();

        assertFalse(contentText.isEmpty(), "应返回回复内容");
        assertFalse(hasThinking[0], "关闭思考模式不应返回思考过程");
        log.info(" ✅ 非思考模式:仅返回回复内容");
    }

    @Test
    void chat_withImage_shouldAnalyzeImage() {
        var auth = new UsernamePasswordAuthenticationToken(userId, null, List.of());

        // 先用图像生成模型生成一张真实图片作为测试素材
        String imageBase64 = imageService.generateImage(topicId, "a red apple")
                .contextWrite(ReactiveSecurityContextHolder.withAuthentication(auth))
                .filter(chunk -> "image".equals(chunk.type()))
                .map(ChatChunk::text)
                .next()
                .block();
        log.info("已生成测试图片({} 字符),用于图像分析测试", imageBase64.length());

        Flux<ChatChunk> responseStream = chatService.chat(
                topicId, "请用中文描述这张图片的内容", false, imageBase64
        ).contextWrite(ReactiveSecurityContextHolder.withAuthentication(auth));

        StringBuilder contentText = new StringBuilder();

        StepVerifier.create(responseStream)
                .thenConsumeWhile(chunk -> {
                    if ("content".equals(chunk.type())) contentText.append(chunk.text());
                    return true;
                })
                .verifyComplete();

        assertFalse(contentText.isEmpty(), "图像分析应返回描述内容");
        log.info("图像分析回复: {}", contentText);
        log.info(" ✅ 图像分析成功");

        // 验证用户消息包含图片
        // 消息顺序:[0]图片生成的 user prompt, [1]生成的图片, [2]图像分析的 user(含图片), [3]分析回复
        List<Message> messages = messageRepository
                .findByTopicIdOrderByCreatedAtAsc(topicId)
                .collectList().block();

        Message analyzeUserMsg = messages.stream()
                .filter(m -> "user".equals(m.role()) && m.image() != null)
                .findFirst().orElse(null);
        assertNotNull(analyzeUserMsg, "应有一条包含图片的用户消息");
        log.info(" ✅ 用户图片已持久化");
    }

    @Test
    void chat_withInvalidTopic_shouldError() {
        var auth = new UsernamePasswordAuthenticationToken(userId, null, List.of());

        Flux<ChatChunk> responseStream = chatService.chat(99999L, "你好", false, null)
                .contextWrite(ReactiveSecurityContextHolder.withAuthentication(auth));

        StepVerifier.create(responseStream)
                .expectError(RuntimeException.class)
                .verify();
        log.info(" ✅ 无效 topicId 正确抛出异常");
    }
}

相比之前的版本,变化如下:

变化 说明
chat() 调用 所有调用新增第四参数 image(普通聊天传 null
新增 chat_withImage_shouldAnalyzeImage 验证携带图片的对话能正确分析并返回描述,用户消息的图片字段已持久化

6.3 ChatControllerTest 更新

ChatRequest 新增了 image 字段,Controller 测试需要同步更新,并增加图像生成端点的测试。

src/test/java/com/albertstack/aichat/controller/ChatControllerTest.java

java
package com.albertstack.aichat.controller;

import com.albertstack.aichat.dto.ChatRequest;
import com.albertstack.aichat.dto.ImageRequest;
import com.albertstack.aichat.dto.RegisterRequest;
import com.albertstack.aichat.dto.TopicRequest;
import com.albertstack.aichat.common.ResultCode;
import tools.jackson.databind.ObjectMapper;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.webtestclient.autoconfigure.AutoConfigureWebTestClient;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.http.MediaType;
import org.springframework.r2dbc.core.DatabaseClient;
import org.springframework.test.web.reactive.server.WebTestClient;

import static com.albertstack.aichat.util.ApiTest.*;
import static org.junit.jupiter.api.Assertions.*;

@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
@AutoConfigureWebTestClient(timeout = "120000")
class ChatControllerTest {

    private static final Logger log = LoggerFactory.getLogger("ApiTest");

    @Autowired
    private WebTestClient originalWebTestClient;

    @Autowired
    private DatabaseClient databaseClient;

    private WebTestClient webTestClient;

    private final ObjectMapper objectMapper = new ObjectMapper();
    private String token;
    private Long topicId;

    @BeforeEach
    void setUp() throws Exception {
        // 调大 WebTestClient 缓冲区(图片响应约 1MB)
        webTestClient = originalWebTestClient.mutate()
                .codecs(config -> config.defaultCodecs().maxInMemorySize(10 * 1024 * 1024))
                .build();

        databaseClient.sql("SET REFERENTIAL_INTEGRITY FALSE").then().block();
        databaseClient.sql("TRUNCATE TABLE MESSAGES RESTART IDENTITY").then().block();
        databaseClient.sql("TRUNCATE TABLE TOPICS RESTART IDENTITY").then().block();
        databaseClient.sql("TRUNCATE TABLE USERS RESTART IDENTITY").then().block();
        databaseClient.sql("SET REFERENTIAL_INTEGRITY TRUE").then().block();

        byte[] authBody = webTestClient.post()
                .uri("/api/auth/register")
                .bodyValue(new RegisterRequest("Albert", "ps123456"))
                .exchange()
                .expectBody(byte[].class).returnResult().getResponseBody();
        token = objectMapper.readTree(authBody).get("data").get("token").asString();

        byte[] topicBody = webTestClient.post()
                .uri("/api/topics")
                .header("Authorization", "Bearer " + token)
                .bodyValue(new TopicRequest("聊天测试话题"))
                .exchange()
                .expectBody(byte[].class).returnResult().getResponseBody();
        topicId = objectMapper.readTree(topicBody).get("data").get("id").asLong();
    }

    @Test
    void chat_shouldReturnSSEStream() {
        log.info("---------- POST /api/chat(SSE 流式) ----------");

        webTestClient.post()
                .uri("/api/chat")
                .header("Authorization", "Bearer " + token)
                .contentType(MediaType.APPLICATION_JSON)
                .bodyValue(new ChatRequest(topicId, "你好,用一句话回答", false, null))
                .exchange()
                .expectStatus().isOk()
                .expectHeader().contentTypeCompatibleWith(MediaType.TEXT_EVENT_STREAM)
                .expectBody(String.class)
                .consumeWith(result -> {
                    String body = result.getResponseBody();
                    log.info("SSE 响应内容: {}", body);

                    if (body != null && !body.isBlank()) {
                        log.info("  ✅ AI 返回了有效内容(长度: {} 字符)", body.length());
                    } else {
                        log.error("  ❌ AI 未返回任何内容");
                        fail("AI 应该返回有效内容");
                    }
                });
    }

    @Test
    void chat_withoutToken_shouldReturn401() {
        of(webTestClient).post("/api/chat", new ChatRequest(topicId, "你好", false, null))
                .expectStatus(401)
                .verify();
    }

    @Test
    void generateImage_shouldReturnSSEWithImage() {
        log.info("---------- POST /api/image/generate(SSE 流式) ----------");

        webTestClient.post()
                .uri("/api/image/generate")
                .header("Authorization", "Bearer " + token)
                .contentType(MediaType.APPLICATION_JSON)
                .bodyValue(new ImageRequest(topicId, "a blue sky"))
                .exchange()
                .expectStatus().isOk()
                .expectHeader().contentTypeCompatibleWith(MediaType.TEXT_EVENT_STREAM)
                .expectBody(String.class)
                .consumeWith(result -> {
                    String body = result.getResponseBody();
                    assertNotNull(body, "应有 SSE 响应");
                    assertTrue(body.contains("\"type\":\"image\""), "应包含 image 类型的 chunk");
                    log.info("  ✅ 图片生成 SSE 流返回成功(长度: {} 字符)", body.length());
                });
    }

    @Test
    void generateImage_withoutToken_shouldReturn401() {
        webTestClient.post()
                .uri("/api/image/generate")
                .contentType(MediaType.APPLICATION_JSON)
                .bodyValue(new ImageRequest(topicId, "a blue sky"))
                .exchange()
                .expectStatus().isUnauthorized();
    }
}

相比之前的版本,变化如下:

变化 说明
WebTestClient 缓冲区 通过 mutate().codecs() 调大到 10MB,图片响应约 1MB 会超过默认 256KB
ChatRequest 构造 所有调用新增第四参数 null(无图片)
timeout 从 60000 调整为 120000(图片生成较慢)
新增 generateImage_shouldReturnSSEWithImage 验证 /api/image/generate SSE 端点返回包含 image 类型 chunk
新增 generateImage_withoutToken_shouldReturn401 验证图片生成端点需要认证

7. 前端更新

7.1 API 模块

更新 topic.ts 中的 Message 接口(新增 image 字段):

typescript
export interface Message {
  id: number
  topicId: number
  role: 'user' | 'assistant'
  content: string
  thinking: string | null
  image: string | null
  createdAt: string
}

新增图像生成 API:

src/api/chat.ts

typescript
// 聊天和图像生成都使用 SSE 流式请求,不走 axios
export const CHAT_URL = '/api/chat'
export const IMAGE_GENERATE_URL = '/api/image/generate'

7.2 ChatView 更新

更新 src/views/ChatView.vue,新增图像生成按钮、图片上传和图片展示:

vue
<template>
  <div class="h-screen flex bg-base-200">
    <!-- 左侧:话题列表 -->
    <aside class="w-64 bg-base-100 flex flex-col border-r border-base-300">
      <div class="p-3">
        <button class="btn btn-primary btn-sm w-full" @click="createTopic">
          + 新对话
        </button>
      </div>

      <div class="flex-1 overflow-y-auto">
        <ul class="menu p-2">
          <li v-for="topic in topics" :key="topic.id">
            <a
              :class="{ 'active': currentTopicId === topic.id }"
              class="flex justify-between items-center"
              @click="selectTopic(topic)"
            >
              <input
                v-if="editingTopicId === topic.id"
                v-model="editingTitle"
                class="input input-xs input-bordered flex-1 mr-1"
                @click.stop
                @keyup.enter="confirmEdit(topic)"
                @keyup.escape="cancelEdit"
                @blur="confirmEdit(topic)"
              />
              <span v-else class="truncate flex-1" @dblclick.stop="startEdit(topic)">
                {{ topic.title }}
              </span>
              <button
                class="btn btn-ghost btn-xs"
                @click.stop="deleteTopic(topic.id)"
              >

              </button>
            </a>
          </li>
        </ul>

        <div v-if="topics.length === 0" class="text-center text-base-content/50 p-4 text-sm">
          暂无对话,点击上方按钮创建
        </div>
      </div>

      <div class="p-3 border-t border-base-300 flex justify-between items-center">
        <span class="text-sm">{{ authStore.username }}</span>
        <button class="btn btn-ghost btn-xs" @click="handleLogout">退出</button>
      </div>
    </aside>

    <!-- 右侧:聊天区域 -->
    <main class="flex-1 flex flex-col">
      <header class="h-14 flex items-center px-4 border-b border-base-300 bg-base-100">
        <h2 class="text-lg font-medium">
          {{ currentTopic?.title || '选择或创建一个对话' }}
        </h2>
      </header>

      <!-- 消息列表 -->
      <div ref="messageListRef" class="flex-1 overflow-y-auto p-4 space-y-4">
        <div v-if="!currentTopicId" class="h-full flex items-center justify-center text-base-content/50">
          ← 选择一个话题开始对话
        </div>

        <div
          v-for="msg in messages"
          :key="msg.id || msg.tempId"
          class="chat"
          :class="msg.role === 'user' ? 'chat-end' : 'chat-start'"
        >
          <!-- AI 消息 -->
          <div v-if="msg.role === 'assistant'" class="chat-bubble bg-base-100 text-base-content border border-base-300">
            <details v-if="msg.thinking" class="mb-2">
              <summary class="cursor-pointer text-xs opacity-50 select-none">
                🧠 深度思考过程(点击展开)
              </summary>
              <div class="mt-1 p-2 rounded bg-base-200 text-xs whitespace-pre-wrap opacity-60">
                {{ msg.thinking }}
              </div>
            </details>
            <!-- 图片消息:支持预览和下载 -->
            <div v-if="msg.image" class="relative group">
              <img :src="'data:image/png;base64,' + msg.image"
                class="rounded max-w-sm cursor-pointer"
                alt="AI 生成的图片"
                @click="previewImage = msg.image" />
              <div class="absolute top-2 right-2 opacity-0 group-hover:opacity-100 transition">
                <a :href="'data:image/png;base64,' + msg.image"
                  :download="'image-' + (msg.id || 'gen') + '.png'"
                  class="btn btn-xs btn-neutral">
                  下载
                </a>
              </div>
            </div>
            <!-- 文字消息 -->
            <div v-if="msg.content" class="prose prose-sm max-w-none" v-html="renderMarkdown(msg.content)"></div>
          </div>

          <!-- 用户消息 -->
          <div v-else class="chat-bubble bg-primary text-primary-content">
            <img v-if="msg.image" :src="'data:image/png;base64,' + msg.image"
              class="rounded max-w-xs mb-2 cursor-pointer"
              alt="上传的图片"
              @click="previewImage = msg.image" />
            <div class="whitespace-pre-wrap">{{ msg.content }}</div>
          </div>
        </div>

        <!-- 流式响应 -->
        <div v-if="streamingThinking || streamingContent" class="chat chat-start">
          <div class="chat-bubble bg-base-100 text-base-content border border-base-300">
            <details v-if="streamingThinking" class="mb-2" open>
              <summary class="cursor-pointer text-xs opacity-50 select-none">
                🧠 正在思考...
              </summary>
              <div class="mt-1 p-2 rounded bg-base-200 text-xs whitespace-pre-wrap opacity-60">
                {{ streamingThinking }}<span class="animate-pulse">▊</span>
              </div>
            </details>
            <div v-if="streamingContent" class="prose prose-sm max-w-none" v-html="renderMarkdown(streamingContent)">
            </div>
            <span v-if="streamingContent" class="animate-pulse">▊</span>
          </div>
        </div>
      </div>

      <!-- 图片预览遮罩 -->
      <div v-if="previewImage" class="fixed inset-0 z-50 bg-black/70 flex items-center justify-center"
        @click="previewImage = null">
        <img :src="'data:image/png;base64,' + previewImage" class="max-w-4xl max-h-[90vh] rounded shadow-lg" />
      </div>

      <!-- 上传图片预览 -->
      <div v-if="uploadedImage" class="px-4 pt-2">
        <div class="inline-flex items-center gap-2 p-2 rounded border border-base-300 bg-base-100">
          <img :src="'data:image/png;base64,' + uploadedImage"
            class="w-16 h-16 rounded object-cover cursor-pointer"
            @click="previewImage = uploadedImage" />
          <button class="btn btn-ghost btn-xs" @click="uploadedImage = null">✕</button>
        </div>
      </div>

      <!-- 输入区域 -->
      <div class="p-4 border-t border-base-300 bg-base-100">
        <form @submit.prevent="handleSend" class="card border border-base-300 bg-base-100">
          <textarea
            v-model="inputText"
            placeholder="输入消息..."
            rows="3"
            class="w-full p-3 resize-none bg-transparent outline-none leading-normal"
            :disabled="!currentTopicId || isSending"
            @keydown.enter.exact.prevent="handleSend"
          ></textarea>
          <div class="flex justify-between items-center px-3 py-2 border-t border-base-200">
            <div class="flex gap-2">
              <button
                type="button"
                class="btn btn-sm"
                :class="mode === 'thinking' ? 'btn-primary' : 'btn-dash'"
                :disabled="!currentTopicId"
                @click="toggleMode('thinking')"
              >
                深度思考
              </button>
              <button
                type="button"
                class="btn btn-sm"
                :class="mode === 'image' ? 'btn-primary' : 'btn-dash'"
                :disabled="!currentTopicId"
                @click="toggleMode('image')"
              >
                生成图片
              </button>
              <input ref="fileInputRef" type="file" accept="image/*" class="hidden"
                @change="handleFileUpload" />
              <button
                type="button"
                class="btn btn-sm btn-dash"
                :disabled="!currentTopicId || isSending"
                @click="fileInputRef?.click()"
              >
                上传图片
              </button>
            </div>
            <button
              type="submit"
              class="btn btn-sm btn-primary"
              :disabled="!currentTopicId || !inputText.trim() || isSending"
            >
              <span v-if="isSending" class="loading loading-spinner loading-xs"></span>
              <span v-else>发送</span>
            </button>
          </div>
        </form>
      </div>
    </main>
  </div>
</template>

<script setup lang="ts">
import { ref, nextTick, onMounted } from 'vue'
import { useRouter } from 'vue-router'
import { marked } from 'marked'
import { markedHighlight } from 'marked-highlight'
import hljs from 'highlight.js'
import 'highlight.js/styles/github.css'
import { useAuthStore } from '../stores/auth'
import { useSSE } from '../composables/useSSE'
import {
  listTopicsApi,
  createTopicApi,
  updateTopicApi,
  deleteTopicApi,
  listMessagesApi,
  type Topic,
} from '../api/topic'
import { CHAT_URL, IMAGE_GENERATE_URL } from '../api/chat'

const router = useRouter()
const authStore = useAuthStore()
const { fetchSSE } = useSSE()

// 话题相关
const topics = ref<Topic[]>([])
const currentTopicId = ref<number | null>(null)
const currentTopic = ref<Topic | null>(null)
const editingTopicId = ref<number | null>(null)
const editingTitle = ref('')

// 消息相关
interface DisplayMessage {
  id?: number
  tempId?: string
  role: 'user' | 'assistant'
  content: string
  thinking?: string | null
  image?: string | null
}
const messages = ref<DisplayMessage[]>([])
const inputText = ref('')
const isSending = ref(false)
const streamingContent = ref('')
const streamingThinking = ref('')
const messageListRef = ref<HTMLElement>()

// 模式:普通聊天 / 深度思考 / 生成图片(互斥)
const mode = ref<'chat' | 'thinking' | 'image'>('chat')

// 图片相关
const uploadedImage = ref<string | null>(null)
const previewImage = ref<string | null>(null)
const fileInputRef = ref<HTMLInputElement>()

// --- Markdown 配置 ---

marked.use(markedHighlight({
  langPrefix: 'hljs language-',
  highlight(code, lang) {
    if (lang && hljs.getLanguage(lang)) {
      return hljs.highlight(code, { language: lang }).value
    }
    return hljs.highlightAuto(code).value
  },
}))

function renderMarkdown(content: string): string {
  return marked.parse(content) as string
}

// --- 模式切换 ---

function toggleMode(target: 'thinking' | 'image') {
  mode.value = mode.value === target ? 'chat' : target
}

// --- 话题操作 ---

async function loadTopics() {
  const { data } = await listTopicsApi()
  topics.value = data
}

async function createTopic() {
  const { data } = await createTopicApi('新对话')
  topics.value.unshift(data)
  selectTopic(data)
}

async function selectTopic(topic: Topic) {
  currentTopicId.value = topic.id
  currentTopic.value = topic

  const { data } = await listMessagesApi(topic.id)
  messages.value = data.map((m) => ({
    id: m.id,
    role: m.role,
    content: m.content,
    thinking: m.thinking,
    image: m.image,
  }))

  await nextTick()
  scrollToBottom()
}

function startEdit(topic: Topic) {
  editingTopicId.value = topic.id
  editingTitle.value = topic.title
}

async function confirmEdit(topic: Topic) {
  if (editingTopicId.value === null) return
  editingTopicId.value = null

  const newTitle = editingTitle.value.trim()
  if (!newTitle || newTitle === topic.title) return

  const { data } = await updateTopicApi(topic.id, newTitle)
  const index = topics.value.findIndex((t) => t.id === topic.id)
  if (index !== -1) topics.value[index] = data
  if (currentTopicId.value === topic.id) currentTopic.value = data
}

function cancelEdit() {
  editingTopicId.value = null
}

async function deleteTopic(topicId: number) {
  await deleteTopicApi(topicId)
  topics.value = topics.value.filter((t) => t.id !== topicId)

  if (currentTopicId.value === topicId) {
    currentTopicId.value = null
    currentTopic.value = null
    messages.value = []
  }
}

// --- 统一发送入口 ---

function handleSend() {
  if (mode.value === 'image') {
    sendImageGenerate()
  } else {
    sendChat()
  }
}

// 文字聊天 / 图像分析
async function sendChat() {
  const content = inputText.value.trim()
  if (!content || !currentTopicId.value || isSending.value) return

  inputText.value = ''
  isSending.value = true

  const currentImage = uploadedImage.value
  uploadedImage.value = null

  messages.value.push({
    tempId: Date.now().toString(),
    role: 'user',
    content,
    image: currentImage,
  })

  await nextTick()
  scrollToBottom()

  streamingContent.value = ''
  streamingThinking.value = ''

  await fetchSSE(
    CHAT_URL,
    {
      topicId: currentTopicId.value,
      content,
      thinking: mode.value === 'thinking',
      image: currentImage,
    },
    (text) => {
      streamingThinking.value += text
      scrollToBottom()
    },
    (text) => {
      streamingContent.value += text
      scrollToBottom()
    },
    () => {
      messages.value.push({
        tempId: Date.now().toString(),
        role: 'assistant',
        content: streamingContent.value,
        thinking: streamingThinking.value || null,
      })
      streamingContent.value = ''
      streamingThinking.value = ''
      isSending.value = false
    },
    (error) => {
      console.error('聊天出错:', error)
      streamingContent.value = ''
      streamingThinking.value = ''
      isSending.value = false
    },
  )
}

// 图片生成
async function sendImageGenerate() {
  const prompt = inputText.value.trim()
  if (!prompt || !currentTopicId.value || isSending.value) return

  inputText.value = ''
  isSending.value = true

  messages.value.push({
    tempId: Date.now().toString(),
    role: 'user',
    content: prompt,
  })

  await nextTick()
  scrollToBottom()

  streamingContent.value = ''

  await fetchSSE(
    IMAGE_GENERATE_URL,
    { topicId: currentTopicId.value, prompt },
    () => {},
    (text) => {
      streamingContent.value += text
      scrollToBottom()
    },
    () => {
      streamingContent.value = ''
      isSending.value = false
      if (currentTopic.value) selectTopic(currentTopic.value)
    },
    (error) => {
      console.error('图片生成失败:', error)
      streamingContent.value = ''
      isSending.value = false
    },
  )
}

// --- 图片操作 ---

function handleFileUpload(event: Event) {
  const file = (event.target as HTMLInputElement).files?.[0]
  if (!file) return

  const reader = new FileReader()
  reader.onload = () => {
    const result = reader.result as string
    uploadedImage.value = result.split(',')[1]
  }
  reader.readAsDataURL(file)

  if (fileInputRef.value) fileInputRef.value.value = ''
}

function scrollToBottom() {
  nextTick(() => {
    if (messageListRef.value) {
      messageListRef.value.scrollTop = messageListRef.value.scrollHeight
    }
  })
}

function handleLogout() {
  authStore.logout()
  router.push('/login')
}

onMounted(() => {
  loadTopics()
})
</script>

8. 功能测试

  1. 打开 http://localhost:5173,登录并创建话题

  2. 普通对话:输入文字,按回车发送,验证 AI 流式回复

  3. 深度思考:点击「深度思考」按钮使其高亮,输入问题发送,验证思考过程可折叠展开

  4. 生成图片:点击「生成图片」按钮使其高亮(深度思考自动取消),输入描述文字,按回车发送,等待图片生成。hover 图片验证下载按钮,点击图片验证全屏预览

  5. 图像分析:取消"生成图片"模式,点击「上传图片」选择一张图片,验证缩略图预览。输入提示词"请描述这张图片",发送,验证 AI 分析结果

  6. 切换话题再切回来,验证图片和分析结果正确加载,图片仍可预览和下载

生成图片与图片解析对话界面

9. 小结

为项目新增了两项图像能力:

新增内容
数据库 MESSAGES 表新增 image 列(CLOB,base64)
实体 Message 新增 image 字段 + createWithImage 工厂方法
后端 DTO ImageRequest(图片生成请求)、ChatRequest 新增 image 字段、ChatChunk 新增 image 工厂方法
后端配置 WebConfig(WebFlux 编解码器 10MB 缓冲区)
后端 Service ImageService(WebClient + SSE 流式返回)、ChatService 支持多模态消息(UserMessage.builder().media()
后端 Controller ImageController(POST /api/image/generate,SSE 端点)
后端测试 ImageServiceTest、ChatServiceTest 新增图像分析用例、ChatControllerTest 新增图片生成用例
前端 API chat.ts 新增 IMAGE_GENERATE_URL
前端页面 ChatView:模式互斥按钮(深度思考/生成图片,btn-dash/btn-primary)、上传图片 + 缩略图预览、图片全屏预览 + 下载、统一发送入口

新增的项目文件:

文件 用途
dto/ImageRequest.java 图片生成请求 DTO
config/WebConfig.java WebFlux 编解码器缓冲区配置
service/ImageService.java 图片生成服务(WebClient 直调 Ollama)
controller/ImageController.java 图片生成 SSE 端点
service/ImageServiceTest.java 图片生成测试

后端 API 总览(全部接口):

方法 路径 说明 认证
GET /api/ping 心跳检测 不需要
POST /api/auth/register 用户注册 不需要
POST /api/auth/login 用户登录 不需要
GET /api/topics 获取话题列表 需要
POST /api/topics 创建话题 需要
PUT /api/topics/{id} 更新话题标题 需要
DELETE /api/topics/{id} 删除话题 需要
GET /api/topics/{id}/messages 获取消息列表 需要
POST /api/chat AI 对话(SSE 流式,支持图像分析) 需要
POST /api/image/generate 图像生成(SSE 流式) 需要

接下来介绍工具调用与联网搜索,让 AI 从"只会聊天"进化为"能做事",具备时间查询、数学计算和联网搜索能力。