大模型应用开发技术架构全解析:从选型到Java实战

2025-12-19 11:41:00
文章摘要
本文解析大模型应用开发的纯Prompt、Function Calling、RAG、Fine-tuning四大核心技术架构,介绍各架构逻辑并提供对应Java实战代码(Fine-tuning除外),明确成本排序与选型原则:优先低成本方案,仅前三者无法满足需求时再考虑Fine-tuning,助力开发者匹配业务场景高效落地。

在大模型应用开发领域,不同的技术架构决定了应用的能力边界、开发成本与落地效率。本文将系统拆解纯Prompt、Function Calling、RAG、Fine-tuning四种核心技术架构,结合技术选型原则与Java实战代码,帮助开发者精准匹配业务场景,高效落地大模型应用。


一、四大核心技术架构深度解析

1.1 纯Prompt模式:低成本快速落地的入门方案

纯Prompt模式是大模型应用开发的基础形态,核心依赖提示词工程(Prompt Engineering) ——通过精准设计提示词,引导大模型输出符合预期的结果,无需额外的代码开发或系统改造。

核心逻辑

  1. 大模型的输出质量高度依赖提示词的精准度,相同问题下,不同提示词可能产生差异极大的答案;
  2. 适用于简单场景:智能问答、文案生成、代码片段编写等,仅需通过API传递精心设计的提示词即可实现核心功能。

Java极简实战(调用DeepSeek模型为例)

import com.google.gson.Gson;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class PurePromptDemo {
    // DeepSeek API配置
    private static final String API_URL = "https://api.deepseek.com/v1/chat/completions";
    private static final String API_KEY = "你的DeepSeek API Key";
    private static final Gson GSON = new Gson();

    public static String callModel(String prompt) throws Exception {
        // 1. 构造提示词消息体
        List<Map<String, String>> messages = new ArrayList<>();
        Map<String, String> userMsg = new HashMap<>();
        userMsg.put("role", "user");
        // 精细化提示词:明确角色、要求、输出格式
        userMsg.put("content", "你是资深Java开发工程师," + prompt + ",回答需附带代码示例,且注释完整");
        messages.add(userMsg);

        // 2. 构造请求体
        Map<String, Object> requestBody = new HashMap<>();
        requestBody.put("model", "deepseek-chat");
        requestBody.put("messages", messages);
        requestBody.put("temperature", 0.7);

        // 3. 发送HTTP请求
        HttpRequest request = HttpRequest.newBuilder()
                .uri(URI.create(API_URL))
                .header("Content-Type", "application/json")
                .header("Authorization", "Bearer " + API_KEY)
                .POST(HttpRequest.BodyPublishers.ofString(GSON.toJson(requestBody)))
                .build();

        HttpClient client = HttpClient.newHttpClient();
        HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());

        // 4. 解析响应
        if (response.statusCode() == 200) {
            Map<String, Object> responseMap = GSON.fromJson(response.body(), Map.class);
            List<Map<String, Object>> choices = (List<Map<String, Object>>) responseMap.get("choices");
            return (String) ((Map<String, Object>) choices.get(0)).get("message");
        } else {
            throw new RuntimeException("调用失败:" + response.statusCode() + "," + response.body());
        }
    }

    public static void main(String[] args) {
        try {
            // 示例:要求生成单例模式代码
            String result = callModel("用Java实现线程安全的懒汉式单例模式");
            System.out.println(result);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}


1.2 Function Calling:大模型与传统应用的桥梁

大模型擅长理解自然语言,但无法直接操作数据库、执行业务规则,Function Calling(函数调用)解决了这一痛点——让大模型根据用户意图,主动触发传统应用的函数执行,实现“语言理解+业务操作”的闭环。

核心流程

  1. 封装传统功能为标准化函数(如查询订单、操作数据库、调用第三方API);
  2. 在提示词中定义函数描述(名称、参数、作用),引导大模型判断是否需要调用函数;
  3. 大模型返回待调用的函数名与参数;
  4. 传统应用执行函数,将结果回传给大模型;
  5. 大模型整合结果,生成最终回答。

关键注意:部分模型(如DeepSeek-R1)不支持Function Calling,需选择兼容模型(如GPT-4o、阿里云百炼Qwen-turbo)。


Java实战(调用“订单查询函数”为例)

import com.google.gson.Gson;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.*;

public class FunctionCallingDemo {
    private static final String API_URL = "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions";
    private static final String API_KEY = "你的阿里云百炼API Key";
    private static final Gson GSON = new Gson();

    // 模拟传统业务函数:查询用户订单
    public static String queryOrder(String userId) {
        // 实际场景可替换为数据库查询
        Map<String, Object> order = new HashMap<>();
        order.put("orderId", "ORD20250501001");
        order.put("userId", userId);
        order.put("amount", 999.0);
        order.put("status", "已支付");
        return GSON.toJson(order);
    }

    public static void callWithFunction(String userQuery) throws Exception {
        // 1. 定义函数描述
        Map<String, Object> function = new HashMap<>();
        function.put("name", "queryOrder");
        function.put("description", "根据用户ID查询订单信息,仅当用户询问订单相关问题时调用");
        List<Map<String, String>> parameters = new ArrayList<>();
        Map<String, String> param = new HashMap<>();
        param.put("name", "userId");
        param.put("type", "string");
        param.put("description", "用户唯一标识ID");
        parameters.add(param);
        function.put("parameters", parameters);

        // 2. 构造请求体(包含函数定义)
        Map<String, Object> requestBody = new HashMap<>();
        requestBody.put("model", "qwen-turbo");
        List<Map<String, String>> messages = new ArrayList<>();
        messages.add(Map.of("role", "user", "content", userQuery));
        requestBody.put("messages", messages);
        requestBody.put("functions", Collections.singletonList(function));
        requestBody.put("function_call", "auto"); // 让模型自动判断是否调用函数

        // 3. 发送请求
        HttpRequest request = HttpRequest.newBuilder()
                .uri(URI.create(API_URL))
                .header("Content-Type", "application/json")
                .header("Authorization", "Bearer " + API_KEY)
                .POST(HttpRequest.BodyPublishers.ofString(GSON.toJson(requestBody)))
                .build();

        HttpClient client = HttpClient.newHttpClient();
        HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());

        // 4. 解析模型响应,判断是否需要调用函数
        if (response.statusCode() == 200) {
            Map<String, Object> responseMap = GSON.fromJson(response.body(), Map.class);
            Map<String, Object> message = (Map<String, Object>) ((List<Map<String, Object>>) responseMap.get("choices")).get(0).get("message");
            
            // 模型要求调用函数
            if (message.containsKey("function_call")) {
                Map<String, Object> funcCall = (Map<String, Object>) message.get("function_call");
                String funcName = (String) funcCall.get("name");
                Map<String, String> funcArgs = GSON.fromJson((String) funcCall.get("arguments"), Map.class);
                
                // 执行本地函数
                String funcResult = "";
                if ("queryOrder".equals(funcName)) {
                    funcResult = queryOrder(funcArgs.get("userId"));
                }

                // 5. 将函数结果回传给模型,生成最终回答
                messages.add(message); // 追加模型的函数调用指令
                messages.add(Map.of(
                        "role", "function",
                        "name", funcName,
                        "content", funcResult
                ));
                requestBody.put("messages", messages);
                
                // 重新发送请求
                HttpRequest finalRequest = HttpRequest.newBuilder()
                        .uri(URI.create(API_URL))
                        .header("Content-Type", "application/json")
                        .header("Authorization", "Bearer " + API_KEY)
                        .POST(HttpRequest.BodyPublishers.ofString(GSON.toJson(requestBody)))
                        .build();
                HttpResponse<String> finalResponse = client.send(finalRequest, HttpResponse.BodyHandlers.ofString());
                Map<String, Object> finalRespMap = GSON.fromJson(finalResponse.body(), Map.class);
                String finalAnswer = (String) ((Map<String, Object>) ((List<Map<String, Object>>) finalRespMap.get("choices")).get(0)).get("message");
                System.out.println("最终回答:" + finalAnswer);
            } else {
                // 无需调用函数,直接返回模型回答
                System.out.println("模型回答:" + message.get("content"));
            }
        }
    }

    public static void main(String[] args) {
        try {
            // 示例:用户询问自己的订单
            callWithFunction("查询我的订单信息,我的用户ID是U123456");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}


1.3 RAG:检索增强生成,突破大模型知识局限

RAG(Retrieval-Augmented Generation,检索增强生成)是解决大模型“时效性差、专业知识不足”的核心方案,通过“信息检索+大模型生成”的组合,让大模型基于最新/专业知识库输出精准答案。

核心痛点

  1. 大模型训练数据固定,无法实时更新;
  2. 通用大模型缺少垂直领域知识;
  3. 上下文窗口有限,无法直接传入海量知识库。

RAG核心模块

(1)检索模块(Retrieval)

  1. 文本拆分:将海量文档拆分为固定长度的文本片段(如500字/段);
  2. 文本嵌入(Embedding):将文本片段转换为向量,存入向量数据库(如Milvus、Chroma);
  3. 文本检索:根据用户问题生成向量,从向量库中匹配最相关的文本片段。

(2)生成模块(Generation)

  1. 组合提示词:将检索到的相关片段与用户问题拼接为提示词;
  2. 生成结果:调用大模型,基于拼接后的提示词生成回答。

Java实战(基于Milvus向量库的RAG简化示例)

import com.google.gson.Gson;
import io.milvus.client.MilvusClient;
import io.milvus.client.MilvusServiceClient;
import io.milvus.param.ConnectParam;
import io.milvus.param.collection.CreateCollectionParam;
import io.milvus.param.dml.InsertParam;
import io.milvus.param.dml.SearchParam;
import io.milvus.response.SearchResultsWrapper;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.*;

public class RAGDemo {
    // 向量库配置
    private static final MilvusClient MILVUS_CLIENT = new MilvusServiceClient(
            ConnectParam.newBuilder()
                    .withHost("localhost")
                    .withPort(19530)
                    .build()
    );
    private static final String COLLECTION_NAME = "knowledge_base";
    private static final int DIMENSION = 768; // Embedding向量维度(以BERT模型为例)
    
    // 大模型API配置
    private static final String EMBEDDING_API = "https://api.deepseek.com/v1/embeddings";
    private static final String CHAT_API = "https://api.deepseek.com/v1/chat/completions";
    private static final String API_KEY = "你的DeepSeek API Key";
    private static final Gson GSON = new Gson();

    // 1. 初始化向量库(创建集合)
    public static void initVectorDB() {
        CreateCollectionParam createParam = CreateCollectionParam.newBuilder()
                .withCollectionName(COLLECTION_NAME)
                .withDimension(DIMENSION)
                .withMetricType(CreateCollectionParam.MetricType.COSINE) // 余弦相似度匹配
                .build();
        MILVUS_CLIENT.createCollection(createParam);
    }

    // 2. 文本嵌入并插入向量库
    public static void insertToVectorDB(String text) throws Exception {
        // 调用Embedding API生成向量
        float[] vector = getEmbedding(text);
        
        // 构造插入数据
        List<InsertParam.Field> fields = new ArrayList<>();
        fields.add(new InsertParam.Field("id", Collections.singletonList(System.currentTimeMillis())));
        fields.add(new InsertParam.Field("vector", Collections.singletonList(vector)));
        fields.add(new InsertParam.Field("text", Collections.singletonList(text)));
        
        InsertParam insertParam = InsertParam.newBuilder()
                .withCollectionName(COLLECTION_NAME)
                .withFields(fields)
                .build();
        MILVUS_CLIENT.insert(insertParam);
    }

    // 3. 生成文本向量(调用Embedding API)
    private static float[] getEmbedding(String text) throws Exception {
        Map<String, Object> requestBody = new HashMap<>();
        requestBody.put("model", "text-embedding-ada-002");
        requestBody.put("input", text);
        
        HttpRequest request = HttpRequest.newBuilder()
                .uri(URI.create(EMBEDDING_API))
                .header("Content-Type", "application/json")
                .header("Authorization", "Bearer " + API_KEY)
                .POST(HttpRequest.BodyPublishers.ofString(GSON.toJson(requestBody)))
                .build();
        
        HttpClient client = HttpClient.newHttpClient();
        HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
        Map<String, Object> responseMap = GSON.fromJson(response.body(), Map.class);
        List<Map<String, Object>> data = (List<Map<String, Object>>) responseMap.get("data");
        List<Double> vecList = (List<Double>) ((Map<String, Object>) data.get(0)).get("embedding");
        
        // 转换为float数组
        float[] vector = new float[vecList.size()];
        for (int i = 0; i < vecList.size(); i++) {
            vector[i] = vecList.get(i).floatValue();
        }
        return vector;
    }

    // 4. 检索相关文本片段
    public static String searchRelevantText(String query) throws Exception {
        float[] queryVector = getEmbedding(query);
        
        SearchParam searchParam = SearchParam.newBuilder()
                .withCollectionName(COLLECTION_NAME)
                .withVector(queryVector)
                .withTopK(3) // 返回最相关的3个片段
                .withOutputFields(Collections.singletonList("text"))
                .build();
        
        SearchResultsWrapper resultsWrapper = new SearchResultsWrapper(MILVUS_CLIENT.search(searchParam).getData());
        StringBuilder relevantText = new StringBuilder();
        for (SearchResultsWrapper.IDScore idScore : resultsWrapper.getIDScore(0)) {
            Map<String, Object> fields = resultsWrapper.getFieldData(idScore.getLongID());
            relevantText.append((String) fields.get("text")).append("\n");
        }
        return relevantText.toString();
    }

    // 5. RAG完整流程:检索+生成
    public static String ragAnswer(String query) throws Exception {
        // 检索相关文本
        String relevantText = searchRelevantText(query);
        
        // 构造提示词(拼接检索结果)
        List<Map<String, String>> messages = new ArrayList<>();
        messages.add(Map.of("role", "user", "content",
                "基于以下参考信息回答问题:\n" + relevantText + "\n问题:" + query + "\n要求:仅基于参考信息回答,无相关信息时说明"));
        
        // 调用大模型生成回答
        Map<String, Object> requestBody = new HashMap<>();
        requestBody.put("model", "deepseek-chat");
        requestBody.put("messages", messages);
        
        HttpRequest request = HttpRequest.newBuilder()
                .uri(URI.create(CHAT_API))
                .header("Content-Type", "application/json")
                .header("Authorization", "Bearer " + API_KEY)
                .POST(HttpRequest.BodyPublishers.ofString(GSON.toJson(requestBody)))
                .build();
        
        HttpClient client = HttpClient.newHttpClient();
        HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
        Map<String, Object> responseMap = GSON.fromJson(response.body(), Map.class);
        return (String) ((Map<String, Object>) ((List<Map<String, Object>>) responseMap.get("choices")).get(0)).get("message");
    }

    public static void main(String[] args) {
        try {
            // 初始化向量库并插入示例知识(如公司产品文档)
            initVectorDB();
            insertToVectorDB("产品A的定价:基础版99元/月,高级版199元/月,支持30天无理由退款");
            insertToVectorDB("产品A的功能:支持多端同步、数据加密、自定义报表导出");
            
            // 示例:用户询问产品A定价
            String answer = ragAnswer("产品A的高级版多少钱?支持退款吗?");
            System.out.println("RAG回答:" + answer);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}


1.4 Fine-tuning:模型微调,适配企业专属需求

Fine-tuning(模型微调)是在预训练大模型基础上,使用企业私有数据进行二次训练,调整模型部分参数,使其输出更贴合企业业务场景。


核心流程

  1. 选预训练模型:根据业务选择适配模型(如Qwen-2.5、DeepSeek-7B);
  2. 准备数据集:收集企业专属数据(如行业知识库、业务对话记录);
  3. 配置超参数:调整学习率、批次大小、训练轮次等;
  4. 训练优化:通过前向传播、损失计算、反向传播更新模型参数;
  5. 验证部署:测试微调后模型效果,部署至生产环境。

核心问题

  1. 成本高:需GPU集群等大量计算资源;
  2. 难度大:调参复杂,需专业算法工程师;
  3. 风险高:易出现过拟合,导致泛化能力下降。

注意:Java并非模型微调的主流语言(主流为Python),因此不提供Java代码示例,企业级微调建议基于PyTorch/TensorFlow实现,微调后的模型可通过Java API调用。


二、技术架构选型原则

2.1 成本排序(由低到高)

纯Prompt < Function Calling < RAG < Fine-tuning

2.2 选型核心原则

在满足业务效果的前提下,优先选择开发成本、运维成本更低的方案,参考以下流程:

  1. 简单场景(如通用问答、文案生成):优先纯Prompt模式,通过精细化提示词满足需求;
  2. 需要联动传统系统(如查订单、调接口):选择Function Calling;
  3. 需要最新/专业知识(如企业知识库问答、行业资讯解答):选择RAG;
  4. 前三种方案无法满足(如高度定制化的行业问答、专属话术生成):评估成本后考虑Fine-tuning。

三、总结

大模型应用开发的核心是“匹配场景选架构”:纯Prompt是入门首选,Function Calling打通传统系统,RAG解决知识局限,Fine-tuning作为最后选择。多数企业的核心需求可通过前三种架构满足,无需盲目追求模型微调。开发者可基于本文的Java实战代码,快速落地对应架构,并根据业务复杂度逐步迭代优化。

声明:该内容由作者自行发布,观点内容仅供参考,不代表平台立场;如有侵权,请联系平台删除。
标签:
技术方向
视觉AI
多模态交互
大模型
模型部署