在大模型应用开发领域,不同的技术架构决定了应用的能力边界、开发成本与落地效率。本文将系统拆解纯Prompt、Function Calling、RAG、Fine-tuning四种核心技术架构,结合技术选型原则与Java实战代码,帮助开发者精准匹配业务场景,高效落地大模型应用。
一、四大核心技术架构深度解析
1.1 纯Prompt模式:低成本快速落地的入门方案
纯Prompt模式是大模型应用开发的基础形态,核心依赖提示词工程(Prompt Engineering) ——通过精准设计提示词,引导大模型输出符合预期的结果,无需额外的代码开发或系统改造。

核心逻辑:
- 大模型的输出质量高度依赖提示词的精准度,相同问题下,不同提示词可能产生差异极大的答案;
- 适用于简单场景:智能问答、文案生成、代码片段编写等,仅需通过API传递精心设计的提示词即可实现核心功能。
Java极简实战(调用DeepSeek模型为例):
import com.google.gson.Gson;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class PurePromptDemo {
private static final String API_URL = "https://api.deepseek.com/v1/chat/completions";
private static final String API_KEY = "你的DeepSeek API Key";
private static final Gson GSON = new Gson();
public static String callModel(String prompt) throws Exception {
List<Map<String, String>> messages = new ArrayList<>();
Map<String, String> userMsg = new HashMap<>();
userMsg.put("role", "user");
userMsg.put("content", "你是资深Java开发工程师," + prompt + ",回答需附带代码示例,且注释完整");
messages.add(userMsg);
Map<String, Object> requestBody = new HashMap<>();
requestBody.put("model", "deepseek-chat");
requestBody.put("messages", messages);
requestBody.put("temperature", 0.7);
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(API_URL))
.header("Content-Type", "application/json")
.header("Authorization", "Bearer " + API_KEY)
.POST(HttpRequest.BodyPublishers.ofString(GSON.toJson(requestBody)))
.build();
HttpClient client = HttpClient.newHttpClient();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
if (response.statusCode() == 200) {
Map<String, Object> responseMap = GSON.fromJson(response.body(), Map.class);
List<Map<String, Object>> choices = (List<Map<String, Object>>) responseMap.get("choices");
return (String) ((Map<String, Object>) choices.get(0)).get("message");
} else {
throw new RuntimeException("调用失败:" + response.statusCode() + "," + response.body());
}
}
public static void main(String[] args) {
try {
String result = callModel("用Java实现线程安全的懒汉式单例模式");
System.out.println(result);
} catch (Exception e) {
e.printStackTrace();
}
}
}
1.2 Function Calling:大模型与传统应用的桥梁
大模型擅长理解自然语言,但无法直接操作数据库、执行业务规则,Function Calling(函数调用)解决了这一痛点——让大模型根据用户意图,主动触发传统应用的函数执行,实现“语言理解+业务操作”的闭环。

核心流程:
- 封装传统功能为标准化函数(如查询订单、操作数据库、调用第三方API);
- 在提示词中定义函数描述(名称、参数、作用),引导大模型判断是否需要调用函数;
- 大模型返回待调用的函数名与参数;
- 传统应用执行函数,将结果回传给大模型;
- 大模型整合结果,生成最终回答。
关键注意:部分模型(如DeepSeek-R1)不支持Function Calling,需选择兼容模型(如GPT-4o、阿里云百炼Qwen-turbo)。
Java实战(调用“订单查询函数”为例):
import com.google.gson.Gson;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.*;
public class FunctionCallingDemo {
private static final String API_URL = "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions";
private static final String API_KEY = "你的阿里云百炼API Key";
private static final Gson GSON = new Gson();
public static String queryOrder(String userId) {
Map<String, Object> order = new HashMap<>();
order.put("orderId", "ORD20250501001");
order.put("userId", userId);
order.put("amount", 999.0);
order.put("status", "已支付");
return GSON.toJson(order);
}
public static void callWithFunction(String userQuery) throws Exception {
Map<String, Object> function = new HashMap<>();
function.put("name", "queryOrder");
function.put("description", "根据用户ID查询订单信息,仅当用户询问订单相关问题时调用");
List<Map<String, String>> parameters = new ArrayList<>();
Map<String, String> param = new HashMap<>();
param.put("name", "userId");
param.put("type", "string");
param.put("description", "用户唯一标识ID");
parameters.add(param);
function.put("parameters", parameters);
Map<String, Object> requestBody = new HashMap<>();
requestBody.put("model", "qwen-turbo");
List<Map<String, String>> messages = new ArrayList<>();
messages.add(Map.of("role", "user", "content", userQuery));
requestBody.put("messages", messages);
requestBody.put("functions", Collections.singletonList(function));
requestBody.put("function_call", "auto");
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(API_URL))
.header("Content-Type", "application/json")
.header("Authorization", "Bearer " + API_KEY)
.POST(HttpRequest.BodyPublishers.ofString(GSON.toJson(requestBody)))
.build();
HttpClient client = HttpClient.newHttpClient();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
if (response.statusCode() == 200) {
Map<String, Object> responseMap = GSON.fromJson(response.body(), Map.class);
Map<String, Object> message = (Map<String, Object>) ((List<Map<String, Object>>) responseMap.get("choices")).get(0).get("message");
if (message.containsKey("function_call")) {
Map<String, Object> funcCall = (Map<String, Object>) message.get("function_call");
String funcName = (String) funcCall.get("name");
Map<String, String> funcArgs = GSON.fromJson((String) funcCall.get("arguments"), Map.class);
String funcResult = "";
if ("queryOrder".equals(funcName)) {
funcResult = queryOrder(funcArgs.get("userId"));
}
messages.add(message);
messages.add(Map.of(
"role", "function",
"name", funcName,
"content", funcResult
));
requestBody.put("messages", messages);
HttpRequest finalRequest = HttpRequest.newBuilder()
.uri(URI.create(API_URL))
.header("Content-Type", "application/json")
.header("Authorization", "Bearer " + API_KEY)
.POST(HttpRequest.BodyPublishers.ofString(GSON.toJson(requestBody)))
.build();
HttpResponse<String> finalResponse = client.send(finalRequest, HttpResponse.BodyHandlers.ofString());
Map<String, Object> finalRespMap = GSON.fromJson(finalResponse.body(), Map.class);
String finalAnswer = (String) ((Map<String, Object>) ((List<Map<String, Object>>) finalRespMap.get("choices")).get(0)).get("message");
System.out.println("最终回答:" + finalAnswer);
} else {
System.out.println("模型回答:" + message.get("content"));
}
}
}
public static void main(String[] args) {
try {
callWithFunction("查询我的订单信息,我的用户ID是U123456");
} catch (Exception e) {
e.printStackTrace();
}
}
}
1.3 RAG:检索增强生成,突破大模型知识局限
RAG(Retrieval-Augmented Generation,检索增强生成)是解决大模型“时效性差、专业知识不足”的核心方案,通过“信息检索+大模型生成”的组合,让大模型基于最新/专业知识库输出精准答案。

核心痛点:
- 大模型训练数据固定,无法实时更新;
- 通用大模型缺少垂直领域知识;
- 上下文窗口有限,无法直接传入海量知识库。
RAG核心模块:
(1)检索模块(Retrieval)
- 文本拆分:将海量文档拆分为固定长度的文本片段(如500字/段);
- 文本嵌入(Embedding):将文本片段转换为向量,存入向量数据库(如Milvus、Chroma);
- 文本检索:根据用户问题生成向量,从向量库中匹配最相关的文本片段。
(2)生成模块(Generation)
- 组合提示词:将检索到的相关片段与用户问题拼接为提示词;
- 生成结果:调用大模型,基于拼接后的提示词生成回答。
Java实战(基于Milvus向量库的RAG简化示例):
import com.google.gson.Gson;
import io.milvus.client.MilvusClient;
import io.milvus.client.MilvusServiceClient;
import io.milvus.param.ConnectParam;
import io.milvus.param.collection.CreateCollectionParam;
import io.milvus.param.dml.InsertParam;
import io.milvus.param.dml.SearchParam;
import io.milvus.response.SearchResultsWrapper;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.*;
public class RAGDemo {
private static final MilvusClient MILVUS_CLIENT = new MilvusServiceClient(
ConnectParam.newBuilder()
.withHost("localhost")
.withPort(19530)
.build()
);
private static final String COLLECTION_NAME = "knowledge_base";
private static final int DIMENSION = 768;
private static final String EMBEDDING_API = "https://api.deepseek.com/v1/embeddings";
private static final String CHAT_API = "https://api.deepseek.com/v1/chat/completions";
private static final String API_KEY = "你的DeepSeek API Key";
private static final Gson GSON = new Gson();
public static void initVectorDB() {
CreateCollectionParam createParam = CreateCollectionParam.newBuilder()
.withCollectionName(COLLECTION_NAME)
.withDimension(DIMENSION)
.withMetricType(CreateCollectionParam.MetricType.COSINE)
.build();
MILVUS_CLIENT.createCollection(createParam);
}
public static void insertToVectorDB(String text) throws Exception {
float[] vector = getEmbedding(text);
List<InsertParam.Field> fields = new ArrayList<>();
fields.add(new InsertParam.Field("id", Collections.singletonList(System.currentTimeMillis())));
fields.add(new InsertParam.Field("vector", Collections.singletonList(vector)));
fields.add(new InsertParam.Field("text", Collections.singletonList(text)));
InsertParam insertParam = InsertParam.newBuilder()
.withCollectionName(COLLECTION_NAME)
.withFields(fields)
.build();
MILVUS_CLIENT.insert(insertParam);
}
private static float[] getEmbedding(String text) throws Exception {
Map<String, Object> requestBody = new HashMap<>();
requestBody.put("model", "text-embedding-ada-002");
requestBody.put("input", text);
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(EMBEDDING_API))
.header("Content-Type", "application/json")
.header("Authorization", "Bearer " + API_KEY)
.POST(HttpRequest.BodyPublishers.ofString(GSON.toJson(requestBody)))
.build();
HttpClient client = HttpClient.newHttpClient();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
Map<String, Object> responseMap = GSON.fromJson(response.body(), Map.class);
List<Map<String, Object>> data = (List<Map<String, Object>>) responseMap.get("data");
List<Double> vecList = (List<Double>) ((Map<String, Object>) data.get(0)).get("embedding");
float[] vector = new float[vecList.size()];
for (int i = 0; i < vecList.size(); i++) {
vector[i] = vecList.get(i).floatValue();
}
return vector;
}
public static String searchRelevantText(String query) throws Exception {
float[] queryVector = getEmbedding(query);
SearchParam searchParam = SearchParam.newBuilder()
.withCollectionName(COLLECTION_NAME)
.withVector(queryVector)
.withTopK(3)
.withOutputFields(Collections.singletonList("text"))
.build();
SearchResultsWrapper resultsWrapper = new SearchResultsWrapper(MILVUS_CLIENT.search(searchParam).getData());
StringBuilder relevantText = new StringBuilder();
for (SearchResultsWrapper.IDScore idScore : resultsWrapper.getIDScore(0)) {
Map<String, Object> fields = resultsWrapper.getFieldData(idScore.getLongID());
relevantText.append((String) fields.get("text")).append("\n");
}
return relevantText.toString();
}
public static String ragAnswer(String query) throws Exception {
String relevantText = searchRelevantText(query);
List<Map<String, String>> messages = new ArrayList<>();
messages.add(Map.of("role", "user", "content",
"基于以下参考信息回答问题:\n" + relevantText + "\n问题:" + query + "\n要求:仅基于参考信息回答,无相关信息时说明"));
Map<String, Object> requestBody = new HashMap<>();
requestBody.put("model", "deepseek-chat");
requestBody.put("messages", messages);
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(CHAT_API))
.header("Content-Type", "application/json")
.header("Authorization", "Bearer " + API_KEY)
.POST(HttpRequest.BodyPublishers.ofString(GSON.toJson(requestBody)))
.build();
HttpClient client = HttpClient.newHttpClient();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
Map<String, Object> responseMap = GSON.fromJson(response.body(), Map.class);
return (String) ((Map<String, Object>) ((List<Map<String, Object>>) responseMap.get("choices")).get(0)).get("message");
}
public static void main(String[] args) {
try {
initVectorDB();
insertToVectorDB("产品A的定价:基础版99元/月,高级版199元/月,支持30天无理由退款");
insertToVectorDB("产品A的功能:支持多端同步、数据加密、自定义报表导出");
String answer = ragAnswer("产品A的高级版多少钱?支持退款吗?");
System.out.println("RAG回答:" + answer);
} catch (Exception e) {
e.printStackTrace();
}
}
}
1.4 Fine-tuning:模型微调,适配企业专属需求
Fine-tuning(模型微调)是在预训练大模型基础上,使用企业私有数据进行二次训练,调整模型部分参数,使其输出更贴合企业业务场景。
核心流程:
- 选预训练模型:根据业务选择适配模型(如Qwen-2.5、DeepSeek-7B);
- 准备数据集:收集企业专属数据(如行业知识库、业务对话记录);
- 配置超参数:调整学习率、批次大小、训练轮次等;
- 训练优化:通过前向传播、损失计算、反向传播更新模型参数;
- 验证部署:测试微调后模型效果,部署至生产环境。
核心问题:
- 成本高:需GPU集群等大量计算资源;
- 难度大:调参复杂,需专业算法工程师;
- 风险高:易出现过拟合,导致泛化能力下降。
注意:Java并非模型微调的主流语言(主流为Python),因此不提供Java代码示例,企业级微调建议基于PyTorch/TensorFlow实现,微调后的模型可通过Java API调用。
二、技术架构选型原则
2.1 成本排序(由低到高)
纯Prompt < Function Calling < RAG < Fine-tuning

2.2 选型核心原则
在满足业务效果的前提下,优先选择开发成本、运维成本更低的方案,参考以下流程:
- 简单场景(如通用问答、文案生成):优先纯Prompt模式,通过精细化提示词满足需求;
- 需要联动传统系统(如查订单、调接口):选择Function Calling;
- 需要最新/专业知识(如企业知识库问答、行业资讯解答):选择RAG;
- 前三种方案无法满足(如高度定制化的行业问答、专属话术生成):评估成本后考虑Fine-tuning。
三、总结
大模型应用开发的核心是“匹配场景选架构”:纯Prompt是入门首选,Function Calling打通传统系统,RAG解决知识局限,Fine-tuning作为最后选择。多数企业的核心需求可通过前三种架构满足,无需盲目追求模型微调。开发者可基于本文的Java实战代码,快速落地对应架构,并根据业务复杂度逐步迭代优化。