AI漫剧创作教程:Midjourney + Gpt Image 2 + Seedance 2.0 + Suno(2026最新实操)
制作工具
文章摘要
这是一篇2026年最新的AI漫剧创作教程,介绍了用Midjourney出图、Gpt Image 2生成故事板、Seedance 2.0生成视频、Suno配音的工作流。教程提供详细提示词,涵盖创建角色身份板、故事板和视频的方法,解释了提示词写法及效果,对AI漫剧创作有参考价值。

最近在X上刷到一个特别有意思的工作流,Midjourney出图,再用Gpt Image 2生成故事板,最后喂给Seedance 2.0,用Suno配音。
这套工作流效果非常惊艳。
提示词非常结构化,实现了角色一致性、动作连贯、叙事节奏的高质量。
本教程提供全部提示词,每部分都会详细解释为什么这样写,以及这样写能带来什么好效果,英文版提示词放在文末,粘贴就能使用。

整体创作流程度
koda是最近X平台上AI漫剧最受欢迎的创作者之一,他分享的工作流共三部。
- 生成角色身份板
- 生成故事板(18格连续画面)
- 输入Seedance 2.0生成视频

创建角色身份板

Image 2生成图片
提示词:
创建一个艺术感的16:9角色身份板。
[主体角色]: 年轻的成年女性旅行法师,运动型苗条身材,长而飘逸的蓝白色头发带有羽毛装饰,发光的蓝色眼睛,分层白色和蓝色旅行袍上有发光图案,手持顶部有圆形发光头的长杖,幻想优雅的姿态。
视觉风格:风格化动漫-油画风 / 电影感奇幻概念艺术。
纯白色或柔和米白色背景,无环境、无logo、无水印。
设计方向:创建像高端动画工作室角色研究 + 艺术画册布局的电影感身份板。不对称、优雅、有视觉记忆点。避免网格和目录式布局。
重要布局规则:不要让任何角色图像重叠,每个视角都要有清晰分离和呼吸空间。
主要构图:一个稍偏中心的大型英雄全身视图作为视觉锚点。周围安排较小的辅助研究图:中性全身、背面、侧面、坐姿、倚靠姿、蹲姿、俯视角度、仰视角度、表情特写。
身份锁定:所有视角保持严格一致性——相同的脸、发型、服装、身体比例、姿态语言。
艺术部分:包含小型剪影研究区(2-3个黑色简化剪影)、表情研究区、细节研究区(脸、头发、服装关键特征)。
文字设计:
角色名:光之法师
角色定位:旅行法师 / 森林净化者
核心气质:平静的指挥感、从容的仪式力量
视觉签名:蓝白色发光杖光轨,空灵的森林守护者美学
整体风格:简洁、电影感、高级、画册质感、干净、富有表现力。
为什么这样写?效果为什么好?
- 结构化分区:视觉风格、设计方向、身份锁定等让AI明确知道这是角色设定,不是单张图。
- 详细外貌:精确到眼睛和衣着,告诉AI心视觉特征,提升角色一致性(后续故事板和视频不会串脸)。
- 布局明确:布局不要重叠,防止AI把多个视角挤在一起、乱七八糟。
- 艺术参考:高端动画工作室 + 艺术画册,引导AI输出专业级概念图,而非普通AI画风。
- 文字部分:帮助AI(快速理解角色性格和视觉签名,后续提示词引用时效果更稳定。
Tips:先生成1-2张基础角色图,再用这个提示词做身份板,保存供后面引用。没有Midjourney,其他文生图工具也能代替。

生成故事板

Kōda生成的图片
提示词:
创建一个16:9图像。紧凑的3x6故事板表格,共18格,使用单色浅灰色粗糙草图风格。
[项目卡]
标题:森林净化(改成你的标题)
元信息:突然的虚空魔法 / 快速12-15秒动作
微简报:一位孤独的法师在空荡荡的[场景]中,用杖击地制造黑暗,然后用发光的poi光轨逐个引出隐藏威胁,再用白色闪光消灭它们,最后恢复场景。
[连续性标题]
序列ID:你的场景-18P
参考优先级:使用@角色参考 来控制法师的脸、身体、服装和杖。故事板控制动作流程和地理位置。
[场景包]
前提:法师使用NOX黑暗 + LUMOS poi魔法清除隐藏生物。
地点:[详细描述场景,例如:夜晚空荡荡的现代城市小巷,霓虹灯,湿漉漉的路面]
开始→结束:空荡场景中法师击杖 → 消灭后恢复干净的场景。
[风格锁定]
风格锁定:干净的单色粗糙草图。
效果锁定:蓝白色poi光轨、逐个引诱生物的剪辑、大型白色毁灭闪光。
[序列]
网格:18格。P01是空场景击杖,中间格是逐个引诱和poi动作,最后几格是闪光+恢复。为什么这样写?效果为什么好?
- 项目卡:先告诉AI这是什么项目、核心目标,让AI从全局把握方向,避免跑题。
- 连续性标题:要保持前后一致,AI会更注意动作连贯和地理位置不变。
- 场景包:详细说明“开始-过程-结束”,AI知道整个故事弧线,不会生成无关内容。
- 风格锁定:极大减少随机性,提升光效和叙事连贯性。
- 18格3x6网格:信息密度更高,适合15秒左右的短视频。
完整版更加复杂,增加了更多如空间连续性、情绪弧线等,效果更稳,想细致研究的可以跳转至文末。

生成视频
提示词:
使用@故事板参考 作为权威的导演创作蓝图。严格按照每个面板顺序,在一个连续的序列中生成。
使用@角色参考 来控制主要法师。
环境:[场景详细描述]
情绪指导:从平静指挥 → 强烈引诱 → 干净释放。
视觉风格:风格化奇幻动作,发光的蓝白色灯光,戏剧性闪光。
音频:环境声、poi挥舞声、撞击闪光声,无背景音乐。
面板节拍:P01:击杖制造黑暗,P02-P10:用poi动作引诱生物,最后几格:白色闪光 + 恢复场景。为什么这样写?效果为什么好?
- 严格按照面板:告诉AI不要自由发挥,要忠实于故事板,提升动作和镜头的连续性。
- 一个连续的序列:即使有硬切,也保持整体像一部小电影,而不是零散片段。
- 情绪指导:让AI理解情感变化,视频更有感染力。
- 面板节拍:精确控制每个时刻的动作和光效,避免光轨乱飞、人物突然消失等。

完整版提示词
故事板:
Create a 16:9 image.
[PROJECT CARD]
Create a compact designed masthead, not a table.
TITLE: FOREST SABER POI RITUAL
META LINE: nocturnal / sci-fi flow-art performance / dense 15-second burst-cut edit
PRIORITY: single-saber poi choreography, real micro-cut bursts, readable lake-energy payoff
MICRO BRIEF: C1 turns a lone energy saber into a fire-poi style light performance, compressing a full ritual dance into a fast 15-second forest sequence.
[CONTINUITY HEADER]
SEQUENCE ID: FOREST-SABER-POI-RITUAL-20P
REFERENCE PRIORITY: First provided image controls C1 face, body, wardrobe, proportions, silhouette, hair, and attitude; second provided image controls night forest base, grounded white craft, trunks, water edge, practical lights, and reflective surfaces. This storyboard controls staging, motion, geography, continuity, cut rhythm, and effect logic.
[SCENE PACKET]
PREMISE: C1 performs a single-saber flow-art ritual like fire poi, mixing Jedi-inspired forms, street-dance footwork, and martial precision until the clearing becomes a light show.
LOCATION: Deep night forest clearing, tall trunks, grounded white rescue craft screen right, rear ramp and cases behind, shallow lake foreground, wet shore at center-left, warm work lights, dark foliage side walls, open practice lane leading to the waterline.
START -> END: C1 stands calm in the practice lane with saber lowered -> after two burst-cut movement phrases, she jumps to the shore, plants the saber at the waterline, and pale energy rings spread across the lake.
ACTION CHAIN: ignition ritual -> micro-cut wrist/boot/blade burst -> figure-eight weave -> butterfly spin -> behind-back orbit -> body-wrap illusion -> blade pass near face -> water-reflection burst -> suspended breath -> leap -> shore stab -> lake-current release.
PROP / EFFECT STATE: One pale white energy saber only; saber stays in C1 hand until final stab; trails behave like long-exposure poi ribbons around her body, then flatten into clean energy rings across the lake; no sparks, no explosion, no damage.
MUST READ: This is not a combat drill; it is a single-saber poi performance where every cut reveals another fragment of one continuous light-dance phrase.
[CHARACTER SANITIZATION]
C1: young adult woman, short tousled blonde hair, compact athletic silhouette, cropped white poncho-like mantle over wrapped top, beige hanging sash, dark loose cropped trousers, boots, fingerless gloves, belt gear, single pale energy saber, flow-artist posture, sharp footwork, controlled fierce focus.
Remove contradictory traits, invisible psychology, excessive costume detail, and backstory that cannot appear in a panel.
[IDENTITY CONSISTENCY]
Provided character reference controls C1 face, body, wardrobe, proportions, hair, and silhouette; storyboard controls staging only. Keep C1's white mantle, beige sash, dark pants, boots, gloves, belt, single saber, and screen direction consistent. Do not redesign, age-shift, beautify, merge, duplicate, or add a second saber.
[STORYBOARD PURITY]
Panel images are visual-only low-detail monochrome light-gray rough sketches. Put panel numbers, beat names, and lens tags in the header strip outside each panel image. No color, labels, arrows, captions, subtitles, logos, watermarks, timing marks, diagrams, UI, ghost poses, duplicate bodies, or technical overlays inside panels.
[MASTER SHOT RULE]
P01 shows full playable geography: C1 center-left practice lane, lake foreground, craft screen right, ramp/cases behind, tall trunks background, foliage side edges, and stable travel direction from lane to waterline.
[EMOTIONAL ARC]
Still focus -> ignition charge -> playful flow control -> fragmented micro-cut intensity -> breath-held precision -> airborne decision -> waterline impact -> quiet awe, shown through stance, hand tension, blade proximity, footwork, cloak lift, and final stillness.
[STYLE LOCKS]
STYLE LOCK: clean monochrome rough-sketch storyboard panels on off-white paper, light-gray gesture lines, simplified forest/craft/water shapes, restrained amber and pale-blue accent only outside panel art, crisp cinematic hierarchy, no rendered panel lighting.
EFFECT LOCK: inside panels, saber trails and lake ripples are simple monochrome bright shapes only; final video effect is pale white saber bloom, long-exposure ribbon trails, tight poi loops, blade-through-lens flashes, water reflections, and flat expanding lake-energy rings.
ENVIRONMENT LOCK: tall vertical trunks, grounded white craft screen right, rear ramp, cases, shallow lake foreground, wet shore center-left, dense foliage pockets, warm practical work lights; preserve the same clearing and craft layout across all wide views.
[SPATIAL CONTINUITY LOCK]
P01, P09, P14, P17, P18, P19, and P20 share the same clearing layout. C1 starts in the center-left practice lane, travels toward the foreground waterline by P17, plants the saber at the wet shore in P19, and remains there as P20 pulls wide. Craft stays screen right, water foreground, trunks vertical background, cases/ramp behind. Allowed changes are camera distance, C1 pose, saber angle, cloak motion, trail density, reflection state, and lake-current spread. P20 is the same location with more distance, not a new establishing shot.
[DIRECTOR STRIP]
Bottom animatic track board aligned to panel columns. Tracks: BEAT LINE, CAMERA PATH, ACTION PATH, RHYTHM TRACK, ESCALATION MAP, STATE TRACK, STYLE TRACK. Use shot chips, thin lines, rhythm blocks, small intensity bars, one-to-three-word labels. No seconds or timestamps.
...(完整DIRECTOR STRIP内容较长,可按需复制前面提供的详细部分)
[SEQUENCE]
Grid: 20 panels in a compact 4x5 cinematic storyboard sheet; panel artwork stays monochrome rough sketch while the director strip makes a 15-second sequence feel longer through two burst-cut clusters, flow-art continuity, waterline impact, and lake-energy final hold.
视频生成:
Use @[storyboard ref] as the storyboard blueprint for the sequence. Treat every storyboard panel as a consecutive shot within a single cinematic sequence. Follow panel order exactly...
Use @[character ref] as the authoritative C1 character reference.
EMOTIONAL GUIDANCE: Valence: focused neutral to fierce flow-state to quiet awe. Arousal: calm ignition, burst peaks, breath hold, leap impact, final release...
AUDIO: No background music or score.
Kōda的提示词设计得非常模块,如果是其他题材,修改起来也比较简单。
如果是科幻题材,可以把森林魔法改为太空,角色换成机械特工,光效换成霓虹或全息,场景换成太空站或未来赛博城市,动作从魔法改为能量。
- 先用简化版学习提示词逻辑,再用完整版提升质量。
- 角色一致性是最重要的一环,先做好身份板。
- 结构化锁定是Kōda作品效果好的主要原因,能减少AI的随机发挥,让动作和场景保持连贯、专业。
里面的内容其实很多,你如果在搭自己做AI漫剧,Kōda的思路有很大的参考价值。
希望对大家有所帮助。
你的AI知识,真的可以变现!塔猴AI达人星火计划,发布课程,赚现金激励,发得多赚得多!点击加入变现队伍:https://www.tahou.com/article/206700733435227141

以上内容不代表本平台立场,仅供读者参考




