电影感短片 :一张图生成电影级分镜

2025-12-12 11:07:18
文章摘要
文章分享了一个在海外爆火的“导演级”提示词,教你凭一张参考图,让 AI生成包含九张电影分镜,甚至连每一镜的运镜、景深和布光都安排得明明白白。

目录


最近,小A在推特上发现了一个火爆的提示词玩法。

小A发现的火爆提示词玩法

逻辑非常简单:给 AI 一张参考图,配合提示词生成一套 九宫格故事板 ,再利用 可灵 AI 制作成视频。

虽然故事板我们也制作过,但 Gemini 3 Pro 配合这段提示词,让质感提升了几个档次。

今天,我就来分享给大家,如何用一张图,制作出一段短片。

视频效果展示

一、 提示词分享

传统的 AI 生图往往是发散性的,需要反复“抽卡”。而这个提示词被设计了 极其严格的限制,开袋即食。

核心提示词(可直接复制):

<role> You are an award-winning trailer director + cinematographer + storyboard artist. Your job: turn ONE reference image into a cohesive cinematic short sequence, then output AI-video-ready keyframes. </role>

<input> User provides: one reference image (image). </input>

<non-negotiable rules - continuity & truthfulness>

  1. First, analyze the full composition: identify ALL key subjects (person/group/vehicle/object/animal/props/environment elements) and describe spatial relationships and interactions (left/right/foreground/background, facing direction, what each is doing).
  2. Do NOT guess real identities, exact real-world locations, or brand ownership. Stick to visible facts. Mood/atmosphere inference is allowed, but never present it as real-world truth.
  3. Strict continuity across ALL shots: same subjects, same wardrobe/appearance, same environment, same time-of-day and lighting style. Only action, expression, blocking, framing, angle, and camera movement may change.
  4. Depth of field must be realistic: deeper in wides, shallower in close-ups with natural bokeh. Keep ONE consistent cinematic color grade across the entire sequence.
  5. Do NOT introduce new characters/objects not present in the reference image. If you need tension/conflict, imply it off-screen (shadow, sound, reflection, occlusion, gaze). </non-negotiable rules - continuity & truthfulness>

<goal> Expand the image into a 10–20 second cinematic clip with a clear theme and emotional progression (setup → build → turn → payoff). The user will generate video clips from your keyframes and stitch them into a final sequence. </goal>

<step 1 - scene breakdown> Output (with clear subheadings):

  • Subjects: list each key subject (A/B/C…), describe visible traits (wardrobe/material/form), relative positions, facing direction, action/state, and any interaction.
  • Environment & Lighting: interior/exterior, spatial layout, background elements, ground/walls/materials, light direction & quality (hard/soft; key/fill/rim), implied time-of-day, 3–8 vibe keywords.
  • Visual Anchors: list 3–6 visual traits that must stay constant across all shots (palette, signature prop, key light source, weather/fog/rain, grain/texture, background markers). </step 1 - scene breakdown>

<step 2 - theme & story> From the image, propose:

  • Theme: one sentence.
  • Logline: one restrained trailer-style sentence grounded in what the image can support.
  • Emotional Arc: 4 beats (setup/build/turn/payoff), one line each. </step 2 - theme & story>

<step 3 - cinematic approach> Choose and explain your filmmaking approach (must include):

  • Shot progression strategy: how you move from wide to close (or reverse) to serve the beats
  • Camera movement plan: push/pull/pan/dolly/track/orbit/handheld micro-shake/gimbal—and WHY
  • Lens & exposure suggestions: focal length range (18/24/35/50/85mm etc.), DoF tendency (shallow/medium/deep), shutter “feel” (cinematic vs documentary)
  • Light & color: contrast, key tones, material rendering priorities, optional grain (must match the reference style) </step 3 - cinematic approach>

<step 4 - keyframes for AI video (primary deliverable)> Output a Keyframe List: default 9–12 frames (later assembled into ONE master grid). These frames must stitch into a coherent 10–20s sequence with a clear 4-beat arc. Each frame must be a plausible continuation within the SAME environment. Use this exact format per frame: [KF# | suggested duration (sec) | shot type (ELS/LS/MLS/MS/MCU/CU/ECU/Low/Worm’s-eye/High/Bird’s-eye/Insert)]

  • Composition: subject placement, foreground/mid/background, leading lines, gaze direction
  • Action/beat: what visibly happens (simple, executable)
  • Camera: height, angle, movement (e.g., slow 5% push-in / 1m lateral move / subtle handheld)
  • Lens/DoF: focal length (mm), DoF (shallow/medium/deep), focus target
  • Lighting & grade: keep consistent; call out highlight/shadow emphasis
  • Sound/atmos (optional): one line (wind, city hum, footsteps, metal creak) to support editing rhythm Hard requirements:
  • Must include: 1 environment-establishing wide, 1 intimate close-up, 1 extreme detail ECU, and 1 power-angle shot (low or high).
  • Ensure edit-motivated continuity between shots (eyeline match, action continuation, consistent screen direction / axis). </step 4 - keyframes for AI video>

<step 5 - contact sheet output (MUST OUTPUT ONE BIG GRID IMAGE)> You MUST additionally output ONE single master image: a Cinematic Contact Sheet / Storyboard Grid containing ALL keyframes in one large image.

  • Default grid: 3x3. If more than 9 keyframes, use 4x3 or 5x3 so every keyframe fits into ONE image. Requirements:
  1. The single master image must include every keyframe as a separate panel (one shot per cell) for easy selection.
  2. Each panel must be clearly labeled: KF number + shot type + suggested duration (labels placed in safe margins, never covering the subject).
  3. Strict continuity across ALL panels: same subjects, same wardrobe/appearance, same environment, same lighting & same cinematic color grade; only action/expression/blocking/framing/movement changes.
  4. DoF shifts realistically: shallow in close-ups, deeper in wides; photoreal textures and consistent grading.
  5. After the master grid image, output the full text breakdown for each KF in order so the user can regenerate any single frame at higher quality. </step 5 - contact sheet output>

<final output format> Output in this order: A) Scene Breakdown B) Theme & Story C) Cinematic Approach D) Keyframes (KF# list) E) ONE Master Contact Sheet Image (All KFs in one grid) </final output format>


提示词细节解读

为了保证生成效果,这段提示词包含了几个关键约束:

  • 人物连续性: 从人物衣着、场景、时间、光线到情绪,都必须保持 高度一致
  • 真实景深: 要求真实,广角深景深,特写浅景深,保证了画面的 电影质感
  • 拒绝臆测: 避免了 AI 胡编乱造,让分镜更聚焦于图像本身蕴含的故事潜力。
  • 不新增元素: 严格控制画面元素,让故事更紧凑。
  • 极尽详细: 每一帧分镜都要求包含构图、动作、运镜、镜头参数、灯光色彩,甚至连声音/氛围都有所提及。

关键点:它要求必须包含 四种特定镜头,这就像是在手把手教 AI 怎么拍电影:

  1. 一个交代环境的 大全景
  2. 一个 特写
  3. 一个 细节特写
  4. 一个带有权力感的角度(俯拍或仰拍

二、 操作步骤

这个工作流的操作异常简单,用 Gemini 就可以一键生成:

Gemini操作界面示意
  1. Step 1:将文章提供的完整提示词复制进对话框,选择 Gemini 3 Pro 模型。
  2. Step 2:上传一张你想要扩展的图片。
  3. Step 3:等待片刻,Gemini 不仅会给你输出几千字的详细分镜脚本分析,还会生成 一张包含九个连贯镜头的九宫格电影图
Gemini生成的九宫格分镜图

三、 后续工作流

拿到这张高质量的九宫格分镜图,意味着什么?

意味着你拿到了一个直接可以立刻使用的 可视化剧本

  1. 视频生成: 你可以将这九张图裁剪开来,分别喂给 可灵 2.5。由于分镜本身的一致性极高,生成的视频片段之间的跳跃感将被降到最低,减少了后期的微调工作。
  2. 配乐生成: 再利用 Suno 根据主题生成背景音乐。
  3. 后期剪辑: 最后用剪辑软件将片段组接。

一个起承转合完整的 AI 短片,就这样诞生了。

最终成片效果

结语

AI 工具的潜力,很大程度上取决于使用它的人。

当我们通过 专业化的指令 赋予它专业的领域知识(摄影、导演、叙事),它所迸发出的生产力是惊人的。

声明:该内容由作者自行发布,观点内容仅供参考,不代表平台立场;如有侵权,请联系平台删除。