Seedance 2.0 Prompt Guide: The Complete System for Cinematic AI Video

Seedance 2.0 is the highest-ranked AI video model on the Artificial Analysis leaderboard. But the gap between a mediocre Seedance output and a cinematic one has nothing to do with the model - it's entirely in the prompt.

After analyzing hundreds of successful prompts from creators like @aimikoda, @LudovicCreator, @TheDorBrothers, @sebatheepan, and @CharaspowerAI, plus the official Seedance documentation and community testing, clear patterns emerge. Here's everything that actually works.

The 6-Step Formula

Every strong Seedance prompt follows this structure, whether the creator knows it or not:

[Subject] + [Action] + [Environment] + [Camera] + [Style] + [Constraints]

Example: "A dust-covered man stands alone in a wasteland. He slowly raises his hand - fingertips turn semi-transparent, air distorts. Handheld camera circles him in a 360-degree orbit with slight shake. 65mm IMAX, heavy film grain, strong lens flares. Eclipse-like atmosphere with dense gray fog."

That's subject (dust-covered man), action (raises hand, fingertips turn transparent), environment (wasteland), camera (handheld 360-degree orbit), style (IMAX, film grain, lens flares), and implicit constraints (the fog limits the environment complexity).

Prompt Length: The Sweet Spot

60-100 words for single shots. Up to 300 words for timeline sequences.

Shorter prompts give Seedance more creative freedom - sometimes that's good, sometimes it's chaos. Longer prompts give you control but the model loses focus beyond the sweet spot. If your prompt is over 100 words for a single shot, you're probably overspecifying.

The exception is timeline-based prompts (covered below) where you're scripting a multi-shot sequence. Those can run 200-300 words because each timestamp block is essentially its own mini-prompt.

Camera Control: One Movement at a Time

This is the single most important rule and the most common mistake.

Specify only one primary camera movement per shot. Multiple simultaneous movements cause jitter, artifacts, and confused output.

Wrong: "Camera pans left while dollying in and tilting up through the crowd"

Right: "Slow dolly in through the crowd, eye-level, steady gimbal"

Seedance responds to explicit camera vocabulary: dolly in/out, pan left/right, tracking shot, orbit, handheld, fixed/locked, crane up/down, push in. Use these terms - they're not decorative. The model understands them as specific instructions.

Default starter if you're unsure: "Slow dolly in, smooth gimbal, steady motion, no zoom." This produces clean, professional output while you figure out what you actually want.

Separate Camera from Subject

The second critical rule: describe what the camera does and what the subject does as separate actions.

Wrong: "Spinning camera around the dancing woman"

Right: "A woman dances slowly, arms raised. Camera orbits her in a steady 360-degree arc."

When you merge camera and subject movement into one description, Seedance can't distinguish who's moving. The result is usually both moving unpredictably.

Timeline Prompting: The Pro Format

Timeline-based prompts are the dominant format for professional Seedance work in 2026. You break the video into timestamp blocks, each describing a specific moment:

`[0:00-0:03]` Opening shot description
`[0:03-0:06]` Second beat
`[0:06-0:10]` Third beat

Each block gets its own camera instruction, action, and mood. This gives you editing-level control without post-production cuts.

Real Example (from @aimikoda)

FORMAT: 15s / 180 BPM / ONE CONTINUOUS SHOT

0:00-0:03: POV freefalls down a steep stairwell. The front wheel punches over the first steps, bars jackhammer below frame. Violent forward descent with brutal stair vibration. SFX: (city hum, tire chatter, rapid stair hits).

0:03-0:05: POV slashes left across a tiny landing, skips a broken crate, drops again as laundry cracks across the front hemisphere. Hard lateral shake. SFX: (cloth slap, skid, crowd shout).

Notice how each block has: action, camera behavior, physical detail, and audio cues. The SFX descriptions don't directly generate audio - they set the scene's energy level and inform the visual pacing.

Lighting: The Highest-ROI Prompt Element

Adding specific lighting descriptions produces more improvement per word than any other prompt element. Generic adjectives ("beautiful", "cinematic") do almost nothing. Specific lighting setups change everything.

Low ROI: "Beautiful cinematic lighting"

High ROI: "Harsh unshielded solar radiation, blinding white-hot rim lighting, deep chiaroscuro voids in the shadows"

High ROI: "Key light from upper camera-left with a warm fill bounce, golden hour backlight catching dust particles"

High ROI: "Eclipse-like atmosphere with dense gray fog. Only light source is an internal orange-red glow from the character, like a dying star"

Use real-world lighting references when possible. The model understands "Rembrandt lighting," "Apple keynote lighting," "documentary lighting," and "music video strobe" as distinct setups.

Style Anchors

Include at least one style keyword. Seedance responds to:

Format references: IMAX, 35mm film, 16mm, anamorphic, digital cinema

Lens specifications: 24mm wide, 50mm standard, 85mm portrait, 135mm telephoto. The model actually changes compression, bokeh, and spatial relationships based on these.

Film grain and texture: Heavy film grain, clean digital, noise, halation

Cultural references: "Wes Anderson symmetry," "Ridley Scott atmosphere," "documentary realism," "Apple product ad." These work as shorthand for complex combinations of framing, color, and pacing.

The Word You Should Never Use

Never use "fast" in a Seedance prompt.

It almost guarantees quality degradation - motion blur artifacts, temporal inconsistency, and visual noise. If you need fast pacing, describe the speed through action and physics instead:

Wrong: "Fast car chase through the city"

Right: "A black sedan tears through narrow streets, tires screeching on wet asphalt, suspension compressing through sharp turns. Camera locked to the rear bumper, frame shaking from speed."

If you absolutely need one fast element, keep everything else slow. One fast-moving subject with a steady camera works. A fast camera tracking a fast subject is where things fall apart.

Prompt Formats Compared

There are four main formats creators use with Seedance. Each has trade-offs:

Plain Text (Paragraph Style)

Best for: Simple single-shot concepts, rapid iteration

"A giant glacier wall collapses into a fjord beside a coastal city. The falling ice triggers a massive water displacement wave that surges toward the harbor. Camera sweeps over the collapsing glacier before racing toward the city."

Pros: Quick to write, natural language, good for concepting. Cons: Limited temporal control, no precise pacing.

Timeline Format

Best for: Multi-beat sequences, music videos, anything with pacing

"[0:00-0:03] Wide establishing, slow dolly. [0:03-0:06] Medium close-up, tracking. [0:06-0:10] Extreme close-up, hold."

Pros: Editing-level control, consistent results, professional standard. Cons: Takes longer to write, requires planning the sequence beforehand.

JSON Structured

Best for: API integration, complex multi-parameter control, reproducible results

`{ "shot_type": "Extreme long shot", "camera_movement": "360-degree barrel roll", "lens_spec": "22mm wide-angle, T1.5", "lighting": "Harsh solar radiation" }`

Pros: Every parameter isolated, easy to iterate on single elements, API-friendly. Cons: Verbose, less natural, overkill for simple shots.

Shot List Format

Best for: Storyboard-to-video, production workflows

"Shot 01 (0:00-2:00): Camera starts at ankle level. Shot 02 (2:00-3:30): Camera weaves between bodies. Shot 03 (3:30-5:00): Camera dips under a horse mid-stride."

Pros: Maps directly to traditional production planning, natural for directors. Cons: Can feel rigid, works better for action sequences than atmospheric pieces.

Image-to-Video

When using a reference image as a starting frame:

Describe only the motion. Don't redescribe what's visible in the image - Seedance can see it. Your text prompt should add what the image can't show: movement, camera path, temporal progression.

Use high-resolution reference images (1080p+). Lower resolution references produce lower quality output.

For character consistency across multiple generations, combine a reference image with repeated descriptive attributes in the text prompt. The reference image anchors the visual; the text reinforces specific details.

Known Limitations

Character deformation during complex movements - faces can distort during fast head turns or full-body action.

Facial consistency struggles across longer sequences - the same character may shift slightly between timeline blocks.

Full-body shots are more prone to artifacts than close-ups or medium shots.

Text rendering is unreliable - don't rely on Seedance to generate readable text in frames.

The Multi-Model Pipeline

Professional creators in 2026 don't use a single model. The emerging standard:

Seedance 2.0 for core cinematic scenes - highest visual quality, best at atmosphere and lighting.

Kling 3.0 for action sequences - better physics simulation, more natural human movement.

Runway Gen-4.5 for refinement passes - superior temporal consistency, Motion Brush for precise control.

Midjourney V8 / Uni-1 for source frames - generate the key image first, then animate with Seedance.

Know what each model is best at. Then push it there.