Every renderer you need — wired into one direction workspace.
Veyra is model-forward by design: pick the right engine for the shot, from Veo 3.1 and Kling 3 Pro to Nano Banana 2 and OmniHuman 1.5 — then stay in the timeline instead of re-learning a new app per vendor.
When new preview tiers land (Veo 3.1, Nano Banana 2, Kling 3, OmniHuman 1.5…), Veyra is structured so they can show up in-product without you rebuilding a pipeline.
Stills, motion, performance capture, analysis, and transcription: one creative space with credits and exports that map to how MVs are actually made.
Full in-app model catalogue
The list below is what Veyra can route to today. Exact defaults and per-environment overrides follow your deployment configuration (API keys, env model picks, and preview availability from providers).
Stills + look dev
Every image model from the in-app stills picker, including Google-native, Imagen, FLUX, and the FAL text-to-image rows — IDs below match production.
Imagen 4 (preview) on the FAL text-to-image path — `shortLabel` matches the in-app picker.
Video (Veo + FAL i2v)
Full Veo list plus the complete FAL / partner i2v catalogue: Hailuo, Kling, OmniHuman, LTX, Pika, and every Kling tier exposed in the app. Each row is the same `fal-ai/...` id you’ll see in exports and config.
Google Veo 3.1 (preview) — up to 4K where supported in-app.
Image + audio — performance and lip sync oriented.
Image + audio — performance / dialogue.
Flagship Kling i2v (broad duration range).
Lip sync
Align mouth performance to audio when the cut calls for it.
Kling — audio-to-video lip sync (FAL) for finishing closeups.
Planning + chat intelligence
Switch the brain behind the planner — Gemini (default) or GPT — same workspace, your choice per session where enabled.
Multi-step planning + structured JSON, with a flash fallback. Override via `GEMINI_PLANNER_MODEL`.
Selected planner micro-flows use this preview image model (e.g. some environment/location tools).
Audio + lyrics intelligence
Analyse the track, transcribe lyrics with word timing, and split stems when you need vocal-forward alignment.
Default `gemini-2.5-flash`, fallback `gemini-2.5-flash-lite` — overridable via `GEMINI_AUDIO_ANALYSIS_MODEL` and fallback env.
On-device (beat service) transcription with selectable weights: tiny…large-v2, large-v3, plus .en variants where applicable.
High-quality stem split before word-level lyrics when you want vocals isolated.
