Original Reddit post

A few experiments exploring how far generative video + fine-tuned orchestration layers can be pushed in rhythm, camera language, body transformation, and most of all, audiovisual synchronization. Breakdown: I used Uisato Studio’ Seedance 2.0 Video mode, with the “Intelligent” setup and the “Audioreactive Performance” prompt recipe. Inputs were:

  • the artist image [full-body recomended - I ended up using a mix of Midjourney + GPT Image + Image Studio]
  • a target audio excerpt not exceeding 14.9 seconds
  • a short director’s intent describing the look, tone, and what I wanted beyond the audioreactive performance From there, the system generated the prompts, direction, and optimal setup. I reviewed it, made small adjustments, generated the clips, and then assembled the final piece in editing. What other experiments would you like to see next? submitted by /u/TasTepeler

Originally posted by u/TasTepeler on r/ArtificialInteligence