Shanghai (Gasgoo)- On February 18, Geely Auto Group and its tech ecosystem partner Stepfun announced the open-sourcing of two multimodal AI large models—the Step-Video-T2V for video generation and the Step-Audio for voice interaction.
The collaboration leveraged both companies’ strengths in computing power, algorithms, and scenario-based training, significantly enhancing the AI models’ performance. Stepfun stated that the initiative aims to share the latest advancements in multimodal large models with the global open-source community and contribute to its development.
Step-Video-T2V
With 30 billion parameters, the Step-Video-T2V can generate high-quality videos at 540p resolution with 204 frames, ensuring exceptional information density and consistency.
To…