Following the release of the video AI model Sora by OpenAI, domestic manufacturers quickly followed suit, successively launching their own video model products, including Shengshu Technology, ZhiPu AI, Alibaba, and others.
On September 24, ByteDance's Volcano Engine released two large models, DouBao Video Generation - PixelDance and DouBao Video Generation - Seaweed, as well as a DouBao music model and a simultaneous translation model.
The industry of large models experienced a shock at the end of 2022, a catch-up in 2023, and gradually cooled down in 2024, with manufacturers beginning to adjust their business directions, shifting from general-purpose to application-oriented, and from training to inference.
Tan Dai, the president of Volcano Engine, stated to First Financial Daily that the cost is a significant reason why large model applications in China have not yet taken off.
Now that the industry has reduced cost prices, it is inevitable to shift towards applications.
Tan Dai indicated that the DouBao large model is not engaging in a price war but is bringing prices back to a reasonable level.
Taking the DouBao large model as an example, as of now, the daily average token usage has exceeded 1.3 trillion, a tenfold increase from the data in May.
It generates an average of 50 million images per day and processes 850,000 hours of voice data daily.
Tan Dai believes that when the price of large models is no longer a barrier to innovation, with the large-scale application by enterprises, the ability of large models to support greater concurrent traffic is becoming a key factor in the industry's development.
Regarding the revenue pressure brought by price reductions, Tan Dai said that for B2B manufacturers, the most important aspect of technology and products is sustainability.
After the price reduction, Volcano Engine has not incurred negative gross margins.
As for the specific gross margin data, Tan Dai did not disclose further.
Regarding the timing of the launch of the video large model, Tan Dai said that the DouBao video large model was mainly used internally within ByteDance in the past.
It takes time for internal technology to be opened up externally and become a commercial business, and enterprise customers generally have concerns about security and stability.
The DouBao video large model has previously been used internally at ByteDance in conjunction with the AI platform Jiemeng and the video editing app Jianying for product trials.
In July of this year, the country's first AI-generated continuous narrative sci-fi short series "Sanxingdui: Future Revelations" was launched on TikTok, produced by Bona Film Group (001330.SZ), with the AI platform Jiemeng providing the chief technical support.
This public release has addressed the issue that most video generation models can only complete simple instructions, and it can achieve natural and coherent multi-shot actions and complex interactions between multiple subjects.
It is understood that the DouBao video generation model is based on the DiT architecture, which allows the video to freely switch in large dynamics and camera movements through the DiT fusion computing unit.
Additionally, it has solved the consistency problem of multi-camera switching through the diffusion model training method, maintaining consistency in the subject, style, and atmosphere during camera transitions.

Currently, the DouBao video large model is implemented in corporate scenarios such as e-commerce marketing, animation education, urban culture and tourism, and micro-scriptwriting, and can also provide creative assistance for professional creators and artists.
The DouBao large model released this time covers all modalities including language, voice, image, and video.
However, regarding the future development of multi-modal AGI, Tan Dai told the reporter that the industry's technology still needs to work harder to barely reach the threshold of AGI.
As a product within ByteDance Group, the launch of the DouBao video large model for B-end customers by Volcano Engine also shows the group's emphasis on cloud computing business in the AI 2.0 era.
Leave a Comment