Chinese tech giants, AI ‘godmother’ Li Fei-Fei race to seize the edge in world models
Summarized and contextualized by DistantNews.
TLDR
- Chinese tech giants and AI pioneer Li Fei-Fei are developing "world models" to extend AI beyond language to understanding the physical world.
- Alibaba unveiled Happy Oyster, an open-ended model for real-time virtual world creation, capable of generating longer video clips and continuous interaction.
- World Labs, co-founded by Li, launched Spark 2.0, an open-source rendering engine enabling lower-powered devices to view detailed 3D images.
The global race to advance artificial intelligence is heating up, with major Chinese tech firms and prominent AI researchers pushing the boundaries of what machines can understand. The focus is shifting from purely language-based AI to "world models" – systems designed to comprehend and interact with the physical reality around us.
Alibaba Group Holding on Thursday unveiled Happy Oyster, which it called an open-ended world model designed for real-time and “flowy” virtual world creation and interaction, according to a statement from the e-commerce group’s Alibaba Token Hub (ATH) business unit, newly formed to consolidate its core AI initiatives.
Alibaba, a titan of Chinese e-commerce, has entered this arena with Happy Oyster. This innovative model aims to revolutionize virtual world creation, allowing for real-time, dynamic interactions and the generation of video clips up to three minutes long. Unlike previous AI tools that produced static, short clips, Happy Oyster offers a more fluid and continuous experience, enabling users to develop imaginative digital environments through text and image prompts.
Unlike conventional AI video tools, which generate one-off clips that top out at a dozen seconds or a few minutes, Happy Oyster could generate video clips of up to three minutes showing virtual worlds, the company said.
Meanwhile, Stanford professor Li Fei-Fei, often dubbed the "godmother of AI," is also making significant strides. Her company, World Labs, has introduced Spark 2.0, an open-source 3D rendering engine. This technology promises to democratize access to detailed 3D imagery, making it viewable even on less powerful devices like smartphones. This development could significantly broaden the applications of immersive technologies.
This meant users could keep developing their imaginative worlds with new ideas, ATH said.
The advancements from both Alibaba and Li Fei-Fei's ventures highlight China's ambition to lead in the next wave of AI innovation. By focusing on world models and accessible 3D technologies, these efforts aim to bridge the gap between the digital and physical realms, potentially unlocking new possibilities in virtual reality, gaming, and beyond.
an open-source 3D Gaussian splatting rendering engine that aims to give even less powerful devices, such as smartphones, the ability to view large-scale and detailed 3D images.
Originally published by South China Morning Post. Summarized and contextualized by our editorial team with added local perspective. Read our editorial standards.