
米言看科技 2024-04-18 18:05:33
OpenAI Sora 是一种革命性的文本到视频 AI 模型,是人工智能和现实世界应用的重大进步。它能够从文本提示生成长达一分钟的视频,保持卓越的视觉质量。Sora 利用扩散模型将视频从静态噪声演变为连贯的视觉叙事,为 AI 技术树立了新标准。 OpenAI 还透露了将视频生成模型作为世界模拟器的研究。他们探索了在视频数据上大规模训练生成模型。具体来说,他们在可变持续时间、分辨率和纵横比的视频和图像上联合训练文本条件扩散模型。它们利用一种 transformer 架构,该架构在视频和图像潜在代码的时空补丁上运行。最大的型号 Sora 能够生成一分钟的高保真视频。结果表明,扩展视频生成模型是构建物理世界通用模拟器的一条有前途的途径。 超现实视频可用于生成超有用的 AI 训练数据。这与每年 100 倍的训练计算规模相一致。到 2025 年,这一代 OpenAI 视频可以扩展到数小时。到 2026 年,每小时可以生成数周的视频。训练数据的生成可以变成实时的多个倍数。 OpenAI Hyperrealistic AI Videos and AI Video Generation for World Simulators OpenAI Sora is a revolutionary text-to-video AI model that is a significant advance for artificial intelligence and real-world applications. It is capable of generating up to one-minute-long videos from textual prompts, maintaining exceptional visual quality. Sora utilizes a diffusion model to evolve videos from static noise into coherent visual narratives, setting a new standard in AI technology. OpenAI also revealed research for video generation models as world simulators. They explore large-scale training of generative models on video data. Specifically, they train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. They leverage a transformer architecture that operates on spacetime patches of video and image latent codes. The largest model, Sora, is capable of generating a minute of high fidelity video. The results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world. Hyperrealistic Video can be used to generate hyper-useful AI training data. This goes in line with the scaling of training compute by 100 times every year. By 2025, this OpenAI video generation could scale to many hours. By 2026, weeks of video could be generated every hour. The generation of training data could become many multiple of real-time.
0 阅读:1

