mochi-1-preview

Mochi-1 is a state-of-the-art open-source text-to-video generation model developed by Genmo. It achieves high-fidelity motion and strong prompt adherence in preliminary evaluations, significantly narrowing the gap between closed and open video generation systems.

Core components:

  • Asymmetric Diffusion Transformer (AsymmDiT): A 10-billion-parameter diffusion model built on a novel architecture.
  • Asymmetric Variational Autoencoder (AsymmVAE): Encoding-decoding model with an asymmetric structure
  • Text Encoding: Uses a single T5-XXL language model for prompt encoding, avoiding reliance on multiple pretrained language models.

Key Features:

  • Single-GPU Setup requires ~60GB VRAM for full operation.Recommended at least one NVIDIA H100 GPU for optimal performance. Supports multi-GPU operation with context parallel implementation. Memory Optimization options include CPU offloading, VAE tiling, and lower precision (bfloat16) variants to reduce VRAM usage.
  • Current output resolution is 480p.
  • The model supports generation of videos with photorealistic styles. Minor warping/distortions may occur in videos with extreme motion. Performs poorly with animated content.
  • Organizations are advised to implement additional safety measures before deploying in commercial applications. 

The model is a component of the video generation pipeline, consisting of:

  • Text encoder: ~4.8B parameters,
  • Transformer: ~10B parameters,
  • VAE: ~460M parameters.

Total: ~10.5B parameters


The 'VRAM requirements' entry is calculated based on the value provided by the authors (60 GB, fp32) with a multiplier corresponding to the assumed precision (e.g., 0.5 for 16-bit, 0.25 for 8-bit, and 0.125 for 4-bit).


Announce Date: 22.10.2024
Parameters: 0.01B
VRAM requirements: 7.5 GB using 4 bits quantization, 15.0 GB using 8 bits quantization, 30.0 GB using 16 bits quantization
Developer: genmo
License: Apache 2.0

Public endpoint

Use our pre-built public endpoints for free to test inference and explore mochi-1-preview capabilities. You can obtain an API access token on the token management page after registration and verification.
Model Name Context Type GPU TPS Status Link
There are no public endpoints for this model yet.

Private server

Rent your own physically dedicated instance with hourly or long-term monthly billing.

We recommend deploying private instances in the following scenarios:

  • maximize endpoint performance,
  • enable full context for long sequences,
  • ensure top-tier security for data processing in an isolated, dedicated environment,
  • use custom weights, such as fine-tuned models or LoRA adapters.

Recommended configurations for hosting mochi-1-preview

Prices:
Name vCPU RAM, MB Disk, GB GPU Price, hour
teslav100-1.12.64.160 12 65536 160 1 $1.20 Launch
rtx5090-1.16.64.160 16 65536 160 1 $1.59 Launch
teslaa100-1.16.64.160 16 65536 160 1 $2.37 Launch
teslah100-1.16.64.160 16 65536 160 1 $3.83 Launch
h200-1.16.128.160 16 131072 160 1 $4.74 Launch
Prices:
Name vCPU RAM, MB Disk, GB GPU Price, hour
teslat4-1.16.16.160 16 16384 160 1 $0.33 Launch
teslaa2-1.16.32.160 16 32768 160 1 $0.38 Launch
teslaa10-1.16.32.160 16 32768 160 1 $0.53 Launch
rtx3090-1.16.24.160 16 24576 160 1 $0.88 Launch
rtx4090-1.16.32.160 16 32768 160 1 $1.15 Launch
teslav100-1.12.64.160 12 65536 160 1 $1.20 Launch
rtx5090-1.16.64.160 16 65536 160 1 $1.59 Launch
teslaa100-1.16.64.160 16 65536 160 1 $2.37 Launch
teslah100-1.16.64.160 16 65536 160 1 $3.83 Launch
h200-1.16.128.160 16 131072 160 1 $4.74 Launch
Prices:
Name vCPU RAM, MB Disk, GB GPU Price, hour
teslat4-1.16.16.160 16 16384 160 1 $0.33 Launch
rtx2080ti-1.10.16.500 10 16384 500 1 $0.38 Launch
teslaa2-1.16.32.160 16 32768 160 1 $0.38 Launch
teslaa10-1.16.32.160 16 32768 160 1 $0.53 Launch
rtx3080-1.16.32.160 16 32768 160 1 $0.57 Launch
rtx3090-1.16.24.160 16 24576 160 1 $0.88 Launch
rtx4090-1.16.32.160 16 32768 160 1 $1.15 Launch
teslav100-1.12.64.160 12 65536 160 1 $1.20 Launch
rtx5090-1.16.64.160 16 65536 160 1 $1.59 Launch
teslaa100-1.16.64.160 16 65536 160 1 $2.37 Launch
teslah100-1.16.64.160 16 65536 160 1 $3.83 Launch
h200-1.16.128.160 16 131072 160 1 $4.74 Launch

Related models

Need help?

Contact our dedicated neural networks support team at nn@immers.cloud or send your request to the sales department at sale@immers.cloud.