SHARP — Single-Image 3D Gaussian View Synthesis

Mirror of Apple's SHARP model weights, converted to safetensors format for use with ComfyUI-FFMPEGA.

Model Description

SHARP predicts 3D Gaussian Splat (3DGS) parameters from a single photograph in under 1 second, then renders camera trajectory videos using gsplat.

  • Input: Single RGB image
  • Output: 3D Gaussian splat representation → camera trajectory video or .ply export
  • Speed: <1s prediction on GPU
  • Resolution: 1536×1536 internal processing

Usage in ComfyUI-FFMPEGA

  1. Set llm_model to none
  2. Set no_llm_mode to sharp
  3. Connect an image to image_a or image_path_a
  4. Run — the model will auto-download on first use

Parameters

Parameter Default Description
sharp_trajectory rotate_forward Camera motion: rotate_forward, swipe, shake, rotate
sharp_num_frames 60 Number of video frames (10–300)
sharp_max_disparity 0.08 Lateral camera range
sharp_max_zoom 0.15 Zoom intensity
sharp_save_ply false Export .ply Gaussian splat file

License

⚠️ Research Use Only — Non-Commercial

This model is licensed under the Apple Machine Learning Research License. Model weights are restricted to Research Purposes only — non-commercial scientific research and academic development.

See LICENSE for the full terms.

Citation

@article{stier2025sharp,
  title={SHARP: Synthesizing 3D Gaussians from a Single Monocular Image with High Fidelity and Accurate Geometry},
  author={Stier, Nikolai and Wadhwa, Neal and Szeliski, Richard},
  year={2025},
  url={https://github.com/apple/ml-sharp}
}

Original Repository

  • Code: github.com/apple/ml-sharp (BSD-like license)
  • Paper: See repository for links
  • Copyright: © 2025 Apple Inc. All Rights Reserved.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support