SHARP — Single-Image 3D Gaussian View Synthesis

Mirror of Apple's SHARP model weights, converted to safetensors format for use with ComfyUI-FFMPEGA.

Model Description

SHARP predicts 3D Gaussian Splat (3DGS) parameters from a single photograph in under 1 second, then renders camera trajectory videos using gsplat.

Input: Single RGB image
Output: 3D Gaussian splat representation → camera trajectory video or .ply export
Speed: <1s prediction on GPU
Resolution: 1536×1536 internal processing

Usage in ComfyUI-FFMPEGA

Set llm_model to none
Set no_llm_mode to sharp
Connect an image to image_a or image_path_a
Run — the model will auto-download on first use

Parameters

Parameter	Default	Description
`sharp_trajectory`	`rotate_forward`	Camera motion: rotate_forward, swipe, shake, rotate
`sharp_num_frames`	`60`	Number of video frames (10–300)
`sharp_max_disparity`	`0.08`	Lateral camera range
`sharp_max_zoom`	`0.15`	Zoom intensity
`sharp_save_ply`	`false`	Export .ply Gaussian splat file

License

⚠️ Research Use Only — Non-Commercial

This model is licensed under the Apple Machine Learning Research License. Model weights are restricted to Research Purposes only — non-commercial scientific research and academic development.

See LICENSE for the full terms.

Citation

@article{stier2025sharp,
  title={SHARP: Synthesizing 3D Gaussians from a Single Monocular Image with High Fidelity and Accurate Geometry},
  author={Stier, Nikolai and Wadhwa, Neal and Szeliski, Richard},
  year={2025},
  url={https://github.com/apple/ml-sharp}
}

Original Repository

Code: github.com/apple/ml-sharp (BSD-like license)
Paper: See repository for links

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Image-to-3D

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support