date: 2024-12-31
title: CV-Project-Report
status: DONE
author:
- AllenYGY
tags:
- NOTE
- CV
- Project
publish: true
CV-Project-Report
Reproducing Mip-3DGS: An Improved 3D Gaussian Splatting Framework
In this project, our goal is to reproduce the results of Mip-3DGS, an improved version of the original 3D Gaussian Splatting (3DGS) framework. While 3DGS has proven to be a breakthrough in balancing high-quality rendering and real-time performance, Mip-3DGS introduces several key enhancements to further optimize efficiency and accuracy.
Our work validates the proposed improvements, including multi-resolution Gaussian representation and adaptive hierarchical density control, which allow the model to better handle fine-grained details while maintaining computational efficiency. Additionally, we analyze the effects of these modifications on rendering quality and performance through extensive experiments on benchmark datasets.
The results of our reproduction confirm that Mip-3DGS significantly outperforms the original 3DGS in terms of rendering speed and visual fidelity, especially for scenes with high geometric complexity. This report details our reproduction process, experimental findings, and insights gained from implementing the Mip-3DGS framework.
Rendering and representing 3D scenes efficiently and accurately is always a fundamental challenge in the computer vision field, especially for real-time applications such as gaming, virtual reality, and more. Traditional methods, such as voxel grids or mesh-based approaches, often fail to balance computational efficiency and visual fidelity. Even though Neural Radiance Fields (NeRF) presented a significant breakthrough in 3D representation, its high computational cost and slow rendering speed still make it unsuitable for real-time applications.
3D Gaussian Splatting (3DGS) emerged as a solution to these challenges, introducing a novel way to represent 3D scenes using Gaussian distributions. However, even with its advantages, 3DGS still has limitations when dealing with high-detail scenes or regions requiring multi-scale representations. Mip-3DGS addresses these issues by improving upon the original framework with multi-resolution Gaussian representations and adaptive density management.
3D Gaussian Splatting built on these challenges by replacing dense volumetric grids with anisotropic Gaussian distributions, allowing more efficient computations and real-time rendering capabilities. However, it lacked sufficient flexibility to handle multi-resolution details effectively.
The objective of this project is to reproduce the results of the Mip-3DGS framework and validate its claimed improvements over the original 3DGS. Specifically, we aim to:
Mip-Splatting enhances 3D Gaussian Splatting (3DGS) by addressing:
The hypothesis is that these improvements lead to higher rendering fidelity, consistent performance across scales, and better generalization to unseen resolutions compared to the original 3DGS.
The implementation builds on the 3D Gaussian Splatting framework and introduces two key enhancements: the 3D Smoothing Filter and the 2D Mip Filter.
To constrain high-frequency artifacts during zoom-in, the 3D smoothing filter modifies each Gaussian's representation in 3D space:
Efficient convolution of Gaussians results in another Gaussian:
This filter ensures that the maximum frequency of each Gaussian does not exceed the Nyquist limit, preventing high-frequency artifacts.
To mitigate aliasing and brightness/dilation artifacts during zoom-out, the 2D Mip Filter replaces the 2D dilation operation used in 3DGS. It simulates the physical imaging process by approximating a box filter with a 2D Gaussian:
This operation ensures proper anti-aliasing by adapting the Gaussian spread to the screen-space pixel size.
The goal of this experiment was to evaluate the generalization capability of Mip-Splatting when trained on a single resolution and tested on multiple scales, simulating zoom-in and zoom-out scenarios. This setting highlights the ability of Mip-Splatting to handle out-of-distribution resolutions without explicit multi-scale supervision.
Method | PSNR (1×) | PSNR (1/2) | PSNR (1/4) | PSNR (1/8) |
---|---|---|---|---|
Mip-Splatting | 34.5 | 34.2 | 33.8 | 33.2 |
3DGS | 32.7 | 31.5 | 30.2 | 28.5 |
3DGS + EWA | 33.1 | 32.8 | 32.0 | 30.5 |
Mip-NeRF | 34.0 | 33.6 | 32.5 | 30.8 |
Zoom-In (Higher Scales):
Zoom-Out (Lower Scales):
Generalization:
Approximation Errors in 2D Mip Filtering:
The 2D Mip Filter uses a Gaussian approximation for box filtering to maintain computational efficiency. While effective in most cases, this approximation can introduce inaccuracies, particularly for very small or large screen-space projections of Gaussians.
Increased Training Complexity:
The inclusion of the 3D Smoothing Filter adds computational overhead during training, as it requires calculating maximum sampling frequencies
Scalability Challenges for Large-Scale Scenes:
The number of Gaussian primitives can grow significantly in highly complex or expansive scenes, leading to higher memory usage and reduced efficiency. This limits Mip-Splatting’s scalability for unbounded environments like large landscapes or dense urban areas.
To address the limitations of Mip-Splatting and further enhance its capabilities, the following future directions are proposed:
Improved Filtering Approximations:
Scalability Enhancements:
Dynamic Scene Support:
Efficient Training Mechanisms:
Integration with Neural Representations:
Real-Time Optimization for Large Scenes:
Mip-Splatting introduces significant advancements over traditional 3D Gaussian Splatting (3DGS) by addressing critical issues such as high-frequency artifacts during zoom-in and aliasing during zoom-out. By integrating a 3D Smoothing Filter and 2D Mip Filter, the framework enables higher rendering fidelity and consistent performance across multiple resolutions. These improvements make Mip-Splatting more robust and adaptable for real-time rendering and novel view synthesis tasks.
Through the reproduction of the Mip-Splatting framework in this project, we verified its claimed enhancements in visual fidelity and computational efficiency. Experimental results demonstrated its ability to handle multi-scale scenes effectively while maintaining high-quality renderings. However, challenges remain, including scalability for large scenes, training complexity, and dynamic scene adaptability.
Future work can build upon this foundation by addressing scalability and real-time performance limitations, exploring dynamic scene support, and integrating neural representations for further improvements. Overall, Mip-Splatting represents a promising direction for efficient, real-time 3D scene reconstruction and rendering, bridging the gap between quality and speed in novel view synthesis applications.