Gaussian Splatting: The Future of Real-Time 3D, From Avatars to Autonomous Driving
Latest 50 papers on gaussian splatting: Dec. 27, 2025
Prepare to have your perception of 3D rendering and scene understanding radically reshaped! Gaussian Splatting (GS) has emerged as a game-changer, rapidly evolving from a novel rendering technique to a foundational pillar across diverse AI/ML applications. This digest dives into a flurry of recent breakthroughs that are pushing the boundaries of what’s possible, from generating ultra-high-resolution images and animating lifelike avatars to enabling city-scale virtual reality and enhancing autonomous navigation.
The Big Ideas & Core Innovations
The overarching theme in recent GS research is efficiency, scalability, and realism – tackling the inherent challenges of real-world complexity with innovative solutions. We’re seeing a push to make GS more adaptive, context-aware, and capable of handling vast, dynamic environments.
One significant leap is in dynamic scene reconstruction and animation. Researchers from Princeton University and Columbia University, in their paper “4D Gaussian Splatting as a Learned Dynamical System”, introduce EvoGS, which models 4D GS as a continuous-time dynamical system. This allows for robust motion prediction and continuous temporal extrapolation, outperforming traditional deformation-based methods. Building on this, the “Prior-Enhanced Gaussian Splatting for Dynamic Scene Reconstruction from Casual Video” paper from the University of Washington and Microsoft Research enhances dynamic GS by integrating high-fidelity 2D priors through novel loss functions, achieving more realistic and coherent results for AR/VR. This is echoed in “HGS: Hybrid Gaussian Splatting with Static-Dynamic Decomposition for Compact Dynamic View Synthesis” by Xi’an Jiaotong University, which achieves real-time rendering at 125 FPS while drastically reducing model size by explicitly separating static and dynamic scene components. For highly expressive facial animation, the work from University of California, San Diego and NVIDIA, titled “Instant Expressive Gaussian Head Avatar via 3D-Aware Expression Distillation”, distills 3D-aware expressions from 2D diffusion models into animatable 3D Gaussian avatars, running orders of magnitude faster than diffusion models while preserving 3D consistency. Complementing this, the “TimeWalker: Personalized Neural Space for Lifelong Head Avatars” paper from Shanghai AI Laboratory introduces a framework for modeling and animating human head avatars across an entire lifespan, using dynamic 2D Gaussian Splatting to capture age-related changes with impressive realism.
Efficiency and scalability are also paramount. “Quantile Rendering: Efficiently Embedding High-dimensional Feature on 3D Gaussian Splatting” by NVIDIA and POSTECH introduces Q-Render, a sparse, transmittance-guided sampling strategy that drastically speeds up high-dimensional feature rendering. For large-scale environments, “Nebula: Enable City-Scale 3D Gaussian Splatting in Virtual Reality via Collaborative Rendering and Accelerated Stereo Rasterization” from Shanghai Jiao Tong University and Shanghai Qi Zhi Institute presents a collaborative rendering framework for city-scale 3DGS in VR, reducing bandwidth by 1925% and achieving significant speedups. Compression is also getting a lot of attention: “Voxel-GS: Quantized Scaffold Gaussian Splatting Compression with Run-Length Coding” by City University of Hong Kong achieves high compression ratios with faster coding, while “Lightweight 3D Gaussian Splatting Compression via Video Codec” from University of Missouri – Kansas City leverages video codecs for efficient 3DGS data compression, showing significant rate-distortion gains.
Quality and geometric consistency are being refined through advanced techniques. “DeMapGS: Simultaneous Mesh Deformation and Surface Attribute Mapping via Gaussian Splatting” by The University of Tokyo and CyberAgent offers a framework for high-quality mesh reconstruction by jointly optimizing geometry and surface attributes using structured Gaussian representation. “MatSpray: Fusing 2D Material World Knowledge on 3D Geometry” by the University of Tübingen integrates diffusion-based 2D material estimates into 3DGS, enabling physically accurate, relightable assets. In dealing with challenging inputs, “Breaking the Vicious Cycle: Coherent 3D Gaussian Splatting from Sparse and Motion-Blurred Views” sets a new state-of-the-art for generating coherent 3D splats from sparse and motion-blurred views, enhancing novel view synthesis and deblurring. The paper “MVGSR: Multi-View Consistent 3D Gaussian Super-Resolution via Epipolar Guidance” from Xi’an Jiaotong University introduces a novel framework for high-resolution 3DGS rendering by integrating multi-view information with epipolar constraints, enhancing consistency and detail.
For specialized applications, GS is proving incredibly versatile. “Pointmap-Conditioned Diffusion for Consistent Novel View Synthesis” from Huawei Paris Research Center improves novel view synthesis in urban driving scenes by combining LiDAR and RGB data with diffusion models, outperforming 3DGS in view extrapolation. Similarly, “UniGaussian: Driving Scene Reconstruction from Multiple Camera Models via Unified Gaussian Representations” from Huawei Noah’s Ark Lab tackles fisheye camera simulation in autonomous driving with unified 3D Gaussian representations. The “Computer vision training dataset generation for robotic environments using Gaussian splatting” paper from Nicolaus Copernicus University in Toruń introduces a pipeline for generating photorealistic, automatically labeled datasets for robotics, blending real and synthetic data using GS.
Under the Hood: Models, Datasets, & Benchmarks
Many of these innovations are underpinned by new models, datasets, and benchmarks that fuel progress:
- Quantile Rendering (Q-Render) & Gaussian Splatting Network (GS-Net): Introduced by NVIDIA and POSTECH (https://arxiv.org/pdf/2512.20927), GS-Net is a 3D neural network predicting high-dimensional Gaussian features, with Q-Render as an efficient sparse sampling strategy. Code is likely to be released at https://github.com/NVIDIA/Gaussian-Splatting-Net.
- SkyLume Dataset: Developed by HKUST(GZ) and the University of Liverpool (https://arxiv.org/pdf/2512.14200), this is the first large-scale UAV dataset for illumination-robust urban scene reconstruction, including LiDAR-derived ground truth.
- MAD-CARS Dataset: From Yandex and HSE University (https://arxiv.org/pdf/2506.21520), this massive dataset comprises 70K 360° car videos for diverse asset reconstruction, crucial for autonomous driving simulation. Code is at https://github.com/zju3dv/ and https://github.com/hyzhou404/HUGSIM.
- SynDM Dataset: Introduced by Northeastern University (https://arxiv.org/pdf/2512.14406), this is the first synthetic dataset for dynamic monocular NeRFs with paired primary and rotated views, enabling better evaluation of view synthesis under large angular perturbations. Code for ExpanDyNeRF is at https://github.com/ostadabbas/ExpanDyNeRF.
- TimeWalker-1.0 Dataset: From Shanghai AI Laboratory and CUHK (https://arxiv.org/pdf/2412.02421), a large-scale collection of 40 celebrities’ lifelong images to train and evaluate lifelong head avatar modeling.
- Chorus Framework: Developed by University of Amsterdam, ETH Zürich, and others (https://arxiv.org/pdf/2512.17817), this multi-teacher pretraining framework aligns a native 3DGS encoder with diverse 2D foundation models to create holistic scene encodings. Code is available at https://huggingface.co/.
- Fast-2DGS: From Clemson University and Arizona State University (https://arxiv.org/pdf/2512.12815), a lightweight framework for efficient 2D image representation using Deep Gaussian Prior. Code is at https://github.com/Aztech-Lab/Fast-2DGS.
- Nexels: A neurally-textured surfel representation from the University of Toronto, Simon Fraser University, and others (https://arxiv.org/pdf/2512.13796) that decouples geometry and appearance for real-time novel view synthesis with sparse geometries. Code is at https://lessvrong.com/cs/nexels.
- Voxel-GS: Proposed by City University of Hong Kong and University of Missouri-Kansas City (https://arxiv.org/pdf/2512.17528), a framework for compressing Gaussian splatting point clouds using lightweight rate proxies and run-length coding. Code is at https://github.com/zb12138/VoxelGS.
- GSRender: From Jasper-sudo-Sun (https://arxiv.org/pdf/2412.14579), a method for occupancy prediction using weakly supervised 3D Gaussian splatting for autonomous driving. Code at https://github.com/Jasper-sudo-Sun/GSRender.
Impact & The Road Ahead
The collective impact of this research is profound, touching upon virtually every aspect of 3D content creation and real-world perception. From AR/VR applications gaining city-scale virtual environments and hyper-realistic, expressive avatars, to autonomous driving benefiting from robust scene reconstruction under challenging conditions and efficient occupancy prediction, Gaussian Splatting is clearly a foundational technology. The advancements in compression, like those from “Voxel-GS” and “Lightweight 3D Gaussian Splatting Compression via Video Codec”, are crucial for enabling the widespread deployment of GS models on resource-constrained devices, such as lightweight drones as highlighted by “VLA-AN: An Efficient and Onboard Vision-Language-Action Framework for Aerial Navigation in Complex Environments” from Zhejiang University. Moreover, the ability to generate photorealistic synthetic data, as shown by works like “Computer vision training dataset generation for robotic environments using Gaussian splatting”, is set to accelerate the development and training of robust AI systems across industries. The meticulous work on disentangling appearance from geometry, as seen in “MatSpray” and “Using Gaussian Splats to Create High-Fidelity Facial Geometry and Texture” by Epic Games and Stanford University, promises new levels of realism and flexibility in digital asset creation. As new architectures like Qonvolutions from Samsung Research America in “Qonvolution: Towards Learning High-Frequency Signals with Queried Convolution” continue to emerge, enabling better learning of high-frequency signals, the visual fidelity of GS models will only continue to soar. The future of 3D is increasingly shaped by Gaussian Splatting, promising an exciting era of immersive, intelligent, and interactive experiences.
Share this content:
Discover more from SciPapermill
Subscribe to get the latest posts sent to your email.
Post Comment