Gaussian Splatting Takes Over: From Real-Time Streaming to Physics-Aware Robots and Beyond

Latest 54 papers on gaussian splatting: May. 16, 2026

Gaussian Splatting (3DGS) has rapidly emerged as a groundbreaking 3D representation, offering unparalleled real-time rendering capabilities and impressive visual fidelity. Initially lauded for its ability to reconstruct static scenes, recent research is pushing the boundaries of 3DGS into dynamic environments, robotic applications, robust reconstruction under adverse conditions, and significantly enhancing its efficiency and versatility. This blog post dives into the latest breakthroughs, distilling key innovations from a collection of recent papers that showcase the incredible breadth and depth of 3DGS advancements.

The Big Ideas & Core Innovations

The fundamental challenge many of these papers address is how to make 3DGS more robust, efficient, and intelligent, especially for complex real-world scenarios. A recurring theme is the move towards physics-aware and geometrically consistent representations.

For instance, the paper PG-3DGS: Optimizing 3D Gaussian Splatting to Satisfy Physics Objectives by Lee et al. from Purdue University introduces a framework that couples differentiable physics simulation with 3DGS. This allows generating 3D structures that are not only photometrically accurate but also physically functional, as seen with pouring teapots and flying airplanes. This is achieved by using a soft occupancy mask to provide a differentiable interface to grid-based physics simulators. Complementing this, Real2Sim: A Physics-driven and Editable Gaussian Splatting Framework for Autonomous Driving Scenes by Huang et al. from Rensselaer Polytechnic Institute, unifies 4D Gaussian Splatting (4DGS) with a differentiable Material Point Method (MPM) solver. This groundbreaking approach enables physics-aware, high-fidelity synthesis of editable autonomous driving scenarios, including crucial corner cases like collisions, by treating each Gaussian primitive as a material particle.

Another significant thrust is improving reconstruction fidelity and handling challenging conditions.

Wu et al. from Fudan University introduce 3D Skew-Normal Splatting, replacing symmetric Gaussians with Azzalini’s Skew-Normal distribution to better model asymmetric structures like sharp boundaries, achieving continuous shape interpolation and consistent PSNR gains. Addressing noisy initializations, Zhou et al. from Fudan University present Denoising-GS: Gaussian Splatting with Spatial-aware Denoising, which reformulates 3DGS optimization as a denoising process, leveraging momentum-biased stochastic exploration and uncertainty-based pruning for state-of-the-art novel view synthesis with fewer primitives. For ‘in-the-wild’ scenarios, Kang et al. from Sun Yat-sen University introduce HarmoGS: Robust 3D Gaussian Splatting in the Wild via Conflict-Aware Gradient Harmonization, which explicitly reconciles cross-view gradient conflicts from transient distractors and illumination variations, leading to more stable optimization. Yuan et al. from Nankai University tackle underwater reconstruction with 3D-UIR: 3D Gaussian for Underwater 3D Scene Reconstruction via Physics Based Appearance-Medium Decoupling, disentangling object appearance from water medium effects through explicit physics-based modeling and depth-guided optimization. Liang et al. from HKUST present TransmissiveGS: Residual-Guided Disentangled Gaussian Splatting for Transmissive Scene Reconstruction and Rendering, a dual-Gaussian representation that disentangles reflected and transmitted radiance, enabling high-fidelity rendering of scenes with glass.

Efficiency, compression, and adaptability are also central to the latest innovations:

Zouein et al. from Trinity College Dublin present Efficient Dense Matching for Enhanced Gaussian Splatting Using AV1 Motion Vectors, repurposing AV1 video codec motion vectors to generate dense point clouds for 3DGS initialization, reducing training time by 63%. Yang et al. from Zhejiang University introduce SparseOIT: Improving Order-Independent Transparency 3DGS via Active Set Method, achieving significant speedups by exploiting sparsity in Gaussian activeness for OIT-based 3DGS training. Deng et al. from Shanghai Jiao Tong University address redundancy in SLAM systems with Compact 3D Gaussian Splatting For Dense Visual SLAM, achieving 226% rendering speed increase and 2.21× memory compression through voxel-anchored representation and residual codebook-based vector quantization. Wang et al. from The University of Hong Kong propose Z-Order Transformer for Feed-Forward Gaussian Splatting, organizing Gaussians into spatially coherent sequences for faster inference and fewer primitives. Zhao et al. from University of Science and Technology of China introduce 3DGS3: Joint Super Sampling and Frame Interpolation for Real-Time Large-Scale 3DGS Rendering, a post-rendering framework achieving 96 FPS at 4K resolution by jointly performing super sampling and frame interpolation. Li et al. from Qilu University of Technology introduce PD-4DGS: Progressive Decomposition of 4D Gaussian Splatting for Bandwidth-Adaptive Dynamic Scene Streaming, the first progressive compression and streaming framework for 4DGS, drastically reducing first-frame latency.

Finally, specialized applications and intelligent scene understanding are flourishing:

Qureshi et al. from University of Maryland, College Park, propose PanoPlane: Plane-Aware Panoramic Completion for Sparse-View Indoor 3D Gaussian Splatting, using layout-anchored attention steering to reconstruct closed room geometry from sparse views. Zapała et al. from Wrocław University of Science and Technology introduce FaceParts: Segmentation and Editing of Gaussian Splatting Avatars for unsupervised segmentation and editing of 3DGS avatars. Budimir et al. from University of Zagreb present Sparse Code Uplifting for Efficient 3D Language Gaussian Splatting, achieving 400x training speedup for open-vocabulary 3D understanding. Zhou et al. from The University of Sydney introduce PoseCompass: Intelligent Synthetic Pose Selection for Visual Localization, using a value-based ranking mechanism for efficient data augmentation in 3DGS-based absolute pose regression. Cao et al. from University of Electronic Science and Technology of China address query contamination in sparse-view reconstruction with GeoQuery: Geometry-Query Diffusion for Sparse-View Reconstruction, using geometry-indexed proxy queries. Li et al. from Harbin Institute of Technology introduce PairDropGS: Paired Dropout-Induced Consistency Regularization for Sparse-View Gaussian Splatting for improved training stability and reconstruction quality. Chi et al. from Xiaomi EV present PointForward: Feedforward Driving Reconstruction through Point-Aligned Representations, replacing per-pixel Gaussian prediction with sparse 3D queries for explicit cross-view consistency in driving scenes. Song et al. from Beijing Jiaotong University propose PointGS: Semantic-Consistent Unsupervised 3D Point Cloud Segmentation with 3D Gaussian Splatting which uses 3DGS as an intermediate representation to distill 2D semantics from SAM for point cloud segmentation. Nguyen et al. from QUT introduce Ilov3Splat: Instance-Level Open-Vocabulary 3D Scene Understanding in Gaussian Splatting for joint geometry and semantic optimization. Xu et al. from The University of Sydney bring Aes3D: Aesthetic Assessment in 3D Gaussian Splatting, the first framework for 3DGS scene aesthetic evaluation. Lee et al. from Sungkyunkwan University address feature bias in localization with Disambiguating 2D-3D Correspondences in Gaussian Splatting-based Feature Fields for Visual Localization. Wang et al. from Harbin Institute of Technology propose SplatWeaver: Learning to Allocate Gaussian Primitives for Generalizable Novel View Synthesis for adaptive Gaussian allocation based on scene complexity. Galappaththige et al. from QUT Centre for Robotics introduce GS-DIFF: From Pixels to Primitives: Scene Change Detection in 3D Gaussian Splatting, the first method to detect scene changes directly in 3DGS primitive space. Chen et al. from Southwest Jiaotong University present SatSurfGS: Generalizable 2D Gaussian Splatting for Sparse-View Satellite Surface Reconstruction for satellite imagery. Yang et al. from Arizona State University introduce AdpSplit: Error-Driven Adaptive Splitting for Faster Geometry Discovery in 3D Gaussian Splatting, an error-driven adaptive split operator to accelerate training. Li et al. from Technical University of Munich present OpenGaFF: Open-Vocabulary Gaussian Feature Field with Codebook Attention for spatially consistent open-vocabulary 3D scene understanding. Chou et al. from Cornell University introduce KFC-W: Generating 3D-Consistent Videos from Unposed Internet Photos which leverages 3DGS to generate 3D-consistent videos from unposed photos. Lee et al. from POSTECH propose RoDyGS: Robust Dynamic Gaussian Splatting for Casual Videos for 4D reconstruction from monocular videos, alongside the Kubric-MRig benchmark. Lee et al. from Seoul National University provide a systematic analysis of 4DGS with FreeTimeGS++: Secrets of Dynamic Gaussian Splatting and Their Principles, developing gated marginalization and neural velocity fields for stability. Jia et al. from KTH Royal Institute of Technology introduce Forecast-aware Gaussian Splatting for Predictive 3D Representation in Language-Guided Pick-and-Place Manipulation, augmenting 3DGS with task-conditioned final-state prediction for robotics. Wang et al. from Peking University introduce VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models which aligns VLA visual encoder outputs with 3D-aware features derived from 3DGS. Kim et al. from KAIST introduce DySurface: Consistent 4D Surface Reconstruction via Bridging Explicit Gaussians and Implicit Functions for temporally consistent 4D mesh reconstruction using explicit Gaussians and implicit Signed Distance Functions. Deng et al. from Beijing Institute of Technology propose PaMoSplat: Part-Aware Motion-Guided Gaussian Splatting for Dynamic Scene Reconstruction, lifting 2D segmentation masks into 3D for part-aware motion. Xing et al. from Ke Holdings Inc. introduce AdaptSplat: Adapting Vision Foundation Models for Feed-Forward 3D Gaussian Splatting with a lightweight Frequency-Preserving Adapter. Jia et al. from Hefei University of Technology present SDTalk: Structured Facial Priors and Dual-Branch Motion Fields for Generalizable Gaussian Talking Head Synthesis for one-shot talking head synthesis. Song et al. from University of California, Los Angeles, introduce ConFixGS: Learning to Fix Feedforward 3D Gaussian Splatting with Confidence-Aware Diffusion Priors in Driving Scenes which repairs feedforward 3DGS using confidence-aware diffusion priors. Mazzucchelli et al. from Arquimea Research Center propose BEA-GS: BEyond RAdiance Supervision in 3DGS for Precise Object Extraction for improved semantic object extraction using boundary and occupancy loss. Artioli et al. from Alpen-Adria-Universität present Thin-Client Interactive Gaussian Adaptive Streaming over HTTP/3 for remote 3DGS rendering and streaming. Vaara et al. from University of Oulu introduce Differentiable Ray Tracing with Gaussians for Unified Radio Propagation Simulation and View Synthesis, bridging computer vision and wireless communication domains. Li et al. from Beijing Normal-Hong Kong Baptist University introduce QuadBox: Accelerating 3D Gaussian Splatting with Geometry-Aware Boxes, using four adaptive axis-aligned bounding boxes to accelerate rendering. Gu et al. from Wuhan University propose ULF-Loc: Unbiased Landmark Feature for Robust Visual Localization with 3D Gaussian Splatting for unbiased feature for robust visual localization. Wang et al. from Institute of Computing Technology, Chinese Academy of Sciences present Ground4D: Spatially-Grounded Feedforward 4D Reconstruction for Unstructured Off-Road Scenes, a pose-free framework for 4D reconstruction in off-road scenes. de Lutio et al. from NVIDIA introduce ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models, a two-stage pipeline combining 3DGS with auto-regressive video diffusion models to enhance imperfect reconstructions. Last but not least, Jayanga Galappaththige et al. from QUT Centre for Robotics present GS-DIFF: From Pixels to Primitives: Scene Change Detection in 3D Gaussian Splatting, the first method to detect scene changes directly in 3DGS primitive space. Sunil et al. from IIT Palakkad introduce High-Fidelity Surface Splatting-Based 3D Reconstruction from Multi-View Images, improving IMLS Splatting with a novel compact polynomial kernel.

Under the Hood: Models, Datasets, & Benchmarks

The innovations above are built upon and validated by a rich ecosystem of models, datasets, and benchmarks:

Key Models & Backbones:
- DINOv2 / DINOv3-ConvNeXt / DINOv2-FiT3D: Extensively used as powerful vision foundation models for feature extraction, semantic consistency, and spatial awareness, especially in AdaptSplat, BEA-GS, and VEGA.
- FLAME parametric face model: Critical for providing structured facial priors in avatar-related tasks like FaceParts and SDTalk.
- SAM (Segment Anything Model) / SAM2: Leveraged for 2D semantic masking and instance-level understanding, then distilled into 3D Gaussians by PointGS, Ilov3Splat and HarmoGS.
- CLIP / OpenCLIP: Provides robust language features for open-vocabulary tasks in Sparse Code Uplifting, OpenGaFF, and Forecast-GS.
- VGGT (Visual Geometry Grounded Transformer): Used for robust pose estimation in challenging scenarios like motion-blurred scenes in AsyncEvGS and as a backbone for 4D reconstruction in Ground4D.
- Diffusion Models (e.g., Wan 2.1 T2V-14B): Integrated for generative scene completion and enhancement in PanoPlane, VidSplat, ArtiFixer and ConFixGS.
Datasets & Benchmarks:
- Mip-NeRF 360, Tanks & Temples, Deep Blending: Standard benchmarks for novel view synthesis and 3D reconstruction quality, appearing across almost all papers, e.g., 3D Skew-Normal Splatting, Denoising-GS, SparseOIT, 3DGS3.
- Waymo Open Dataset, nuScenes, KITTI, CARLA: Essential for autonomous driving scene reconstruction and simulation in Real2Sim, PointForward, and ConFixGS.
- Replica, ScanNet, Matterport3D: Indoor scene datasets for applications like panoramic completion and SLAM, seen in PanoPlane and Compact 3D Gaussian Splatting for Dense Visual SLAM.
- New Benchmarks:
  - Aesthetic3D Dataset: Introduced by Aes3D for 3D scene aesthetic assessment.
  - Kubric-MRig: A comprehensive benchmark for dynamic novel view synthesis, proposed by RoDyGS.
  - ReplicaMultiagent Plus: Provided by MAGS-SLAM for multi-agent SLAM.
  - AsyncEv-Deblur Dataset: For RGB-Event 3D reconstruction from motion-blurred scenes, introduced by AsyncEvGS.
Code Availability: Many papers acknowledge future public code releases, with some already providing active links:

Impact & The Road Ahead

These advancements signify a profound shift in how we perceive and interact with 3D digital content. The ability to reconstruct, simulate, and edit physically plausible 3D scenes in real-time opens doors for transformative applications:

Autonomous Driving: Real2Sim and PointForward are creating hyper-realistic, physics-aware synthetic data for training and testing autonomous vehicles, drastically reducing the costs and dangers of real-world data collection, especially for corner cases. ConFixGS specifically fixes feedforward 3DGS issues in driving scenes, making predictions more reliable.
Robotics & Manipulation: Forecast-GS and VEGA integrate language-guided reasoning with 3DGS, enabling robots to understand and execute complex manipulation tasks based on human instructions, with explicit 3D state prediction and spatial awareness.
AR/VR and Digital Twins: Real-time streaming solutions like TIGAS and progressive loading in PD-4DGS make high-fidelity 3D content accessible on lightweight devices, fueling interactive AR/VR experiences and scalable digital twin deployments. The use of Differentiable Ray Tracing with Gaussians extends digital twins to wireless communication, enabling physically accurate radio propagation simulations within visually realistic scenes.
Content Creation & Editing: FaceParts and SDTalk empower creators with intuitive tools for 3D avatar editing and realistic talking head synthesis, while PaMoSplat opens up possibilities for part-aware dynamic scene manipulation. ArtiFixer demonstrates how to enhance and extend imperfect 3D reconstructions using generative models.
Environmental Monitoring & Remote Sensing: SatSurfGS unlocks precise 3D surface reconstruction from sparse satellite imagery, invaluable for mapping, urban planning, and environmental analysis. GS-DIFF introduces a novel approach for detecting scene changes, critical for monitoring dynamic environments.

The road ahead for Gaussian Splatting is incredibly exciting. Expect continued innovations in handling extreme dynamic scenes, further integration with large language and vision models for more intelligent 3D understanding, and even more efficient compression and streaming techniques. The community is actively exploring how to bridge the gap between explicit Gaussian representations and implicit functions for robust surface extraction, as exemplified by DySurface, and how to achieve full physics-based interactions. The foundational work being laid down today is paving the way for a future where photorealistic, interactive, and intelligent 3D content is ubiquitous across all domains.

Share this content:

Spread the love

Discover more from SciPapermill

Subscribe to get the latest posts sent to your email.

Gaussian Splatting Takes Over: From Real-Time Streaming to Physics-Aware Robots and Beyond

Latest 54 papers on gaussian splatting: May. 16, 2026

The Big Ideas & Core Innovations

Under the Hood: Models, Datasets, & Benchmarks

Impact & The Road Ahead

Hi there 👋

Get a roundup of the latest AI paper digests in a quick, clean weekly email.

Discover more from SciPapermill

Post Comment Cancel reply

Latest 54 papers on gaussian splatting: May. 16, 2026

The Big Ideas & Core Innovations

Under the Hood: Models, Datasets, & Benchmarks

Impact & The Road Ahead

Hi there 👋

Get a roundup of the latest AI paper digests in a quick, clean weekly email.

Discover more from SciPapermill

Data Augmentation: The Silent Architect of Robust AI

Machine Translation Unveiled: Decoding the Latest Advancements, Challenges, and Creative Horizons

Post Comment Cancel reply

Discover more from SciPapermill