Time Series Forecasting: Navigating Non-Stationarity, Enhancing Interpretability, and Scaling with LLMs
Latest 50 papers on time series forecasting: Dec. 21, 2025
Time series forecasting is at the forefront of AI/ML research, driven by its critical role in everything from economic predictions and smart city management to personalized healthcare. The inherent complexities of temporal data—non-stationarity, intricate inter-dependencies, and the ever-present challenge of ‘black box’ models—present formidable hurdles. However, recent breakthroughs, as illuminated by a collection of cutting-edge papers, are revolutionizing how we approach these challenges, pushing the boundaries of accuracy, efficiency, and interpretability.
The Big Idea(s) & Core Innovations
The overarching theme in recent research is the quest for more robust, adaptive, and understandable forecasting models. A major push is seen in addressing non-stationarity and concept drift. Researchers from Shanghai Jiao Tong University, Lifan Zhao and Yanyan Shen, in their paper “Proactive Model Adaptation Against Concept Drift for Online Time Series Forecasting”, introduce Proceed, a framework that proactively adapts models by estimating and translating concept drift into parameter adjustments. This directly tackles the feedback delay issue inherent in online forecasting. Similarly, Mert Sonmezer and Seyda Ertekin from Middle East Technical University present CANet in “CANet: ChronoAdaptive Network for Enhanced Long-Term Time Series Forecasting under Non-Stationarity”, which uses a Non-stationary Adaptive Normalization module to preserve temporal dynamics while adapting to statistical changes, effectively combating over-stationarization.
The integration of Large Language Models (LLMs) is another transformative trend. “Conversational Time Series Foundation Models: Towards Explainable and Effective Forecasting” by Defu Cao and collaborators from the University of Southern California and Amazon AWS, introduces TSOrchestr. This innovative framework uses LLMs as intelligent judges to coordinate ensembles of time series models, combining interpretability with numerical precision through SHAP-based finetuning. Expanding on this, “FiCoTS: Fine-to-Coarse LLM-Enhanced Hierarchical Cross-Modality Interaction for Time Series Forecasting” by Yafei Lyu et al. proposes a fine-to-coarse LLM-enhanced hierarchical cross-modality interaction framework, leveraging LLMs to filter noise and align text tokens with time series patches. Furthermore, “STELLA: Guiding Large Language Models for Time Series Forecasting with Semantic Abstractions” by J. Fan et al. and “Can Slow-Thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting” by Mingyue Cheng et al. explore how LLMs, especially ‘slow-thinking’ ones, can perform time series forecasting as a conditional reasoning task by integrating structured semantic information or multi-step reasoning. This not only enhances performance but also brings a new level of interpretability to a historically opaque field.
Efficiency and scalability for long-term forecasting are also paramount. “DPWMixer: Dual-Path Wavelet Mixer for Long-Term Time Series Forecasting” by Qianyang Li and colleagues from Xi’an Jiaotong University introduces a Haar wavelet decomposition and dual-path modeling to efficiently disentangle trends and details with linear time complexity. Qingyuan Yang et al. from Northeastern University tackle this with “FRWKV: Frequency-Domain Linear Attention for Long-Term Time Series Forecasting”, achieving linear complexity by combining frequency-domain analysis with linear attention, outperforming traditional Transformers at long horizons. For resource-constrained scenarios, “TimeDistill: Efficient Long-Term Time Series Forecasting with MLP via Cross-Architecture Distillation” by Juntong Ni et al. from Emory University shows that knowledge distillation can enable lightweight MLP models to surpass even their complex teacher models in accuracy and efficiency. Even simpler models are making a comeback, as Ruslan Gokhman (Yeshiva University) reveals in “UrbanAI 2025 Challenge: Linear vs Transformer Models for Long-Horizon Exogenous Temperature Forecasting”, demonstrating that well-designed linear models can outperform complex Transformer-family architectures in long-horizon exogenous-only temperature forecasting.
Multimodal and domain-specific challenges are also being addressed. “Adaptive Information Routing for Multimodal Time Series Forecasting” by Jun Seo et al. from LG AI Research introduces AIR, a framework that dynamically integrates textual information refined by LLMs into time series models. For specific applications, “Cross-Sample Augmented Test-Time Adaptation for Personalized Intraoperative Hypotension Prediction” by Kanxue Li et al. from Wuhan University proposes CSA-TTA to improve personalized intraoperative hypotension prediction by leveraging cross-sample augmentation, addressing rare event challenges in medical data.
Under the Hood: Models, Datasets, & Benchmarks
The recent advancements highlight a shift towards hybrid architectures, leveraging the strengths of different modeling paradigms, alongside an increased focus on robust data handling and evaluation.
- Hybrid LLM-based Orchestration: TSOrchestr and FiCoTS showcase LLMs not just as text generators but as powerful reasoning engines capable of guiding complex forecasting tasks. TSOrchestr was finetuned with R1-style finetuning guided by SHAP-based faithfulness scores on the GIFT-Eval benchmark. FiCoTS leverages dynamic heterogeneous graphs for cross-modality alignment.
- Novel Architectures: CANet introduces a Non-stationary Adaptive Normalization module. DPWMixer uses a Haar Wavelet Pyramid and Dual-Path Trend Mixer. DB2-TransF by Moulik Gupta and Achyut Mani Tripathi (G B Pant, Indian Institute of Technology) replaces self-attention with learnable Daubechies wavelets for efficiency and accuracy. PeriodNet introduces period attention and iterative grouping. Sonnet, by Yuxuan Shu and Vasileios Lampos (University College London), uses spectral analysis and wavelet transforms with Multivariable Coherence Attention (MVCA) and Koopman dynamics. Higher-Order Transformers (HOT) by Soroush Omranpour et al. (Mila, McGill University) leverage Kronecker factorization for efficient multiway tensor data modeling. AdaMamba by MinCheol Jeon (KhuyngHee Univ) introduces adaptive normalization with a hybrid Patch–Mamba–MoE temporal encoder, achieving SOTA on benchmarks like ETTh1/2, ETTm1/2, and Weather datasets.
- Specialized Models: UniDiff by Author A et al. is a unified diffusion framework for multimodal time series forecasting, using generative capabilities for diverse predictions. SimDiff, by Hang Ding et al. from Shanghai Jiao Tong University and Alibaba Group, is a simpler, end-to-end diffusion model using Normalization Independence and a Median-of-Means estimator for precise point forecasting. TARFVAE by Jiawen Wei et al. (Meituan) combines Transformer-based autoregressive flow with VAEs for efficient one-step generative forecasting. The Clustered Echo State Network (CESN) introduced by S.H. et al. (https://arxiv.org/pdf/2512.08963) improves multivariate time series prediction by organizing reservoir nodes into modular clusters, showing superior accuracy on datasets like Indian NSE stock market and solar wind data.
- Evaluation & Robustness: The paper “Channel Dependence, Limited Lookback Windows, and the Simplicity of Datasets: How Biased is Time Series Forecasting?” by Ibram Abdelmalak et al. (University of Hildesheim) highlights the critical role of lookback window tuning and dataset complexity. “Hidden Leaks in Time Series Forecasting: How Data Leakage Affects LSTM Evaluation Across Configurations and Validation Strategies” by S. Albelali and M. Ahmed warns against 10-fold cross-validation’s susceptibility to data leakage. Co-TSFA by Joel Ekstrand et al. (Halmstad University) improves robustness under anomalous conditions using contrastive regularization.
- Resource Access: Many papers provide public codebases for further exploration: CSA-TTA (https://github.com/kanxueli/CSA-TTA), Proceed (https://github.com/SJTU-DMTai/OnlineTSF), SEED (https://github.com/saber1360/SEED), CANet (https://github.com/mertsonmezer/CANet), PIR (https://github.com/icantnamemyself/PIR), DB2-TransF (https://github.com/SteadySurfdom/DB2-TransF), Forecaster (https://github.com/MullenLab/Forecaster), TimeReasoner (https://github.com/MingyueCheng/TimeReasoner), Sonnet (https://github.com/ClaudiaShu/Sonnet), FRWKV (https://github.com/yangqingyuan-byte/FRWKV), UniDiff (https://github.com/UniDiff), IdealTSF (https://github.com/LuckyLJH/IdealTSF), TS-RAG (https://github.com/UConn-DSIS/TS-RAG), DPWMixer (https://github.com/hit636/DPWMixer), AutoHFormer (https://github.com/CoderPowerBeyond/AutoHFormer), Stateful Replay (https://github.com/wenzhangdu/stateful-replay), SimDiff (https://github.com/Dear-Sloth/SimDiff/tree/main), CDF-Forecasts-with-DLNs (https://github.com/Coopez/CDF-Forecasts-with-DLNs), Higher-Order Transformers (https://github.com/s-omranpour/HOT), and TARFVAE (https://github.com/Gavine77/TARFVAE).
Impact & The Road Ahead
These advancements herald a new era for time series forecasting, making it more robust, interpretable, and efficient across diverse domains. From critical applications like personalized medicine, where CSA-TTA and guideline-based LLMs for sepsis prediction by Michael Staniek et al. (Heidelberg University, Google DeepMind) offer life-saving potential, to optimizing energy grids with causal feature selection in residential load forecasting, the real-world impact is immense. In finance, the re-evaluation of Time Series Foundation Models (TSFMs) by Eghbal Rahimikia et al. (University of Manchester, UCL) emphasizes domain-specific pre-training, paving the way for more accurate financial predictions. The accessibility provided by platforms like Forecaster by Aaron D. Mullen et al. (University of Kentucky) will democratize advanced forecasting, enabling clinicians and domain experts to leverage these tools without extensive technical expertise. Furthermore, techniques like speculative decoding for accelerating TSFMs, introduced by Pranav Subbaraman et al. (UCLA), promise to make large models viable for latency-sensitive applications.
The road ahead involves further refinement of LLM integration, exploring more sophisticated hybrid architectures, and developing universally robust evaluation protocols. The challenge of balancing model complexity with interpretability and efficiency remains, but with these groundbreaking developments, we are closer than ever to truly intelligent and trustworthy time series forecasting systems.
Share this content:
Discover more from SciPapermill
Subscribe to get the latest posts sent to your email.
Post Comment