In-Context Learning: Revolutionizing AI Adaptation and Generalization
Latest 50 papers on in-context learning: Nov. 16, 2025
In-context learning (ICL) has rapidly emerged as a cornerstone of modern AI, allowing large language models (LLMs) and other foundation models to adapt to new tasks and generalize to unseen data without requiring extensive fine-tuning. This paradigm shift empowers models to learn from a few examples provided in the prompt, making them incredibly versatile. Recent research showcases a burgeoning landscape of innovation, addressing fundamental questions of how ICL works, how to optimize its application, and how to extend its power to diverse domains, from personalized content detection to scientific discovery and even software engineering.
The Big Idea(s) & Core Innovations
At its heart, the latest breakthroughs in ICL revolve around refining how models leverage contextual information to learn and generalize. A critical area of focus is optimizing example selection and arrangement. Challenging the prevailing notion that example selection solely dictates ICL performance, research from UC San Diego and Cushing Academy, Boston in their paper, “Order Matters: Rethinking Prompt Construction in In-Context Learning”, reveals that the order of examples in prompts has a comparable impact. This calls for a re-evaluation of prompt engineering, emphasizing the synergy between what examples are chosen and how they are presented. Building on this, the “Efficient and Effective In-context Demonstration Selection with Coreset” paper by researchers including Zihua Wang and Yu Zhang from Southeast University and Alibaba Group introduces CoDR (Coreset-based Dual Retrieval), a novel framework that uses cluster-pruning to construct diverse coresets and a dual retrieval mechanism to enhance both efficiency and effectiveness in demonstration selection, especially in complex multimodal scenarios.
Another significant theme is extending ICL’s capabilities to new modalities and problem types. The “Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm” from Fudan University and other institutions proposes ‘Thinking with Video,’ showcasing how video generation models like Sora-2 can bridge visual and textual understanding for multimodal reasoning, leveraging ICL and self-consistency. Meanwhile, Shenggan from the University of California, Berkeley, in “In-Context Adaptation of VLMs for Few-Shot Cell Detection in Optical Microscopy”, adapts Vision-Language Models (VLMs) for few-shot cell detection in biomedical imaging, demonstrating the power of multimodal ICL in low-data regimes. For quality assessment in manufacturing, Zhang, Li, and Wang from University of California, Stanford, and MIT in “In-Context-Learning-Assisted Quality Assessment Vision-Language Models for Metal Additive Manufacturing” introduce ICL with VLMs for quality assessment in metal additive manufacturing, showing its superiority over traditional ML methods.
Beyond just language, ICL is proving revolutionary for structured data. The “Generalization Can Emerge in Tabular Foundation Models From a Single Table” paper by researchers from University of Toronto, Polytechnique Montréal, and others challenges the necessity of massive datasets, showing that generalization in tabular ICL can emerge from a single real-world table via self-supervised pre-training, emphasizing feature count and task diversity over sheer data volume. Further innovating for tabular data, “Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning” from Lexsi Labs introduces a novel tabular foundation model with multi-scale sparse attention and cross-component memory, significantly boosting performance on high-dimensional tabular data.
Fundamental theoretical insights are also reshaping our understanding of ICL. Sushant Mehta and Ishan Gupta in “Scaling Laws and In-Context Learning: A Unified Theoretical Framework” establish power-law scaling relationships for ICL performance and prove that transformers implement gradient descent during forward passes. Furthermore, “Vocabulary In-Context Learning in Transformers: Benefits of Positional Encoding” by Qian Ma, Ruoxiang Xu, and Yongqiang Cai from Beijing Normal University highlights the critical role of positional encoding in enabling transformers to achieve the Universal Approximation Property (UAP) in ICL tasks.
Under the Hood: Models, Datasets, & Benchmarks
The advancement of in-context learning is inextricably linked to the development and rigorous testing of new models, specialized datasets, and challenging benchmarks:
- Heuristic Transformer (HT): Introduced by Oliver Dippel, Alexei Lisitsa, and Bei Peng from the Universities of Liverpool and Sheffield in “Heuristic Transformer: Belief Augmented In-Context Reinforcement Learning”, HT is an ICRL framework enhanced by belief-based reward modeling using a variational auto-encoder (VAE). It demonstrates superior performance across environments like Darkroom, Miniworld, and MuJoCo.
- Do-PFN: A foundation model for causal effect estimation from observational data using ICL, proposed by Jake Robertson et al. from Prior Labs and ELLIS Institute Tübingen in “Do-PFN: In-Context Learning for Causal Effect Estimation”. It’s evaluated on over 1,000 synthetic datasets and the RealCause benchmark. Code: https://github.com/jr2021/Do-PFN
- LG-DUMAP: Presented by Sai Puppala et al. from the University of Texas at El Paso and Southern Illinois University Carbondale in “LLM-Guided Dynamic-UMAP for Personalized Federated Graph Learning”, this framework enhances personalized federated graph learning using LLMs, prompt tuning, and variational methods for low-resource settings. It integrates a parametric UMAP-style manifold objective.
- PtychoBench: A novel multi-modal, multi-task benchmark dataset for X-ray ptychographic analysis, introduced by Robinson Umeike et al. from The University of Alabama and Argonne National Laboratory in “Adapting General-Purpose Foundation Models for X-ray Ptychography in Low-Data Regimes”. It compares SFT and ICL strategies using models like GPT-4o and DINOv3-based classifiers.
- FirstAidQA: The first synthetic QA dataset tailored to first aid and emergency response, comprising 5,500 question-answer pairs, created by Saiyma Sittul Muna et al. from Islamic University of Technology, Dhaka in “FirstAidQA: A Synthetic Dataset for First Aid and Emergency Response in Low-Connectivity Settings”. Publicly available at https://huggingface.co/datasets/i-am-mushfiq/FirstAidQA.
- BHEPC: The first large-scale, high-quality Bhili-Hindi-English Parallel Corpus with 110,000 sentences, introduced by Pooja Singh et al. from IIT Delhi in “Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel Corpus”. It benchmarks multilingual models like mT5, Qwen3, DeepSeek-V3, Gemma-2-9B, and GPT series.
- TCSR-SQL & TCD Dataset: Wenbo Xu et al. from Harbin Institute of Technology (Shenzhen) in “TCSR-SQL: Towards Table Content-aware Text-to-SQL with Self-retrieval” introduces a self-retrieval text-to-SQL method and the TCD dataset with 2115 question-SQL pairs to handle ambiguous data in content-aware text-to-SQL tasks.
- LATTLE: A framework by Ibna Kowsar et al. from Tennessee State University in “LLM Attention Transplant for Transfer Learning of Tabular Data Across Disparate Domains” that transplants selective attention weights from an LLM to a gated feature tokenized transformer (gFTT) for cross-domain tabular data transfer learning.
- DoPE: A novel method by Jing Xiong et al. from The University of Hong Kong in “DoPE: Denoising Rotary Position Embedding” that improves length extrapolation in Transformers by addressing attention sinks using truncated matrix entropy. This is a theoretical contribution that enhances existing models.
- HAPAX: A training regime introduced by Kerem S¸ahin et al. from Northeastern University in “In-Context Learning Without Copying”, which suppresses inductive copying in LLMs, yet maintains strong abstractive ICL performance. Code: https://hapax.baulab.info.
- FTS-OBP: A flexible evaluation method for ABSA, along with a study on Small Decoder-Only Models (SLMs) for low-resource domains, presented by Yan Cathy Hua et al. from the University of Auckland in “Data-Efficient Adaptation and a Novel Evaluation Method for Aspect-based Sentiment Analysis”. Code: https://github.com/yhua219/ftsobp_and_edurabsa_slm.
Impact & The Road Ahead
The impact of these advancements spans a wide array of AI applications. From enhancing the robustness of autonomous emergency response systems with generative AI, as explored by Author A and Author B from XYZ University and ABC Research Lab in “Advancing Autonomous Emergency Response Systems: A Generative AI Perspective”, to enabling highly secure and personalized federated learning with LLMs, as demonstrated by Dongcheng Li et al. from Guangxi Normal University in “Implicit Federated In-context Learning For Task-Specific LLM Fine-Tuning”, ICL is making AI more adaptable and efficient.
The research also points to intriguing avenues for future exploration. The notion that “Repetitions are not all alike: distinct mechanisms sustain repetition in language models” by Matéo Mahaut and Francesca Franzon from Universitat Pompeu Fabra suggests a deeper understanding of LLM internal dynamics, crucial for improving generation quality. The critical analysis of Chain-of-Thought (CoT) prompting by Tianshi Zheng et al. from The Hong Kong University of Science and Technology and NVIDIA in “The Curse of CoT: On the Limitations of Chain-of-Thought in In-Context Learning” underscores the need for more nuanced reasoning methodologies beyond simple step-by-step thinking.
Further, the integration of ICL with privacy-preserving techniques, such as the differentially private framework using nearest neighbor search by Antti Koskela et al. from Nokia Bell Labs in “Differentially Private In-Context Learning with Nearest Neighbor Search”, opens pathways for secure and ethical AI deployment. The development of frameworks like “MCP4IFC: IFC-Based Building Design Using Large Language Models” by Bharathi Kannan Nithyanantham et al. from University of Rostock to enable LLMs to manipulate BIM data through natural language is poised to revolutionize industries like architecture, engineering, and construction.
Ultimately, the journey through these diverse papers reveals a common thread: ICL is not merely a trick for LLMs but a fundamental emergent property of large-scale models, offering a powerful paradigm for rapid adaptation and generalization. As research continues to unravel its mechanistic underpinnings, optimize its application, and expand its reach, we can anticipate an even more intelligent, adaptable, and domain-agnostic future for AI.
Share this content:
Post Comment