Few-Shot Learning: Navigating the Data Desert with Intelligence and Efficiency
Latest 50 papers on few-shot learning: Sep. 29, 2025
Few-Shot Learning: Navigating the Data Desert with Intelligence and Efficiency
In the rapidly evolving landscape of AI/ML, the appetite for data is insatiable. Yet, real-world scenarios often present a stark reality: meticulously labeled datasets are scarce, expensive, or simply unavailable. This is the few-shot learning (FSL) dilemma – how do we train robust, performant models when examples are counted in tens, not millions? Recent research offers exciting breakthroughs, pushing the boundaries of what’s possible with limited data and redefining how models learn and generalize.
The Big Idea(s) & Core Innovations:
The core challenge these papers address is to make AI models learn effectively from minimal examples, mirroring human-like rapid adaptation. A pervasive theme is the ingenious use of Large Language Models (LLMs) and Vision-Language Models (VLMs) as powerful engines for this adaptive learning, often through sophisticated prompt engineering and contextual cues. For instance, in “Mechanism of Task-oriented Information Removal in In-context Learning,” researchers from JAIST, RIKEN, and the University of Chicago propose that In-context Learning (ICL) isn’t about learning new tasks, but rather the art of removing task-irrelevant information. They pinpoint ‘Denoising Heads’ within attention mechanisms as crucial for this filtering, fundamentally altering our understanding of how ICL works.
Building on the power of LLMs, papers like “Semantic-Aware Fuzzing: An Empirical Framework for LLM-Guided, Reasoning-Driven Input Mutation” by Meng Lu et al. from Queen’s University and McGill University demonstrate how reasoning-based LLMs can revolutionize binary fuzzing. Their framework significantly boosts code coverage and bug discovery by using LLMs to generate semantically meaningful input mutations, even with zero-shot or few-shot prompts, obviating the need for fine-tuning.
The idea of intelligent guidance extends beyond language. In “Expert-Guided Explainable Few-Shot Learning for Medical Image Diagnosis”, researchers from MICCAI Workshop on Data Engineering in Medical Imaging 2025 propose integrating radiologist annotations into few-shot medical image diagnosis. By aligning Grad-CAM heatmaps with expert-defined regions, their method enhances both accuracy and crucial interpretability in low-data clinical settings. Similarly, Gao Yu Lee et al. from Nanyang Technological University in “ANROT-HELANet: Adverserially and Naturally Robust Attention-Based Aggregation Network via The Hellinger Distance for Few-Shot Classification” introduce the Hellinger distance to build robust few-shot classifiers that resist both adversarial and natural noise, pushing the boundaries of reliable classification.
Another significant innovation lies in optimizing existing models for few-shot scenarios. The paper “Improving Instruct Models for Free: A Study on Partial Adaptation” by Ozan Irsoy et al. from Bloomberg reveals a counter-intuitive finding: reducing instruction-tuning strength in LLMs can actually improve few-shot ICL performance, highlighting the ‘partial adaptation’ trade-off. In the vision-language domain, Taha Koleilat et al. from Concordia University in “Singular Value Few-shot Adaptation of Vision-Language Models” introduce CLIP-SVD, a parameter-efficient technique that uses Singular Value Decomposition (SVD) to adapt VLMs with just 0.04% of total parameters, yielding state-of-the-art results across natural and biomedical datasets. Meanwhile, Phuoc-Nguyen Bui et al. from Sungkyunkwan University propose “Attn-Adapter: Attention Is All You Need for Online Few-shot Learner of Vision-Language Model”, a lightweight online few-shot learner that dynamically refines CLIP embeddings through dual attention mechanisms for enhanced generalization.
Beyond these, the “From Channel Bias to Feature Redundancy: Uncovering the ”Less is More” Principle in Few-Shot Learning” paper by Ji Zhang et al. from Southwest Jiaotong University introduces a critical insight: for pre-trained vision models, most features can be harmful in few-shot settings due to channel bias and redundancy. Their AFIA method effectively prunes these redundant features, demonstrating that sometimes, less is indeed more. This complements insights from “The Few-shot Dilemma: Over-prompting Large Language Models” by Jiang, A. Q. et al. from Meta and Google DeepMind, which warns against over-prompting LLMs, suggesting a balanced approach to prompt engineering for better generalization.
Under the Hood: Models, Datasets, & Benchmarks:
The advancements are powered by innovative models and validated by robust datasets and benchmarks:
- MOLECULES (ICRL): The Nanyang Technological University and MIT paper, “Can LLMs Reason Over Non-Text Modalities in a Training-Free Manner? A Case Study with In-Context Representation Learning”, introduces In-Context Representation Learning (ICRL) to enable LLMs to integrate non-text modalities (e.g., molecular data) in a training-free manner. Code available at https://github.com/ztlmememe/LLMxFM_ICRL.
- MOMEMTO (Time Series): Pohang University of Science and Technology introduces MOMEMTO in “MOMEMTO: Patch-based Memory Gate Model in Time Series Foundation Model”, a time series foundation model for anomaly detection with a patch-based memory gate module. This model is evaluated on 23 univariate benchmark datasets.
- RRDataset (AI-Generated Image Detection): Chunxiao Li et al. from Beijing Normal University present RRDataset in “Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios”, a benchmark for AI-generated image detection under real-world conditions, including internet transmission and re-digitization. Data available at https://zenodo.org/records/14963880.
- U-DIADS-TL (Historical Documents): The ICDAR 2025 FEST competition, detailed in “ICDAR 2025 Competition on FEw-Shot Text line segmentation of ancient handwritten documents (FEST)” by S. Zottin et al. from University of Udine, introduces the U-DIADS-TL dataset for few-shot text line segmentation in ancient manuscripts. Related code from R. Sterzinger et al. from TU Graz can be found at https://github.com/RafaelSterzinger/acpr_few_shot_hist in “Few-Shot Connectivity-Aware Text Line Segmentation in Historical Documents”.
- DAC-FCF (Bearing Fault Diagnosis): Shengke Sun et al. from Nanjing University of Science and Technology present DAC-FCF in “An Advanced Convolutional Neural Network for Bearing Fault Diagnosis under Limited Data”, which combines Conditional CLR-GAN (CCLR-GAN) and a 1D-Fourier CNN for improved fault diagnosis. Code is at https://github.com/sunshengke/DAC-FCF.
- Galaxea Open-World Dataset (Robotics): The Galaxea Team introduces the Galaxea Open-World Dataset in “Galaxea Open-World Dataset and G0 Dual-System VLA Model” for mobile manipulation, alongside G0, a dual VLM/VLA model. The dataset and related code are at https://opengalaxea.github.io/G0/ and https://github.com/Stanford-ILIAD/openvla-mini.
- MOLE Dataset (Metadata Extraction): Zaid Alyafeai et al. from KAUST release the MOLE dataset in “MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs” for evaluating LLM-based metadata extraction from scientific papers. The dataset and code are available at https://huggingface.co/datasets/IVUL-KAUST/MOLE and https://github.com/IVUL-KAUST/MOLE/.
- WEBEYETRACK (Eye-Tracking): Eduardo Davalos et al. from Trinity University and Vanderbilt University introduce WEBEYETRACK in “WEBEYETRACK: Scalable Eye-Tracking for the Browser via On-Device Few-Shot Personalization”, an open-source framework for browser-friendly few-shot gaze estimation. Code is at https://github.com/RedForestAI/WebEyeTrack.
- JVLGS (Gas Leak Segmentation): Xinlong Zhao et al. from University of British Columbia propose JVLGS in “JVLGS: Joint Vision-Language Gas Leak Segmentation” for gas leak segmentation using visual and textual modalities. Code available at https://github.com/GeekEagle/JVLGS.
Impact & The Road Ahead:
These advancements in few-shot learning have profound implications across various domains. In robotics, methods like O3Afford by Zhiyuan Li et al. from MIT and Stanford University (“O3Afford: One-Shot 3D Object-to-Object Affordance Grounding for Generalizable Robotic Manipulation”) and MimicDroid by Rutav Shah et al. from The University of Texas at Austin (“MimicDroid: In-Context Learning for Humanoid Robot Manipulation from Human Play Videos”) are enabling robots to learn complex manipulation tasks from minimal demonstrations or even human play videos, paving the way for more adaptable and autonomous systems. In healthcare, few-shot techniques are making inroads into critical applications like cough classification (“Cough Classification using Few-Shot Learning”) and surgical skill assessment (“Exploring Pre-training Across Domains for Few-Shot Surgical Skill Assessment”), addressing the perennial challenge of limited annotated medical data. The application of LLMs in patient information extraction (“A Study of Large Language Models for Patient Information Extraction: Model Architecture, Fine-Tuning Strategy, and Multi-task Instruction Tuning”) and clinical document summarization (“MaLei at MultiClinSUM: Summarisation of Clinical Documents using Perspective-Aware Iterative Self-Prompting with LLMs” by Libo Ren et al. from University of Manchester) promises to revolutionize medical communication and research.
Industrial applications are also seeing significant gains. “Multi-task and few-shot learning in virtual flow metering” by Kristian Løvland et al. from NTNU and Solution Seeker AS shows how few-shot learning can maintain high performance in virtual flow metering even with very limited data from new wells, a critical factor for the petroleum industry. “TransMatch: A Transfer-Learning Framework for Defect Detection in Laser Powder Bed Fusion Additive Manufacturing” by Mohsen Asghari Ilani and Yaser Mike Banad from University of Oklahoma tackles quality assurance in additive manufacturing with impressive accuracy, leveraging semi-supervised few-shot learning. Furthermore, LLM-driven quantum programming in QAgent by Zhenxiao Fu et al. from Indiana University Bloomington (“QAgent: An LLM-based Multi-Agent System for Autonomous OpenQASM programming”) and network traffic classification with FlowletFormer by Liming Liu et al. from Tsinghua University (“FlowletFormer: Network Behavioral Semantic Aware Pre-training Model for Traffic Classification”) indicate a future where complex systems are managed and optimized with unprecedented intelligence and efficiency.
The horizon for few-shot learning is bright, characterized by a move towards more robust, interpretable, and generalizable models. Future work will likely focus on combining theoretical understandings of generalization with practical, efficient adaptation strategies, perhaps further refining prompt engineering, model architectures, and novel loss functions. As AI continues to integrate into highly specialized and data-scarce domains, few-shot learning will be the cornerstone of its success, enabling intelligent systems to truly learn and adapt with human-like efficiency.
Post Comment