AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy
Large language models (LLMs) show impressive capabilities, matching and sometimes exceeding human performance in many domains. This study explores the potential of LLMs to augment judgement in...
View ArticleTowards Meta-Pruning via Optimal Transport
Structural pruning of neural networks conventionally relies on identifying and discarding less important neurons, a practice often resulting in significant accuracy loss that necessitates subsequent...
View ArticleRetrieval-Augmented Thought Process as Sequential Decision Making
Large Language Models (LLMs) have demonstrated their strong ability to assist people and show "sparks of intelligence". However, several open challenges hinder their wider application: such as concerns...
View ArticleTowards a mathematical theory for consistency training in diffusion models
Consistency models, which were proposed to mitigate the high computational overhead during the sampling phase of diffusion models, facilitate single-step sampling while attaining state-of-the-art...
View ArticleTuning-Free Stochastic Optimization
Large-scale machine learning problems make the cost of hyperparameter tuning ever more prohibitive. This creates a need for algorithms that can tune themselves on-the-fly. We formalize the notion of...
View ArticleIR-Aware ECO Timing Optimization Using Reinforcement Learning
Engineering change orders (ECOs) in late stages make minimal design fixes to recover from timing shifts due to excessive IR drops. This paper integrates IR-drop-aware timing analysis and ECO timing...
View ArticleMulti-level Optimal Control with Neural Surrogate Models
Optimal actuator and control design is studied as a multi-level optimisation problem, where the actuator design is evaluated based on the performance of the associated optimal closed loop. The...
View ArticleScalable Structure Learning for Sparse Context-Specific Causal Systems
Several approaches to graphically representing context-specific relations among jointly distributed categorical variables have been proposed, along with structure learning algorithms. While existing...
View ArticleDiffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Diffusion models have gained attention in text processing, offering many potential advantages over traditional autoregressive models. This work explores the integration of diffusion models and...
View ArticleMixed Q-Functionals: Advancing Value-Based Methods in Cooperative MARL with...
Tackling multi-agent learning problems efficiently is a challenging task in continuous action domains. While value-based algorithms excel in sample efficiency when applied to discrete action domains,...
View ArticleTowards Unified Alignment Between Agents, Humans, and Environment
The rapid progress of foundation models has led to the prosperity of autonomous agents, which leverage the universal capabilities of foundation models to conduct reasoning, decision-making, and...
View ArticleTask-conditioned adaptation of visual features in multi-task policy learning
Successfully addressing a wide variety of tasks is a core ability of autonomous agents, which requires flexibly adapting the underlying decision-making strategies and, as we argue in this work, also...
View ArticleGraph Structure Inference with BAM: Introducing the Bilinear Attention Mechanism
In statistics and machine learning, detecting dependencies in datasets is a central challenge. We propose a novel neural network model for supervised graph structure learning, i.e., the process of...
View ArticleAIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
Recently, instruction-following audio-language models have received broad attention for human-audio interaction. However, the absence of benchmarks capable of evaluating audio-centric interaction...
View ArticleGeneralization Bounds for Heavy-Tailed SDEs through the Fractional...
Understanding the generalization properties of heavy-tailed stochastic optimization algorithms has attracted increasing attention over the past years. While illuminating interesting aspects of...
View ArticleContrastive Multiple Instance Learning for Weakly Supervised Person ReID
The acquisition of large-scale, precisely labeled datasets for person re-identification (ReID) poses a significant challenge. Weakly supervised ReID has begun to address this issue, although its...
View ArticleTowards a Foundation Model for Brain Age Prediction using coVariance Neural...
Brain age is the estimate of biological age derived from neuroimaging datasets using machine learning algorithms. Increasing brain age with respect to chronological age can reflect increased...
View ArticleA Flow-based Credibility Metric for Safety-critical Pedestrian Detection
Safety is of utmost importance for perception in automated driving (AD). However, a prime safety concern in state-of-the art object detection is that standard evaluation schemes utilize safety-agnostic...
View ArticleStochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for...
We investigate the test risk of continuous-time stochastic gradient flow dynamics in learning theory. Using a path integral formulation we provide, in the regime of a small learning rate, a general...
View ArticleAutoMathText: Autonomous Data Selection with Language Models for Mathematical...
To improve language models' proficiency in mathematical reasoning via continual pretraining, we introduce a novel strategy that leverages base language models for autonomous data selection. Departing...
View Article