Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask...
We present Premier-TACO, a multitask feature representation learning approach designed to improve few-shot policy learning efficiency in sequential decision-making tasks. Premier-TACO leverages a...
View ArticleMinMaxMin $Q$-learning
MinMaxMin $Q$-learning is a novel optimistic Actor-Critic algorithm that addresses the problem of overestimation bias ($Q$-estimations are overestimating the real $Q$-values) inherent in conservative...
View ArticleSQT -- std $Q$-target
Std $Q$-target is a conservative, actor-critic, ensemble, $Q$-learning-based algorithm, which is based on a single key $Q$-formula: $Q$-networks standard deviation, which is an "uncertainty penalty",...
View ArticleEverybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Given the generational gap in available hardware between lay practitioners and the most endowed institutions, LLMs are becoming increasingly inaccessible as they grow in size. Whilst many approaches...
View ArticleDo Transformer World Models Give Better Policy Gradients?
A natural approach for reinforcement learning is to predict future rewards by unrolling a neural network world model, and to backpropagate through the resulting computational graph to learn a policy....
View ArticleApiQ: Finetuning of 2-Bit Quantized Large Language Model
Memory-efficient finetuning of large language models (LLMs) has recently attracted huge attention with the increasing size of LLMs, primarily due to the constraints posed by GPU memory limitations and...
View ArticleDenoising Diffusion Probabilistic Models in Six Simple Steps
Denoising Diffusion Probabilistic Models (DDPMs) are a very popular class of deep generative model that have been successfully applied to a diverse range of problems including image and video...
View ArticleDiffusion World Model
We introduce Diffusion World Model (DWM), a conditional diffusion model capable of predicting multistep future states and rewards concurrently. As opposed to traditional one-step dynamics models, DWM...
View ArticleEvading Data Contamination Detection for Language Models is (too) Easy
Large language models are widespread, with their performance on benchmarks frequently guiding user preferences for one model over another. However, the vast amount of data these models are trained on...
View ArticleReview of multimodal machine learning approaches in healthcare
Machine learning methods in healthcare have traditionally focused on using data from a single modality, limiting their ability to effectively replicate the clinical practice of integrating multiple...
View ArticleMetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters
This paper addresses the challenge of optimizing meta-parameters (i.e., hyperparameters) in machine learning algorithms, a critical factor influencing training efficiency and model performance. Moving...
View ArticleINViT: A Generalizable Routing Problem Solver with Invariant Nested View...
Recently, deep reinforcement learning has shown promising results for learning fast heuristics to solve routing problems. Meanwhile, most of the solvers suffer from generalizing to an unseen...
View ArticlepFedMoE: Data-Level Personalization with Mixture of Experts for...
Federated learning (FL) has been widely adopted for collaborative training on decentralized data. However, it faces the challenges of data, system, and model heterogeneity. This has inspired the...
View ArticleImproving the accuracy of freight mode choice models: A case study using the...
The US Census Bureau has collected two rounds of experimental data from the Commodity Flow Survey, providing shipment-level characteristics of nationwide commodity movements, published in 2012 (i.e.,...
View ArticlePirateNets: Physics-informed Deep Learning with Residual Adaptive Networks
While physics-informed neural networks (PINNs) have become a popular deep learning framework for tackling forward and inverse problems governed by partial differential equations (PDEs), their...
View ArticlePrivacy-preserving data release leveraging optimal transport and particle...
We present a novel approach for differentially private data synthesis of protected tabular datasets, a relevant task in highly sensitive domains such as healthcare and government. Current...
View ArticleCheckmating One, by Using Many: Combining Mixture of Experts with MCTS to...
This paper presents a new approach that integrates deep learning with computational chess, using both the Mixture of Experts (MoE) method and Monte-Carlo Tree Search (MCTS). Our methodology employs a...
View ArticlePICL: Physics Informed Contrastive Learning for Partial Differential Equations
Neural operators have recently grown in popularity as Partial Differential Equation (PDEs) surrogate models. Learning solution functionals, rather than functions, has proven to be a powerful approach...
View ArticleAdaptive Block Sparse Regularization under Arbitrary Linear Transform
We propose a convex and fast signal reconstruction method for block sparsity under arbitrary linear transform with unknown block structure. The proposed method is a generalization of the similar...
View ArticleCross-Space Adaptive Filter: Integrating Graph Topology and Node Attributes...
The vanilla Graph Convolutional Network (GCN) uses a low-pass filter to extract low-frequency signals from graph topology, which may lead to the over-smoothing problem when GCN goes deep. To this end,...
View Article