Quantcast
Channel: cs.LG updates on arXiv.org
Browsing all 448 articles
Browse latest View live
↧

Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask...

We present Premier-TACO, a multitask feature representation learning approach designed to improve few-shot policy learning efficiency in sequential decision-making tasks. Premier-TACO leverages a...

View Article


MinMaxMin $Q$-learning

MinMaxMin $Q$-learning is a novel optimistic Actor-Critic algorithm that addresses the problem of overestimation bias ($Q$-estimations are overestimating the real $Q$-values) inherent in conservative...

View Article


SQT -- std $Q$-target

Std $Q$-target is a conservative, actor-critic, ensemble, $Q$-learning-based algorithm, which is based on a single key $Q$-formula: $Q$-networks standard deviation, which is an "uncertainty penalty",...

View Article

Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes

Given the generational gap in available hardware between lay practitioners and the most endowed institutions, LLMs are becoming increasingly inaccessible as they grow in size. Whilst many approaches...

View Article

Do Transformer World Models Give Better Policy Gradients?

A natural approach for reinforcement learning is to predict future rewards by unrolling a neural network world model, and to backpropagate through the resulting computational graph to learn a policy....

View Article


ApiQ: Finetuning of 2-Bit Quantized Large Language Model

Memory-efficient finetuning of large language models (LLMs) has recently attracted huge attention with the increasing size of LLMs, primarily due to the constraints posed by GPU memory limitations and...

View Article

Denoising Diffusion Probabilistic Models in Six Simple Steps

Denoising Diffusion Probabilistic Models (DDPMs) are a very popular class of deep generative model that have been successfully applied to a diverse range of problems including image and video...

View Article

Diffusion World Model

We introduce Diffusion World Model (DWM), a conditional diffusion model capable of predicting multistep future states and rewards concurrently. As opposed to traditional one-step dynamics models, DWM...

View Article


Evading Data Contamination Detection for Language Models is (too) Easy

Large language models are widespread, with their performance on benchmarks frequently guiding user preferences for one model over another. However, the vast amount of data these models are trained on...

View Article


Review of multimodal machine learning approaches in healthcare

Machine learning methods in healthcare have traditionally focused on using data from a single modality, limiting their ability to effectively replicate the clinical practice of integrating multiple...

View Article

MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters

This paper addresses the challenge of optimizing meta-parameters (i.e., hyperparameters) in machine learning algorithms, a critical factor influencing training efficiency and model performance. Moving...

View Article

INViT: A Generalizable Routing Problem Solver with Invariant Nested View...

Recently, deep reinforcement learning has shown promising results for learning fast heuristics to solve routing problems. Meanwhile, most of the solvers suffer from generalizing to an unseen...

View Article

pFedMoE: Data-Level Personalization with Mixture of Experts for...

Federated learning (FL) has been widely adopted for collaborative training on decentralized data. However, it faces the challenges of data, system, and model heterogeneity. This has inspired the...

View Article


Improving the accuracy of freight mode choice models: A case study using the...

The US Census Bureau has collected two rounds of experimental data from the Commodity Flow Survey, providing shipment-level characteristics of nationwide commodity movements, published in 2012 (i.e.,...

View Article

PirateNets: Physics-informed Deep Learning with Residual Adaptive Networks

While physics-informed neural networks (PINNs) have become a popular deep learning framework for tackling forward and inverse problems governed by partial differential equations (PDEs), their...

View Article


Privacy-preserving data release leveraging optimal transport and particle...

We present a novel approach for differentially private data synthesis of protected tabular datasets, a relevant task in highly sensitive domains such as healthcare and government. Current...

View Article

Checkmating One, by Using Many: Combining Mixture of Experts with MCTS to...

This paper presents a new approach that integrates deep learning with computational chess, using both the Mixture of Experts (MoE) method and Monte-Carlo Tree Search (MCTS). Our methodology employs a...

View Article


PICL: Physics Informed Contrastive Learning for Partial Differential Equations

Neural operators have recently grown in popularity as Partial Differential Equation (PDEs) surrogate models. Learning solution functionals, rather than functions, has proven to be a powerful approach...

View Article

Adaptive Block Sparse Regularization under Arbitrary Linear Transform

We propose a convex and fast signal reconstruction method for block sparsity under arbitrary linear transform with unknown block structure. The proposed method is a generalization of the similar...

View Article

Cross-Space Adaptive Filter: Integrating Graph Topology and Node Attributes...

The vanilla Graph Convolutional Network (GCN) uses a low-pass filter to extract low-frequency signals from graph topology, which may lead to the over-smoothing problem when GCN goes deep. To this end,...

View Article
Browsing all 448 articles
Browse latest View live