cs.LG updates on arXiv.org

↧

Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask...

February 12, 2024, 9:00 pm

We present Premier-TACO, a multitask feature representation learning approach designed to improve few-shot policy learning efficiency in sequential decision-making tasks. Premier-TACO leverages a...

View Article

MinMaxMin $Q$-learning

February 12, 2024, 9:00 pm

MinMaxMin $Q$-learning is a novel optimistic Actor-Critic algorithm that addresses the problem of overestimation bias ($Q$-estimations are overestimating the real $Q$-values) inherent in conservative...

View Article

SQT -- std $Q$-target

February 12, 2024, 9:00 pm

Std $Q$-target is a conservative, actor-critic, ensemble, $Q$-learning-based algorithm, which is based on a single key $Q$-formula: $Q$-networks standard deviation, which is an "uncertainty penalty",...

View Article

Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes

February 12, 2024, 9:00 pm

Given the generational gap in available hardware between lay practitioners and the most endowed institutions, LLMs are becoming increasingly inaccessible as they grow in size. Whilst many approaches...

View Article

Do Transformer World Models Give Better Policy Gradients?

February 12, 2024, 9:00 pm

A natural approach for reinforcement learning is to predict future rewards by unrolling a neural network world model, and to backpropagate through the resulting computational graph to learn a policy....

View Article

ApiQ: Finetuning of 2-Bit Quantized Large Language Model

February 12, 2024, 9:00 pm

Memory-efficient finetuning of large language models (LLMs) has recently attracted huge attention with the increasing size of LLMs, primarily due to the constraints posed by GPU memory limitations and...

View Article

Denoising Diffusion Probabilistic Models in Six Simple Steps

February 12, 2024, 9:00 pm

Denoising Diffusion Probabilistic Models (DDPMs) are a very popular class of deep generative model that have been successfully applied to a diverse range of problems including image and video...

View Article

Diffusion World Model

February 12, 2024, 9:00 pm

We introduce Diffusion World Model (DWM), a conditional diffusion model capable of predicting multistep future states and rewards concurrently. As opposed to traditional one-step dynamics models, DWM...

View Article

Evading Data Contamination Detection for Language Models is (too) Easy

February 12, 2024, 9:00 pm

Large language models are widespread, with their performance on benchmarks frequently guiding user preferences for one model over another. However, the vast amount of data these models are trained on...

View Article

Review of multimodal machine learning approaches in healthcare

February 12, 2024, 9:00 pm

Machine learning methods in healthcare have traditionally focused on using data from a single modality, limiting their ability to effectively replicate the clinical practice of integrating multiple...

View Article

MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters

February 12, 2024, 9:00 pm

This paper addresses the challenge of optimizing meta-parameters (i.e., hyperparameters) in machine learning algorithms, a critical factor influencing training efficiency and model performance. Moving...

View Article

INViT: A Generalizable Routing Problem Solver with Invariant Nested View...

February 12, 2024, 9:00 pm

Recently, deep reinforcement learning has shown promising results for learning fast heuristics to solve routing problems. Meanwhile, most of the solvers suffer from generalizing to an unseen...

View Article

pFedMoE: Data-Level Personalization with Mixture of Experts for...

February 12, 2024, 9:00 pm

Federated learning (FL) has been widely adopted for collaborative training on decentralized data. However, it faces the challenges of data, system, and model heterogeneity. This has inspired the...

View Article

Improving the accuracy of freight mode choice models: A case study using the...

February 12, 2024, 9:00 pm

The US Census Bureau has collected two rounds of experimental data from the Commodity Flow Survey, providing shipment-level characteristics of nationwide commodity movements, published in 2012 (i.e.,...

View Article

PirateNets: Physics-informed Deep Learning with Residual Adaptive Networks

February 12, 2024, 9:00 pm

While physics-informed neural networks (PINNs) have become a popular deep learning framework for tackling forward and inverse problems governed by partial differential equations (PDEs), their...

View Article

Privacy-preserving data release leveraging optimal transport and particle...

February 12, 2024, 9:00 pm

We present a novel approach for differentially private data synthesis of protected tabular datasets, a relevant task in highly sensitive domains such as healthcare and government. Current...

View Article

Checkmating One, by Using Many: Combining Mixture of Experts with MCTS to...

February 12, 2024, 9:00 pm

This paper presents a new approach that integrates deep learning with computational chess, using both the Mixture of Experts (MoE) method and Monte-Carlo Tree Search (MCTS). Our methodology employs a...

View Article

PICL: Physics Informed Contrastive Learning for Partial Differential Equations

February 12, 2024, 9:00 pm

Neural operators have recently grown in popularity as Partial Differential Equation (PDEs) surrogate models. Learning solution functionals, rather than functions, has proven to be a powerful approach...

View Article

Adaptive Block Sparse Regularization under Arbitrary Linear Transform

February 12, 2024, 9:00 pm

We propose a convex and fast signal reconstruction method for block sparsity under arbitrary linear transform with unknown block structure. The proposed method is a generalization of the similar...

View Article

Cross-Space Adaptive Filter: Integrating Graph Topology and Node Attributes...

February 12, 2024, 9:00 pm

The vanilla Graph Convolutional Network (GCN) uses a low-pass filter to extract low-frequency signals from graph topology, which may lead to the over-smoothing problem when GCN goes deep. To this end,...

View Article