cs.LG updates on arXiv.org

↧

A Tale of Tails: Model Collapse as a Change of Scaling Laws

February 12, 2024, 9:00 pm

As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data....

View Article

Distilling Symbolic Priors for Concept Learning into Neural Networks

February 12, 2024, 9:00 pm

Humans can learn new concepts from a small number of examples by drawing on their inductive biases. These inductive biases have previously been captured by using Bayesian models defined over symbolic...

View Article

Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models

February 12, 2024, 9:00 pm

Large Language Models (LLMs) based on Mixture-of-Experts (MoE) architecture are showing promising performance on various tasks. However, running them on resource-constrained settings, where GPU memory...

View Article

Informativeness of Reward Functions in Reinforcement Learning

February 12, 2024, 9:00 pm

Reward functions are central in specifying the task we want a reinforcement learning agent to perform. Given a task and desired optimal behavior, we study the problem of designing informative reward...

View Article

FedImpro: Measuring and Improving Client Update in Federated Learning

February 12, 2024, 9:00 pm

Federated Learning (FL) models often experience client drift caused by heterogeneous data, where the distribution of data differs across clients. To address this issue, advanced research primarily...

View Article

Clients Collaborate: Flexible Differentially Private Federated Learning with...

February 12, 2024, 9:00 pm

To defend against privacy leakage of user data, differential privacy is widely used in federated learning, but it is not free. The addition of noise randomly disrupts the semantic integrity of the...

View Article

Guided Sketch-Based Program Induction by Search Gradients

February 12, 2024, 9:00 pm

Many tasks can be easily solved using machine learning techniques. However, some tasks cannot readily be solved using statistical models, requiring a symbolic approach instead. Program induction is one...

View Article

Non-linear Fusion in Federated Learning: A Hypernetwork Approach to Federated...

February 12, 2024, 9:00 pm

Federated Learning (FL) has emerged as a promising paradigm in which multiple clients collaboratively train a shared global model while preserving data privacy. To create a robust and practicable FL...

View Article

In-Context Data Distillation with TabPFN

February 12, 2024, 9:00 pm

Foundation models have revolutionized tasks in computer vision and natural language processing. However, in the realm of tabular data, tree-based models like XGBoost continue to dominate. TabPFN, a...

View Article

Contextual Stochastic Vehicle Routing with Time Windows

February 12, 2024, 9:00 pm

We study the vehicle routing problem with time windows (VRPTW) and stochastic travel times, in which the decision-maker observes related contextual information, represented as feature variables, before...

View Article

DeepCover: Advancing RNN Test Coverage and Online Error Prediction using...

February 12, 2024, 9:00 pm

Recurrent neural networks (RNNs) have emerged as powerful tools for processing sequential data in various fields, including natural language processing and speech recognition. However, the lack of...

View Article

Tree Ensembles for Contextual Bandits

February 12, 2024, 9:00 pm

We propose a novel framework for contextual multi-armed bandits based on tree ensembles. Our framework integrates two widely used bandit methods, Upper Confidence Bound and Thompson Sampling, for both...

View Article

Training dynamics in Physics-Informed Neural Networks with feature mapping

February 12, 2024, 9:00 pm

Physics-Informed Neural Networks (PINNs) have emerged as an iconic machine learning approach for solving Partial Differential Equations (PDEs). Although its variants have achieved significant progress,...

View Article

OpenFedLLM: Training Large Language Models on Decentralized Private Data via...

February 12, 2024, 9:00 pm

Trained on massive publicly available data, large language models (LLMs) have demonstrated tremendous success across various fields. While more data contributes to better performance, a disconcerting...

View Article

Assessing Uncertainty Estimation Methods for 3D Image Segmentation under...

February 12, 2024, 9:00 pm

In recent years, machine learning has witnessed extensive adoption across various sectors, yet its application in medical image-based disease detection and diagnosis remains challenging due to...

View Article

Learning Attributed Graphlets: Predictive Graph Mining by Graphlets with...

February 12, 2024, 9:00 pm

The graph classification problem has been widely studied; however, achieving an interpretable model with high predictive performance remains a challenging issue. This paper proposes an interpretable...

View Article

Clustering Techniques Selection for a Hybrid Regression Model: A Case Study...

February 12, 2024, 9:00 pm

This work addresses the performance comparison between four clustering techniques with the objective of achieving strong hybrid models in supervised learning tasks. A real dataset from a bio-climatic...

View Article

Generating Chain-of-Thoughts with a Direct Pairwise-Comparison Approach to...

February 12, 2024, 9:00 pm

To improve the ability of the large language model (LLMs) to handle complex reasoning problems, chain-of-thoughts (CoT) methods were proposed to guide LLMs to reason step-by-step, facilitating problem...

View Article

Solving Deep Reinforcement Learning Benchmarks with Linear Policy Networks

February 12, 2024, 9:00 pm

Although Deep Reinforcement Learning (DRL) methods can learn effective policies for challenging problems such as Atari games and robotics tasks, algorithms are complex and training times are often...

View Article

Topological Neural Networks: Mitigating the Bottlenecks of Graph Neural...

February 12, 2024, 9:00 pm

The irreducible complexity of natural phenomena has led Graph Neural Networks to be employed as a standard model to perform representation learning tasks on graph-structured data. While their capacity...

View Article

Understanding Test-Time Augmentation

February 12, 2024, 9:00 pm

Test-Time Augmentation (TTA) is a very powerful heuristic that takes advantage of data augmentation during testing to produce averaged output. Despite the experimental effectiveness of TTA, there is...

View Article

Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF

February 12, 2024, 9:00 pm

Bilevel optimization has been recently applied to many machine learning tasks. However, their applications have been restricted to the supervised learning setting, where static objective functions with...

View Article

Discriminative Adversarial Unlearning

February 12, 2024, 9:00 pm

We introduce a novel machine unlearning framework founded upon the established principles of the min-max optimization paradigm. We capitalize on the capabilities of strong Membership Inference Attacks...

View Article

LiRank: Industrial Large Scale Ranking Models at LinkedIn

February 12, 2024, 9:00 pm

We present LiRank, a large-scale ranking framework at LinkedIn that brings to production state-of-the-art modeling architectures and optimization methods. We unveil several modeling improvements,...

View Article

For Better or For Worse? Learning Minimum Variance Features With Label...

February 12, 2024, 9:00 pm

Data augmentation has been pivotal in successfully training deep learning models on classification tasks over the past decade. An important subclass of data augmentation techniques - which includes...

View Article

RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations

February 12, 2024, 9:00 pm

There is considerable work on improving robustness against adversarial attacks bounded by a single $l_p$ norm using adversarial training (AT). However, the multiple-norm robustness (union accuracy) of...

View Article

Forecasting Events in Soccer Matches Through Language

February 12, 2024, 9:00 pm

This paper introduces an approach to predicting the next event in a soccer match, a challenge bearing remarkable similarities to the problem faced by Large Language Models (LLMs). Unlike other methods...

View Article

Monitored Markov Decision Processes

February 12, 2024, 9:00 pm

In reinforcement learning (RL), an agent learns to perform a task by interacting with an environment and receiving feedback (a numerical reward) for its actions. However, the assumption that rewards...

View Article

Towards a Systematic Approach to Design New Ensemble Learning Algorithms

February 12, 2024, 9:00 pm

Ensemble learning has been a focal point of machine learning research due to its potential to improve predictive performance. This study revisits the foundational work on ensemble error decomposition,...

View Article

Estimating Player Performance in Different Contexts Using Fine-tuned Large...

February 12, 2024, 9:00 pm

This paper introduces an innovative application of Large Event Models (LEMs), akin to Large Language Models, to the domain of soccer analytics. By learning the "language" of soccer - predicting...

View Article

A Kalman Filter Based Framework for Monitoring the Performance of In-Hospital...

February 12, 2024, 9:00 pm

Unlike in a clinical trial, where researchers get to determine the least number of positive and negative samples required, or in a machine learning study where the size and the class distribution of...

View Article

Explain Variance of Prediction in Variational Time Series Models for Clinical...

February 12, 2024, 9:00 pm

In healthcare, thanks to many model agnostic methods, explainability of the prediction scores made by deep learning applications has improved. However, we note that for daily or hourly risk of...

View Article

Generative Nowcasting of Marine Fog Visibility in the Grand Banks area and...

February 12, 2024, 9:00 pm

This study presents the application of generative deep learning techniques to evaluate marine fog visibility nowcasting using the FATIMA (Fog and turbulence interactions in the marine atmosphere)...

View Article

Scalable Kernel Logistic Regression with Nystr\"om Approximation: Theoretical...

February 12, 2024, 9:00 pm

The application of kernel-based Machine Learning (ML) techniques to discrete choice modelling using large datasets often faces challenges due to memory requirements and the considerable number of...

View Article

Embedding Compression for Teacher-to-Student Knowledge Transfer

February 12, 2024, 9:00 pm

Common knowledge distillation methods require the teacher model and the student model to be trained on the same task. However, the usage of embeddings as teachers has also been proposed for different...

View Article

Convergence of Gradient Descent with Small Initialization for Unregularized...

February 12, 2024, 9:00 pm

We study the problem of symmetric matrix completion, where the goal is to reconstruct a positive semidefinite matrix $\rm{X}^\star \in \mathbb{R}^{d\times d}$ of rank-$r$, parameterized by...

View Article

Low-Rank Learning by Design: the Role of Network Architecture and Activation...

February 12, 2024, 9:00 pm

Our understanding of learning dynamics of deep neural networks (DNNs) remains incomplete. Recent research has begun to uncover the mathematical principles underlying these networks, including the...

View Article

ExGRG: Explicitly-Generated Relation Graph for Self-Supervised Representation...

February 12, 2024, 9:00 pm

Self-supervised Learning (SSL) has emerged as a powerful technique in pre-training deep learning models without relying on expensive annotated labels, instead leveraging embedded signals in unlabeled...

View Article

Corruption Robust Offline Reinforcement Learning with Human Feedback

February 12, 2024, 9:00 pm

We study data corruption robustness for reinforcement learning with human feedback (RLHF) in an offline setting. Given an offline dataset of pairs of trajectories along with feedback about human...

View Article

Dynamic Graph Information Bottleneck

February 12, 2024, 9:00 pm

Dynamic Graphs widely exist in the real world, which carry complicated spatial and temporal feature patterns, challenging their representation learning. Dynamic Graph Neural Networks (DGNNs) have shown...

View Article

Electricity Price Forecasting in the Irish Balancing Market

February 12, 2024, 9:00 pm

Short-term electricity markets are becoming more relevant due to less-predictable renewable energy sources, attracting considerable attention from the industry. The balancing market is the closest to...

View Article

Multi-class real-time crash risk forecasting using convolutional neural...

February 12, 2024, 9:00 pm

The performance of an artificial neural network (ANN) in forecasting crash risk is shown in this paper. To begin, some traffic and weather data are acquired as raw data. This data is then analyzed, and...

View Article

Entropy-Regularized Token-Level Policy Optimization for Large Language Models

February 12, 2024, 9:00 pm

Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. Traditional approaches often depend on meticulously designed prompts, high-quality examples,...

View Article

Feed-Forward Neural Networks as a Mixed-Integer Program

February 12, 2024, 9:00 pm

Deep neural networks (DNNs) are widely studied in various applications. A DNN consists of layers of neurons that compute affine combinations, apply nonlinear operations, and produce corresponding...

View Article

FL-NAS: Towards Fairness of NAS for Resource Constrained Devices via Large...

February 12, 2024, 9:00 pm

Neural Architecture Search (NAS) has become the de fecto tools in the industry in automating the design of deep neural networks for various applications, especially those driven by mobile and edge...

View Article

Scaling Intelligent Agents in Combat Simulations for Wargaming

February 12, 2024, 9:00 pm

Remaining competitive in future conflicts with technologically-advanced competitors requires us to accelerate our research and development in artificial intelligence (AI) for wargaming. More...

View Article

Comparison of machine learning and statistical approaches for digital...

February 12, 2024, 9:00 pm

Several methods have been proposed for correcting the elevation bias in digital elevation models (DEMs) for example, linear regression. Nowadays, supervised machine learning enables the modelling of...

View Article

A Masked language model for multi-source EHR trajectories contextual...

February 12, 2024, 9:00 pm

Using electronic health records data and machine learning to guide future decisions needs to address challenges, including 1) long/short-term dependencies and 2) interactions between diseases and...

View Article

Sign Rank Limitations for Attention-Based Graph Decoders

February 12, 2024, 9:00 pm

Inner product-based decoders are among the most influential frameworks used to extract meaningful data from latent embeddings. However, such decoders have shown limitations in representation capacity...

View Article

Using remotely sensed data for air pollution assessment

February 12, 2024, 9:00 pm

Air pollution constitutes a global problem of paramount importance that affects not only human health, but also the environment. The existence of spatial and temporal data regarding the concentrations...

View Article