From Uncertainty to Precision: Enhancing Binary Classifier Performance...
The assessment of binary classifier performance traditionally centers on discriminative ability using metrics, such as accuracy. However, these metrics often disregard the model's inherent uncertainty,...
View ArticleHYPO: Hyperspherical Out-of-Distribution Generalization
Out-of-distribution (OOD) generalization is critical for machine learning models deployed in the real world. However, achieving this can be fundamentally challenging, as it requires the ability to...
View ArticleTowards an Understanding of Stepwise Inference in Transformers: A Synthetic...
Stepwise inference protocols, such as scratchpads and chain-of-thought, help language models solve complex problems by decomposing them into a sequence of simpler subproblems. Despite the significant...
View ArticlePredictive Churn with the Set of Good Models
Machine learning models in modern mass-market applications are often updated over time. One of the foremost challenges faced is that, despite increasing overall performance, these updates may flip...
View ArticleUniversal link predictor by In-context Learning
Link prediction is a crucial task in graph machine learning, where the goal is to infer missing or future links within a graph. Traditional approaches leverage heuristic methods based on widely...
View ArticleLoRA-drop: Efficient LoRA Parameter Pruning based on Output Evaluation
Low-Rank Adaptation (LoRA) introduces auxiliary parameters for each layer to fine-tune the pre-trained model under limited computing resources. But it still faces challenges of resource consumption...
View ArticleModel Collapse Demystified: The Case of Regression
In the era of large language models like ChatGPT, the phenomenon of "model collapse" refers to the situation whereby as a model is trained recursively on data generated from previous generations of...
View ArticleOnline Sequential Decision-Making with Unknown Delays
In the field of online sequential decision-making, we address the problem with delays utilizing the framework of online convex optimization (OCO), where the feedback of a decision can arrive with an...
View ArticleBoundary Exploration for Bayesian Optimization With Unknown Physical Constraints
Bayesian optimization has been successfully applied to optimize black-box functions where the number of evaluations is severely limited. However, in many real-world applications, it is hard or...
View ArticleTighter Bounds on the Information Bottleneck with Application to Deep Learning
Deep Neural Nets (DNNs) learn latent representations induced by their downstream task, objective function, and other parameters. The quality of the learned representations impacts the DNN's...
View ArticleG-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding...
Given a graph with textual attributes, we enable users to `chat with their graph': that is, to ask questions about the graph using a conversational interface. In response to a user's questions, our...
View ArticleNear-Minimax-Optimal Distributional Reinforcement Learning with a Generative...
We propose a new algorithm for model-based distributional reinforcement learning (RL), and prove that it is minimax-optimal for approximating return distributions with a generative model (up to...
View ArticleFoundational Inference Models for Dynamical Systems
Ordinary differential equations (ODEs) underlie dynamical systems which serve as models for a vast number of natural and social phenomena. Yet inferring the ODE that best describes a set of noisy...
View ArticleUnveiling Group-Specific Distributed Concept Drift: A Fairness Imperative in...
In the evolving field of machine learning, ensuring fairness has become a critical concern, prompting the development of algorithms designed to mitigate discriminatory outcomes in decision-making...
View ArticleIdentifying architectural design decisions for achieving green ML serving
The growing use of large machine learning models highlights concerns about their increasing computational demands. While the energy consumption of their training phase has received attention, fewer...
View ArticleOnly the Curve Shape Matters: Training Foundation Models for Zero-Shot...
We present General Time Transformer (GTT), an encoder-only style foundation model for zero-shot multivariate time series forecasting. GTT is pretrained on a large dataset of 200M high-quality time...
View ArticleWeisfeiler-Leman at the margin: When more expressivity matters
The Weisfeiler-Leman algorithm ($1$-WL) is a well-studied heuristic for the graph isomorphism problem. Recently, the algorithm has played a prominent role in understanding the expressive power of...
View ArticleTransAxx: Efficient Transformers with Approximate Computing
Vision Transformer (ViT) models which were recently introduced by the transformer architecture have shown to be very competitive and often become a popular alternative to Convolutional Neural Networks...
View ArticleNeuralSentinel: Safeguarding Neural Network Reliability and Trustworthiness
The usage of Artificial Intelligence (AI) systems has increased exponentially, thanks to their ability to reduce the amount of data to be analyzed, the user efforts and preserving a high rate of...
View ArticleClusterTabNet: Supervised clustering method for table detection and table...
We present a novel deep-learning-based method to cluster words in documents which we apply to detect and recognize tables given the OCR output. We interpret table structure bottom-up as a graph of...
View Article