Publication Library
Automating Model Search for Large Scale Machine Learning
Description: The proliferation of massive datasets combined with the development of sophisticated analytical techniques has enabled a wide variety of novel applications such as improved product recommendations, automatic image tagging, and improved speech-driven interfaces. A major obstacle to supporting these predictive applications is the challenging and expensive process of identifying and training an appropriate predictive model. Recent efforts aiming to automate this process have focused on single node implementations and have assumed that model training itself is a black box, limiting their usefulness for applications driven by large-scale datasets. In this work, we build upon these recent efforts and propose an architecture for automatic machine learning at scale comprised of a cost-based cluster resource allocation estimator, advanced hyperparameter tuning techniques, bandit resource allocation via runtime algorithm introspection, and physical optimization via batching and optimal resource allocation. The result is TUPAQ, a component of the MLbase system that automatically finds and trains models for a user’s predictive application with comparable quality to those found using exhaustive strategies, but an order of magnitude more efficiently than the standard baseline approach. TUPAQ scales to models trained on Terabytes of data across hundreds of machines.
Created At: 15 December 2024
Updated At: 15 December 2024
AI Sleeper Agents
Description: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.
Created At: 15 December 2024
Updated At: 15 December 2024
The Kelly Criterion in Blackjack Sports Betting and the Stock Market
Description: Kelly Criterion in Blackjack Sports Betting, and the Stock Market
Created At: 15 December 2024
Updated At: 15 December 2024
Introduction to Causal Inference
Description: Causal inference goes beyond prediction by modeling the outcome of interventions and formalizing counterfactual reasoning. Instead of restricting causal conclusions to experiments, causal inference explicates the conditions under which it is possible to draw causal conclusions even from observational data. In this paper, I provide a concise introduction to the graphical approach to causal inference, which uses Directed Acyclic Graphs (DAGs) to visualize, and Structural Causal Models (SCMs) to relate probabilistic and causal relationships. Successively, we climb what Judea Pearl calls the “causal hierarchy” — moving from association to intervention to counterfactuals. I explain how DAGs can help us reason about associations between variables as well as interventions; how the do-calculus leads to a satisfactory definition of confounding, thereby clarifying, among other things, Simpson’s paradox; and how SCMs enable us to reason about what could have been. Lastly, I discuss a number of challenges in applying causal inference in practice.
Created At: 15 December 2024
Updated At: 15 December 2024
Multi-Agent Deep Q-Network with Layer-based Communication Channel for Autonomous Internal Logistics
Description: In smart manufacturing, scheduling autonomous internal logistic vehicles is crucial for optimizing operational efficiency. This paper proposes a multi-agent deep Q-network (MADQN) with a layer-based communication channel (LBCC) to address this challenge. The main goals are to minimize total job tardiness, reduce the number of tardy jobs, and lower vehicle energy consumption. The method is evaluated against nine well-known scheduling heuristics, demonstrating its effectiveness in handling dynamic job shop behaviors like job arrivals and workstation unavailabilities. The approach also proves scalable, maintaining performance across different layouts and larger problem instances, highlighting the robustness and adaptability of MADQN with LBCC in smart manufacturing.
Created At: 15 December 2024
Updated At: 15 December 2024