Publication Library

Publication Library

LLMPhy Complex Physical Reasoning Using Large Language Models and World Models

Description: Physical reasoning is an important skill needed for robotic agents when operating in the real world. However, solving such reasoning problems often involves hypothesizing and reflecting over complex multi-body interactions under the effect of a multitude of physical forces and thus learning all such interactions poses a significant hurdle for state-of-the-art machine learning frameworks, including large language models (LLMs). To study this problem, we propose a new physical reasoning task and a dataset, dubbed TraySim. Our task involves predicting the dynamics of several objects on a tray that is given an external impact -- the domino effect of the ensued object interactions and their dynamics thus offering a challenging yet controlled setup, with the goal of reasoning being to infer the stability of the objects after the impact. To solve this complex physical reasoning task, we present LLMPhy, a zero-shot black-box optimization framework that leverages the physics knowledge and program synthesis abilities of LLMs, and synergizes these abilities with the world models built into modern physics engines. Specifically, LLMPhy uses an LLM to generate code to iteratively estimate the physical hyperparameters of the system (friction, damping, layout, etc.) via an implicit analysis-by-synthesis approach using a (non-differentiable) simulator in the loop and uses the inferred parameters to imagine the dynamics of the scene towards solving the reasoning task. To show the effectiveness of LLMPhy, we present experiments on our TraySim dataset to predict the steady-state poses of the objects. Our results show that the combination of the LLM and the physics engine leads to state-of-the-art zero-shot physical reasoning performance, while demonstrating superior convergence against standard black-box optimization methods and better estimation of the physical parameters.

Created At: 13 November 2024

Updated At: 13 November 2024

Automation of the NIST Cryptographic Module Validation Program September 2024 Status Report

Description: The Cryptographic Module Validation Program (CMVP) validates third-party assertions that cryptographic module implementations satisfy the requirements of Federal Information Processing Standards (FIPS) Publication 140-3, Security Requirements for Cryptographic Modules. The NIST National Cybersecurity Center of Excellence (NCCoE) has undertaken the Automated Cryptographic Module Validation Project (ACMVP) to support improvement in the efficiency and timeliness of CMVP operations and processes. The goal is to demonstrate a suite of automated tools that would permit organizations to perform testing of their cryptographic products according to the requirements of FIPS 140-3, then directly report the results to NIST using appropriate protocols. This is a status report of progress made so far with the ACMVP and the planned next steps for the project.

Created At: 04 November 2024

Updated At: 04 November 2024

Private Augmentation Robust and Task Agnostic Data Valuation Approach for Data Marketplace

Description: Evaluating datasets in data marketplaces, where the buyer aim to purchase valuable data, is a critical challenge. In this paper, we introduce an innovative task-agnostic data valuation method called PriArTa which is an approach for computing the distance between the distribution of the buyer's existing dataset and the seller's dataset, allowing the buyer to determine how effectively the new data can enhance its dataset. PriArTa is communication-efficient, enabling the buyer to evaluate datasets without needing access to the entire dataset from each seller. Instead, the buyer requests that sellers perform specific preprocessing on their data and then send back the results. Using this information and a scoring metric, the buyer can evaluate the dataset. The preprocessing is designed to allow the buyer to compute the score while preserving the privacy of each seller's dataset, mitigating the risk of information leakage before the purchase. A key feature of PriArTa is its robustness to common data transformations, ensuring consistent value assessment and reducing the risk of purchasing redundant data. The effectiveness of PriArTa is demonstrated through experiments on real-world image datasets, showing its ability to perform privacy-preserving, augmentation-robust data valuation in data marketplaces.

Created At: 04 November 2024

Updated At: 04 November 2024

Multi Agent Deep Q Network with Layer based Communication Channel for Autonomous Internal Logistics Vehicle Scheduling in Smart Manufacturing

Description: In smart manufacturing, scheduling autonomous internal logistic vehicles is crucial for optimizing operational efficiency. This paper proposes a multi-agent deep Q-network (MADQN) with a layer-based communication channel (LBCC) to address this challenge. The main goals are to minimize total job tardiness, reduce the number of tardy jobs, and lower vehicle energy consumption. The method is evaluated against nine well-known scheduling heuristics, demonstrating its effectiveness in handling dynamic job shop behaviors like job arrivals and workstation unavailabilities. The approach also proves scalable, maintaining performance across different layouts and larger problem instances, highlighting the robustness and adaptability of MADQN with LBCC in smart manufacturing.

Created At: 04 November 2024

Updated At: 04 November 2024

An Introduction to Causal Inference

Description: Causal inference goes beyond prediction by modeling the outcome of interventions and formalizing counterfactual reasoning. Instead of restricting causal conclusions to experiments, causal inference explicates the conditions under which it is possible to draw causal conclusions even from observational data. In this paper, I provide a concise introduction to the graphical approach to causal inference, which uses Directed Acyclic Graphs (DAGs) to visualize, and Structural Causal Models (SCMs) to relate probabilistic and causal relationships. Successively, we climb what Judea Pearl calls the “causal hierarchy” — moving from association to intervention to counterfactuals. I explain how DAGs can help us reason about associations between variables as well as interventions; how the do-calculus leads to a satisfactory definition of confounding, thereby clarifying, among other things, Simpson’s paradox; and how SCMs enable us to reason about what could have been. Lastly, I discuss a number of challenges in applying causal inference in practice.

Created At: 04 November 2024

Updated At: 04 November 2024

First 52 53 54 55 56