Publication Library

Publication Library

DSA Committee Activities and Scope

Description: About DSA Committees

Created At: 11 May 2025

Updated At: 11 May 2025

X-WAV
Artificial General Intelligences Five Hard National Security Problems

Description: The potential emergence of artificial general intelligence (AGI) is plausible and should be taken seriously by the U.S. national security community. Yet the pace and potential progress of AGI's emergence — as well as the composition of a post-AGI future — is shrouded in a cloud of uncertainty. This poses a challenge for strategists and policymakers trying to discern what potential threats and opportunities might emerge on the path to AGI and once AGI is achieved. This paper puts forth five hard problems that AGI's emergence presents for U.S. national security: (1) wonder weapons, (2) systemic shifts in power, (3) nonexperts empowered to develop weapons of mass destruction, (4) artificial entities with agency, and (5) instability. In much of the discourse on AGI, policymakers and analysts argue past one another with differing opinions on which issues deserve immediate focus and resources. Yet the authors have observed that proposals to advance progress on one problem can undermine progress on — if not outright ignore — another. They offer these five hard national security problems to help structure such discourse by providing a common language to communicate about risks and opportunities of AGI and a rubric to evaluate alternative strategies.

Created At: 11 May 2025

Updated At: 11 May 2025

Ψ-Arena - Interactive Assessment and Optimization of LLM-based Psychological Counselors with Tripartite Feedback

Description: Large language models (LLMs) have shown promise in providing scalable mental health support, while evaluating their counseling capability remains crucial to ensure both efficacy and safety. Existing evaluations are limited by the static assessment that focuses on knowledge tests, the single perspective that centers on user experience, and the open-loop framework that lacks actionable feedback. To address these issues, we propose {\Psi}-Arena, an interactive framework for comprehensive assessment and optimization of LLM-based counselors, featuring three key characteristics: (1) Realistic arena interactions that simulate real-world counseling through multi-stage dialogues with psychologically profiled NPC clients, (2) Tripartite evaluation that integrates assessments from the client, counselor, and supervisor perspectives, and (3) Closed-loop optimization that iteratively improves LLM counselors using diagnostic feedback. Experiments across eight state-of-the-art LLMs show significant performance variations in different real-world scenarios and evaluation perspectives. Moreover, reflection-based optimization results in up to a 141% improvement in counseling performance. We hope PsychoArena provides a foundational resource for advancing reliable and human-aligned LLM applications in mental healthcare.

Created At: 09 May 2025

Updated At: 09 May 2025

Benchmarking LLMs Swarm intelligence

Description: See: https://github.com/x66ccff/swarmbench Large Language Models (LLMs) show potential for complex reasoning, yet their capacity for emergent coordination in Multi-Agent Systems (MAS) when operating under strict constraints-such as limited local perception and communication, characteristic of natural swarms-remains largely unexplored, particularly concerning the nuances of swarm intelligence. Existing benchmarks often do not fully capture the unique challenges of decentralized coordination that arise when agents operate with incomplete spatio-temporal information. To bridge this gap, we introduce SwarmBench, a novel benchmark designed to systematically evaluate the swarm intelligence capabilities of LLMs acting as decentralized agents. SwarmBench features five foundational MAS coordination tasks within a configurable 2D grid environment, forcing agents to rely primarily on local sensory input (k x k view) and local communication. We propose metrics for coordination effectiveness and analyze emergent group dynamics. Evaluating several leading LLMs in a zero-shot setting, we find significant performance variations across tasks, highlighting the difficulties posed by local information constraints. While some coordination emerges, results indicate limitations in robust planning and strategy formation under uncertainty in these decentralized scenarios. Assessing LLMs under swarm-like conditions is crucial for realizing their potential in future decentralized systems. We release SwarmBench as an open, extensible toolkit-built upon a customizable and scalable physical system with defined mechanical properties. It provides environments, prompts, evaluation scripts, and the comprehensive experimental datasets generated, aiming to foster reproducible research into LLM-based MAS coordination and the theoretical underpinnings of Embodied MAS. Our code repository is available at https://github.com/x66ccff/swarmbench

Created At: 09 May 2025

Updated At: 09 May 2025

Multi-agent Embodied AI - Advances and Future Directions

Description: Embodied artificial intelligence (Embodied AI) plays a pivotal role in the application of advanced technologies in the intelligent era, where AI systems are integrated with physical bodies that enable them to perceive, reason, and interact with their environments. Through the use of sensors for input and actuators for action, these systems can learn and adapt based on real-world feedback, allowing them to perform tasks effectively in dynamic and unpredictable environments. As techniques such as deep learning (DL), reinforcement learning (RL), and large language models (LLMs) mature, embodied AI has become a leading field in both academia and industry, with applications spanning robotics, healthcare, transportation, and manufacturing. However, most research has focused on single-agent systems that often assume static, closed environments, whereas real-world embodied AI must navigate far more complex scenarios. In such settings, agents must not only interact with their surroundings but also collaborate with other agents, necessitating sophisticated mechanisms for adaptation, real-time learning, and collaborative problem-solving. Despite increasing interest in multi-agent systems, existing research remains narrow in scope, often relying on simplified models that fail to capture the full complexity of dynamic, open environments for multi-agent embodied AI. Moreover, no comprehensive survey has systematically reviewed the advancements in this area. As embodied AI rapidly evolves, it is crucial to deepen our understanding of multi-agent embodied AI to address the challenges presented by real-world applications. To fill this gap and foster further development in the field, this paper reviews the current state of research, analyzes key contributions, and identifies challenges and future directions, providing insights to guide innovation and progress in this field.

Created At: 09 May 2025

Updated At: 09 May 2025

1 2 3 4 5 6 7 Last