Publication Library
Anomaly Detection in Time Series Data Using Reinforcement Learning, Variational Autoencoder, and Active Learning
Description: A novel approach to detecting anomalies in time series data is presented in this paper. This approach is pivotal in domains such as data centers, sensor networks, and finance. Traditional methods often struggle with manual parameter tuning and cannot adapt to new anomaly types. Our method overcomes these limitations by integrating Deep Reinforcement Learning (DRL) with a Variational Autoencoder (VAE) and Active Learning. By incorporating a Long Short-Term Memory (LSTM) network, our approach models sequential data and its dependencies effectively, allowing for the detection of new anomaly classes with minimal labeled data. Our innovative DRL- VAE and Active Learning combination significantly improves existing methods, as shown by our evaluations on real-world datasets, enhancing anomaly detection techniques and advancing time series analysis.
Created At: 07 April 2025
Updated At: 07 April 2025
Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions
Description: https://medagentsim.netlify.app/ In this work, we introduce MedAgentSim, an open-source simulated clinical environment with doctor, patient, and measurement agents designed to evaluate and enhance LLM performance in dynamic diagnostic settings. Unlike prior approaches, our framework requires doctor agents to actively engage with patients through multi-turn conversations, requesting relevant medical examinations (e.g., temperature, blood pressure, ECG) and imaging results (e.g., MRI, X-ray) from a measurement agent to mimic the real-world diagnostic process. Additionally, we incorporate self improvement mechanisms that allow models to iteratively refine their diagnostic strategies. We enhance LLM performance in our simulated setting by integrating multi-agent discussions, chain-of-thought reasoning, and experience-based knowledge retrieval, facilitating progressive learning as doctor agents interact with more patients. We also introduce an evaluation benchmark for assessing the LLM's ability to engage in dynamic, context-aware diagnostic interactions. While MedAgentSim is fully automated, it also supports a user-controlled mode, enabling human interaction with either the doctor or patient agent. Comprehensive evaluations in various simulated diagnostic scenarios demonstrate the effectiveness of our approach. Our code, simulation tool, and benchmark are available at https://medagentsim.netlify.app/
Created At: 07 April 2025
Updated At: 07 April 2025
LLM Post-Training - A Deep Dive into Reasoning Large Language Models
Description: https://github.com/mbzuai-oryx/Awesome-LLM-Post-training Large Language Models (LLMs) have transformed the natural language processing landscape and brought to life diverse applications. Pretraining on vast web-scale data has laid the foundation for these models, yet the research community is now increasingly shifting focus toward post-training techniques to achieve further breakthroughs. While pretraining provides a broad linguistic foundation, post-training methods enable LLMs to refine their knowledge, improve reasoning, enhance factual accuracy, and align more effectively with user intents and ethical considerations. Fine-tuning, reinforcement learning, and test-time scaling have emerged as critical strategies for optimizing LLMs performance, ensuring robustness, and improving adaptability across various real-world tasks. This survey provides a systematic exploration of post-training methodologies, analyzing their role in refining LLMs beyond pretraining, addressing key challenges such as catastrophic forgetting, reward hacking, and inference-time trade-offs. We highlight emerging directions in model alignment, scalable adaptation, and inference-time reasoning, and outline future research directions. We also provide a public repository to continually track developments in this fast-evolving field: https://github.com/mbzuai-oryx/Awesome-LLM-Post-training
Created At: 07 April 2025
Updated At: 07 April 2025
Improving Sense-Making with Artificial Intelligence
Description: The report identifies 20 challenges associated with scaling sense-making processes in five key areas: Collection orchestration Data access and sharing Data fusion and analysis Model management Skills and training The report suggests that AI capabilities, such as natural language processing, computer vision, planning systems, prediction/classification, and expert systems, can be combined to address these challenges.
Created At: 05 April 2025
Updated At: 05 April 2025
State-of-play and future trends on the development of oversight frameworks for emerging technologies - Part 1
Description: As technologies become more pervasive and form a critical aspect of our societal infrastructure, governance and wider oversight mechanisms have a key role to play in ensuring that benefits from technology are maximised and risks are managed proactively. The goal of technology oversight is to ensure that technology is developed, deployed and used in a responsible and ethical manner, and that it does not pose undue risks or harm to individuals or society as a whole. Wellcome commissioned RAND Europe to undertake a study on the state-of-play and future trends on the development of oversight frameworks for emerging technologies. The specific objective of the study is to identify and analyse a suite of oversight frameworks and mechanisms (including associated emerging trends and novel approaches) that are in use, in development or under debate in different jurisdictions across the globe for a set of emerging technologies. The technologies of interest include genomics (specifically engineering biology), human embryology, organoids, neurotechnology, artificial intelligence (AI) (specifically its application and use as a research tool) and data platforms. The study findings are presented in two related documents: the global technology landscape review report and the technology oversight report (this report). The two reports should be read alongside each other. This report examines notable oversight mechanisms that are either established or under development across a selection of global jurisdictions, offering key learning and insights that could inform future technology oversight discussions.
Created At: 05 April 2025
Updated At: 05 April 2025