Publication Library

Publication Library

Real-World Evidence, Causal Inference, and Machine Learning

Description: The current focus on real world evidence (RWE) is occurring at a time when at least two major trends are converging. First, is the progress made in observational research design and methods over the past decade. Second, the development of numerous large observational healthcare databases around the world is creating repositories of improved data assets to support observational research. Objective: This paper examines the implications of the improvements in observational methods and research design, as well as the growing availability of real world data for the quality of RWE. These developments have been very positive. On the other hand, unstructured data, such as medical notes, and the sparcity of data created by merging multiple data assets are not easily handled by traditional health services research statistical methods. In response, machine learning methods are gaining increased traction as potential tools for analyzing massive, complex datasets. Conclusions: Machine learning methods have traditionally been used for classification and prediction, rather than causal inference. The prediction capabilities of machine learning are valuable by themselves. However, using machine learning for causal inference is still evolving. Machine learning can be used for hypothesis generation, followed by the application of traditional causal methods. But relatively recent developments, such as targeted maximum likelihood methods, are directly integrating machine learning with causal inference.

Created At: 14 December 2024

Updated At: 14 December 2024

Using Machine Learning Applied to Real-World Healthcare Data for Predictive Analytics An Applied Example in Bariatric Surgery

Description: Objectives: Laparoscopic metabolic surgery (MxS) can lead to remission of type 2 diabetes (T2D); however, treatment response to MxS can be heterogeneous. Here, we demonstrate an open-source predictive analytics platform that applies machine-learning techniques to a common data model; we develop and validate a predictive model of antihyperglycemic medication cessation (validated proxy for A1c control) in patients with treated T2D who underwent MxS. Methods: We selected patients meeting the following criteria in 2 large US healthcare claims databases (Truven Health MarketScan Commercial [CCAE]; Optum Clinformatics [Optum]): underwent MxS between January 1, 2007, to October 1, 2013 (first = index); aged $18 years; continuous enrollment 180 days pre-index (baseline) to 730 days postindex; baseline T2D diagnosis and treatment. The outcome was no antihyperglycemic medication treatment from 365 to 730 days after MxS. A regularized logistic regression model was trained using the following candidate predictor categories measured at baseline: demographics, conditions, medications, measurements, and procedures. A 75% to 25% split of the CCAE group was used for model training and testing; the Optum group was used for external validation. Results: 13050 (CCAE) and 3477 (Optum) patients met the study inclusion criteria. Antihyperglycemic medication cessation rates were 72.9% (CCAE) and 70.8% (Optum). The model possessed good internal discriminative accuracy (area under the curve [AUC] = 0.778 [95% CI = 0.761-0.795] in CCAE test set N = 3527) and transportability (external AUC = 0.759 [95% CI = 0.741-0.777] in Optum N = 3477). Conclusion: The application of machine learning techniques to real-world healthcare data can yield useful predictive models to assist patient selection. In future practice, establishment of prerequisite technological infrastructure will be needed to implement such models for real-world decision support.

Created At: 14 December 2024

Updated At: 14 December 2024

Learning Multiple Initial Solutions to Optimization Problems

Description: Sequentially solving similar optimization problems under strict runtime constraints is essential for many applications, such as robot control, autonomous driving, and portfolio management. The performance of local optimization methods in these settings is sensitive to the initial solution: poor initialization can lead to slow convergence or suboptimal solutions. To address this challenge, we propose learning to predict multiple diverse initial solutions given parameters that define the problem instance. We introduce two strategies for utilizing multiple initial solutions: (i) a single-optimizer approach, where the most promising initial solution is chosen using a selection function, and (ii) a multiple-optimizers approach, where several optimizers, potentially run in parallel, are each initialized with a different solution, with the best solution chosen afterward. We validate our method on three optimal control benchmark tasks: cart-pole, reacher, and autonomous driving, using different optimizers: DDP, MPPI, and iLQR. We find significant and consistent improvement with our method across all evaluation settings and demonstrate that it efficiently scales with the number of initial solutions required. The code is available at https://github.com/EladSharony/miso.

Created At: 14 December 2024

Updated At: 14 December 2024

Enhancing literature review with LLM and NLP methods. Algorithmic trading case

Description: This study utilizes machine learning algorithms to analyze and organize knowledge in the field of algorithmic trading. By filtering a dataset of 136 million research papers, we identified 14,342 relevant articles published between 1956 and Q1 2020. We compare traditional practices—such as keyword-based algorithms and embedding techniques—with state-of-the-art topic modeling methods that employ dimensionality reduction and clustering. This comparison allows us to assess the popularity and evolution of different approaches and themes within algorithmic trading. We demonstrate the usefulness of Natural Language Processing (NLP) in the automatic extraction of knowledge, highlighting the new possibilities created by the latest iterations of Large Language Models (LLMs) like ChatGPT. The rationale for focusing on this topic stems from our analysis, which reveals that research articles on algorithmic trading are increasing at a faster rate than the overall number of publications. While stocks and main indices comprise more than half of all assets considered, certain asset classes, such as cryptocurrencies, exhibit a much stronger growth trend. Machine learning models have become the most popular methods in recent years. The study demonstrates the efficacy of LLMs in refining datasets and addressing intricate questions about the analyzed articles, such as comparing the efficiency of different models. Our research shows that by decomposing tasks into smaller components and incorporating reasoning steps, we can effectively tackle complex questions supported by case analyses. This approach contributes to a deeper understanding of algorithmic trading methodologies and underscores the potential of advanced NLP techniques in literature reviews.

Created At: 14 December 2024

Updated At: 14 December 2024

The Hybrid Forecast of SP 500 Volatility ensembled from VIX, GARCH and LSTM models

Description: Predicting the S&P 500 index’s volatility is crucial for investors and financial analysts as it helps assess market risk and make informed investment decisions. Volatility represents the level of uncertainty or risk related to the size of changes in a security’s value, making it an essential indicator for financial planning. This study explores four methods to improve the accuracy of volatility forecasts for the S&P 500: the established GARCH model, known for capturing historical volatility patterns; an LSTM network that utilizes past volatility and log returns; a hybrid LSTM-GARCH model that combines the strengths of both approaches; and an advanced version of the hybrid model that also factors in the VIX index to gauge market sentiment. This analysis is based on a daily dataset that includes data for S&P 500 and VIX index, covering the period from January 3, 2000, to December 21, 2023. Through rigorous testing and comparison, we found that machine learning approaches, particularly the hybrid LSTM models, significantly outperform the traditional GARCH model. Including the VIX index in the hybrid model further enhances its forecasting ability by incorporating real-time market sentiment. The results of this study offer valuable insights for achieving more accurate volatility predictions, enabling better risk management and strategic investment decisions in the volatile environment of the S&P 500.

Created At: 14 December 2024

Updated At: 14 December 2024

First 22 23 24 25 26 27 28 Last