Publication Library
Hunting Tomorrows Leaders Using Machine Learning to Forecast SP 500 Additions and Removal
Description: This study applies machine learning to predict S&P 500 membership changes: key events that profoundly impact investor behavior and market dynamics. Quarterly data from WRDS datasets (2013 onwards) was used, incorporating features such as industry classification, financial data, market data, and corporate governance indicators. Using a Random Forest model, we achieved a test F1 score of 0.85, outperforming logistic regression and SVC models. This research not only showcases the power of machine learning for financial forecasting but also emphasizes model transparency through SHAP analysis and feature engineering. The model's real world applicability is demonstrated with predicted changes for Q3 2023, such as the addition of Uber (UBER) and the removal of SolarEdge Technologies (SEDG). By incorporating these predictions into a trading strategy i.e. buying stocks announced for addition and shorting those marked for removal, we anticipate capturing alpha and enhancing investment decision making, offering valuable insights into index dynamics
Created At: 19 December 2024
Updated At: 19 December 2024
To VaR or Not to VaR
Description: We consider economic obstacles that limit the reliability and accuracy of value-at-risk (VaR). Investors who manage large market transactions should take into account the impact of the randomness of large trade volumes on predictions of price probability and VaR assessments. We introduce market-based probabilities of price and return that depend on the randomness of market trade values and volumes. Contrary to them, the conventional frequency-based price probability describes the case of constant trade volumes. We derive the dependence of market-based price volatility on the volatilities and correlation of trade values and volumes. In the coming years, that will limit the accuracy of price probability predictions to Gaussian approximations, and even the forecasts of market-based price volatility will be inaccurate and highly uncertain.
Created At: 19 December 2024
Updated At: 19 December 2024
Expressions of Market-Based Correlations Between Prices and Returns of Two Assets
Description: This paper derives the expressions of correlations between prices of two assets, returns of two assets, and price-return correlations of two assets that depend on statistical moments and correlations of the current values, past values, and volumes of their market trades. The usual frequency-based expressions of correlations of time series of prices and returns describe a partial case of our model when all trade volumes and past trade values are constant. Such an assumptions are rather far from market reality, and its use results in excess losses and wrong forecasts. Traders, banks, and funds that perform multi-million market transactions or manage billion-valued portfolios should consider the impact of large trade volumes on market prices and returns. The use of the market-based correlations of prices and returns of two assets is mandatory for them. The development of macroeconomic models and market forecasts like those being created by BlackRock's Aladdin, JP Morgan, and the U.S. Fed., is impossible without the use of market-based correlations of prices and returns of two assets.
Created At: 19 December 2024
Updated At: 19 December 2024
Neuroscience of Flow States in the Modern World
Description: A Review on the Role of the Neuroscience of Flow States in the Modern World
Created At: 19 December 2024
Updated At: 19 December 2024
Beyond Labeling Oracles What does it mean to steal ML models
Description: Model extraction attacks are designed to steal trained models with only query access, as is often provided through APIs that ML-as-a-Service providers offer. Machine Learning (ML) models are expensive to train, in part because data is hard to obtain, and a primary incentive for model extraction is to acquire a model while incurring less cost than training from scratch. Literature on model extraction commonly claims or presumes that the attacker is able to save on both data acquisition and labeling costs. We thoroughly evaluate this assumption and find that the attacker often does not. This is because current attacks implicitly rely on the adversary being able to sample from the victim model’s data distribution. We thoroughly research factors influencing the success of model extraction. Wediscover that prior knowledge of the attacker, i.e. access to in-distribution data, dominates other factors like the attack policy the adversary follows to choose which queries to make to the victim model API. Our findings urge the community to redefine the adversarial goals of ME attacks as current evaluation methods misinterpret the ME performance.
Created At: 15 December 2024
Updated At: 15 December 2024