My research involves the development of new machine learning and econometric methodologies for empirical finance, asset pricing and climate risk applications. More specifically, my research interests are threefold:
a) I focus on how to measure and hedge climate risk; b) I present new machine learning methods for empirical asset pricing models based on big data; and c) I develop multivariate GARCH models and shrinkage estimation techniques for large-dimensional covariance matrices and factor models.
Journal Publications (8):
In statistics, samples are drawn from a population in a data-generating process (DGP). Standard errors measure the uncertainty in sample estimates of population parameters. In science, evidence is generated to test hypotheses in an evidence-generating process (EGP). We claim that EGP variation across researchers adds uncertainty: non-standard errors. To study them, we let 164 teams test six hypotheses on the same sample. We find that non-standard errors are sizeable, on par with standard errors. Their size (i) co-varies only weakly with team merits, reproducibility, or peer rating, (ii) declines significantly after peer-feedback, and (iii) is underestimated by participants.
Conditional heteroskedasticity of the error terms is a common occurrence in financial factor models, such as the CAPM and Fama-French factor models. This feature necessitates the use of heteroskedasticity consistent (HC) standard errors to make valid inference for regression coefficients. In this paper, we show that using weighted least squares (WLS) or adaptive least squares (ALS) to estimate model parameters generally leads to smaller HC standard errors compared to ordinary least squares (OLS), which translates into improved inference in the form of shorter confidence intervals and more powerful hypothesis tests. In an extensive empirical analysis based on historical stock returns and commonly used factors, we find that conditional heteroskedasticity is pronounced and that WLS and ALS can dramatically shorten confidence intervals compared to OLS, especially during times of financial turmoil.
Existing factor models struggle to model the covariance matrix for a large number of stocks and factors. Therefore, we introduce a new covariance matrix estimator that first shrinks the factor model coefficients and then applies nonlinear shrinkage to the residuals and factors. The estimator blends a regularized factor structure with conditional heteroskedasticity of residuals and factors and displays superior all-around performance against various competitors. We show that for the proposed double- shrinkage estimator, it is enough to use only the market factor or the most important latent factor(s). Thus there is no need for laboriously taking into account the factor zoo.
Modeling and forecasting dynamic (or time-varying) covariance matrices has many important applications in finance, such as Markowitz portfolio selection. A popular tool to this end are multivariate GARCH models. Historically, such models did not perform well in large dimensions due to the so-called curse of dimensionality. The recent DCC-NL model of Engle et al. (2019) is able to overcome this curse via nonlinear shrinkage estimation of the unconditional correlation matrix. In this paper, we show how performance can be increased further by using open/high/low/close (OHLC) price data instead of simply using daily returns. A key innovation, for the improved modeling of not only dynamic variances but also of dynamic covariances, is the concept of a regularized return, obtained from a volatility proxy in conjunction with a smoothed sign (function) of the observed return.
Existing shrinkage techniques struggle to model the covariance matrix of asset returns in the presence of multiple-asset classes. Therefore, we introduce a Blockbuster shrinkage estimator that clusters the covariance matrix accordingly. Besides the definition and derivation of a new asymptotically optimal linear shrinkage estimator we propose an adaptive Blockbuster algorithm that clusters the covariance matrix even if the (number of) asset classes are unknown and change over time. It displays superior all-around performance on historical data against a variety of state-of-the-art linear shrinkage competitors. Additionally, we find that for small and medium-sized investment universes the proposed estimator outperforms even recent nonlinear shrinkage techniques. Hence, this new estimator can be used to deliver more efficient portfolio selection and detection of anomalies in the cross-section of asset returns. Furthermore, due to the general structure of the proposed Blockbuster shrinkage estimator the application is not restricted to financial problems.
Many researchers seek factors that predict the cross-section of stock returns. In finance, the key is to replicate anomalies by long-short portfolios based on their factor scores, with microcaps alleviated via New York Stock Exchange (NYSE) breakpoints and value-weighted returns. In econometrics, the key is to include a covariance matrix estimator of stock returns for the (mimicking) portfolio construction. This paper marries these two strands of literature in order to test the zoo of cross-sectional anomalies by injecting size controls, basically NYSE breakpoints and value-weighted returns, into efficient sorting. Thus, we propose to use a covariance matrix estimator for ultra-high dimensions (up to 5,000) taking into account large, small and microcap stocks. We demonstrate that using a nonlinear shrinkage estimator of the covariance matrix substantially enhances the power of tests for cross-sectional anomalies: On average, ‘Student’ t-statistics more than double.
We propose a new method, VASA, based on variable subsample aggregation of model predictions for equity returns using a large-dimensional set of factors. To demonstrate the effectiveness, robustness and dimension reduction power of VASA, we perform a comparative analysis between state-of-the-art machine learning algorithms. As a performance measure, we explore not only the global predictive but also the stock-specific R2’s and their distribution. While the global R2 indicates the average forecasting accuracy, we find that high variability in the stock-specific R2’s can be detrimental for the portfolio performance, due to the higher prediction risk. Since VASA shows minimal variability, portfolios formed on this method outperform the portfolios based on more complicated methods like random forests and neural nets.
This paper injects factor structure into the estimation of time-varying, large-dimensional covariance matrices of stock returns. Existing factor models struggle to model the covariance matrix of residuals in the presence of time-varying conditional heteroskedasticity in large universes. Conversely, rotation-equivariant estimators of large-dimensional time-varying covariance matrices forsake directional information embedded in market-wide risk factors. We introduce a new covariance matrix estimator that blends factor structure with time-varying conditional heteroskedasticity of residuals in large dimensions up to 1000 stocks. It displays superior all-around performance on historical data against a variety of state-of-the-art competitors, including static factor models, exogenous factor models, sparsity-based models, and structure-free dynamic models. This new estimator can be used to deliver more efficient portfolio selection and detection of anomalies in the cross-section of stock returns.
Working Papers (1):
We propose and implement a procedure to dynamically hedge climate change risk. First, we construct a climate change index through textual analysis of newspapers and scientific data bases based on the latest advances in machine learning and textual analysis. Second, we present a new approach to compute (factor) mimicking portfolios to build climate (change) risk hedge portfolios. The new mimicking portfolio approach is much more efficient than traditional simple long-short sorting portfolios by taking into account new methodologies of estimating large dynamic covariance matrices resulting in a more elaborated time-varying optimization problem.
Work in Progress (4):
- Heterogeneous Predictability in Asset Pricing (with Simon Hediger, Bryan Kelly and Markus Leippold)
- A Covariance Matrix Estimator for Unbalanced and Missing Data (with Robert Engle)
- Large Subsampled Covariance Matrices
- Innosuisse project: Asset Allocation through Reinforcement Learning for Swiss Pension Funds (with Simon Broda and Patrick Walker)