Abstracts
Alexander Steinicke:
Worst-Case Optimal Investment in Incomplete Markets
We study and solve the worst-case optimal portfolio problem of an investor with logarithmic preferences facing the possibility of a market crash. Our setting takes place in a Lévy-market and we assume stochastic market coefficients. To tackle this problem, we enhance the martingale approach developed by F. Seifried in 2010. A utility crash-exposure transformation into a backward stochastic differential equation (BSDE) setting allows us to characterize the optimal indifference strategies. Further, we deal with the question of existence of those indifference strategies for market models with an unbounded market price of risk. To numerically compute the strategies, we solve the corresponding (non-Lipschitz) BSDEs through their associated PDEs and need to analyze continuity and boundedness properties of CIR forward processes. We demonstrate our approach for Heston’s stochastic volatility model, Bates’ stochastic volatility model including jumps, and Kim-Omberg’s model for a stochastic excess return.
Xinwei Shen:
Distributional Learning: From Methodology to Applications
Estimating the full (conditional) distribution is crucial to many applications. However, existing methods such as quantile regression typically struggle with high-dimensional response variables. To this end, distributional learning models the target distribution via a generative model, which enables inference via sampling. In this talk, we introduce a distributional learning method called engression. We then demonstrate the applications of engression to several statistical problems including extrapolation in nonparametric regression, causal effect estimation, and dimension reduction, as well as scientific problems such as climate downscaling.
Davy Paindaveine:
Rank Tests for PCA Under Weak Identifiability
In a triangular array framework where n observations are randomly sampled from a p-dimensional elliptical distribution with shape matrix V_n, we consider the problem of testing the null hypothesis H_0: theta=theta_0, where theta is the (fixed) leading unit eigenvector of V_n and theta_0 is a given unit p-vector. The dependence of the shape matrix on the sample size allows us to consider challenging asymptotic scenarios in which the parameter of interest theta is unidentified in the limit, because the ratio between both leading eigenvalues of V_n converges to one. We study the corresponding limiting experiments under such weak identifiability, and we show that these may be LAN or non-LAN. While earlier work in the framework was strictly limited to Gaussian distributions, where the study of local log-likelihood ratios could simply rely on explicit expressions, our asymptotic investigation allows for essentially arbitrary elliptical distributions. Even in non-LAN experiments, our results enable us to investigate the asymptotic null and non-null properties of multivariate rank tests. These nonparametric tests are shown to exhibit an excellent behavior under weak identifiability: not only do they maintain the target nominal size irrespective of the amount of weak identifiability, but they also keep their uniform efficiency properties under such non-standard scenarios.
Alessia Caponera:
Reproducing Kernel Approach to Tomographic Data
Many natural phenomena pose challenges wherein the function of interest cannot be directly measured. For instance, the density of a brain cannot be directly measured, but rather, only evaluated through 2D sectional images via Computerized Tomography (CT). In such setup where the true random function is a latent feature, how can we estimate their mean function and covariance tensor using discretized indirect observations? In this talk, we consider the tomographic operator as an operator between reproducing kernel Hilbert spaces (RKHS) and establish representer theorems to address the problem of mean and covariance estimation. We also present the uniform rates of convergence of our estimators with respect to our observation scheme, evaluating efficiency through simulation results across various tomographic configurations.
Based on a joint work with Ho Yun and Victor M. Panaretos.
Sirio Legramanti:
Leveraging Covariates in Bayesian Nonparametric Clustering: An Application to Transportation Networks
In clustering, observed individual data are often accompanied by covariates that can assist the clustering process itself. This is the case, for example, of transportation networks, where each node has spatial coordinates, and it is often desirable that clusters of nodes are spatially cohesive. In fact, the obtained clusters may be used to inform public policy decisions, and it may be preferrable that such policies are uniform over neighboring areas. Naturally, depending on the application, different notions of closeness can be used to define such neighborhoods, thus potentially requiring proper transformations of the spatial covariates. Motivated by real-world data about the monthly subscriptions to the public transportation system of the Bergamo province (Italy), we show how to incorporate properly-transformed spatial covariates into a state-of-the-art stochastic block model, while allowing to weight the contribution of covariates.
(joint work with Valentina Ghidini and Raffaele Argiento)
Lukas Steinberger:
Statistical Efficiency in Local Differential Privacy
We develop a theory of asymptotically efficient estimation in regular parametric models when data confidentiality is ensured by local differential privacy (LDP). The idea of LDP is that individual data owners should be able to release an anonymized or sanitized version Zi of their possibly sensitive information Xi by drawing Zi from a pre-specified conditional distribution Q that satisfies the formal α-differential privacy constraint. The problem is now to identify a randomization mechanism Q, generating Zi, and an estimator ˆθ, that uses the sanitized data to estimate the population parameter, with minimal variance among all data-generation and estimation schemes satisfying the privacy constraint. Starting from a regular parametric model for the iid unobserved sensitive data X1, . . . ,Xn, we establish local asymptotic mixed normality (along subsequences) of the model describing the sanitized observations Z1, . . . ,Zn. This result readily implies convolution and local asymptotic minimax theorems. In case p = 1, the optimal asymptotic variance is found to be the inverse of the supremal Fisher-Information, where the supremum runs over all α-differentially private (marginal) privacy mechanisms. We present a numerical algorithm for finding a (nearly) optimal privacy mechanism and an estimator based on the corresponding sanitized data that achieves this asymptotically optimal variance under mild assumptions. In special cases, such as the Gaussian location model, our theory also enables us to identify exact closed form expressions of efficient privacy mechanism and estimators.
Wolfgang Karl Härdle:
Pricing Kernels Are Often Non-monotone...
Nonparametric estimation of the pricing kernel has led to a "puzzle", that challenged finance principles. The apparent non monotonicity of the empirical pricing kernel has been addressed as improper use of past and future observations. A new CDI spline based smoothing technique-reflecting the forward looking information sets-puts the estimated pricing kernel into a seemingly theory compatible shape. Our findings, though, show equivalence between the two techniques. We discover that CDI cannot be fully consistent since it relies on averaging in an almost constant stochastic dynamics world. Empirical insights rather point to economic phenomena and not to technical flaws.
Kartik G. Waghmare:
Conditional Independence in Continuous Domain
The concept of conditional independence allows us to distinguish between direct and indirect associations in data and thus has a long history in statistics. Its application to statistical modeling and inference in the context of continuous time stochastic processes or random fields, however, has been obstructed by technical difficulties arising from working with infinite-dimensional function spaces. In a second-order or Gaussian setting, we show how the theory of reproducing kernels allows us to fruitfully study conditional independence in a continuous context by circumventing these difficulties and how the insights it furnishes can be applied to the problems of (a) covariance estimation for partially observed functional data with application to longitudinal studies in medicine and (b) modeling stochastic processes as graphical models corresponding to continuous graphs.
Thomas Nagler:
The Surprising Effect of Reshuffling the Data During Hyperparameter Tuning
Tuning parameter selection is crucial for optimizing predictive power of statistical and machine learning models models. The standard protocol evaluates various parameter configurations using a resampling estimate of the generalization error to guide optimization and select a final parameter configuration. Without much evidence, paired resampling splits, i.e., either a fixed train-validation split or a fixed cross-validation scheme, are often recommended. We show that, surprisingly, reshuffling the splits for every configuration often improves the final model's generalization performance on unseen data. Our theoretical analysis explains how reshuffling affects the asymptotic behavior of the validation loss surface and provides a bound on the expected regret in the limiting regime. This bound connects the potential benefits of reshuffling to the signal and noise characteristics of the underlying optimization problem. We confirm our theoretical results in a controlled simulation study and demonstrate the practical usefulness of reshuffling in a large-scale, realistic hyperparameter optimization experiment.
Sören Christensen:
Strategic Randomization: Equilibria in Markovian Stopping Games
This talk explores equilibrium concepts in stopping games with underlying diffusion processes. We address key challenges in classical existence results for equilibria, focusing on two common approaches: (1) equilibria in general randomized stopping times, which are too broad to allow for explicit solutions and fail to respect subgame perfection, and (2) equilibria in first-entry times, which rely on overly restrictive conditions.
As an alternative, we introduce the concept of Markovian randomized stopping times, providing a unifying framework to overcome these limitations. We establish general existence theorems for Markovian equilibria across a wide range of stopping games. This approach has two key advantages: (1) it enables explicit solutions in diverse scenarios, and (2) it offers clear game-theoretic interpretations, enhancing its relevance for both theory and applications.
The talk is based on joint work with: Boy Schultz and Kristoffer Lindensjö