skip to main content


Title: Clustering high‐frequency financial time series based on information theory
Abstract

Clustering large financial time series data enables pattern extraction that facilitates risk management. The knowledge gathered from unsupervised learning is useful for improving portfolio optimization and making stock trading recommendations. Most methods available in the literature for clustering financial time series are based on exploiting linear relationships between time series. However, prices of different assets (stocks) may have non‐linear relationships which may be quantified using information based measures such as mutual information (MI). To estimate the empirical mutual information between time series of stock returns, we employ a novel kernel density estimator (KDE) based jackknife mutual information estimation (JMI), and compare it with the widely‐used binning method. We then propose an average distance gradient change algorithm and an algorithm based on the average silhouette criterion that use pairwise and groupwise MI of high‐frequency financial stock returns. Through numerical studies, we provide insights into the impact of the clustering on asset allocation and risk management based on the nonlinear information structure of the US stock market.

 
more » « less
PAR ID:
10367682
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Applied Stochastic Models in Business and Industry
Volume:
38
Issue:
1
ISSN:
1524-1904
Page Range / eLocation ID:
p. 4-26
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Hydrologic variability can present severe financial challenges for organizations that rely on water for the provision of services, such as water utilities and hydropower producers. While recent decades have seen rapid growth in decision‐support innovations aimed at helping utilities manage hydrologic uncertainty for multiple objectives, support for managing the related financial risks remains limited. However, the mathematical similarities between multi‐objective reservoir control and financial risk management suggest that the two problems can be approached in a similar manner. This paper demonstrates the utility of Evolutionary Multi‐Objective Direct Policy Search for developing adaptive policies for managing the drought‐related financial risk faced by a hydropower producer. These policies dynamically balance a portfolio, consisting of snowpack‐based financial hedging contracts, cash reserves, and debt, based on evolving system conditions. Performance is quantified based on four conflicting objectives, representing the classic tradeoff between “risk” and “return” in addition to decision‐makers’ unique preferences toward different risk management instruments. The dynamic policies identified here significantly outperform static management formulations that are more typically employed for financial risk applications in the water resources literature. Additionally, this paper combines visual analytics and information theoretic sensitivity analysis to improve understanding about how different candidate policies achieve their comparative advantages through differences in how they adapt to real‐time information. The methodology presented in this paper should be applicable to any organization subject to financial risk stemming from hydrology or other environmental variables (e.g., wind speed, insolation), including electric utilities, water utilities, agricultural producers, and renewable energy developers.

     
    more » « less
  2. Abstract Background

    A cell exhibits a variety of responses to internal and external cues. These responses are possible, in part, due to the presence of an elaborate gene regulatory network (GRN) in every single cell. In the past 20 years, many groups worked on reconstructing the topological structure of GRNs from large-scale gene expression data using a variety of inference algorithms. Insights gained about participating players in GRNs may ultimately lead to therapeutic benefits. Mutual information (MI) is a widely used metric within this inference/reconstruction pipeline as it can detect any correlation (linear and non-linear) between any number of variables (n-dimensions). However, the use of MI with continuous data (for example, normalized fluorescence intensity measurement of gene expression levels) is sensitive to data size, correlation strength and underlying distributions, and often requires laborious and, at times, ad hoc optimization.

    Results

    In this work, we first show that estimating MI of a bi- and tri-variate Gaussian distribution usingk-nearest neighbor (kNN) MI estimation results in significant error reduction as compared to commonly used methods based on fixed binning. Second, we demonstrate that implementing the MI-based kNN Kraskov–Stoögbauer–Grassberger (KSG) algorithm leads to a significant improvement in GRN reconstruction for popular inference algorithms, such as Context Likelihood of Relatedness (CLR). Finally, through extensive in-silico benchmarking we show that a new inference algorithm CMIA (Conditional Mutual Information Augmentation), inspired by CLR, in combination with the KSG-MI estimator, outperforms commonly used methods.

    Conclusions

    Using three canonical datasets containing 15 synthetic networks, the newly developed method for GRN reconstruction—which combines CMIA, and the KSG-MI estimator—achieves an improvement of 20–35% in precision-recall measures over the current gold standard in the field. This new method will enable researchers to discover new gene interactions or better choose gene candidates for experimental validations.

     
    more » « less
  3. Volatility modeling is crucial in finance, especially when dealing with intraday transaction‐level asset returns. The irregular and high‐frequency nature of the data presents unique challenges. While stochastic volatility (SV) models are widely used for understanding patterns in volatility of daily stock returns which constitute regularly spaced time series, new classes of models must be introduced for analyzing volatility in irregularly spaced intraday data. Specifically these models must accommodate the random gaps between successive transactional events. By modeling the gaps using autoregressive conditional duration (ACD) models, we describe a hierarchical irregular SV autoregressive conditional duration (IR‐SV‐ACD) model for estimating and forecasting intertransaction gaps and the volatility of log‐returns. We carry out the analysis in the Bayesian framework via the Hamiltonian Monte Carlo (HMC) algorithm with No‐U‐turn sampler (NUTS) in R using thecmdstanrpackage. The fits and forecasts are obtained using Monte Carlo averages based on the posterior samples. We illustrate this approach using simulation studies and real data analysis for intraday prices available at microseconds level of health stocks traded on the New York Stock Exchange (NYSE). The log‐returns and gaps are calculated for the stocks and are used for modeling.

     
    more » « less
  4. Abstract

    This paper describes risk-pooling friendships and other social networks among pastoralists in Karamoja, Uganda. Social networks are of critical importance for risk management in an environment marked by volatility and uncertainty. Risk management or risk pooling mainly takes the form of “stock friendships”: an informal insurance system in which men established mutually beneficial partnerships with unrelated or related individuals through livestock transfers in the form of gifts or loans. Friends accepted the obligation to assist each other during need, ranging from the time of marriage to times of distress. Anthropologists and economists claim that social networks are critical for recouping short-term losses such as food shortage, as well as for ensuring long-term sustainability through the building of social capital and rebuilding of herds. To this end, I present ethnographic data on friendship, kinship, and other networks among male and female pastoralists in Karamoja. Using qualitative and quantitative data on these relationships and norms of livestock transfers and other mutual aid, I show the enduring importance of social networks in the life of Karamoja’s pastoralists today. I also demonstrate how exchange networks were utilized by participants during a drought. On this basis, I argue that appreciating historical and traditional mechanisms of resilience among pastoralists is vital for designing community-based risk management projects. I discuss how traditional safety net systems have been used successfully by NGOs to assist pastoralists in the wake of disaster, and how the same can be done by harnessing risk-pooling friendships in Karamoja.

     
    more » « less
  5. We show that endogenous variation in risk aversion over the business cycle can jointly explain financial market responses to high-frequency monetary policy shocks with standard asset pricing moments. We newly integrate a work-horse New Keynesian model with countercyclical risk aversion via habit formation preferences. In the model, a surprise increase in the policy rate lowers consumption relative to habit, raising risk aversion. Endogenously time-varying risk aversion in the model is crucial to explain the large fall in the stock market, the cross-section of industry returns, and the increase in long-term bond yields in response to a surprise policy rate increase. 
    more » « less