The importance of the data declaration date

I will never forget the iconic mistake I made during my graduate training programme as a quantitative analyst.

What-if-analysis

My first assignment was to design and implement a multi-factor ranking model for the head of quants at an investment firm. This involved delivering a ‘What-if-analysis’ tool that would allow a fund manager to:

  • Create a basket of instruments on which to run a model
  • Specify the period start and end, together with a rebalance frequency
  • Select factors on which to rank the instruments
  • Specify the weights associated with each factor to arrive at a cumulative ranking for generating the long and short signals
  • Display the simulated funds’ composition and performance for the period
  • Change the factors or weights, resulting in a different composition
  • Determine the optimal associated weights to generate the best return over the period.

Shooting the lights out

I immersed myself in the problem. I developed the application over a 6-month period. While testing the beta version, I played with various factors and weights. At this point I found a set of factors and weights that appeared to ‘shoot the lights out’. I showed this model to the head of quants who agreed with me: I had achieved something remarkable – a consistent high-return model, achieving in some cases annual returns of around 100%.

My iconic error

The model was an investor’s dream. It seemed too good to be true. It was too good to be true. So where had I gone wrong?

The concept of a data declaration date was not foreign to me. All financial line item forecasts (estimates) come with a declaration date (revision date). When calculating factors such as a 12-month forecasted earnings growth, one needs to consider the most recent forecast declared on, or before, the date under consideration.

My blind spot was in not realising that the data declaration date is just as important for historically declared fundamental data, such as headline earnings per share. My data load process would import fundamental data and save it against the date it was applicable to, without consideration of the date on which it was actually declared. In other words the model used data that would not have been available at that point of simulation during the period under consideration.

Knowing the unknowable

The following figure illustrates the ‘information as at date’ (IAAD) concept. When retrospectively modelling the period up to the IAAD, the model would, for example, utilise the stored value for EPS3. This lead to the ‘shooting the lights out’ scenario, as the modeller had overlooked the fact that the data value for EPS3 had not been reported at that time. The model was effectively knowing the unknowable:

The-state-before-declaration-My-iconic-error-graph

How Quintessence deals with value dates

Quintessence stores research data against the dates to which they are applicable (the value date). In addition, Quintessence stores the date on which the values were declared (the declaration date). This allows What-if-analysis models to look at any period in time, and only consider what information would have been available at the time, regardless of what values are stored for the period.

Here ends the lesson:

Never underestimate the importance of data declaration dates when modelling with data, retrospectively or otherwise.

Contact us for more information on the importance of the data declaration date, or read about Quintessence features.

Related Articles

The information you need, right on schedule

The information you need, right on schedule

How to package complex algorithms in Excel

How to package complex algorithms in Excel

Retrieving data into Excel® from an integrated source

Retrieving data into Excel® from an integrated source

WANT YOUR SPAGHETTI KNOT UNRAVELLED?

Contact us to find out how we can assist you or book a demo to experience Quintessence first-hand.

Copyright © 2018 Quintessence
All trademarks are the property of their respective owners. All rights reserved.
Website developed by: