Quants Corner: Working with CEIC point-in-time data, a multi-factor approach

Getting to the point with PIT data

Four years ago Britain’s Queen Elizabeth II observed that, “recollections may vary.” The late monarch could have been speaking about many key economic datasets, which are subject to revisions that can see a single data point change dramatically over time. A recent – and glaring – example is the 12-month figure for job growth in the US through March of this year, which was initially reported at 1.7 million but was cut in September to 849,000.

Changes of these magnitude rekindle the debate over the relative value, from a quant perspective, between preliminary numbers and the revised numbers that are more accurate but can come with substantial lags.

Non-quants tend to assume the more accurate revised data is the most valuable when it comes to generating actionable investment decisions. Why trade on numbers that may not tell the right story? But, from a quant perspective, having only the latest released data gives an incomplete picture. Working with the initial numbers, even when they turn out to be misleading, can help to eliminate look-ahead bias and support more realistic back-testing.

In this Quants Corner, we explore the value of CEIC’s point-in-time data in building investment factors and test the benefits of utilizing the results in a multi-factor model rather than simply using them on a stand-alone basis.

Unlocking the door with the CEIC key

Point-in-time, for the purposes of this analysis, means what was known at a given moment. CEIC’s point-in-time data preserves every published indicator value with its specific date, giving users the ability to reconstruct exactly what was known in that moment. All data release dates for each factor capture –separately – all the publication dates and indicator values. Those indicator values, which are measurements reported by specific economic or financial sources, can and do change as new information is released. Each of these versions are known as vintages.

A good, and topical, example is US consumer price inflation (CPI) data. The indicator value of the CPI is released monthly, and as more data is collected or adjustments are made, the number could be revised. Each of these – the initial release, updates, and later revisions – represents a different vintage. For example, the US CPI index for January 24’ was first published by CEIC with a value of 141.3734 on 30 April ’24. This was later revised to 141.4397 on 27 June that same year.

To create a factor from this data we utilize a flat file, which is a straightforward two-dimensional table with vintages as column names and data dates as row names. The CPI level is a data point released monthly, and by itself, does not tell how much prices have changed over time. To capture the rate of change we: calculate using the flat file:

Shift the data dates forward by a year to compute twelve-month lagged CPI.
Divide the latest CPI level by the lagged CPI, subtract 1 and multiply the result by 100.

This transformation results in a flat file where months are rows, vintages are columns, and entries are the latest-known percentage inflation. The latest available inflation figure, for each data and each country, now provides a point-in-time indicator that can be back-tested.

A satisfying inflation of returns

A key strength of this approach is that it measures indicators as they were available to market participants in real time, not with the benefit of hindsight (i.e., data revised later). This ensures that any tested trading strategy is based on what could realistically been known in real time.

The results are back tests that are not artificially skewed by information that wasn’t actually available at the time. More generally, indicators tend to perform well when they:

Capture economic signals (like inflation trends) that matter most for markets.
Can be applied across countries for consistent comparisons.

The point-in-time indicator of the latest-known percentage inflation figure, produced from CEIC’s CPI data, satisfies these conditions. Furthermore, this approach is not limited to the CPI. It can be applied to any indicator or factor where the change, rather than the level, conveys the true economic signal. This raises two questions: which factors fit this criterion, and can they be combined in a multi factor model to produce better risk-adjusted returns?

To answer both questions, we created a pool of 13 factors derived from CEIC datasets and combined five of them as inputs for a standard EPFR multi-factor country model that ranks countries after the factors are applied into five equal baskets based on the model. The final step is to go long the top quintile and short the bottom quintile.

The results based on the five factors selected are promising:

Picture1-Oct-28-2025-02-37-07-0321-PM

Using the static weights shown in our back-test results above, we must first standardize the values of each factor to make them comparable. This is achieved through cross-sectional z-scoring, allowing us to build an overall model and produce a multi-factor signal.

This multi-factor indicator is then back-tested to compare with results from the single-factor indicators. The resulting multi-factor Alpha has an even higher Sharpe ratio, as shown below, than any individual indicator. This means the multi-factor approach is safer to bet on, offering stronger risk-adjusted returns and greater reliability.

The back tests also show that results have been positive eight of the past 10 years:

Picture2-Nov-04-2025-11-50-33-3045-AM

They also hold up when the quintiles are refreshed at different frequencies.

Picture3-Nov-04-2025-11-50-59-5157-AM

Why stop now?

With CEIC offering access to over 30,000 traditional, alternative and differentiated data sets, we have barely scratched the surface when it comes to creating new factors that can be used singly or in combination to enhance returns.

Below is a table showing the universe of factors we created from CEIC data for this exercise.

Picture4-Nov-04-2025-11-53-01-2954-AM

There are literally hundreds of potential indicators that can be created using CEIC’s PIT data. A future iteration of our multi factor model may utilize factors derived from data on the number of seated diners, motor vehicle sales growth and the bankruptcy numbers captured in Google searches.

Stay tuned.

If you are not a CEIC client, explore how we can assist you in generating alpha by registering for a trial of our product: https://hubs.la/Q02f5lQh0

Countries

Indicators

Products

Blog

About

Working with CEIC point-in-time data, a multi-factor approach