A better gold valuation tool for investors

We launched QaurumSM almost two years ago in response to a vocal need for more robust and accessible gold valuation analytics. While these exist in abundance for other asset classes, gold investors have historically had to settle for something more cursory or incomplete.

Housed on our data and research portal Goldhub, Qaurum is an interactive tool, powered by our Gold Valuation Framework, that strives to help investors understand how gold prices are determined by the interaction of macroeconomic drivers and gold’s supply and demand (Focus 1). 

A question investors often ask is how Qaurum compares to a more simplistic model based on just two inputs: US real interest rates and a dollar index. This ‘simple’ model is often referenced in financial literature and appears to have worked well over the last few years. 

While US real rates and the dollar can explain gold well, the gold market is deeper and broader than these two factors imply. Seeing gold through such a narrow lens can be limiting both in understanding gold and in forming a robust view of its future performance. We find that…

  • A simple model assumes that prices are almost exclusively determined on the basis of US financial indicators and is often spuriously constructed in levels which can pose statistical issues 

  • There is a clear structural shift in the relationship between gold and US real rates in late 2008, which may or may not reverse in future

  • During several instances in history where the underlying relationships have flipped, gold, the USD and real rates moved in tandem 

  • Qaurum does more than just predict movements in the gold price. It’s designed to promote an understanding of the various and sometime contradictory forces that drive gold and to offer a forward-looking view of how gold might respond in a variety of scenarios

  • On measures of in-sample and out-of-sample Goodness-of-fit, Qaurum scores higher than the simple model (Table 1)

Table 1: Comparing the goodness-of-fit between Qaurum and the simple model

R-squared, Directional accuracy and Root Mean Squared Error (RMSE) across 4 sets of samples. All values in %.

  In-sample range  Out-of-sample range 
Goodness-of-fit
measure
Modelto
2016
to
2017
to
2018
to
2019
from
2017
from
2018
from
2019
from
2020
R-squaredQaurum59595959    
 Simple47373636    
          
DirectionalQaurum84848484100100100100
accuracySimple77777777606080100
          
RMSEQaurum98.88.78.64.96.99.610.1
 Simple18.117.61717.47.99.713.311.9

Samples incrementally increase by one year. For Qaurum, the full sample covers 1989 to 2020. For the simple model, the full sample runs from 1999 to 2020

Source: World Gold Council

The purpose of Qaurum

Before jumping into what the simple model is and how the two compare, it is worth highlighting why Qaurum and its underlying Gold Valuation Framework methodology were created.

Focus 1: Qaurum and the GVF serve four key purposes:

 

  • To provide a quantitative framework with which to understand both the breadth and depth of the gold market. That is to say, the importance of regions as well as sources of demand and supply. 
  • To quantify the interplay between macro factors and supply and demand in order to capture gold’s dual nature as a consumer good and investment asset as well as the impact of its scarcity on long-term returns
  • To generate a multifaceted view of scenario-based price paths. By multifaceted, we mean both a broad set of inputs to generate the view, as well as a broad set of scenarios to encapsulate positive and negative risks
  • To generate a view on long-run implied gold returns to facilitate asset allocation decisions and to help generate a medium-term valuation assessments

It is these purposes combined that make our proprietary model Qaurum distinguishable from those of a simple model. So, let’s first take a look at the simple model in more detail; why it works well and why it perhaps doesn’t.

The simple model

What is the simple model and why is it so compelling? There are variations on what is classified as a simple model. But one using just one or two variables appears most common. 

Chart 1 illustrates how closely the simple model tracks the gold price. This one is constructed using the US 10-year Treasury Inflation Protected yield (TIP) as a proxy for a real interest rates and the broad trade-weighted US dollar index to represent the US dollar.

 

Chart 1: The simple model has explained gold prices movements well over the last 15 years

The simple model has explained gold prices movements well over the last 15 years

Simple model of gold price explained using US 10-year TIP yield and Broad US dollar index

The simple model has explained gold prices movements well over the last 15 years
Simple model of gold price explained using US 10-year TIP yield and Broad US dollar index
Model estimated using OLS, in levels, on data from Jan 2007 to Dec 2020 Source: Bloomberg, World Gold Council

Sources: Bloomberg, World Gold Council; Disclaimer

Model estimated using OLS, in levels, on data from Jan 2007 to Dec 2020

 

The rationale for these dominant drivers to sufficiently capture gold prices lies in the opportunity cost argument. Gold competes with all assets but most closely with those that resemble it best: a perpetual asset that holds its value over time and has no credit risk. In this case sovereign US bonds are the closest observed proxy for such an asset.

Gold’s relationship with the US dollar captures some overlapping dynamics of monetary debasement in the long term but also short-term dynamics. For example, globally competing real yields and non-dollar bloc buying for consumption and other non-investment driven reasons. 
This model and variations of it, are often used to explain gold price movements and appear to work very well with an R-squared of 0.85 [Please see Appendix Table 2 footnote in the full report for a caveat in applying R-squared to level data]. 

The model in Chart 1 is estimated on monthly end-of-period data. Table 2 in the appendix (in the full report) shows the R-squared and directional accuracy of the model based on different frequencies and using either end-of-period values or average values.

The caveats of the simple model

The first caveat, arguably well acknowledged by those who promote the simple model, is that it is too narrow in scope. The gold market operates globally, and more than two thirds of the demand for investment gold annually comes from countries outside of the US and Europe.1 In addition, 58% of gold demand since 2010 has come from non-investment sources.2 Velocity of investment is clearly higher than other sources of demand, as evidenced by volumes.  However, as one decreases the frequency of price movements in gold, investment’s relative importance shrinks through netting out, as other demand becomes more influential. Chart 2 shows that although US and EU investment can dominate demand at times in the short run, over longer periods it partially nets out. The sum of US/EU investment is 5,442t while the sum of all other demand is 29,781t.

 

Chart 2: US/EU investment partially nets out in the long run versus other sources of demand

US/EU investment partially nets out in the long run versus other sources of demand

Non-US/EU demand vs other sources of demand

US/EU investment partially nets out in the long run versus other sources of demand
Non-US/EU demand vs other sources of demand
Green bars are the sum of Jewellery fabrication, Technology, less recycling, central bank demand and non-EU/US bar & coin demand. Red bars are the sum of US/EU bar & coin demand, US/EU ETF demand and changes in COMEX futures positioning Source: Bloomberg, Metals Focus, World Gold Council

Sources: Bloomberg, Metals Focus, World Gold Council; Disclaimer

Green bars are the sum of Jewellery fabrication, Technology, less recycling, central bank demand and non-EU/US bar & coin demand. Red bars are the sum of US/EU bar & coin demand, US/EU ETF demand and changes in COMEX futures positioning

 

The simple model is often estimated in levels – that is to say, the gold price is estimated as a function of the level of interest rates and the level of the dollar. While perhaps more intuitive, estimating a model in this format could be statistically problematic unless the two series have a long-run relationship; i.e. they are cointegrated. If they are not, then the relationship may be spurious.3 

However, there is clearly an economic link between interest rates and gold. Whether it is driven by the opportunity-cost motive or the link interest rates have to the health of the economy, it is nonetheless undeniably there. By deflating the gold price using inflation, one achieves a correct form for this relationship. However, given that one series is now deflated by realised inflation and the other by expected inflation this may lead to other unintended consequences. To avoid the potential pitfalls of modelling in levels and also achieve the statistically desirable property of stationarity4, the common procedure is to difference the data i.e. convert it to period-on-period changes.

However, before looking at a model in changes, it is worth nothing that many variations of the simple model seen in use are estimated over a period that begins approximately during the Great Financial Crisis. But both TIPs and the dollar index have joint data as far back as early 1998. If we re-estimate the model using the full dataset, as is common practice in the absence of a sound reason, we see that its accuracy is considerably worse prior to the crisis (Chart 3).

 

Chart 3: The simple model has less accurately explained gold prices when using full history of TIPs data

The simple model has less accurately explained gold prices when using full history of TIPs data

The simple model of gold price explained using US 10-year TIP yield and Broad US dollar index

The simple model has less accurately explained gold prices when using full history of TIPs data
The simple model of gold price explained using US 10-year TIP yield and Broad US dollar index
Model estimated using OLS, in levels, on data from Jan 2007 to Dec 2020 Source: Bloomberg, World Gold Council

Sources: Bloomberg, World Gold Council; Disclaimer

Model estimated using OLS, in levels, on data from Jan 2007 to Dec 2020

 

This shift suggests a change in one of the relationships – a structural break. By converting the data to a stationary differenced series, we can formally test for such a break.

The analysis strongly suggests a shift in the gold and real rate relationship at the end of 2008, with a c. six-fold increase in gold’s sensitivity to 10-year TIPs (See Chart 8 in the Appendix in the full report for detail).5 This is likely the result of the new monetary policy regime that unfolded during the global financial crisis, which promoted the 10-year yield (nominal and real) as a global risk barometer, from which many assets – not just gold – began taking their cues to a much larger degree. But, we can’t be sure. 

More importantly, we don’t know if that structural change will unravel over the next few years. There is some credence to the argument that while the Fed and other central banks aren’t directly controlling yields, their policy actions are dampening the signals bond yields convey, making them less useful as a barometer going forward.6 This is further compounded by how low yields are and the zero bound for nominal yields makes them less effective as hedges in a crisis.

This presents a clear risk when using only two variables to forecast gold. Moreover, a model in ‘changes’ doesn’t work so well over periods of more than a few months – too short for most investor horizons when forecasting – as the errors accumulate at a much faster rate than a model in levels due to a greater presence of noise.

In addition, the model suggests a very stationary inverse relationship between gold, the US dollar and real interest rates. Extending back over 20 years (Chart 4) there have been several instances when gold has moved in the same direction as the dollar and the real interest rate. It shows that these relationships are in fact varying and suggest that other variables are required to more fully capture moves in gold prices.

 

Chart 4: Months when gold, the US dollar and US real yields have moved in the same direction

Chart 4: Months when gold, the US dollar and US real yields have moved in the same direction

Vertical lines denote months during which gold, real yields and the US dollar have moved in the same direction

Chart 4: Months when gold, the US dollar and US real yields have moved in the same direction
Vertical lines denote months during which gold, real yields and the US dollar have moved in the same direction
Gold is LBMA PM price, US dollar= Broad trade-weighted dollar index DXY, US real yield = US 10-year TIP yield. These periods overlap where the colours are darker Source: Bloomberg, World Gold Council

Sources: Bloomberg, World Gold Council; Disclaimer

Gold is LBMA PM price, US dollar= Broad trade-weighted dollar index DXY, US real yield = US 10-year TIP yield. These periods overlap where the colours are darker

 

Comparing the two models

To support the assertion that Qaurum is as good as the simple model in explaining changes in the gold price, we can illustrate this in-sample. But more importantly, especially from a forecasting perspective, on an out-of-sample comparison, Qaurum ranks higher than the simple model both in terms of the size of the error (Root Mean Squared Error: RMSE) and in terms of directional accuracy

Out of sample testing

One critical aspect of model building and of validating the model’s accuracy, is the performance of the model given new data, also referred to as out-of-sample data. It is a common failing that many models are built to explain a sample of data so accurately – known as over-fitting - that it has very little explanatory ability when new data arrives. One way to do this is to split the data into two samples: one, which is used to estimate the model known as in-sample and then a separate out-of-sample subset on which to test the estimated model. 

To address the question “Is Qaurum as good as the simple model?”, we looked at three common metrics of model performance: R-squared, directional accuracy and Root mean squared error (RMSE). We analysed the performance metrics both in-sample and out-of-sample.  Our out-of-sample range was set incrementally from 2017 to 2020 in four estimates.

The R-squared metric only covers the in-sample data. This is displayed in Chart 5. Changing the sample changes the R-squared for the simple model, with a clear slide between 2016 and 2017. For Qaurum, the R-squared remains practically unchanged at .59 for each of these samples.

The second metric we looked at was directional accuracy (Chart 6) defined as the number of years during which the model correctly predicted the direction of prices. Starting with the in-sample values, Qaurum has consistently better accuracy, missing five years in the sample from 1989 with only one of these since 1997. The simple model also missed five years: 1999, 2000, 2005, 2008 and 2014. As percentage of observations, the accuracy is therefore lower at 77% vs 84%.

 

Chart 5: Qaurum’s in-sample R-squared is consistently higher than the simple model’s

Chart 5: Qaurum’s in-sample R-squared is consistently higher than the simple model’s

Comparison between Qaurum and simple model across four samples of incrementally increasing length

Chart 5: Qaurum’s in-sample R-squared is consistently higher than the simple model’s
Comparison between Qaurum and simple model across four samples of incrementally increasing length
R-squared is not calculated in Qaurum when solving for return. We yield an R-squared by regressing our implied return on actual gold returns We don’t report out-of-sample R-squared due to limited observations Source: World Gold Council

Sources: World Gold Council; Disclaimer

R-squared is not calculated in Qaurum when solving for return. We yield an R-squared by regressing our implied return on actual gold returns We don’t report out-of-sample R-squared due to limited observations

 

 

Chart 6: Qaurum consistently achieves higher directional accuracy than the simple model, both in- and out-of-sample

Qaurum consistently achieves higher directional accuracy than the simple model, both in- and out-of-sample

Comparison between Qaurum and simple model across four sets of samples

Qaurum consistently achieves higher directional accuracy than the simple model, both in- and out-of-sample
Comparison between Qaurum and simple model across four sets of samples
Source: World Gold Council

Sources: World Gold Council; Disclaimer

 

More importantly however is the out-of-sample accuracy. Qaurum gets the direction right in every instance after 2016. The simple model misses two years in the first two samples and one in the third before correctly predicting the return in 2020 giving a 100% accuracy score in that sample.

The final metric (Chart 7) and one commonly used to assess the out-of-sample performance of models is RMSE. This can be interpreted as the standard deviation of the forecast error. The in-sample error for the simple model is almost twice that of Qaurum, while the out-of-sample error is higher in each instance.

We also tested whether the errors were statistically different between the in-sample data and out-of-sample data using a two sample unequal variance test. The p-values were all larger than 0.38 for Qaurum and 0.30 for the simple model, so we cannot reject the hypothesis that the RMSE values are consistent across samples.

 

Chart 7: RMSE is considerably lower for Qaurum in-sample and consistently lower out of sample

RMSE is considerably lower for Qaurum in-sample and consistently lower out of sample

Comparison between Qaurum and simple model across four sets of samples

RMSE is considerably lower for Qaurum in-sample and consistently lower out of sample
Comparison between Qaurum and simple model across four sets of samples
Source: World Gold Council

Sources: World Gold Council; Disclaimer

 

Conclusion

In summary, we were asked why Qaurum, with its apparent intricacies is needed when a simple model explains the gold price so well. Qaurum serves several purposes in our view. And its accuracy is not the only important aspect to consider. We highlighted some of the pitfalls of using a very lean model to describe gold’s behaviour above. Such models may have statistical issues, or they may be just short-sighted and therefore less able to adapt to changing environments. 

We also measured goodness-of-fit for both models, both in-sample and out-of-sample and the results show that Qaurum was superior on all metrics and in all instances. This is not to say that Qaurum is infallible. It has its limitations as well: constrained to annual data, few observations, a broad but limited set of drivers. However, despite this it has worked very well since its launch. In particular, directional accuracy over the last 24 years, where the model missed just a single year (2009), is encouraging and suggests that it is capturing a broad enough set of drivers to allow for changing sensitivities in the model. Or “allow for a varying importance of gold’s drivers in different environments”.

Models by design, strive for simplicity. Financial markets are complicated, and models can help remove some of the complexity. But gold is a particularly rich market, being global in nature and with a multitude of functions and uses. As such, a very simple model that captures only a proportion of its characteristics may not be apt. For example, equity investors might use price-to-book/cash-flow/earnings to value companies or indices. But whereas they are necessary, they alone are not sufficient in establishing a robust investment strategy. So why should investors settle for a two-factor model for gold? Qaurum strives to present a simplification of the gold market, accurately and consistently but also aims to capture its depth and breadth.

110-year average of US and European ETF + Bar & coin demand as a share of global ETF + Bar & coin demand. Figure excludes OTC investment.

2Sum of Jewellery and Technology demand as a share of total gold demand.

3A spurious relationship between two variables is loosely defined as one where the relationship exists because of the presence of some third variable that ‘imparts correlation to them’. Time is a good example of a third variable, which often makes completely unrelated things look like they are correlated.

4Stationarity in the strictest sense refers to the property whereby a stochastic process has the same probability distribution at any point in time. It is more common to settle for a weaker form where the mean and variance are stationary.

5A regression of Δ log gold on Δ log dollar index and Δ US 10-year TIP yield sees a significant shift in the coefficient on the TIP yield from -0.019 to -0.123 from the estimation period Jan 1998 to Sep 2008 to Oct 2008 to Dec 2020.

6Real rates ain’t as real as you think | Financial Times (ft.com)

Important disclaimers and disclosures

© 2021 World Gold Council. All rights reserved. World Gold Council and the Circle device are trademarks of the World Gold Council or its affiliates.

All references to LBMA Gold Price are used with the permission of ICE Benchmark Administration Limited and have been provided for informational purposes only. ICE Benchmark Administration Limited accepts no liability or responsibility for the accuracy of the prices or the underlying product to which the prices may be referenced. Other content is the intellectual property of the respective third party and all rights are reserved to them. 

Reproduction or redistribution of any of this information is expressly prohibited without the prior written consent of World Gold Council or the appropriate copyright owners, except as specifically provided below. Information and statistics are copyright © and/or other intellectual property of the World Gold Council or its affiliates (collectively, “WGC”) or third-party providers identified herein. All rights of the respective owners are reserved.

The use of the statistics in this information is permitted for the purposes of review and commentary (including media commentary) in line with fair industry practice, subject to the following two pre-conditions: (i) only limited extracts of data or analysis be used; and (ii) any and all use of these statistics is accompanied by a citation to World Gold Council and, where appropriate, to Metals Focus, Refinitiv GFMS or other identified copyright owners as their source. World Gold Council is affiliated with Metals Focus.

WGC does not guarantee the accuracy or completeness of any information nor accepts responsibility for any losses or damages arising directly or indirectly from the use of this information.

This information is for educational purposes only and by receiving this information, you agree with its intended purpose. Nothing contained herein is intended to constitute a recommendation, investment advice, or offer for the purchase or sale of gold, any gold-related products or services or any other products, services, securities or financial instruments (collectively, “Services”). This information does not take into account any investment objectives, financial situation or particular needs of any particular person. 

Diversification does not guarantee any investment returns and does not eliminate the risk of loss. Past performance is not necessarily indicative of future results. The resulting performance of any investment outcomes that can be generated through allocation to gold are hypothetical in nature, may not reflect actual investment results and are not guarantees of future results. WGC does not guarantee or warranty any calculations and models used in any hypothetical portfolios or any outcomes resulting from any such use. Investors should discuss their individual circumstances with their appropriate investment professionals before making any decision regarding any Services or investments.

This information may contain forward-looking statements, such as statements which use the words “believes”, “expects”, “may”, or “suggests”, or similar terminology, which are based on current expectations and are subject to change. Forward-looking statements involve a number of risks and uncertainties. There can be no assurance that any forward-looking statements will be achieved. WGC assumes no responsibility for updating any forward-looking statements.

Information regarding QaurumSM and the Gold Valuation Framework

Note that the resulting performance of various investment outcomes that can generated through use of Qaurum, the Gold Valuation Framework and other information are hypothetical in nature, may not reflect actual investment results and are not guarantees of future results. Neither WGC nor Oxford Economics provides any warranty or guarantee regarding the functionality of the tool, including without limitation any projections, estimates or calculations.