Beyond Diversification
Beyond Diversification

Beyond Diversification

On markets, I don’t have a crystal ball. None of us do. But after decades of investing, I can usually offer an answer that’s rooted in experience and informed by the research and insights of the talented investors I’m lucky enough to call my colleagues. (Location 109)

Great investors have an almost insatiable intellectual curiosity. They want to know why things are the way they are and how they might be different. Many people love engaging with capital markets. What sets the great investors apart is their all-consuming passion for uncovering insights. (Location 114)

investors. Great investors use that collaboration to arrive at their own original thinking. Our corporate culture at T. Rowe Price empowers these investors to thrive. (Location 118)

He focuses on one of the key, yet rarely discussed, building blocks of the model: the (clearly unrealistic) assumption that investors can borrow all they want at the risk-free rate. (Location 199)

That’s far from Elton and Gruber’s description of the CAPM as “one of the most important discoveries in the field of finance.” Taleb’s concern is that modern portfolio theory and the CAPM were derived under Gaussian assumptions. (Location 245)

flaws. In my view, despite the multidecade academic equivocation on its merit, the CAPM is as good as any other starting point for multi-asset investors, because it relates expected returns to risk. (Location 257)

Beta = (asset volatility/market volatility) × correlation (Location 266)

Expected return = risk-free rate + beta × (market expected return – risk-free rate) (Location 268)

To account for the “historical data” critique, our estimate of the risk-free rate must be forward-looking. We can think of the CAPM as a simple building block approach: we start with the current risk-free rate and add a risk premium (scaled by the asset’s beta). (Location 280)

“When the tide comes down, we’ll see who’s been skinny-dipping” is a quote frequently attributed to Warren Buffett. With apologies for my overuse of the tide analogy, my point is that bear (or even just weak) financial markets are harder to navigate for unskilled investors and can make life difficult for plan sponsors, financial advisors, and individual investors. (Location 303)

As much as we try to keep it simple, it’s never easy to forecast returns. Markowitz and Sharpe seem to disagree on the theory behind CAPM, even though it’s one of the most important foundations of modern finance. (Location 317)

To estimate the total market risk premium, we also need to forecast bond returns. (In theory, we need to forecast returns on all assets, including illiquid investments, but for practical reasons I’ll focus on liquid markets.) (Location 346)

Bond investors tend to worry about rising rates because of the short-term losses that occur when the rate hikes aren’t already priced into the forward curve. However, contrary to conventional wisdom, this example illustrates how rising rates are good for bonds: higher rates mean higher reinvestment rates and, ultimately, higher expected returns. (Location 358)

have converged to the initial yield to maturity at a time horizon that matches duration. This result holds across a variety of rate paths. (Location 369)

What explains these inconsistencies? The main reason is that I use a valuation approach to determine the market risk premium, while the CAPM is a statistical, risk-pricing approach. (Location 447)

FOR STOCKS, THE NEXT STEP IS TO REFINE OUR EARNINGS forecasts. Then we can decompose expected returns into more detailed building blocks. When I estimated the equity market risk premium as an input to the CAPM, I showed that the simplest way to forecast equity returns based on valuation is to invert the price-to-earnings ratio. (Location 532)

Jeremy Grantham, founder and chief investment officer of GMO, also argues that recent earnings growth is sustainable, a regime change of sorts, due to “increased monopoly, political, and brand power.”3 (Location 556)

I don’t find that argument particularly convincing. My colleague David Giroux, portfolio manager and CIO at T. Rowe Price and one of the most talented investors I’ve ever worked with, recently pointed out in an internal memo (Q1 2018) that recent earnings on the S&P 500 have been inflated by a recovery in the energy sector and a weak US dollar, two factors that may not prevail going forward. (Location 562)

Realized return = dividend yield + price change (Location 589)

Realized return = dividend yield + growth + valuation change (Location 591)

For income, there’s good news. Even though the price component of dividend yield is quite unpredictable, companies tend to adjust their payout ratio to smooth dividend yields over time. (Dividend yield = dividend/price, and the payout ratio is the percentage of earnings paid out as dividends.) (Location 602)

Corporate finance fundamentals give us the “sustainable growth rate,” defined as the return on equity (ROE) multiplied by the retention rate (the percentage of earnings that’s not paid out as dividends). (Location 607)

However, over the long run, there’s a powerful mean-reversion effect in valuation levels. Hence, a fair assumption is that valuation changes average out to zero over time. The problem is that it’s hard to model the speed of mean reversion, and the path is never smooth. (Location 628)

For my project, I compared the effectiveness of two valuation change models: one that assumed mean reversion and another that used proprietary data on institutional investor flows. (Location 634)

There’s an elephant in the room of this debate around how to best estimate our “growth” building block. Ultimately, to forecast earnings growth, we need fundamental analysis. (Location 726)

I recently set up a horse race between the following ratios: price to earnings (P/E), price to book (P/B), and price to cash flow (P/CF). I wanted to know which of these valuation measures would produce the highest correlation with subsequent valuation changes, which I defined as Valuation change = realized return – (income + growth) (Location 745)

SO FAR, WE’VE FOCUSED ON RELATIVELY LONG-TERM return forecasts, or “capital markets assumptions.” These types of forecasts are useful for strategic asset allocation. Also, we saw that we can apply the equity building block model at shorter horizons, as in the equity country allocation model we discussed. (Location 866)

While valuation is the main driver for our decisions, we account for macro factors (growth, inflation, central bank policy, geopolitical factors, etc.), index-level fundamentals (sales, earnings, margins, leverage, etc.), and technicals (sentiment, positioning, flows, momentum, etc.). Before each meeting, every committee member pores through an up-to-date book of over 180 pages, full of all types of relevant data, return signals, and risk analytics. (Location 905)

First, we reviewed an analysis prepared by Chris Faulkner-MacDonagh on the impact of macro factors on stocks vs. bonds. While the committee focuses on relative valuations, we also consider macro factors. We know that markets anticipate, or price in, growth and inflation consensus forecasts. Therefore, we always compare our views with what’s priced in. (Location 947)

Simultaneously, while most fixed income portfolio managers remain slightly overweight spread risk, equity portfolio managers across the firm have been de-risking their portfolios (which leads to a “doubling-up” effect of sorts: underweight stocks in Tactical Asset Allocation, and underweight equity beta in security selection). (Location 955)

In 2018, I tested the effectiveness of three valuation metrics as predictors of stock returns: price to earnings (P/E), price to book (P/B), and price to cash flow (P/CF). (Location 984)

me. Valuation seemed to work consistently across equity asset classes and time horizons. Out of 192 correlations, across absolute and relative return forecasts, 187 had the expected sign, consistent with the “buy cheap (valuations), sell high (valuations)” intuition—a remarkable hit rate of 97%. (Location 997)

Also, it was surprising that the signals seemed to work well at shorter horizons (six months and one year). (Location 999)

Consistent with my prior valuation horse race, to forecast absolute returns, P/CF worked better than P/B and P/E. At the one-year horizon, P/CF had an average correlation of over –43% with forward returns, compared with –26% for P/B and –22% for P/E. (Location 1007)

P/E worked better than P/CF at the six-month, one-year, and two-year horizons and worked about the same as P/CF at the three-year horizon. Why didn’t P/CF win? What happened to my simple explanation that cash flows are harder to game and thereby more reliable? (Location 1010)

I’ve coauthored papers with Rob. He has one of the most intuitive minds I’ve ever come across. He grasps issues quickly. When I asked him my question, he nailed the answer in two words: “Systemwide noise.” (Location 1032)

Systemwide noise. Another way to think about this important distinction is that some investors may seek to predict when to invest in stocks (a time series question), while others may focus on whether value stocks will outperform growth stocks (a cross-sectional question). (Location 1038)

These factors are mapped to the pair trades we’ve discussed in this book so far (stocks versus bonds, value versus growth stocks, say small cap versus large cap, high yield versus investment-grade bonds, etc.). (Location 1168)

Asness simply means that “momentum works.” Forbes responds with obvious skepticism: “Sort of the investing equivalent of horoscopes.” (Location 1240)

The key to momentum, however, is not to ignore valuation. It’s to combine it with valuation, as Asness explains in his interview, and as we’ll discuss further. Investors should look for opportunities when both signals agree. For example, buy a cheap asset when its momentum turns positive rather than on the way down, or sell an expensive asset with negative momentum. (Location 1255)

Geczy and Samonov found similar results. They showed that the momentum signal is strongest at the one-month horizon and degrades gradually as the investment horizon gets longer. For stocks, the signal turns into mean reversion roughly around the one-year horizon. (Location 1285)

In my introduction to this book I talked about the GIGO (garbage-in, garbage-out) critique. I explained that quants tend to get annoyed with the GIGO critique. Another critique, perhaps even more irritating and a favorite of academic journal referees, is to question an approach’s “robustness.” (Location 1465)

The subtext was “Welcome newbie, you need to learn how we do things here. We use judgment.” One of the reasons I was taken aback by his “I made it up” response was that he was a well-known quant investor. He used econometrics, statistics, and systematic strategies as core parts of his process. Had he become disillusioned with mathematical approaches? (Location 1469)

In his recent book Principles, legendary investor Ray Dalio (2017) explains the evolution of his approach to marry quantitative analysis with judgment: Rather than blindly following the computer’s recommendations, I would have the computer work in parallel with my own analysis and then compare the two. When the computer’s decision was different from mine, I would examine why. Most of the time, it was because I had overlooked something. In those cases, the computer taught me. But sometimes I would think about some new criteria my system would’ve missed, so I would teach the computer. We helped each other. (Location 1483)

At the top of their paper, they picked a brilliant quote. It gets at the heart of the issue, which is that asset prices are set by humans. The quote is from the famous physicist and Nobel laureate Richard Feynman: “Imagine how much harder physics would be if electrons had feelings!” (Location 1497)

I began my presentation with the observation that the financial services industry seems obsessed with return forecasting. Asset owners, investment managers, sell-side strategists, and financial media pundits all invest considerable time and resources to predict the direction of markets. Even so, I argued, risk-based investing may provide easier and more robust ways to improve portfolio performance, often without the need to forecast returns directly. (Location 1609)

Whatever the root cause, I insisted that investors must manage exposure to large and sudden losses. And to do so, they must recognize that volatility—and thereby exposure to loss—is not stable through time. (Location 1621)

This example made it clear that a constant (fixed weight) asset allocation does not deliver a constant risk exposure. (Location 1625)

The managed volatility strategy is designed for that purpose. It relies on the short-term predictability in volatility. In a nutshell, the strategy adjusts the asset mix over time to stabilize a portfolio’s risk. (Location 1631)

Equity futures were allocated 70% to the S&P 500 and 30% to the MSCI EAFE (Europe, Australasia, and the Far East) Index futures, to reflect the neutral US/non-US equity mix inside the balanced strategy. Last, we imposed a minimum daily trade size of 1% and maximum trade size of 10% of the portfolio’s notional. To forecast volatility, we used a DCC–EGARCH model (dynamic conditional correlation—exponentially weighted generalized autoregressive conditional heteroscedasticity) with fat-tailed distributions. (Location 1665)

As expected, over the 18-year period studied, managed volatility consistently stabilized volatility compared with its static benchmark. The algorithm worked particularly well during the 2008–2009 financial crisis. The strategy was quite tactical. (Location 1673)

In terms of managing tail risk specifically, one way to explain how managed volatility works is to represent portfolio returns as a “mixture of distributions,” which is related to the concept of risk regimes. (Location 1687)

Stefan Hubrich points out that there is another way to think about how managed volatility may increase Sharpe ratios in certain market environments: think of time diversification as similar to cross-asset diversification. (Location 1702)

Ultimately, volatility is forecastable because it “clusters”; hence, managed volatility stabilizes realized volatility and reduces tail risk. (Location 1712)

Investors can use managed volatility to reduce the tail-risk exposure in covered call writing. The idea is to think of risk-based strategies as a set of tools, rather than stand-alone approaches. Low or even negative correlations between risk-based strategies can add a lot of value to a portfolio, even when the individual strategies’ Sharpe ratios are relatively low. (Location 1762)

I asked Rob if he could calculate the correlation between past and future volatility, skewness, and kurtosis for each of these 33 return series. I also asked him to vary the trailing windows as follows: one month (21 days), six months, one year, three years, and five years. I asked him to vary the forecast horizons in the same way and to repeat the entire experiment for daily, weekly, and monthly data. (Location 1911)

First, persistence in volatility was highest at short horizons. It’s easier to forecast volatility one month ahead than one year ahead, and so on. Across all combinations of data windows, forecast horizons, and data frequencies, month-to-month volatility based on daily data showed the most persistence. The average correlation between this month’s daily volatility and next month’s was +68%, with little variation across asset classes and bets. It was the highest positive correlation across all risk forecast permutations for 32 out of 33 asset classes or bets—a remarkably consistent result. (Location 1929)

A more surprising result was that shorter estimation windows also produced better forecasts at longer horizons. This result is important for asset allocation because it means we can also improve risk-adjusted returns without tactical volatility management. (Location 1936)

On the other hand, when it came to data frequency (daily, weekly, or monthly), our experiments revealed that for the same estimation window, more data were better than less data. (Location 1948)

We didn’t expect this result. We expected that it would be better to match data frequency between estimation and forecast. In other words, we expected that daily data would best forecast daily volatility, weekly data would best forecast weekly volatility, and so on. But we found that on average, daily data almost always produced the best forecasts, even for forward monthly volatility. (Location 1951)

persistence works best at short horizons, shorter estimation windows work better even for longer horizons, and higher- frequency data work better than lower-frequency data. (Location 1955)

Another consistent pattern emerged at longer horizons: mean reversion. It started to appear when we used a three-year estimation window to forecast three-year volatility. It became strongest with the five-year estimation window for five-year forward realized volatility, across daily, weekly, and monthly frequencies. (Location 1957)

A key takeaway from our results is that while we found persistence at short horizons, especially one month ahead, the results were the opposite for long horizons. Five years of calm markets are more likely to be followed by five years of turbulence, and vice versa. (Location 1977)

But it has been well documented that correlations tend to increase in down markets, especially during crashes (i.e., “left-tail events”). Studies have shown that this effect is pervasive for a large variety of financial assets, including individual stocks, country equity markets, global equity industries, hedge funds, currencies, and international bond markets. (Location 2038)

We added another angle (as usual, Mark’s idea): we showed that not only did correlations increase on the downside, but they also significantly decreased on the upside. This asymmetry is the opposite of what investors want. Indeed, who wants diversification on the upside? Upside unification (or antidiversification) would be preferable. During good times, we should seek to reduce the return drag from diversifiers. (Location 2058)

We also calculated the stock-bond correlation when bonds were in their bottom 5%. We found that bonds diversify stocks when stocks sell off, but stocks do not diversify bonds when bonds sell off. Double conditioning would fail to reveal this lack of symmetry in the diversification between the two assets. (Location 2078)

But there is more to these statistics than meets the eye. Private assets’ reported returns suffer from the smoothing bias. In fact, in a paper my former colleagues Niels Pedersen, Fei He, and I published in the Financial Analysts Journal in 2014, titled “Asset Allocation: Risk Models for Alternative Investments” (which won a Graham and Dodd Award for Excellence in Research), we show that private assets’ diversification advantage is almost entirely illusory. (Location 2156)

In light of our results, and as reinforced by the events of 2020, I recommend that investors avoid the use of full-sample correlations in portfolio construction—or, at least, that they stress-test their correlation assumptions. Scenario analysis, either historical or forward-looking, should take a bigger role in asset allocation than it does. (Location 2252)

Finally, investors should look beyond diversification to manage portfolio risk. Tail-risk hedging (with equity put options or proxies), risk factors that embed short positions or defensive momentum strategies, and dynamic risk-based strategies all provide better left-tail protection than traditional diversification. (Location 2265)

Can we forecast correlations? The answer: sort of. As we discussed in the previous chapter, the left-tail events that drive correlation shifts in the markets are basically impossible to predict. Perhaps all we can do is build the airplane to withstand the turbulence whenever it occurs. Nonetheless, if volatility is persistent, and if we can forecast portfolio volatility, which is in part driven by correlations among the underlying assets, shouldn’t we observe some persistence in correlations as well? (Location 2344)

As we did for our volatility study, we mixed the frequencies between lookback and forecast data. By intuition, most investors match these frequencies. To forecast weekly correlations, they use past weekly correlations; to forecast monthly correlations, they use past monthly correlations; etc. Instead, we tested which of the daily, weekly, or monthly correlations best predicted future correlations at all frequencies. Our volatility study had revealed that higher frequencies led to better forecasts across the board. Would the same result hold for correlations? (Location 2384)

The month-to-month correlation in daily volatility was +68% (across asset classes and bets), compared with +4% for skewness, +2% for kurtosis, and +40% for correlations. Like volatilities, correlations were more persistent at short horizons than at long horizons, with some mean reversion observed at the three-year and five-year horizons (when we used three-year and five-year lookback windows). (Location 2389)

Though we found that shorter windows generally worked better, the “sweet spot” for correlation predictability was between six months and one year. We found a similar pattern in the longer French data sample for US stocks. (Location 2393)

For example, suppose we forecast the correlation between two asset classes for the next 52 weeks. As with volatility estimates, many investors may be tempted to match the lookback with the forecast windows. Erb, Harvey, and Viskanta (1994) seem to consider this choice as self-evident. In this case, we would use the correlation observed over the last 52 weeks as our forecast for correlation over the next 52 weeks. (Location 2397)

Yet when averaged across our asset class pairs, the best estimate of forward 52-week correlations would have been the trailing one-year daily correlations, with a +25% average correlation between trailing and forward correlations, compared with +22% for weekly and +14% for monthly data. (Location 2400)

The rule of thumb for asset allocators is that correlations show some useful predictability when we use daily data from the last 6 to 12 months. (Location 2405)

But ultimately, most investors care about exposure to loss. Conditional value at risk (CVaR), for example, is a popular risk measure. It provides an estimate of the average loss size during sell-offs, usually defined as the bottom 1, 5, or 10% of outcomes. For example, an annual CVaR of –10% at the 5% confidence level means that once every 20 years, we can expect to lose about 10% on our investment. (Location 2408)

Even so, across all our analyses, Rob and I focused on the most basic of risk forecasts: equally-weighted, historical estimates. Based on a large sample of asset class returns, we answered questions that anyone who builds risk forecasts in the real world must answer, yet are almost always swept under the rug in the existing literature: Which lookback window is most predictive for various forecast horizons, and which data frequency should we use (daily, weekly, monthly)? We evaluated these choices for forecasting volatility, skewness, kurtosis, correlations, (Location 2427)

The key takeaway from all our risk forecasting studies is this: basic parameter choices matter a great deal. A refrain I use in this book is that complexity is not always best. This rule applies to how we decide to spend time and energy to improve risk models. (Location 2437)

obvious. There are several analytical tools available to asset allocators to model fat tails, such as historical analysis, blended probability distributions, and scenario analysis. (Location 2454)

However, a strict interpretation of Taleb’s black swan theory seems impossible to implement with models and data, because, in his words (Taleb, 2010), a black swan “lies outside the realm of regular expectations, because nothing in the past can convincingly point to its possibility.” (Location 2462)

Yet though I’m sure he would disagree with many of the concepts I present in this book, Nassim Taleb makes a valid point: too often fat tails are ignored, even by very smart people who should know better. Fat tails matter because asset allocators must forecast exposure to loss when they construct portfolios. (Location 2496)

He hints that many hedge funds, under the cloak of nontransparency that they enjoy compared with the transparency of mutual funds (many hedge funds operate as a “black box”), may use this type of strategy to juice their track records. (Location 2516)

If the last few minutes have been rough, the probability that the plane will continue to experience turbulence is higher than it would be if the air had been smooth for a while. The advanced techniques we used in our paper account for the fact that asset returns (as well as fundamental and economic data) tend to cluster, like air turbulence. (Location 2608)

Suppose we form a view that there’s a 10% probability of a recession over the next year, and we think the market hasn’t priced in this risk. In that case, we would apply a 10% weight to the risk-off regime, instead of 5% as I did earlier. As a result, we would get a fatter-tailed mixture. (Location 2618)

When we shock a risk factor and infer movements in other related risk factors, we should use stress betas. To use a simple example, as of February 28, 2019, based on the last 12 months, the beta between the equity and credit factors was 0.13. Hence, a shock of –10% on the equity factor would translate into a loss of 0.13 × –10% = –1.3% for credit factor exposures. (Location 2772)

This instability in factor exposures has consequences for portfolio construction. It motivates the use of risk factor models (rather than asset class–based models) in portfolio optimization even if we continue to invest across asset classes. (Location 2952)

But for factors such as momentum, we don’t know whether excess returns represent compensation for risk or a behavioral anomaly. Some authors have argued that momentum is caused by investors who extrapolate recent performance. (Location 2981)

The methodology, partly because of its weighting scheme as well as how it uses leverage, “achieves its large, highly significant alpha, by hugely overweighting micro- and nano-cap stocks. . . . These stocks have limited capacity and are expensive to trade.” (Location 3013)

Covered call writing, which we discussed in Chapter 7,5 is a strategy that delivers another risk premium with sound theoretical foundations and persistence over time: the volatility risk premium. (Location 3017)

Part of the appeal (and the hype) behind risk premiums as building blocks for portfolio construction is their low correlation with each other and with traditional asset classes. (Location 3024)

When the authors combine value and momentum strategies across markets (individual US, UK, European, and Japanese stocks, equity country indexes, currencies, global government (Location 3029)

bonds, and commodity futures), they obtain a stratospheric, hardly-ever-seen-in-practice Sharpe (return-to-risk) ratio of 1.59. (Location 3030)

Risk premium strategies are too easy to build. It has been estimated that there are at least 300 published factors, with roughly 40 newly discovered factors announced each year.6 In the real world, it’s not that easy to beat markets on a risk-adjusted basis. In the paper “Will Your Factor Deliver?” (2016), Noah Beck, Jason Hsu, Vitali Kalesnik, and Helge Kostka explain the issue as follows: (Location 3036)

Based on my extensive review of the literature, and my own research, I would argue that private equity is not a free lunch. It has a role to play in some investors’ portfolios, but it’s a smaller role than suggested by a naïve interpretation of the data. (Location 3601)

Many sophisticated investors are also skeptical of private equity’s supposed superiority as an asset class. In the 2006 article “You Can’t Eat IRR,” Howard Marks, cofounder of Oaktree Capital, first explains that the IRR methodology has some advantages. (Location 3702)

However, Marks also points out a major issue with IRRs: they adjust for when the capital was called, not actually deployed, by the private equity manager. (Location 3705)

This study reveals an asymmetry in how public market returns affect private equity valuations: private equity managers tend to mark up their investments when public market returns are positive, while they don’t mark them down as much when public market returns are negative. (Location 3720)

Although we reach similar conclusions, our approach is different in that we pick stocks (i.e., we look for mispricings within ETF constituents), whereas Cahan et al. pick ETFs (they look for mispricing across ETFs, based on stock-level analysis). (Location 3896)

FOLLOWING ARE STRATEGIC ASSET ALLOCATIONS FOR portfolios that broadly incorporate the principles we’ve discussed throughout this book. Tactical asset allocation should also be applied around these strategic weights, to take advantage of relative valuation opportunities based on the approach we discussed in Part One of this book. (Location 4117)

Valuation-based expected return models also point to lower returns over the next 5 to 10 years, across a wide range of equity and fixed income asset classes. (Location 4305)

On risk forecasting, the good news is that volatility seems fairly predictable (at least more than returns). (Location 4313)