I’d been wandering for a while just how long a ‘long term investment’ is. There’s lots of good stuff written on this, but for me there’s no substitute for getting my hands dirty in data. So I dusted off Excel and took a look at the numbers. This analysis is not unique, I’m sure. It’s probably not even entirely accurate. I’m looking forward to hearing about all the ways in which it is flawed, so that I can do v2.

Here are the results (click to get a good look). The method, and a little analysis of the results, are below.

You’re looking at 6 different lines, one for each different ‘investment length’. For example, the blue line shows the distribution of returns on a 1 year investment in the S&P500, taken over all the possible 1 year periods that occurred between Jan 1950 and Oct 2008. The orange line show the distribution of returns on a 30 year investment, which as you’d expect, is much tighter.

A few things stand out for me:

- Five years is not a ‘long term investment’, unless you’re willing to accept a fair chance (~25%) of performing at (or well under) a savings account like return (~4%)
- Even at twenty years, there is a ~20% chance of getting a return between 2% and 4%, which would be a bit disappointing as an investor, IMO.
- Based on the last 58 years of data, it looks like 30 years could be considered ‘long’ (but remember what they say about past performance)

Disclaimer: I’m really not any sort of an expert at this. This is 30 minutes in Excel with data from Yahoo Finance. There are a million nuances that I’m missing.

Of course, this is for investing a lump sum of money at one point in time. Most consumers would actually invest continually over a long period of time, which creates a dollar-cost-averaging effect. Maybe I’ll look at that next..

What does this chart say to you? Let me know in the comments.

Methodology

I calculated the annualized return by dividing ‘month A closing price’ by ‘month B closing price’. I then raised that number to the power of 1 over the number of years between month A and month B. I then subtracted 1. I then measured frequency of that return by counting the number of occurrences of the return in buckets 2% wide (i.e. count x where: 2%<x<=4%), and dividing by the total number of periods of that length in the test (shown as ‘n’ in the legend). I then plotted the mid-point of that bucket (i.e. 3%) on the horizontal axis, and the corresponding frequency on the vertical axis.

I’m looking forward to hearing your thoughts, and suggestions for v2.

tom

October 25, 2008 at 11:38 am |

You’re missing the effect of dividends for one thing.

October 25, 2008 at 12:40 pm |

Ideally, you ought to convert prices from nominal to real to remove any “artificial” returns that are just products of inflation. This will depress the numbers.

October 25, 2008 at 2:29 pm |

Please do this same analysis starting with 1929, or sooner. With all the tv “experts” saying it is a great time to buy, I would like to see what my return is over the long run, given that we are probably only 1/2 way down this stock market slide.Please do this same analysis starting with 1929, or sooner. With all the tv “experts” saying it is a great time to buy, I would like to see what my return is over the long run. Given that the US investment banks do not have to determine the market price of their assets for two years (instead of the usual one), I figure there is still a lot of losses hidden on paper.

So, is it really good time to buy if the stock market has another 50% to loose? How many decades will it take to make that money back? Will I be dead by then? 🙂

October 25, 2008 at 2:52 pm |

Did you include dividends?

October 25, 2008 at 4:25 pm |

Alas, this does not include dividends. That will be in the next version (by analyzing Total Shareholder Return), assuming I can find enough data to do it.

Including dividends will shift the curve to the right, but what will it do to the variance? I’m pondering that now.

October 25, 2008 at 5:23 pm |

Could you make your spreadsheet available with this post? Then people could play around with it.

October 26, 2008 at 1:05 am |

Really interesting analysis!

Things I take away:

1) Raw eyeballing says you’re likely to get 6-8% return which is a lot less than the 10% that is quoted as the long term average. As other people have said dividends change this (hopefully for the better)

2) 10yr gives you the best chance to get >10% returns. Interesting.

I’d like to see dividends and cost averaging tried out.

October 26, 2008 at 2:35 am |

Good stuff!

Were you trying to take the Geometric Mean for the Returns? If so, wouldn’t you divide (month B closing price-month A closing price) by month A closing price for each month instead of just

???I would have probably used for returns each month and then taken the Mean for each year to get an average return.

Very interesting stuff though!

http://exponentialsmoothing.wordpress.com/October 26, 2008 at 2:37 am |

opps,

what i meant to say was:

Good stuff!

Were you trying to take the Geometric Mean for the Returns? If so, wouldn’t you divide (month B closing price-month A closing price) by month A closing price (B-A/A) for each month instead of just (B/A)???

I would have probably used (B-A/A) for returns each month and then taken the Mean for each year to get an average return.

Very interesting stuff though!

http://exponentialsmoothing.wordpress.com/

October 26, 2008 at 4:41 am |

Thanks exposmooth !

To answer your question in one sentence: ‘yes, and I think I did’

To answer it in detail:

I am trying to get at a number that represents the annualized return (lets call it R) between two points in time. Lets call the value of the index at the beginning of that period ‘A’ and the value after ‘n’ year ‘Z’. Then R represents a return which if we got every year, and compounded, would grow our investment at value A to be worth Z after ‘n’ years. Given that, I am not concerned with the value of the investment at any point between A and Z (for this analysis), so the definition of R cannot be affected by any interlying values.

My original definition for the calculation of R was:

R = ((Z/A)^(1/n))-1

You asked whether I was taking the geometric mean of the returnS.

There is only one return in my calculation process, as I only am concerned with two points in time, which yeilds only one return.

However, we can use the geometric mean of ‘A’ and ‘Z’ to get at ‘R’, as:

R = ( Geo(A,Z)/A )-1

(where Geo() is the geometric mean function)

i.e. R is the geometric mean of A and Z, divided by A, minus 1.

to expand:

R = (((A.Z)^(1/n))/A)-1

Since A and Z are both values of an index, they could both be divided by any factor ‘K’ and the definition of R would still hold, so:

R = ((((A/K).(Z/K))^(1/n))/(A/K))-1

Now set ‘K’ to equal ‘A’, and cancel, and we get:

R = ((((1).(Z/A))^(1/n))/(1))-1

so

R = ((Z/A)^(1/n))-1

Which get us back to the original definition of R.

So in long, and in short, ‘yes’, this could be thought of as a measure of geometric mean 🙂

October 26, 2008 at 2:56 pm |

my answer would be long enough until our investments make profits.

October 27, 2008 at 12:32 am |

very interesting! others seem to have mentioned divs and inflation – how about a look at the returns available from a random walk strategy?