This topic has been nagging on me for more than a month – and since the last few days, I’ve managed to take some time away from my research for Dr Wealth Insiders (fun stuff – learn more about how you can become an Insider here!) to pen my thoughts on this.
Finance academics and professionals are constantly trying to come up with a “magic” score or metric that will consistently keep delivering superior results… just like the One Ring to rule them all…
Here is a (non-comprehensive) list of some of the more common scores you might have come across:
- Piotroski F-score
- Beneish M-score
- Altman Z-score
- Greenblatt’s Magic Formula
- Zacks Style Score
- Morningstar Star Ratings
- Your own ratings based on your own set of criteria
I don’t have any issues with them per se – what I DO have issues with is how some investors, writers and even some professionals (mis)use them.
Below are three things I have gripes with:
1. They Are Just Based On ONE Year Of Performance
Take for instance the beloved Piotroski F-score. For readers who don’t know what the F-score is, it is a scoring system calculated based on the following 9 criteria to evaluate if a stock is considered a good stock or not. A company that has a score of 8 to 9 is considered a strong stock, while below 2 indicates a weak company.
Piotroski F-Score Criteria
- Return on Assets (1 point if it is positive in the current year, 0 otherwise);
- Operating Cash Flow (1 point if it is positive in the current year, 0 otherwise);
- Change in Return of Assets (ROA) (1 point if ROA is higher in the current year compared to the previous one, 0 otherwise);
- Accruals (1 point if Operating Cash Flow/Total Assets is higher than ROA in the current year, 0 otherwise);
- Change in Leverage (long-term) ratio (1 point if the ratio is lower this year compared to the previous one, 0 otherwise);
- Change in Current ratio (1 point if it is higher in the current year compared to the previous one, 0 otherwise);
- Change in the number of shares (1 point if no new shares were issued during the last year);
- Change in Gross Margin (1 point if it is higher in the current year compared to the previous one, 0 otherwise);
- Change in Asset Turnover ratio (1 point if it is higher in the current year compared to the previous one, 0 otherwise);
If you look at how the points are being calculated, all of them rely on current year (or a single year) of financial results.
This can be dangerous because one or two years of results may not be representative of the long-run financial health of the business.
For instance, a firm’s current ratio would’ve been higher than it would normally have been if the company sold off equipment or a piece of land the previous year with the intention of downsizing due to weak demand or internal control, and yet still would have scored ‘1’ on that criteria.
Or, the business is a cyclical one and scoring highly in the past year doesn’t mean it will continue to do so in future years.
Or worse, the company could have artificially inflated numbers, changed internal policies (i.e. FIFO to LIFO; or adjusting useful life), or conducted buybacks to make metrics like ROA (Return on Assets), Operating Cash Flow and Number of Shares look good that particular year.
All these nitty-gritty details that aren’t supposedly important can manifest into value/growth/ dividend / [insert label] traps that keep you stuck with a cash-haemorrhaging business for a long time.
2. They Screen Out Many Other Potentially Good Investment Ideas
Using a rigid scoring system may keep some bad investment ideas at bay – but it also keeps a lot of the other potentially good ones out of sight and out of mind.
For instance, if you’re using your own metrics, you might perhaps consider P/E (price-to-earnings) below a certain cut-off number, let’s say 15.
You might get a lot of undervalued companies but you would also have missed out a lot of great businesses that have grown strongly – like Visa or Google or Alibaba.
Great businesses usually trade at a premium – and chances are, that “undervalued pick” you got might just be another “undervalued trap”!
Moreover, the metrics commonly used by investors might cannot (and should not) be interpreted so straightforwardly as “good” or “bad”.
For example, an investor might only want to look for companies that have generated positive FCF (free cash flows) in that particular year, and leaving out the ones that don’t. This leaves out a host of possible great businesses that might have generated negative FCF in that year because they acquired a new building or subsidiary and have used FCF to fund that project.
By the time the project is completed and turns cashflow-accretive on its financial statements, the stock might have already priced it in and gone up another 50%…
On a similar note, using predefined scoring systems like the Altman Z or Piotroski F scores does not mean you will do superbly well compared to the market or your other investor peers.
Most investors are aware of these scores and many (I repeat – MANY) are using the SAME scoring systems to generate investment ideas.
This means if you come across a supposedly great “undervalued” stock – chances are, a hundred or a thousand other investors have come across it too with the same screens.
Who’s to say that those investors have not already bid up the price to fair values (or even more)?
3. We Assume The Scores Are Fixed and Applicable on Every Stock
Now this one might seem a bit hard to understand – but every criteria in your scoring system has a “weight” (whether you are aware of it or not).
In the Piotroski F-score, the weights for each criteria is ‘1’.
In your own scoring system, the weights are also by default ‘1’.
In the Altman Z-score and Beneish M-score, the weights are different for each criteria.
By using the scores as-is, we are assuming that these weights and criteria are fixed throughout our analysis and throughout time (if you are looking at these scores every quarter or every year).
That is highly dangerous – and a risk that you might not even know you are taking on.
Markets are not static – and investors put value on different things at different times. (In technical-speak, risk premiums are constantly being eroded and new ones are being discovered all the time.)
So why should we subject our analysis to static assumptions in our weights and our scoring systems?
For instance, if we look to our F-score once again, we see the Accruals criteria counts as a ‘1’. The accruals criteria in Beneish M-score is giving a weight of ‘4.679’.
In recent years, though, it has apparently been shown that the “hidden value” that has been ascribed to a company using fewer accruals has been eroded away.
Would this render that criteria useless? Maybe – if more empirical research is done on it.
If this is so, the scoring would be on an 8-point system rather than a 9-point one. And, this would render Piotroski’s findings were inconclusive for today’s times.
Let’s now turn our attention to the Altman Z-score for a minute.
The score tests the likelihood of bankruptcy for a company. A score below 1.8 indicates a high risk of bankruptcy while companies with scores above 3 are unlikely to go bankrupt.
The creator of the score, Edward Altman, had recently mentioned that Z-scores of American businesses have been trending lower and lower – and the original cut-offs are getting increasingly irrelevant.
“Over time, I began to observe that the average Z-score of American companies mainly, but even global companies, began to get lower and lower.
[The bond market] became more available for both investment grade and non-investment grade companies and companies periodically took advantage of low interest rates to raise their leverage. As a result, the financial risk of companies began to increase. Also with global competition, companies’ profitability began to diminish.
And so the average Z-score became lower and lower, which meant that more firms would have been classified as likely bankrupt using the Z model if we kept the original cutoff scores. In order to modernize the model, we needed bond-rating equivalence of the scores, which changes constantly and adds on an updated nature to the interpretations of the scores.”Edward Altman, Source: CFA Institute Blog
What and how to update these scoring systems will not be discussed in the context of this article. The key thing to takeaway is that these scores are not written in stone.
Another misuse of these scores is that some investors blindly apply them to any and every stock universe.
F-score for the Singapore market. Z-score for the Hong Kong market. Hey, why not layer on the Magic Formula to strengthen the filtering and a Beneish M-score to scrub those frauds away. The more the better right?
Well… not quite.
The thing is – each scoring system has its limitations. It has been my experience that adding more “things” to the mix to try to account for those limitations does not make the whole analysis better – it just morphs into a new (untested) thing that cannot be taken to replicate the original results of the individual scoring systems.
Essentially – the sum of the parts is starkly different than the usefulness of each score individually.
The statistics back me up. In a regression analysis, the more variables you add… the more noise and variability of the results.
Meanwhile, other investors are also misusing these scores individually.
For instance, Joseph Piotroski did NOT intend for the score to be used on just any basket of stocks. His methodology included screening out the bottom 20% of P/B (price-to-book) of firms BEFORE even embarking on the 9 criteria.
Piotroski found that the 9 criteria can “eliminate firms with poor future prospects” and significantly improve returns from that low P/B basket. Moreover, the effect is “concentrated in small and medium-sized firms, companies with low share turnover, and firms with no analyst following”.
Finally, Piotroski did not mention this – but his research design was conducted using only US historical data.
Hence, extrapolating these conclusions to the ENTIRE stock market in Singapore or Hong Kong would be a statistical faux pas. (PS: I have not had the chance to find out if the F-score research had been replicated for Singapore or HK, but even then, the score cannot be applied to the entire stock market!)
Even for the Altman Z-score, Edward Altman notes:
I would say the vast majority of people are misusing the Z-score because they are applying it across the board regardless of the sector, the industry. And what we found over the years is that non-manufacturers, especially in certain industries like services or retail, have on average higher Z-scores than manufacturing companies.Edward Altman, Source: CFA Institute Blog
These little nuances make a difference. A little thought goes a very long way.
After all my ranting about misuse and nit-picking on tiny details…
Investors could do much better for their portfolios if they could just practice care and caution using prescribed tools (some of them are “FREE”) and know why they are scoring those particular criterion.
It’s easy to be lazy and blindly use these scores / checklists / points / ratings… but in my opinion, I see these tools as just a preliminary screen – NOT an investing method or “strategy” as some articles like to call it.
For many investors, it can be a great starting point to find wonderful stocks.
But ultimately, you still need to dig deeper into the business and the numbers – learn to think independently and trust your own judgement.