We are bombarded with information every day and don’t have enough capacity to process and analyse it all.
One way we try and simplify is to look at the numbers.
For example, we look at figures and statistics over time – the performance of markets, the change in interest rates, the purchasing managers index and year-on-year comparisons.
Numbers are easier to process, chart and analyse, so we focus on them – but are they telling us the full story?
Are we missing out on what the associated text is saying?
Numbers rarely exist in isolation. They are often accompanied by analysis and commentary in the form of text.
Take annual reports, for example.
New investors look at company annual reports as an accurate and faithful rendering of a company’s performance.
Seasoned investors know that an annual report is the starting point.
It says what the company officers want to say.
The real messages are buried in the text and the numbers have been “managed” to meet expectations.
Is it possible to automate text processing?
This paper by BangRae Lee, Jun-Hwan Park, Leenam Kwon, Young-Ho Moon, YoungHo Shin, GyuSeok Kim, and Han-joon Kim analyses the relationship between business text patterns and financial performance in corporate data.
Specifically, they use annual reports of US listed companies in 10-K format that report on financial performance, the state of the business, competitiveness and the risks the companies face in their industry.
These reports talk about the past. What can text analysis tell us about the future?
Text mining is a way to process and extract insights from text
Text mining techniques process text and analyse it using descriptive statistics, clustering and sentiment analysis.
For example, the length of text in company annual reports can be expressed in terms of the number of sentences, the number of words and the number of words per sentence.
Clustering involves grouping companies that have similar statistics and then comparing their performance.
For example, we could use their average compound annual growth rate (CAGR) and compare that with another set of companies.
Finally, sentiment analysis looks at how positive, negative or neutral the text is – a way of measuring the subjective content and tone of text.
Does it work?
It’s still early days for this kind of technology but some interesting things are pointed out in the paper.
Companies with good performance talk about products, services, users and business, while those with poor performance talk about the government, contracts, results and the future.
It’s possible that companies that do well write more – longer sentences and more words about how they are doing.
Finally – and an interesting result – the tone of the text has no relationship with sales performance.
The takeaway is – don’t get sucked in if the company officers predict good times ahead, or if they are pessimistic about things.
That says more about them than the company.
It’s possible that text mining techniques will help us make better forecasts as we continue to use and refine them.