How to analyse a dataset

data-analysis.png

When we say we analyse data, what do we actually mean?

For many, it means looking at rows and columns in a spreadsheet, with a sense of quiet desperation.

So, where should we start.

The American Institute of Certified Public Accountants (AICPA) defines data analytics (in the context of audits) as “The science and art of discovering and analyzing patterns, identifying anomalies and extracting other useful information in data underlying or related to the subject matter of an audit through analysis, modeling and visualization for the purpose of planning or performing the audit.”

That’s probably as good a place as any.

The first thing we try and do when looking at a dataset is to figure out where to look more closely.

We’re trying to take a journey from data to insight – and that involves a few steps.

Take patterns, for example. Patterns tell us that something happens regularly – and so help us predict the future.

For example, a heatmap of energy use in a building that consistently shows hot spots of high energy usage outside normal working hours is a pattern worth looking at more closely.

The human brain is wired to detect patterns – but sometimes we can fool ourselves into thinking a pattern exists and then convince ourselves by selecting only evidence that confirms our belief.

But, that’s where computers come in, and the ability to visualize and run a correlation analysis should help sort that out.

Once we have an expected pattern, then we have something to compare against when looking at new data values in the future.

Then there are deviations.

Most things – and their associated data points – fluctuate.

They go up and down – sometimes up for a while and sometimes the other way.

Our task is to figure out which movements are significant.

And we can do that by using methods like control charts – where we work out where we expect values to be most of the time and call out the ones that go out of the bands we have set.

Finally, there are outliers.

These are the data points that are simply different from everything else.

They could just be wrong, or they could point to a major problem.

For example, it could be a sign of fraud, or a breakdown of equipment.

A thematic review of audit quality by the UK’s Financial Reporting Council (FRC) finds that a lot of firms talk about their use of data analytics in auditing financial statements.

Not many, however, have the in-house capability to get and process data in this way.

It seems that it makes a lot more sense to have a centralised team that can provide this kind specialised IT capability – from extracting data from other systems, getting it into the right format and carrying out the necessary analysis.

One point to note is that some companies offshore this kind of data capture and analysis – which may become an issue as more governments create controls over data governance, security and privacy.

On a practical basis, however, if we have tools and processes that can analyze patterns, deviations and anomalies or outliers, we’re off to a good start.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s