How today’s data explosion and AI are multiplying business risks

There is an old adage that says “If you don’t look you won’t see”. The old adage is of course common sense, but the key to seeing involves both looking but also knowing where to look. The Data companies use to do business comes from many sources inside and outside and is created by them as well. But many companies don’t know where all of their data is, where it came from, how it is being used, it’s quality, if they are using the data legally, and whether it is efficient to access and use.

A recent survey showed that it is not uncommon for businesses to have several hundred separate sources of data used in their businesses, and few businesses know all of their data sources, much less manage all of them properly, it is simply difficult to keep up with. This problem is only growing being magnified by the huge growth in the amount of data businesses are using, and this growth does not appear to be slowing.

Big Data

The term Big Data was coined in the 1990’s but what the term means has changed drastically as the amount of data being produced in the world has grown by many magnitudes. (In fact the huge surge in the amount of data being generated and used today due to new AI and ML technologies is making many people think another term such as ‘Huge Data’ may be more appropriate in today’s world).

I like Intel Corp.’s summary of the term ‘Big Data’:
Big data is a concept that describes the flood of unstructured data created by everything from social media posts and network traffic to the Internet of Things (IoT), public safety cameras, and global weather data. Unlike small data—which can be structured, stored, and analyzed in a relational database—big data exceeds the capacity of tables, rows, and columns in complexity and processing. Small data and big data lie on a spectrum. You know you’ve entered the big data realm when you see extreme data volume, velocity, and variety’.

Big Data Volume

As you might have guessed, big data is big. (Huge, in fact!). Big data sets easily exceed a petabyte (1,000 terabytes) and can reach into the exabytes (1,000 petabytes). Data sets this large are beyond human comprehension and traditional computing capacity. Making sense of big data—identifying meaningful patterns, extracting insights, and putting it all to work—requires machine learning, AI, and serious computing power.

Big Data Velocity

Big data doesn’t arrive in a daily expense report or a month’s worth of transaction data. Big data comes in real time in extremely high volumes. An example: Google receives, on average, over 40,000 search queries per second,1 analyzes them, answers them, and serves up analytics-driven advertising for each and every one. That’s big data velocity.

Big Data Variety

On top of coming in petabytes per second, big data comes in every conceivable data type, format, and form. Big data includes pictures, video, audio, and text. Big data can be structured, like census data, or completely unstructured, like pictures from social posts.

Big data could come from video posts, the sensors in a factory, or all the cell phones using a specific app.

Big Data phase 3.0

Although web-based unstructured content is still the main focus for many organizations in data analysis, data analytics, and big data, the current possibilities to retrieve valuable information are emerging out of mobile devices. Simultaneously, the rise of sensor-based internet-enabled devices is increasing the data generation like never before. Famously coined as the ‘Internet of Things’ (IoT), millions of TVs, thermostats, wearables and even refrigerators are now generating zettabytes of data every day. And the race to extract meaningful and valuable information out of these new data sources has only just begun.

The rise of ‘Edge Computing’

Major advancements in network speeds, compute power and ML have resulted in the rise of AI technologies and tools. You can’t turn on the news or talk with someone today before the topic of AI and how it is changing our world comes up. As a result ‘Edge Computing’ is set to revise how data is managed and processed for critical sectors of the economy. Edge computing, which refers to computing done near the source of data collection rather than in the cloud or a centralized data center, represents the next frontier for big data.

AI models’ demand for data is outpacing supply

The amount of available data has exploded. In just 10 years, from 2010 to 2020, the total amount of new data generated per year grew 32x, from 2 zettabytes created in 2010 to over 64 zettabytes created in 2020 alone. Figure 2 draws on the Statista research dataset to illustrate this.


Comments

One response to “How today’s data explosion and AI are multiplying business risks”

  1. […] ‘How today’s data explosion and AI are multiplying business risks’ […]

Discover more from Nimble Risk Management (NRM) | Reducing business risk through optimized value delivery.

Subscribe now to keep reading and get access to the full archive.

Continue reading