This blog is part of the IEEE IoT Brain Trust series. This collection of blogs will explore IoT in the industry. 

Nearly everyone who has recently viewed advertisements on the Web, seen recommendations in on-line shopping, used protection for fraud and security, or any number of other applications using data from the “crowd” has felt the impact of Big Data. We are now starting to feel the effect of a much bigger “crowd”, the Internet of Things (IoT). The opportunities for innovation from the combination of Big Data and IoT will, once again, change how we think about using the Internet.

For much of the data generated by the IoT, the point is to analyze it in order to make better decisions and, in many cases, to control processes. One could ask of IoT, if the data generated is not analyzed and, in most cases, used after analysis, what is the point? Conversely, IoT will provide sources of data that are, in aggregate, huge and often very latency sensitive. The half-life of data will become much shorter.

This figure shows the IMADC lifecycle of IoT and Big Data. It starts with instrumentation and monitoring, both functions of IoT. The monitoring function supplies data, often in the form of a data stream, for analysis, either in [near] real time or for storage and later investigation, usually for both. A common example is classification. That is, can we decide, with the data passing by, whether we have an example of alarm/alert/not alert, fraud/not fraud, disease/healthy, within specification/not within specification, or any of a universe of other classifications? This analysis supports a decision, which can be complex and part of a larger decision process, or one that is passed directly on to an automatic control device. The cycle then begins again. This lifecycle is not substantially different from any data lifecycle. What is different is the level of analysis that can be executed because of the properties of the data and the new analytic techniques that have evolved along with Big Data.

 What difference does all of this make to you?

The first thing to think about is what you could do if you had the right types and amounts of data. There are lots of new applications, not all of which require huge volume or velocity, but require thinking differently about data. Some might be classified in the following areas:

  • Improved, more transparent operations (often the first target)
  • Optimized processes
  • Reduced loss – fraud, reliability, security
  • Better customer understanding and targeting
  • Automation and decreased latency in products and services
  • New revenue streams, perhaps from new models of doing business

As you think these through, I have a piece of advice. Think very broadly about the data that you have or might generate. There is a wealth of open source data available from the government and many other sources. It can often be combined with your proprietary data to very good advantage. In addition, you may be able to inexpensively generate more and very valuable data about your operations and your customer’s behavior by using small devices for tracking.

One of my favorite examples is from a surprising venue, the Dallas Metropolitan Art Museum. The Dallas Metropolitan Art Museum’s director fully leveraged the power of trading data for free memberships into a new business model. Previously, the museum charged admission. It now offers free admission and membership in return for one’s information. In this way, the director counted on attracting more visitors and measuring their interest. With this data, he is hoping to use the information to persuade major donors, foundations, and the city to increase their giving.

Finally, do not mistake “Lots of Data” for “Big Data”. There are plenty of large data warehouses that are only used in very traditional ways, and there are also an increasing number of smaller data sets, which are using the data mining, statistics, and machine learning associated with Big Data in new and very innovative ways to create new value. It’s not only the size of the data but also how one uses the data.


Networked & Programmable World: IoT + Big Data

We have gone through a number of quantum leaps in the amount of data available for analysis over the last 20 years. For a long time the primary sources of huge amounts of data, outside of the scientific sources like astronomy, meteorology, and now biology, were transactions generated by the operations of companies in industries such as telecom, finance, etc. In the mid-1990s, the emergence of the WWW made it possible for individuals to be both originators and consumers of data, in aggregate huge amounts. Widespread availability of broadband access significantly increased the amount of data that individuals could generate and consume, and added unstructured data in the form of text/speech/video to the mix. Then came mobility and the ability to carry what amounts to a powerful computer in your pocket and to be connected to worldwide networks 24x7 caused a further explosion in data volume. Not only in volume but also in what we might think of as an inconvenience index and half-life of information. That is, people now expect answers in minutes rather than weeks, and the useful lifespan of new information is shrinking fast. In addition, whereas we would formerly be willing to go to a computer center for printouts or to a television for news, we now expect information to be within arms length at all times, that is, in our pocket. All of this has created a huge amount of data, much of it available for analysis, arriving at tremendous rates, and in a variety of forms – the definition of Big Data.

What is changing?

The advent of the Internet of Things adds trillions of potential sources and consumers of data to the billions of people who make up the “crowd”. The next “crowd” is the stunning number of devices operating on their own that will be connected to the ‘net and generating and/or using data. Some of these will be traditional sensors measuring all sorts of physical phenomena -- rainfall, temperature, water levels, etc. Some will be sensors that report on human environments and are attached to automobiles, smart phones, jewelry, refrigerators, cameras, and so on. Many will be used for control, such as household power consumption, smart cities, and automobiles. In not too long, these devices will outnumber people by several orders of magnitude.

Just how much data are we talking about?

There are lots of estimates of how much data is out there and how fast it is growing. Nearly all of them are measured in zettabytes (10^21), and growing at exponential rates. One such set of numbers, from IDC, measures 2014 data at 4.4ZB and estimates 2020 data at 44ZB, according to EMC. Fascinating numbers, but for most of us a few giga- or terabytes is plenty. That same IDC article estimates that the number of “things” connected to networks will go from roughly 20 billion today to over 30 billion in 2020. At any rate, many times the number of people, and growing much faster.

How should you think about this, and what might you do now?

First, if you aren’t using Big Data and its techniques already on the data that you have, or might generate, it is time to start. My experience is that a small, multi-skilled team with a bias toward finding something of value quickly is the way to start. That said, even if you are already swimming in the Big Data ocean, get ready for another really big wave of data and opportunity. Some of the “things” in the IoT will likely be your own, but there is also a fast increasing resource of “Open Data” to add to it. It is certainly time to think expansively, and strategically, about how you are going to come out ahead using these two trends.