Predictive Analytics as a service for IoT


This post is a personal viewpoint based on my teaching (IoT and Machine Learning) at the City sciences program at UPM in Madrid – Technical University of Madrid and at Oxford University (with a mobile perspective).

Predictive Analytics are critical for IoT, but most companies do not have the skillsets to develop their own Predictive analytics engine.  The objective of this effort is to provide a predictive analytics interface for Hypercat. We aim to provide a solution accessed through a Hypercat API and a library. Whenever possible, we will use Open Source. We will also encapsulate industry best practices into the solution. The post is also related to extending the discussions at the event Smart cities need a Trusted IoT foundation

Data and Analytics will be the key differentiator for IoT.

A single sensor collecting data at one-second intervals will generate 31.5 million datapoints year (source Intel/WindRiver). However, the value lies not just in one sensor’s datapoints – but rather the collective intelligence gleaned for thousands (indeed millions) of sensors working together

As I discuss below, this information (and more specifically the rate of IoT based sensor information and its real time nature) will make a key difference for IoT and Predictive analytics.

IoT and predictive analytics will change the nature of decision making and will change the competitive landscape of industries. Industries will have to make thousands of decisions in near real-time. With predictive analytics, each decision will improve the model for subsequent decisions (also in near real time). We will recognize patterns, make adjustments and improve performance based on data from multiple people and sensors

IoT and Predictive analytics will enable devices to identify, diagnose and report issues more precisely and quickly as they occur. This will create a ‘closed loop’ model where the Predictive model improves with experience. We will thus go from identifying patterns to making predictions – all in real time  

However, the road to this vision is not quite straight forward. The two worlds of IoT and Predictive analytics do not meet easily

Predictive analytics needs the model to be trained before the model makes a prediction. Creating a model and updating it on a continuous real-time basis with streaming IoT data is a complex challenge. Also, it does not fit in the traditional model of map reduce and it’s inherently batch processing nature. This challenge is being addressed already (Moving Hadoop beyond batch processing and MapReduce) but will become increasingly central as IoT becomes mainstream.


IoT and Predictive analytics – opportunities

For IoT and Predictive analytics, processing will take place both in the Cloud but also more to the edge. Not all data will be sent to the Cloud at all times. The newly launched Egburt from Camgian microsystems is an example of this new trend.  Some have called this trend ‘Data gravity’ where computing power is brought to the data as opposed to processing Data in a centralized location.

In addition, the sheer volume of IoT data leads to challenges and opportunities. For example 100 million points per second in a time series is not uncommon. This leads to specific challenges for IoT (Internet of Things – time series data challenge)

Here are some examples of possible opportunities for IoT and Predictive analytics where groups of sensors work together:

  • We could undertake system wide predictive maintenance of offshore equipment like wind farms for multiple turbines (i.e. the overall system as opposed to a specific turbine).  If we predict a high likelihood of failure in one turbine, we could dynamically reduce the load on that turbine by switching to a lower performance.
  • Manage overall performance of a group of devices – again for the wind farm example – individual turbines could be tuned together to achieve optimal performance where individual pieces of equipment have an impact on the overall performance
  • Manage the ‘domino effect’ of failure – as devices are connected (and interdependent) – failure of one could cascade across the whole network. By using predictive analytics – we could anticipate such cascading failure and also reduce its impact

IoT and Predictive analytics – challenges

Despite the benefits, the two worlds of IoT and Predictive analytics do not meet very naturally

In a nutshell, Predictive analytics involves extracting information from existing data sets to identify patterns which help predict future outcomes and trends for new (unseen) scenarios.  This allows us to predict what will happen in future with an acceptable level of reliability.

To do this, we must

a)      Identify patterns from existing data sets

b)      Create a model which will predict the future


Doing these two steps in Real time is a challenge. Traditionally, data is fed to a system in a batch. But for IoT, we have a continuous stream of new observations in real time. The outcome (i.e. the business decision) also has to be made in real time. Today, some systems like Credit card authorization perform some real time validations – but for IoT, the scale and scope will be much larger.


So, this leads to more questions:

a)      Can the predictive model be built in real time?

b)      Can the model be updated in real time?

c)       How much historical data can be used for this model?

d)      How can the data be pre-processed and at what rate?

e)      How frequently can the model be retrained?

f)       Can the model be incrementally updated?


There are many architectural changes also for Real time  ex In memory processing, stream processing etc



According to Gartner analyst Joe Skorupa. “The enormous number of devices, coupled with the sheer volume, velocity and structure of IoT data, creates challenges, particularly in the areas of security, data, storage management, servers and the data center network, as real-time business processes are at stake,”

Thus, IoT will affect many areas: Security, Business processes, Consumer Privacy Data Storage Management Server Technologies Data Center Network etc

The hypercat platform provides a mechanism to manage these complex changes

We can model every sensor+actuator and person as a Digital entity. We can assign predictive behaviour to digital objects (Digital entity has processing power, an agenda and access to meta data). We can model and assign predictive behaviour to multiple levels of objects(from the while refinery to a valve)

We can model time varying data and predict behaviour based on inputs at a point in time.  The behaviour is flexible (resolved at run time) and creates a risk prediction and a feedback loop to modify behaviour in real time along with a set of rules

We can thus cover the whole lifecycle – Starting with discovery of new IoT services in a federated manner, managing security and privacy to ultimately creating autonomous, emergent behaviour for each entity

All this in context of a security and Interoperability framework


Predictive analytics as a service?

Based on the above, predictive analytics cannot be an API – but it would be more a dynamic service which can provide the right data, to the right person, at the right time and place. The service would be self improving(self learning) in real time.

I welcome comments on the above. You can email me at ajit.jaokar at or post in the Hypercat LinkedIn forum