A methodology for solving problems with DataScience for Internet of Things



This (long!) blog is based on my forthcoming book:  Data Science for Internet of Things.

It is also the basis for the course I teach  Data Science for Internet of Things Course.   

Welcome your comments. 

Please email me at ajit.jaokar at futuretext.com  - Email me also for a pdf version if you are interested in joining the course

Here, we start off with the question:  At which points could you apply analytics to the IoT ecosystem and what are the implications?  We then extend this to a broader question:  Could we formulate a methodology to solve Data Science for IoT problems?  I have illustrated my thinking through a number of companies/examples.  I personally work with an Open Source strategy (based on R, Spark and Python) but  the methodology applies to any implementation. We are currently working with a range of implementations including AWS, Azure, GE Predix, Nvidia etc.  Thus, the discussion is vendor agnostic.

I also mention some trends I am following such as Apache NiFi etc

The Internet of Things and the flow of Data

As we move towards a world of 50 billion connected devices,  Data Science for IoT (IoT  analytics) helps to create new services and business models.  IoT analytics is the application of data science models  to IoT datasets.  The flow of data starts with the deployment of sensors.  Sensors detect events or changes in quantities. They provide a corresponding output in the form of a signal. Historically, sensors have been used in domains such as manufacturing. Now their deployment is becoming pervasive through ordinary objects like wearables. Sensors are also being deployed through new devices like Robots and Self driving cars. This widespread deployment of sensors has led to the Internet of Things.

Features of a typical wireless sensor node are described in this paper (wireless embedded sensor  architecture). Typically, data arising from sensors is in time series format and is often geotagged. This means, there are two forms of analytics for IoT: Time series and Spatial analytics. Time series analytics typically lead to insights like Anomaly detection. Thus, classifiers (used to detect anomalies) are commonly used for IoT analytics to detect anomalies.  But by looking at historical trends, streaming, combining data from multiple events(sensor fusion), we can get new insights. And more use cases for IoT keep emerging such as Augmented reality (think – Pokemon Go + IoT)

Meanwhile,  sensors themselves continue to evolve. Sensors have shrunk due to technologies like MEMS. Also, their communications protocols have improved through new technologies like LoRA. These protocols lead to new forms of communication for IoT such as Device to Device; Device to Server; or Server to Server. Thus, whichever way we look at it, IoT devices create a large amount of Data. Typically, the goal of IoT analytics is to analyse the data as close to the event as possible. We see this requirement in many ‘Smart city’ type applications such as Transportation, Energy grids, Utilities like Water, Street lighting, Parking etc

IoT data transformation techniques

Once data is captured through the sensor, there are a few analytics techniques that can be applied to the Data. Some of these are unique to IoT. For instance, not all data may be sent to the Cloud/Lake.  We could perform temporal or spatial analysis. Considering the volume of Data, some may be discarded at source or summarized at the Edge. Data could also be aggregated and aggregate analytics could be applied to the IoT data aggregates at the ‘Edge’. For example,  If you want to detect failure of a component, you could find spikes in values for that component over a recent span (thereby potentially predicting failure). Also, you could correlate data in multiple IoT streams. Typically, in stream processing, we are trying to find out what happened now (as opposed to what happened in the past).  Hence, response should be near real-time. Also, sensor data could be ‘cleaned’ at the Edge. Missing values in sensor data could be filled in(imputing values),  sensor data could be combined to infer an event(Complex event processing), Data could be normalized, we could handle different data formats or multiple communication protocols, manage thresholds, normalize data across sensors, time, devices etc



Applying IoT Analytics to the Flow of Data


Here, we address the possible locations and types of analytics that could be applied to IoT datasets.

(Please click to expand diagram)


Some initial thoughts:

  • IoT data arises from  sensors and ultimately resides in the Cloud.
  • We  use  the  concept  of  a  ‘Data  Lake’  to  refer  to  a repository of Data
  • We consider four possible avenues for IoT analytics: ‘Analytics  at  the  Edge’,  ‘Streaming  Analytics’ , NoSQL databases and ‘IoT analytics at the Data Lake’
  • For  Streaming  analytics,  we  could  build  an  offline model and apply it to a stream
  • If  we  consider  cameras  as  sensors,  Deep  learning techniques could be applied to Image and video datasets (for example  CNNs)
  • Even when IoT data volumes are high, not  all  scenarios  need  Data  to  be distributed. It is very much possible to run analytics on a single node using a non-distributed architecture using Python or R systems.
  • Feedback mechanisms are a key part of IoT analytics. Feedback is part of multiple IoT analytics modalities ex Edge, Streaming etc
  • CEP (Complex event processing) can be applied to multiple points as we see in the diagram


We now describe various analytics techniques which could apply to IoT datasets

Complex event processing

Complex Event Processing (CEP) can be used in multiple points for IoT analytics (ex : Edge, Stream, Cloud et).

In general, Event processing is a method of tracking and  analyzing  streams  of  data and deriving a conclusion from them. Complex event processing, or CEP, is event processing that combines data from multiple sources to infer events or patterns that suggest more complicated circumstances. The goal of complex event processing is to identify meaningful events (such as opportunities or threats) and respond to them as quickly as possible.

In CEP, the data is at motion. In contrast, a traditional Query (ex an RDBMS) acts on Static Data. Thus, CEP is mainly about Stream processing but the algorithms underlining CEP can also be applied to historical data

CEP relies on a number of techniques including for Events: pattern detection, abstraction, filtering,  aggregation and transformation. CEP algorithms model event hierarchies and detect relationships (such as causality, membership or timing) between events. They create an abstraction of an  event-driven processes. Thus, typically, CEP engines act as event correlation engines where they analyze a mass of events, pinpoint the most significant ones, and trigger actions.

Most CEP solutions and concepts can be classified into two main categories: Aggregation-oriented CEP and Detection-oriented CEP.  An aggregation-oriented CEP solution is focused on executing on-line algorithms as a response  to  event  data  entering  the  system  –  for example to continuously calculate an average based on data in the inbound events. Detection-oriented CEP is focused on detecting combinations of events called events patterns or situations – for example detecting a situation is to look for a specific sequence of events. For IoT, CEP techniques are concerned with deriving a higher order value / abstraction from discrete sensor readings.

CEP uses techniques like Bayesian    networks,    neural    networks,     Dempster- Shafer methods, kalman filters etc. Some more background at Developing a complex event processing architecture for IoT

Streaming analytics

Real-time systems differ in the way they perform analytics. Specifically,  Real-time  systems  perform  analytics  on  short time  windows  for  Data  Streams.  Hence, the scope  of  Real Time analytics is a ‘window’ which typically comprises of the last few time slots. Making Predictions on Real Time Data streams involves building an Offline model and applying it to a stream. Models incorporate one or more machine learning algorithms which are trained using the training Data. Models are first built offline based on historical data (Spam, Credit card fraud etc). Once built, the model can be validated against a real time system to find deviations in the real time stream data. Deviations beyond a certain threshold are tagged as anomalies.

IoT ecosystems can create many logs depending on the status of IoT devices. By collecting these logs for a period of time and analyzing the sequence of event patterns, a model to predict a fault can be built including the probability of failure for the sequence. This model to predict failure is then applied to the stream (online). A technique like the Hidden Markov Model can be used for detecting failure patterns based on the observed sequence. Complex Event Processing can be used to combine events over a time frame (ex in the last one minute) and co-relate patterns to detect the failure pattern.

Typically, streaming systems could be implemented in Kafka and spark


Some interesting links on streaming I am tracking:

 Newer versions of kafka designed for iot use cases

Data Science Central: stream processing and streaming analytics how it works

Iot 101 everything you need to know to start your iot project – Part One

Iot 101 everything you need to know to start your iot project – Part Two


Edge Processing

Many vendors like Cisco and Intel are proponents of Edge Processing  (also  called  Edge  computing).  The  main  idea behind Edge Computing is to push processing away from the core and towards the Edge of the network. For IoT, that means pushing processing towards the sensors or a gateway. This enables data to be initially processed at the Edge device possibly enabling smaller datasets sent to the core. Devices at the Edge may not be continuously connected to the network. Hence, these devices may need a copy of the master data/reference data for processing in an offline format. Edge devices may also include other features like:

•    Apply rules and workflow against that data

•    Take action as needed

•    Filter and cleanse the data

•    Store local data for local use

•    Enhance security

•    Provide governance admin controls

IoT analytics techniques applied at the Data Lake

Data Lakes

The concept of a Data Lake is similar to that of a Data warehouse or a Data Mart. In this context, we see a Data Lake as a repository for data from different IoT sources. A Data Lake is driven by the Hadoop platform. This means, Data in a Data lake is preserved in its raw format. Unlike a Data Warehouse, Data in a Data Lake is not pre-categorised. From an analytics perspective, Data Lakes are relevant in the following ways:

  • We could monitor the stream of data arriving in the lake for specific events or could co-relate different streams. Both of these tasks use Complex event processing (CEP). CEP could also apply to Data when it is stored in the lake to extract broad, historical perspectives.
  • Similarly, Deep learning and other techniques could be applied to IoT datasets in the Data Lake when the Data  is ‘at rest’. We describe these below.

ETL (Extract Transform and Load)

Companies like Pentaho are applying ETL techniques to IoT data

Deep learning

Some deep learning techniques could apply to IoT datasets. If you consider images and video as sensor data, then we could apply various convolutional neural network techniques to this data.

It gets more interesting when we consider RNNs(Recurrent Neural Networks)  and Reinforcement learning. For example – Reinforcement learning and time series – Brandon Rohrer How to turn your house robot into a robot – Answering the challenge – a new reinforcement learning robot

Over time, we will see far more complex options – for example for Self driving cars  and the use of Recurrent neural networks (mobileeye)

Some more interesting links for Deep Learning and IoT:


Systems level optimization and process level optimization for IoT is another complex area where we are doing work. Some links for this



Visualization is necessary for analytics in general and IoT analytics is no exception

Here are some links

NOSQL databases

NoSQL databases today offer a great way to implement IoT analytics. For instance,

Apache Cassandra for IoT

MongoDB and IoT tutorial


Other  IoT analytic techniques

In this section, I list some IoT  technologies where we could implement analytics


A Methodology to solve Data Science for IoT problems

We started off with the question: Which points could you apply analytics to the IoT ecosystem and what are the implications? But behind this work is a broader question:  Could we formulate a methodology to solve Data Science for IoT problems?  I am exploring this question as part of my teaching both online and at Oxford University along with Jean-Jacques Bernard.

Here is more on our thinking:

  • CRISP-DM is a Data mining process methodology used in analytics.  More on CRISP-DM HERE and HERE (pdf documents).
  • From a business perspective (top down),we can extend CRISP-DM to incorporate the understanding of the IoT domain i.e. add domain specific features.  This includes understanding the business impact, handling high volumes of IoT data, understanding the nature of Data coming from various IoT devices etc
    • From an implementation perspective(bottom up),  once we have an understanding of the Data and the business processes, for each IoT vertical : We first find the analytics (what is being measured, optimized etc). Then find the data needed for those analytics. Then we provide examples of that implementation using code. Extending CRISP-DM to an implementation methodology, we could have Process(workflow), templates,  code, use cases, Data etc
    • For implementation in R, we are looking to initially use Open source R and Spark and the  h2o.ai  API



We started off with the question:  At which points could you apply analytics to the IoT ecosystem and what are the implications? And extended this to a broader question:  Could we formulate a methodology to solve Data Science for IoT problems?  The above is comprehensive but not absolute. For example, you can implement deep learning algorithms on mobile devices (Qualcomm snapdragon machine learning development kit for mobile mobile devices).  So, even as I write it, I can think of exceptions!


This article is part of my forthcoming book on Data Science for IoT and also the courses I teach

Welcome your comments.  Please email me at ajit.jaokar at futuretext.com  - Email me also for a pdf version if you are interested. If you want to be a part of my course please see the testimonials at Data Science for Internet of Things Course.  

Book review: About Time Series Databases and a New look at Anomaly detection by Ted Dunning and Ellen Friedman


 This blog is a review of two books. Both are available for free from the MapR site, written by Ted Dunning and Ellen Friedman (published by O Reilly) : About Time Series Databases: New ways to store and access data and A new look at Anomaly Detection

 The  MapR platform is a key part of the Data Science for the Internet of Things (IoT) course – University of Oxford and I shall be covering these issues in my course

 In this post, I discuss the significance of Time series databases from an IoT perspective based on my review of these books. Specifically, we discuss Classification and Anomaly detection which often go together for typical IoT applications. The books are easy to read with analogies like HAL (Space Odyssey ) and I recommend them.


Time Series data

The idea of time series data is not new. Historically, time series data can be stored even in simple structures like flat files. The difference now is the huge volume of data and the future applications possible by collecting this data – especially for IoT. These large scale time series databases and applications are the focus of the book. Large scale time series applications typically need a NoSQL database like Apache Cassandra, Apache HBase,  MapR-DB etc.  The book’s focus is Apache HBase and MapR-DB for the collection, storage and access of large-scale time series data.

  Essentially, time series data involves measurements or observations of events as a function of the time at which they occurred. The airline ‘black box’ is a good example of a time series data. The black box records data many times per second for dozens of parameters throughout the flight including altitude, flight path, engine temperature and power, indicated air speed, fuel consumption, and control settings. Each measurement includes the time it was made. The analogy applies to sensor data. Increasingly, with the proliferation of IoT, Time series data is becoming more common and universal. The data so acquired through sensors is typically stored in Time Series Databases.  The TSDB (Time series database) is optimized for best performance for queries based on a range of time


Time series data applications

Time series databases apply to many IoT use cases for example:

  • Trucking, to reduce taxes according to how much trucks drive on public roads (which sometimes incur a tax). It’s not just a matter of how many miles a truck drives but rather which miles.
  • A smart pallet can be a source of time series data that might record events of interest such as when the pallet was filled with goods, when it was loaded or unloaded from a truck, when it was transferred into storage in a warehouse, or even the environmental parameters involved, such as temperature.
  • Similarly, commercial waste containers, called dumpsters in the US, could be equipped with sensors to report on how full they are at different points in time.
  • Cell tower traffic can also be modelled as a time series and anomalies like flash crowd events that can be used to provide early warning.
  • Data Center Monitoring can be modelled as a Time series to predict  outages, plan upgrades
  • Similarly, Satellites, Robots and many more devices can be modelled as Time series data

From these readings captured in a Time Series database, we can derive analytics such as:

Prognosis: What are the short- and long-term trends for some measurement or ensemble of measurements?

Introspection: How do several measurements correlate over a period of time?

Prediction:  How do I build a machine-learning model based on the temporal behaviour of many measurements correlated to externally known facts?

Introspection:  Have similar patterns of measurements preceded similar events?

Diagnosis:  What measurements might indicate the cause of some event, such as a failure?


Classification and Anomaly detection for IoT

The books gives examples of usage of Anomaly detection and Classification for IoT data.

For Time series IoT based readings, anomaly detection and Classification go together. Anomaly detection determines what normal looks like, and how to detect deviations from normal.

When searching for anomalies, we don’t know what their characteristics will be in advance. Once we know characteristics, we can use a different form of machine learning i.e. classification

Anomaly in this context just means different than expected—it does not refer to desirable or un‐ desirable. Anomaly detection is a discovery process to help you figure out what is going on and what you need to look for. The anomaly-detection program must discover interesting patterns or connections in the data itself.

Anomaly detection and classification go together when it comes to finding a solution to real-world problems. Anomaly detection is used first in the discovery phase—to help you figure out what is going on and what you need to look for. You could use the anomaly-detection model to spot outliers, then set up an efficient classification model to assign new examples to the categories you’ve already identified. You then update the anomaly detector to consider these new examples as normal and repeat the process

The book goes on to give examples of usage of these techniques in EKG

For example, for the challenge of finding an approachable, practical way to model normal for a very complicated curve such as the EKG, we could use a type of machine learning known as deep learning.

 Deep learning involves letting a system learn in several layers, in order to deal with large and complicated problems in approachable steps. Curves such as the EKG have repeated components separated in time rather than superposed. We take advantage of the repetitive and separated nature of an EKG curve in order to accurately model its complicated shape to detect normal patterns using Deep learning

The book also refers to a Data structure called t-Digest for Accurate Calculation of Extreme Quantiles  t-digest was developed by one of the authors, Ted Dunning, as a way to accurately estimate extreme quantiles for very large data sets with limited memory use. This capability makes t-digest particularly useful for selecting a good threshold for anomaly detection. The t-digest algorithm is available in Apache Mahout as part of the Mahout math library. It’s also available as open source at https://github.com/tdunning/t-digest


Anomaly detection is a complex field and needs a lot of data.

For example: what happens if you only save a month of sensor data at a time, but the critical events leading up to a catastrophic part failure happened six weeks or more before the event?

IoT from a large scale Data standpoint

To conclude, much of the complexity for IoT analytics comes from the management of Large scale data.

Collectively, Interconnected Objects and the data they share make up the Internet of Things (IoT).

Relationships between objects and people, between objects and other objects, conditions in the present, and histories of their condition over time can be monitored and stored for future analysis, but doing so is quite a challenge.

However, the rewards are also potentially enormous. That’s where machine learning and anomaly detection can provide a huge benefit.

For Time series, the book covers themes such as

Storing and Processing Time Series Data

The Direct Blob Insertion Design

Why Relational Databases Aren’t Quite Right

Architecture of Open TSDB

Value Added: Direct Blob Loading for High Performance

Using SQL-on-Hadoop Tools

Using Apache Spark SQL

 Advanced Topics for Time Series Databases(Stationary Data, Wandering Sources, Space-Filling Curves )

For Anomaly detection:

Windows and Clusters

 Anomalies in Sporadic Events

Website Traffic Prediction

Extreme Seasonality Effects



Links again:

About Time Series Databases: New ways to store and access data and A new look at Anomaly Detection  by Ted Dunning and Ellen Friedman (published by O Reilly).

Also the link for Data Science for the Internet of Things (IoT) course – University of Oxford where I hope to cover these issues in more detail in context of  MapR

IoT and Machine Learning workshop in Palo Alto – part of the Internet of Things World event






As you know from previous posts, I have been very interested in IoT / Smart cities and Algorithms

So, its nice to conduct this workshop based on the program ”Big data analytics and algorithms for cities” at the City sciences program for the Technical University of Madrid

 IoT and Machine Learning is a unique one day workshop which explores Machine learning techniques for IOT.

The workshop is designed as an exploratory/introductory workshop for participants who are interested in using machine learning techniques for IoT data.

Arthur Samuel, the pioneeing AI scientist, defined machine learning as – ‘The field of study that gives computers the ability to learn without being explicitly programmed’.  Machine learning includes examples such as the Driverless car which require data from other cars, street lights, people and a range of sensors coupled with the analytics to make real-time decisions.

Hence, unlike programmers who work with pre-defined logic for a problem domain( using statements like if-then-else, loops etc), for data scientistics, the logic is often non-deterministic.

Thus, given an IOT data set, the machine learning algorithm has to deduce a logic based on a pattern in the data.

The first part of the workshop will explain Machine learning techniques. This will be followed by understanding how these techniques could be applied to IOT datasets. We will use Smart city datasets to explore Machine learning and we will explore specific techniques like sensor fusion.

Machine learning techniques we explore are:

  • Supervised and unsupervised learning
  • Neural Networks
  • Machine Learning System Design
  • Clustering
  • Anomaly Detection
  • Recommender Systems
  • Large-Scale Machine learning systems
  • Programming paradigms and Languages for machine learning
  • Computation at the edge or Computation at the core

Problem domains include:

  • Prediction Examples  bassed on datasets (energy, pollution)
  • Optimzation based (traffic routing, commute optimization)
  • Pattern identifying (predict hotspots based on health care data)
  • New business proceeses based on machine learning for objects that have to navigate an unpredictable domain (driverless cars, drones)

Note that this course is introductory but still needs basic understand and aptitude for Mathematics

Please contact me for any queries/thoughts/comments

workshop link  in Palo Alto on IoT and Machine Learning

webinos – sensor based scenarios – managed service scenarios for sensor networks ..

Firstly, some goals of the webinos project adapted from the webinos introduction

The purpose of the Webinos project is to define and deliver an Open Source platform, which will enable web applications and services to be used and shared consistently and securely over a broad spectrum of connected devices.

In practice, this is more than the APIs on individual devices. So, within a web based scenario, the service should be able to potentially

-  Run across devices and domains

-  Share preferences, status and synchronized information across multiple devices

-  This applies to device features as well – ex to use smart phones as input devices

-  Allow consistent access to developers

-  Manage user authentication, cross device events, metrics and application distribution across devices

The tagline of the webinos project is “Secure Web Operating System Application Delivery Environment”, indicating that security (and also privacy)  is a significant part of the project.

Some of the functionality already exists in proprietary implementations – for example Sky go – which allows Sky subscribers to watch Sky TV on their designated iPhone and iPad devices. Webinos aims to do this and LOT more for the web across platforms by creating truly distributed applications.

In practice, this means for web based applications, webinos will allow for the following:

  • Applications which make optimal use of the resources on the featured devices of TV, Automotive, Tablet, PC and Mobile
  • Applications which interoperate over diverse device types
  • Applications which can make use of services on other devices owned by the same person
  • Applications which can make use of services on devices owned by other people
  • Discovery mechanisms to find services, devices and people, on multiple interconnect technologies – even when they are not connected to the internet
  • Efficient communication mechanisms, that can pass messages over different physical bearers, can navigate firewalls, and make sensible use of scarce network resources
  • Promiscuous communication mechanisms, that will find the best physical connection to pass messages over (not just IP)
  • Strongly authenticated, communication mechanisms that work bi directionally – we know we really are talking to the remote service, device we thought we were – tackling head on the spoofing and phishing weaknesses of the web
  • And finally, implementing distributed, user centric policy:
    • allowing the user to define what applications work on what devices,
    • to define what information is exposed to other services
    • and ensuring these capabilities are interoperable and transferable – ensuring a user stays in control of their devices and their applications

To implement functionality, webinos architecture introduces three components: the webinos web runtime, the webinos personal zone hub(PZH) and the webinos personal zone proxy(PZP)

A webinos web runtime, is a special type of browser which is capable rendering the latest Javascript, HTML4/5 and CSS specifications. It is responsible for rendering the UI elements of the webinos application. A webinos WRT must be able to access the webinos root object from Javascript. Via this root object the third party developer will be able to access the webinos functionality. A webinos WRT differs from a normal browser or web runtime in that all extended Javascript functions as well as some normal browser behaviours (such as XHR) must be mediated by the webinos policy enforcement layer. The webinos web runtime is tightly coupled to the PZP and presents environmental properties and critical events to the PZP.



In webinos, the Personal Zone is a conceptual construct, that is implemented on a distributed basis from a single Personal Zone Hub (PZH) and multiple Personal Zone Proxy (PZP)s

The webinos personal zone hub PZH provides a a fixed entity to which all requests and messages can be sent to and routed on – a personal postbox as it were. The PZH is also the authoritative master copy of a number of critical data elements that are to synced between Personal Zone Proxy (PZP)s and Personal Zone Hub (PZH) – for example certificates.

The PZH enables functionality like the creation of a User authentication service,  secure session creation for transport of messages and synchronisation between the PZP and PZH. The PZH also stores the policy files.

The webinos personal zone proxy PZP  acts in place of the Personal Zone hub, when there is no internet access to the central server. The PZP fulfils most, if not all of the above functions described above, when there is no PZH access. In addition to the PZH proxy function, the PZP is responsible for all discovery using local hardware based bearers (bluetooth, zigbee , NFC etc). Unlike the PZH, the PZH does not issue certificates and identities.

For optimisation reasons PZPs are capable of talking directly PZP-PZP, without routing messages through the PZH

Thus, a webinos application has the folowing characteristics:

-  A webinos application runs “on device” (where that device could also be internet addressable i.e. a server).

-  A webinos application is packaged, as per packaging specifications, and executes within the WRT.

-  A webinos application has its access to security sensitive capabilities, mediated by the active policy.

-  A webinos application can expose some or all of its capability as a webinos service

-  An application developer is granted access to webinos capabilities via the webinos root JavaScript object.

An application developer programs and packages the application according to the webinos specification. They use the API to gain access to functionality. While much of the distributed capabilities of webinos are transparent to the developer, the developer is able to access functionality like discovery and service binding.

So, how will this all work for sensor based devices(zero screen) especially in a smart cities ecosystem?

This is the problem I am trying to address:

Consider some use cases: In all cases, we are essentially considering a group of managed devices under some secure, distributed, private and managed data scenario (for example – the data is owned by the customer)

a)  Consider a standard bluetooth heart rate monitor which could be worn by a customer. In this case, the data is stored on the mobile device and accessed by the customer. However, a variant of this use case is – the data could be transmitted to the physician. This is no different than wearing the more expensive heart rate monitors which doctors normally prescribe – except that the data in this case could be transmitted to the doctor in close to real time. In this case, the PZP would be on the mobile device and the PZH would be on the server or with the doctor. In this case, we could even conceive of a ‘managed third party service’ which is specialised in handling data from multiple customers on their behalf and which the doctor can access. Such a managed service would need the security framework which webinos provides but would be far cheaper than existing medical alternatives since it is based on inexpensive devices which can be hooked together


 b)  A second scenario could be based on an open energy monitor based device such as emonbase. It is based on the idea that customers own their own data and consequently could use that data to either negotiate or switch energy providers. Once again, you could have many devices within the home each running a PZP connected to a PZH which runs on a PC or a home gateway. The above principles apply for distributed and secure data management and also for a secure, third party managed service independent of the specific energy provider (in which case, the PZH is managed by the third party).
These are exploratory ideas and I am still thinking about them – hence they will evolve. Comments and feedback welcome


Apps for smart cities event








I am happy to announce the Apps for Smart Cities event to be held in Amsterdam in 2012 (end of march). Here is some background how this idea came out and I would welcome your comments/speaking proposals etc. Please email me at ajit.jaokar at futuretext.com


A couple of weeks ago, I gave the keynote at PICNIC in Amsterdam where I spoke on the topic of ‘What makes a city smart’. My view was, the closer a city behaves to the Internet, the ‘smarter’ it is. While this definition sounds very generic, it is relevant because the Internet is a platform and is thus an enabler of innovation. This innovation is created by the people.  So, in discussion with Appsterdam, we proposed the idea of ‘Apps for Smart cities’ – an event about grassroots innovation for Smart cities. Today, apps are a core component of the Mobile and also the Web ecosystem.  So, most people are familiar with apps – either as developers or as users. When we extend the idea to ‘apps for smart cities’, we get the concept of apps which incorporate both hardware and software.

So, what does a Smart City look like?

I am also on the advisory board of the World Smart Capital   program , which is modelled on the lines of the World design capital. The world smart capital has produced a Smart cities manifesto at PICNIC.

They define a Smart city as:

A city can be defined as smart when investments in human and social capital and traditional (ex transport) and modern(ex ICT) communications infrastructure fuel sustainable economic development and a high quality of life with a wise management of natural resources through participatory governance


This is a very comprehensive definition. It is beyond the traditional – IT led – emphasis on  sensors and embedded systems.


The concept of smart city seems to rotate around six areas:

- Smart mobility

-   Smart economy

-   Smart environment

-   Smart living

-   Smart people

-   Smart governance


Obviously, mobility plays an important role especially with mobile phones today which incorporate multiple sensors.  Finally smart cities lead to a change of participatory governance style and emphasis on new challenges like Privacy and Security for citizens.


So, the Apps for smart cities event will focus on creating apps for the above including hardware and software and also mobile devices.

Already, we have some interesting supporters like Pachube and we are pleased to have them aboard. We are also speaking through appsterdam with various other organizations as well. So, watch this space  :)

There is a deeper philosophy behind this:

-          The value of open hardware is in the empowerment of communities  which Chris Anderson has famously termed the next Industrial Revolution or ‘the long tail of things’.

-          The tools of factory production, from electronics assembly to 3-D printing, are now available to individuals, in batches as small as a single unit and “Hardware is becoming much more like software,” as MIT professor Eric von Hippel puts it.

-          As Chris Anderson says: We’ve seen this picture before: It’s what happens just before monolithic industries fragment in the face of countless small entrants, from the music industry to newspapers. Lower the barriers to entry and the crowd pours in. and Thus the new industrial organizational model. It’s built around small pieces, loosely joined. Companies are small, virtual, and informal.

These are the ideas we want to explore in the Apps for Smart Cities event.

We welcome your comments. Contact me on ajit.jaokar at futurtext.com. We are looking for speakers/ sponsors and ideas.

So, what are the toolkits on my radar? (Please free to suggest more)

Apart from arduino ofcourse + Cisco, IBM, Vodafone , here are some more ideas on my radar:


is a modular, open source system for building devices. a US based open source hardware company, quite famous , recently it unveils a plan to corporate with Ford    

Funnel  http://gainer.cc/

Funnel is a toolkit to sketch your idea physically, and consists of software libraries and hardware. By using Funnel, the user can handle sensors and/or actuators with various programming languages such as ActionScript 3, Processing, and Ruby. In addition, the user can set filters to input or outputs ports: range division, filtering (e.g. LPF, HPF), scaling and oscillators. It is actually a  redesigned arduino platform

Gainer  http://gainer.cc/

Gainer is an environment for user interfaces and media installations. By using the Gainer environment, the user can handle sensors and/or actuators with a PC on various programming environments such as Flash, Max/MSP, Processing and so on.

Make controller   http://www.makingthings.com/resources/downloads/

The Make Controller 2.0 & Interface Board Kit includes the Make Controller Version 2.0 and the new Interface Board that makes adding sensors and motors easier than ever! Also available with the Application Board. The Make Controller is built around the AT91SAM7X256, and adds the essential components (like the crystal, voltage regulator, filter capacitors, etc.) required to run it, while bringing almost all the processor’s signal lines out to standard 0.1″ spaced sockets.

Wiring   http://wiring.org.co/

Wiring is an open source programming environment and electronics i/o board for exploring the electronic arts, tangible media, teaching and learning computer programming and prototyping with electronics. It illustrates the concept of programming with electronics and the physical realm of hardware control which are necessary to explore physical interaction design and tangible

Sun SPOTs   https://spots-hardware.dev.java.net/

Project Sun SPOT was created to encourage the development of new applications and devices. It is designed from the ground up to allow programmers who never before worked with embedded devices to think beyond the keyboard, mouse and screen and write programs that interact with each other, the environment and their users in completely new ways. A Java programmer can use standard Java development tools such as NetBeans to write code.

Pinguino    http://pinguino.cc/

Pinguino is an Arduino-like prototyping platform based on 8-bit or powerful 32-bit ©Microchip PIC Microcontrollers with built-in USB module (no FTDI chip).

Speaking at World smart capital – Amsterdam – What makes a city smart ..

Speaking at World smart capital – Amsterdam – What makes a city smart ..

Will post details of my talk here soon and slides

Book review – Internet of Things – Global Technological and Societal Trends Smart Environments and Spaces to Green

Book review: Internet of Things – Global Technological and Societal Trends Smart Environments and Spaces to Green ICT by Ovidiu Vermesan and Peter Friess

I have an interest in the Internet of things both from a business perspective but also from a PhD / research perspective. I have covered IOT before for ex How would the Internet of things look like if it were driven by NFC (vs RFID). Hence, I was interested in the book and the publishers kindly sent me a complimentary copy for review.

At around 95 euros, the book is clearly a reference book and I asked about the pricing / positioning of the book. The book is a collection of papers specifically written for the publication by various experts. In that sense, the papers are not available elsewhere (for example on Google Scholar) as I understand it. The editors, who are clearly well known in this space, have thus created a collection of papers on IOT with a specific perspective.

So, with that in mind, here are my comments

The book is a collection of papers each focussed on specific themes:

Chapters 1, 2 and 3 – focus on the IOT vision in Europe. IOT in Europe has a lot of visibility at the European commission and FP projects and EU documents are often complex and hard to read. Hence, these three chapters provide a good view of EU priorities, themes and research clusters

Chapter 4 is from Dr Alessandro Bassi of the iot-A project. This project is an ambitious attempt to create a reference architecture for IOT but the chapter itself is quite high level

Chapter 5 is from a good friend Rob van Kranenburg and as usual Rob takes a visionary, socio economic perspective of IOT and does a good job

Chapter 6 is from Prof Ken Sakamura who is one of the best known experts in this space. He provides a Japanese/ uiD perspective

Chapter 7 governs technologies, applications and governance in the Internet of things. This chapter covers technologies in detail but it is also written by Chinese authors. This makes it even more interesting for me since IOT has a lot of emphasis in China.

Chapter 8 discusses IOT from a perspective of mobile. This could be a whole book! But the chapter is very interesting albeit limited by the structure of one chapter.

Chapter 9. Opportunities and challenges for IOT technologies is a long chapter about technologies and future challenges like security, privacy etc. Again, this could be a whole book!

Chapter 10 is about IOT and network virtualization written by authors from ETRI in Korea, ETRI does some very cutting edge work so it is insightful

Chapter 11 is about interoperability, standardization and governance about IOT and chapter 12 is about Ipv6, IOT and M2M

My analysis:

This book is an excellent reference book and its core strength lies in providing a ‘on ramp’ for IOT and in multiple perspectives. IOT is complex and will develop differently in various geographies (for example China and EU). Each topic can be explored in detail but its nice to have a quick starting point for sectors(anyone who has seen IOT FP7 projects will agree that there is often too much documentation – rather than too little!)

Thus, there is a lot of value which the book brings

My only suggestion would be that perhaps that the editors could have provided greater editorial across the papers – ex their view on China, Japan etc. Since each of the authors are also well respected, readers get value from the specific chapters but maybe there could be more across the chapters.

Also, I could not find any emphasis on ‘Green ICT’ although the title suggests that. In any case, if you have a commercial/ research interest in this space, I would recommend it.

The publishers link is: River publishers – Internet of Things – Global Technological and Societal Trends Smart Environments and Spaces to Green ICT

How would the Internet of things look like if it were driven by NFC (vs RFID)

As NFC catches momentum in Europe and North America, I have been thinking of yet another gedankenexperiment :

How the industry would shape up if the Internet of things were driven by NFC?

To understand this, we have to break down the concepts.

Internet of things

Firstly, Internet of things is a concept driven largely by academia so far.

There are several partially overlapping definitions: (source Wikipedia)

Casagras:[5]: “A global network infrastructure, linking physical and virtual objects through the exploitation of data capture and communication capabilities. This infrastructure includes existing and evolving Internet and network developments. It will offer specific object-identification, sensor and connection capability as the basis for the development of independent cooperative services and applications. These will be characterised by a high degree of autonomous data capture, event transfer, network connectivity and interoperability

SAP:[6]: “A world where physical objects are seamlessly integrated into the information network, and where the physical objects can become active participants in business processes. Services are available to interact with these ‘smart objects’ over the Internet, query and change their state and any information associated with them, taking into account security and privacy issues.

ETP EPOSS:[7]:”The network formed by things/objects having identities, virtual personalities operating in smart spaces using intelligent interfaces to connect and communicate with the users, social and environmental contexts

CERP-IoT: [8]:”Internet of Things (IoT) is an integrated part of Future Internet and could be defined as a dynamic global network infrastructure with self configuring capabilities based on standard and interoperable communication protocols where physical and virtual ‘things’ have identities, physical attributes, and virtual personalities and use intelligent interfaces, and are seamlessly integrated into the information network. In the IoT, ‘things’ are expected to become active participants in business, information and social processes where they are enabled to interact and communicate among themselves and with the environment by exchanging data and information ‘sensed’ about the environment, while reacting autonomously to the ‘real/physical world’ events and influencing it by running processes that trigger actions and create services with or without direct human intervention. Interfaces in the form of services facilitate interactions with these ‘smart things’ over the Internet, query and change their state and any information associated with them, taking into account security and privacy issues.

Other:[9]:”The future Internet of Things links uniquely identifiable things to their virtual representations in the Internet containing or linking to additional information on their identity, status, location or any other business, social or privately relevant information at a financial or non-financial pay-off that exceeds the efforts of information provisioning and offers information access to non-predefined participants. The provided accurate and appropriate information may be accessed in the right quantity and condition, at the right time and place at the right price. The Internet of Things is not synonymous with ubiquitous / pervasive computing, the Internet Protocol (IP), communication technology, embedded devices, its applications, the Internet of People or the Intranet / Extranet of Things, yet it combines aspects and technologies of all of these approaches.

If we identify the common elements for IOT then:

1)      Objects should be uniquely identified

2)      They should be network enabled and hence objects can be queried and activated remotely

3)      Services enabled through such ‘smart objects’ will be co-operative

In addition, some other notes for IOT

1)     The original idea of the Auto-ID Center is based RFID-tags and unique identification through the Electronic Product Code. So, IOT is tied to the idea of RFID/Barcodes

2)     IOT is different from ambient intelligence / pervasive computing / ubiquitous computing which are ideas designed  such that machines modify their behaviour to fit into the environment instead of humans forcing humans to change their behaviour.

3)     There is an alternate view of IOT which is fulfilled by making objects web addressable and that means the object has an agent in the cloud and objects can communicate in the cloud without directly communicating with each other. Ipv6 has a role to play in this space ie if objects become internet addressable

4)     IOT systems will be event driven, complex (ie not deterministic)

5)     But the most important consideration for IOT is the scale: IOT aims for trillions of objects which will lead to billions of parallel and simultaneous interactions requiring massively parallel systems

The uptake of NFC

The original concept for IOT came from the RFID ecosystem. NFC could be seen to be a subset of IOT. NFC is compatible with RFID and the main difference is the range. Also, RFID started with supply chain, asset tracking etc and NFC with transportation. So far, RFID has not become ubiquitous as a technology. But NFC is on the verge of a major uptake in Europe and North America. NFC has applications in access control, access control, consumer electronics, healthcare, information exchange, coupons, payments and transportation.  Thus at an application level, NFC and RFID are comparable.

The uptake of NFC in EU and North America is driven by various factors:

a)     Three different constituencies are driving NFC  - credit cards(visa), telecoms(SIM), Web(Google wallet, paypal)

b)    NFC will show an initial uptake through interactions(informational type requests) and a portion of these could be transactions

Analysis for IOT

As we have seen before, the various definitions of IOT have some common elements. But let us imagine what IOT would look like if NFC were the driving technology

The key requirement to fullfill the true potential of IOT is the scale. Now, if NFC takes off then most of the requirements for IOT could be fulfilled except the scale of interactions. This means, the more emergent/ complex services for IOT may not emerge (at least initially) with NFC but still NFC will be useful.

In addition:

a)     If mobile devices will take up NFC, then we are likely to see more A2P (application to person – ex payment) rather than person to person services. This is good because it provides an initial use case and then as more devices and objects become NFC enabled, more complex use cases will emerge leading to network effects

b)    Hence, the larger scale vision of IOT will not be realised unless you achieve  large scale standardization and interoperability. In the West, I do not see governments attempting this level of standardization. Which makes NFC very significant because much of the promise of IOT will be achieved through NFC but without the scale

c)     Japan, South Korea,Singapore and ofcourse China could achieve standardization in their respective countries. That could achieve scale / IOT vision within their local geographies

d)      China is different since it is a large scale market in addition to a creator of technology. So, internally within China, a lot could be achieved which will add value especially considering the emphasis in China based on the Chinese premier Wen Jiabo’s vision that: Internet + Internet of Things = Wisdom of the Earth.

e)      Can China influence standards? This is a more complex and perhaps a non technological question. But the observation I make it – the rate of uptake of NFC will mean that in the west a parallel ecosystem will develop based on NFC which will mean that influencing standards on a global basis may not be so relevant as a competitive advantage.

Conclusion :

I suspect that NFC will achieve much of the goals for IOT but not on scale but we may see scale in specific geographies where governments can influence standards and achieve interoperability. We saw the same with Korea and Japan for mobile ecosystems. Both achieved high mobile growth within their respective geographies but could not translate it into global uptake.

I also find the alternative view of IOT(that of making objects web addressable) interesting especially when tied to the Cloud

In any case, I love studying ecosystems and IOT will be very interesting ..

Connected Home Global Summit

Connected Home Global Summit is another interesting event which I am tracking.

Verizon Wireless, Comcast, Vodafone, BBC, Telecom Italia Cable and Wireless and Virgin Mobile are some of the 20 operators, broadcasters and content providers confirmed to speak at the 2nd annual Connected Home Global Summit, 24 – 26 May, London, UK

Focused on the technology choices and business models that will monetize the Connected Home, the event is co-located with the inaugural Connected Home Global Industry Awards which celebrates excellence and innovation in the bourgeoning marketplace.

From the agenda, I find these themes interesting

- A View of the Technological Roadmap and Services Enabled by a Broadband Connected Home
- DLNA – The Platform Enabling Seamless and Interoperable
- Environment for Sharing Multimedia In and Around the Connected Home
- Multi-room/multi-device Content: How operators can keep ownership of the connected home? (I smile :)
- Case Study: Delivering a 3 Screen Strategy
- Bouygues Telecom Case Study: Connected Home Services Delivered Through a Box
- Enabling Secure and Guaranteed Delivery to CE Devices in the Home
- Exploring Net Neutrality – The Equality of Internet Traffic
- Applications and Services – Monetising the Connected Home Eco-System
- Assessing how Content and Services will Drive the Connected Home
- Unpicking the Value Chain: Who Owns What, and Who Pays Who For It?
- The Connected Home as a Brokerage Platform for Broadband Operators – Ingredients and Recipes
- Understanding the Extent to which Usability is a Key Differentiator for Connected Home Services
- The Impact Content Delivery Networks will have on the Connected Home
- Orange Case Study: Social TV, Taking TV Beyond the Television Screen
- In Search of Seamless Connectivity & Content Sharing: Assessing the Use of Tablets as Universal Remote Controls of the Connected Home
- Tablet Mania – Which Devices will Drive the Connected Home?
- How can “There” be Made Part of “Here”? – Exploring the Trends and Opportunities around Open TV Platforms
- Getting the Best out of Combining Mobile and TV Applications
- Smart TVs: Opportunities for pay-TV operators
- From Single Connected Devices to a Central Point of Access and Control
- The Future Connected Home – Focus on the Users, and All Else Will Follow
- Future Technologies – What’s Next? Exploring the Future of Connected TVs, 3D and Beyond
- Applications and Services – Monetising the Connected Home Eco-System
- Opening the Home Gateway to the Outside World
- Content Everywhere – Value Added Services for the Connected Consumer
- Understanding the Extent to which Usability is a Key Differentiator for Connected Home Services

more at
Connected Home Global Summit

Does net neutrality apply to machine to machine communications?

Does net neutrality apply to machine to machine communications?

I am exploring this for a talk at LTE summit in Amsterdam next week

In a nutshell

Net neutrality = all packets are created (commercially) equal

It matters because networks are about communication and innovation shifts to the edge of the network

However, this innovation is driven by people(who are at the edge of the network)

So, in a M2M(vs a P2P scenario), there are no ‘people at the edge’ to innovate

Furthermore, network traffic is predictable

Machines don’t want to talk more than what they are designed to

When they do communicate, they often need network level QOS

So, does net neutrality apply to M2M networks?


Image source: Telecoms.com