IoT and the Rise of the Predictive Organization

 

 

 

 

 

 

 

 

 

 

 

I will be launching a newsletter starting in Jan 2015 to cover these ideas in detail.

You can sign up for the newsletter at futuretext IoT Machine Learning – Predictive Analytics – newsletter

I will also be launching a course/certification for “Data Science in IoT” at Oxford, London and San Francisco – email me at ajit.jaokar at futuretext.com if you want to know more

 

In the Godfather II, Hyman Roth said to Micheal Corleone

             ’Michael – we are bigger than US Steel“.

Over the holiday season,  I said this to my friend Jeremy Geelan when I was comparing the Mobile industry to the IoT.

The term Internet of Things was coined by the British technologist Kevin Ashton in 1999, to describe a system where the Internet is connected to the physical world via ubiquitous sensors. Languishing depths of academia(at least here in Europe …) – IoT had it’s netscape moment early in 2014 when Google acquired Nest

Mobile is huge and has dominated the Tech landscape for the last decade.

But the Internet of Things(IoT) will be bigger.

How big?

Here are some numbers. Souce (adapted from  David Wood blog )

By 2020, we are expected to have 50 billion connected devices

To put in context:

  • The first commercial citywide cellular network was launched in Japan by NTT in 1979.
  • The milestone of 1 billion mobile phone connections was reached in 2002.
  • The 2 billion mobile phone connections milestone was reached in 2005.
  • The 3 billion mobile phone connections milestone was reached in 2007.
  • The 4 billion mobile phone connections milestone was reached in February 2009.
  • We reached 7.2 billion active mobile connections 2014

So, 50 billion by 2020 is a massive number by a factor, and no one doubts that number any more.

But IoT is much more than the number of connections – it’s all about the Data and the intelligence that can be gleaned from the Data.

As more objects are becoming embedded with sensors and gain the ability to communicate, new business models emerge.

IoT also creates new pathways for information to travel – especially across an Organization’s bounday and across it’s value chain and in engaging with their customers.

This Data and the Intelligence gleaned from it – will fundamentally transform organizations creating a new kind of ‘Predictive Organization’ which has Predictive analytics / Machine Learning at it’s core i.e. Algorithms that will learn from experience.

Machine learning is the study of algorithms and systems that improve their performance with experience. There are broadly two ways for algorithms to learn:  Supervised learning(where the algorithm is trained in advance using labelled data sets) and unsuprevised learning (with no prior learning – ex with methods like Clustering etc).

Machine Learning algorithms take the billions of Data points as inputs and extract actionable insights from ther data. So, the Predictive Organization starts with the prediction process and then creates a feedback loop through measuring and managing. Crucially, this tales place across the boundary of the Enterprise

I believe there are twelve unique characterictics of IoT based Predictive analytics/machine learning

1)     Time Series Data: Processing sensor data.

2)     Beyond sensing: Using Data for improving lives and businesses.

3)     Managing IoT Data.

4)     The Predictive Organization: Rethinking the edges of the Enterprise: Supply Chain and CRM impact

5)     Decisions at the ‘Edge’

6)     Real time processing.

7)     Cognitive computing – Image processing and beyond.

8)     Managing Massive Geographic scale.

9)     Cloud and Virtualization.

10)  Integration with Hardware.

11)  Rethinking existing Machine Learning Algorithms  for the IoT world.

12)  Co-relating IoT data to social data – the Datalogix model for IoT

Indeed one could argue that IoT leads to the creation of new types of organization – for instance  based on the sharing economy based on converging the digital and the physical world.

I will be launching a newsletter starting in Jan 2015 to cover these ideas in detail.

You can sign up for the newsletter at futuretext IoT Machine Learning – Predictive Analytics – newsletter

I will also be launching a course/certification for “Data Science in IoT” at Oxford, London and San Francisco - email me at ajit.jaokar at futuretext.com if you want to know more

Image source: wikipedia

IoT Machine Learning – Predictive Analytics – newsletter

 

 

 

Greetings!

In January,  I am launching a newsletter focusing on IoT, Machine Learning and Predictive analytics

This is a key, complex domain which I believe will be very significant going forward

some of the themes I will cover are:

Time series data from sensors 

Real time analytics

Streaming

In memory databases etc

Startup business models

Question is:

What other topics should I include considering the niche theme i.e. IoT and Machine Learning / Predictive Analytics

You can sign up on www.futuretext.com or email me at [email protected] or respond in Twitter @ajitjaokar

Image source: http://ml.cmu.edu/

Protected: test

This post is password protected. To view it please enter your password below:

ForumOxford: Internet of Things Conference 2015 listed among 40 most important #IoT events to attend this year ..

What a nice way to end the year ..

Jeremey Geelan who created a list of the top 40 Internet of Things Conferences to attend in 2015 has added the forumoxford : 2015 Internet of Things conference  to the list of 40 important Internet of Things conferences for 2015

Date: 6 November, 2014

Venue: Rewley House, University of Oxford
URL: forumoxford : 2015 Internet of Things conference

co-chaired by me and Tomi Ahonen. Now in it’s 10th year. Mark the dates!

full list again  list of the top 40 Internet of Things Conferences to attend in 2015

 

Infographic – The evolution of wireless networks

PS
I get many such requests to post infographics ..
But this one is good
Comes from a reliable source (New Jersey Institute of Technology - Online Masters of Science in Electrical Engineering)

Infographic – The evolution of wireless networks

New Jersey Institute of Technology’s Online Master of Science in Electrical Engineering

Space Clouds: Turtles in Space – Learning to Code

Here is something I have been thinking as part of the Countdown Institute.

The Countdown Institute  teaches young people aged 10 to 16 to learn programming skills using Space exploration

I have been a fan of Seymour Papert’s Turtles based on my work at feynlabs.

Turtles in Python(Python Turtles) and in general(Turtle Graphics) are a great way of learning to code.

Object Oriented paradigms (like Turtles) are an easy way to start learning Programming (as opposed to Procedural Paradigms) because they help to tie back to the problem / context easily. The Turtles concept also downplays the more complex aspects of OO programming such as Inheritance and Polymorphism.

Countdown helps enables young people to learn coding by solving problems in a specific context – in this case – Space exploration.

But we need a simple and a consistent way to model problems. Space Clouds is a data/modelling layer which relates Space exploration to coding within Space exploration. We can think of the Space Cloud as a unifying Data layer / software objects/class. It is a consistent way of modelling a problem and getting kids  to code

From a programmatic standpoint , we have varying space objects(Satellites, Drones, Planets, Space missions etc).

Like an Object (such as a Turtle) – each of these are Objects have behaviour and data

Each lesson starts with describing (modelling) the objects involved in the ‘world’ – ex in a high altitude balloon – jet stream could be defined as part of the space cloud.

This is a very easy paradigm to understand for a Child .. ie I switch on a device and the ‘sky lights up’ so to speak.

Depending on the problem – the Objects could be Planets, Satellites, missions(Orion, Rosetta)

Space Clouds is a simple, context specific modelling language for the context of space exploration created with the goal of teaching young people to code. Space Clouds is Programming Language agnostic. Current modelling languages like UML are designed for modelling entire systems and are not really suited for learning to code. 

The idea of Space Clouds can be thought of as the concept of in ‘Turtles in Space’

A recent blog on learning to code said that No-fuss setups and Task Oriented tools are key features to get more kids to code.

Space Clouds takes a similar approach by simplifying (limiting) input in early stages and connecting to a specific context

Image source Valiant turtle – wikipedia

 

 

Implementing Tim Berners-Lee’s vision of Rich Data vs. Big Data

 

 

 

 

 

 

 

 

INTRODUCTION:

In a previous blog post,  I discussed (Magna Carta for the Web) about the potential of Tim Berners-Lee vision of Rich Data.

When I met Tim at the EIF event in Brussels, I asked about the vision of Rich Data. I also thought more about how this vision could be actually implemented from a Predictive/Machine learning standpoint.

To recap the vision from the previous post:

So what is Rich Data? It’s Data (and Algorithms) that would empower the individual. According to Tim Berners-Lee: “If a computer collated data from your doctor, your credit card company, your smart home, your social networks, and so on, it could get a real overview of your life.” Berners-Lee was visibly enthusiastic about the potential applications of that knowledge, from living more healthily to picking better Christmas presents for his nephews and nieces. This, he said, would be “rich data”. (Motherboard

This blog explores a possible way this idea could be implemented. I hope perhaps I can implement it perhaps as part of an Open Data Institute incubated start-up

To summarize my view here:

The world of Big Data needs to maintain large amounts of Data because the past is used to predict the future. This is needed  because we do not voluntarily share data and Intent. Here,  I propose that to engender Trust, both the Algorithms and the ‘training’ should be transparent – which leads to greater Trust and greater sharing.  This in turn does not need us to hold large amounts of Data (Big Data) to determine Predictions(Intents). Instead, Intents will be known (shared voluntarily) by people at the point of need. This would create a world of Rich Data – where the Intent is determined algorithmically using smaller data sets (and without the need to maintain a large amount of historical data)

BACKGROUND AND CHALLENGES:

Thus, to break it down further, here are some more thoughts:

a)      Big Data vs. Rich Data: To gain insights from data, we currently collect all the data we can lay our hands on (Big Data).  In contrast, for Rich Data, instead of collecting all data in one place in advance, you need access to many small data sets for a given person and situation. But crucially, this ‘linking of datasets’ should happen at the point of need and dynamically. For example:  Personal profile, Contextual information and risk profile ex for a person who is at a risk of Diabetes or a Stroke – only at the point of a medical emergency(vs. gathered in advance).

b)      Context already exists: Much of this information exists already. The mobile industry has done a great job of  capturing contextual  information accurately – for example location and tying it to content(Geo tagged images)

c)       The ‘segment of one’ idea has been tried in many variants: Segmenting has been tried – with some success. In Retail (The future of Retail is segment of One), BCG perspective paper (Segment of One marketing – pdf) Inc magazine – Audience segmenting – targeting your customers . Segmentation is already possible

d)      Intents are not linked to context: The feedback loop is not complete because currently while context exists – it is not tied to Intent. Most people do not trust advertisers and others with their intent

e)      Intent (Predictions) are based on the past:  Because we do not trust providers with Intent – Intent is gleaned through Big Data. Intents are related to Predictions. Predictions are based on a large number of historical observations either of the individual or related individuals. To create accurate predictions in this way, we need large amounts of centralized data and any other forms of Data.  That’s the Big Data world we live in

f)       IoT: IoT will not solve the problem. It will create an order of magnitude of contextual information – but providers will not be trusted and datasets will not be shared. And we will continue to create larger datasets with bigger volumes.

CREATING A TRUST FRAMEWORK FOR SHARING DATA AT AN ALGORITHMIC LEVEL

To recap:

a)      To gain insights from data, we currently collect all the data we can lay our hands on. This is the world of Big Data.

b)      We take this approach because we do not know the Intent.

c)       Rather, we (as people) do not trust providers with Intent.

d)      Hence, in the world of Big Data, we need a lot of Data.  In contrast, for Rich Data, instead of collecting all data in one place in advance, you need access to many small data sets for a given person and situation. But crucially, this ‘linking of datasets’ should happen at the point of need and dynamically. For example:  Personal profile, Contextual information and risk profile ex for a person who is at a risk of Diabetes or a Stroke – only at the point of a medical emergency(vs. gathered in advance).

 

From an algorithmic standpoint, the overall objective is:  To determine the maximum likelihood of sharing under a Trust framework. Given a set of trust frameworks and a set of personas ( for example person with a propensity of a stroke)  - We want to know the probability of sharing information and under which trust framework

We need a small number of observations for an individual

We need an inbuilt trust framework for sharing

We need the Calibration of Trust to be ‘people driven’ and not provider driven

POSSIBLE ALGORITHMIC APPROACH

A possible way to implement the above could be through a Naive Bayes Classifier.

  • In machine learning, Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) independence assumptions between the features.
  • Workings: Let {f1, . . . , fm} be a predefined set of m features. A classifier is a function f that maps input feature vectors x ∈ X to output class labels y ∈ {1, . . . , C} where X is the feature space. Our goal is to learn f from a labelled training set of N input-output pairs, (xn, yn), n = 1 : N; this is an example of supervised learning i.e. the algorithm has to be trained
  • An advantage of Naive Bayes is that it only requires a small amount of training data to estimate the parameters (means and variances of the variables) necessary for classification.
  • This represents the basics of Naive Bayes. Tom Mitchell in a Carnegie Mellon paper says “A hundred independently drawn training examples will usually suffice to obtain a maximum likelihood estimate of P(Y) that is within a few percent of its correct value1 when Y is a Boolean variable. However, accurately estimating P(X|Y) typically requires many more examples.”
  • In addition, we need to consider feature selection and dimensionality reduction. Feature selection is the process of selecting a subset of relevant features for use in model construction. Feature selection is different from dimensionality reduction. Both methods seek to reduce the number of attributes in the dataset, but a dimensionality reduction method do so by creating new combinations of attributes, where as feature selection methods include and exclude attributes present in the data without changing them. Examples of dimensionality reduction methods include Principal Component Analysis

IMPLEMENTATION

  • Thus, a combination of Naive Bayes and PCA may be  a start to implementing Rich Data. Naive Bayes needs relatively a smaller amount of data. PCA will reduce dimensionality.
  • How to incorporate Trust? The next question is: How to incorporate Trust? Based on above, Trust become a feature (an input vector) to the algorithm with an appropriate weightage. The output is then based on the probability of sharing under a Trust framework for a given persona
  • Who calibrates the Trust? A related and bigger question is: How to calibrate Trust within the Algorithm? This is indeed the Holy Grail and underpins the foundation of the approach. Prediction in research has grown exponentially due to the availability of Data – but Predictive science is not perfect (Good paper: The Good, the Bad, and the Ugly of Predictive) .  Predictive Algorithms gain their intelligence through two ways:  Supervised learning  (like Naive Bayes where the algorithm learns through training Data) or through Unsupervised learning where the algorithm tries to find hidden structure in unlabeled data.

 

So, if we have to calibrate trust for a Supervised learning algorithm – the workings must be open and the trust (propensity to share) must be created from the personas itself. Ex – People at risk of a stroke, elderly etc. Such an Open algorithm that learns from the people and whose workings are transparent will engender trust. It will in turn lead to greater sharing – and a different type of predictive algorithm which will need smaller historical amounts of data  - but will track a larger number of Data streams to determine value at their intersection. This in turn will complete the feedback loop and tie intent to context

Finally, I do not propose that a specific algorithm (such as Naive Bayes) is the answer – rather I propose that both the Algorithms and the ‘training’ should be transparent – which leads to greater Trust and greater sharing.  This in turn does not need us to hold large amounts of Data (Big Data) to determine Predictions(Intents). Instead, Intents will be known (shared voluntarily) by people at the point of need. This would create a world of Rich Data – where the Intent is determined algorithmically using smaller data sets (and without the need to maintain a large amount of historical data)

Comments welcome – at ajit.jaokar at futuretext.com