IoT and the Rise of the Predictive Organization












I will be launching a newsletter starting in Jan 2015 to cover these ideas in detail.

You can sign up for the newsletter at futuretext IoT Machine Learning – Predictive Analytics – newsletter

I will also be launching a course/certification for “Data Science in IoT” at Oxford, London and San Francisco – email me at ajit.jaokar at if you want to know more


In the Godfather II, Hyman Roth said to Micheal Corleone

             ’Michael – we are bigger than US Steel“.

Over the holiday season,  I said this to my friend Jeremy Geelan when I was comparing the Mobile industry to the IoT.

The term Internet of Things was coined by the British technologist Kevin Ashton in 1999, to describe a system where the Internet is connected to the physical world via ubiquitous sensors. Languishing depths of academia(at least here in Europe …) – IoT had it’s netscape moment early in 2014 when Google acquired Nest

Mobile is huge and has dominated the Tech landscape for the last decade.

But the Internet of Things(IoT) will be bigger.

How big?

Here are some numbers. Souce (adapted from  David Wood blog )

By 2020, we are expected to have 50 billion connected devices

To put in context:

  • The first commercial citywide cellular network was launched in Japan by NTT in 1979.
  • The milestone of 1 billion mobile phone connections was reached in 2002.
  • The 2 billion mobile phone connections milestone was reached in 2005.
  • The 3 billion mobile phone connections milestone was reached in 2007.
  • The 4 billion mobile phone connections milestone was reached in February 2009.
  • We reached 7.2 billion active mobile connections 2014

So, 50 billion by 2020 is a massive number by a factor, and no one doubts that number any more.

But IoT is much more than the number of connections – it’s all about the Data and the intelligence that can be gleaned from the Data.

As more objects are becoming embedded with sensors and gain the ability to communicate, new business models emerge.

IoT also creates new pathways for information to travel – especially across an Organization’s bounday and across it’s value chain and in engaging with their customers.

This Data and the Intelligence gleaned from it – will fundamentally transform organizations creating a new kind of ‘Predictive Organization’ which has Predictive analytics / Machine Learning at it’s core i.e. Algorithms that will learn from experience.

Machine learning is the study of algorithms and systems that improve their performance with experience. There are broadly two ways for algorithms to learn:  Supervised learning(where the algorithm is trained in advance using labelled data sets) and unsuprevised learning (with no prior learning – ex with methods like Clustering etc).

Machine Learning algorithms take the billions of Data points as inputs and extract actionable insights from ther data. So, the Predictive Organization starts with the prediction process and then creates a feedback loop through measuring and managing. Crucially, this tales place across the boundary of the Enterprise

I believe there are twelve unique characterictics of IoT based Predictive analytics/machine learning

1)     Time Series Data: Processing sensor data.

2)     Beyond sensing: Using Data for improving lives and businesses.

3)     Managing IoT Data.

4)     The Predictive Organization: Rethinking the edges of the Enterprise: Supply Chain and CRM impact

5)     Decisions at the ‘Edge’

6)     Real time processing.

7)     Cognitive computing – Image processing and beyond.

8)     Managing Massive Geographic scale.

9)     Cloud and Virtualization.

10)  Integration with Hardware.

11)  Rethinking existing Machine Learning Algorithms  for the IoT world.

12)  Co-relating IoT data to social data – the Datalogix model for IoT

Indeed one could argue that IoT leads to the creation of new types of organization – for instance  based on the sharing economy based on converging the digital and the physical world.

I will be launching a newsletter starting in Jan 2015 to cover these ideas in detail.

You can sign up for the newsletter at futuretext IoT Machine Learning – Predictive Analytics – newsletter

I will also be launching a course/certification for “Data Science in IoT” at Oxford, London and San Francisco - email me at ajit.jaokar at if you want to know more

Image source: wikipedia

IoT Machine Learning – Predictive Analytics – newsletter





In January,  I am launching a newsletter focusing on IoT, Machine Learning and Predictive analytics

This is a key, complex domain which I believe will be very significant going forward

some of the themes I will cover are:

Time series data from sensors 

Real time analytics


In memory databases etc

Startup business models

Question is:

What other topics should I include considering the niche theme i.e. IoT and Machine Learning / Predictive Analytics

You can sign up on or email me at or respond in Twitter @ajitjaokar

Image source:

Protected: test

This post is password protected. To view it please enter your password below:

ForumOxford: Internet of Things Conference 2015 listed among 40 most important #IoT events to attend this year ..

What a nice way to end the year ..

Jeremey Geelan who created a list of the top 40 Internet of Things Conferences to attend in 2015 has added the forumoxford : 2015 Internet of Things conference  to the list of 40 important Internet of Things conferences for 2015

Date: 6 November, 2014

Venue: Rewley House, University of Oxford
URL: forumoxford : 2015 Internet of Things conference

co-chaired by me and Tomi Ahonen. Now in it’s 10th year. Mark the dates!

full list again  list of the top 40 Internet of Things Conferences to attend in 2015


Infographic – The evolution of wireless networks

I get many such requests to post infographics ..
But this one is good
Comes from a reliable source (New Jersey Institute of Technology - Online Masters of Science in Electrical Engineering)

Infographic – The evolution of wireless networks

New Jersey Institute of Technology’s Online Master of Science in Electrical Engineering

Space Clouds: Turtles in Space – Learning to Code

Here is something I have been thinking as part of the Countdown Institute.

The Countdown Institute  teaches young people aged 10 to 16 to learn programming skills using Space exploration

I have been a fan of Seymour Papert’s Turtles based on my work at feynlabs.

Turtles in Python(Python Turtles) and in general(Turtle Graphics) are a great way of learning to code.

Object Oriented paradigms (like Turtles) are an easy way to start learning Programming (as opposed to Procedural Paradigms) because they help to tie back to the problem / context easily. The Turtles concept also downplays the more complex aspects of OO programming such as Inheritance and Polymorphism.

Countdown helps enables young people to learn coding by solving problems in a specific context – in this case – Space exploration.

But we need a simple and a consistent way to model problems. Space Clouds is a data/modelling layer which relates Space exploration to coding within Space exploration. We can think of the Space Cloud as a unifying Data layer / software objects/class. It is a consistent way of modelling a problem and getting kids  to code

From a programmatic standpoint , we have varying space objects(Satellites, Drones, Planets, Space missions etc).

Like an Object (such as a Turtle) – each of these are Objects have behaviour and data

Each lesson starts with describing (modelling) the objects involved in the ‘world’ – ex in a high altitude balloon – jet stream could be defined as part of the space cloud.

This is a very easy paradigm to understand for a Child .. ie I switch on a device and the ‘sky lights up’ so to speak.

Depending on the problem – the Objects could be Planets, Satellites, missions(Orion, Rosetta)

Space Clouds is a simple, context specific modelling language for the context of space exploration created with the goal of teaching young people to code. Space Clouds is Programming Language agnostic. Current modelling languages like UML are designed for modelling entire systems and are not really suited for learning to code. 

The idea of Space Clouds can be thought of as the concept of in ‘Turtles in Space’

A recent blog on learning to code said that No-fuss setups and Task Oriented tools are key features to get more kids to code.

Space Clouds takes a similar approach by simplifying (limiting) input in early stages and connecting to a specific context

Image source Valiant turtle – wikipedia



Implementing Tim Berners-Lee’s vision of Rich Data vs. Big Data










In a previous blog post,  I discussed (Magna Carta for the Web) about the potential of Tim Berners-Lee vision of Rich Data.

When I met Tim at the EIF event in Brussels, I asked about the vision of Rich Data. I also thought more about how this vision could be actually implemented from a Predictive/Machine learning standpoint.

To recap the vision from the previous post:

So what is Rich Data? It’s Data (and Algorithms) that would empower the individual. According to Tim Berners-Lee: “If a computer collated data from your doctor, your credit card company, your smart home, your social networks, and so on, it could get a real overview of your life.” Berners-Lee was visibly enthusiastic about the potential applications of that knowledge, from living more healthily to picking better Christmas presents for his nephews and nieces. This, he said, would be “rich data”. (Motherboard

This blog explores a possible way this idea could be implemented. I hope perhaps I can implement it perhaps as part of an Open Data Institute incubated start-up

To summarize my view here:

The world of Big Data needs to maintain large amounts of Data because the past is used to predict the future. This is needed  because we do not voluntarily share data and Intent. Here,  I propose that to engender Trust, both the Algorithms and the ‘training’ should be transparent – which leads to greater Trust and greater sharing.  This in turn does not need us to hold large amounts of Data (Big Data) to determine Predictions(Intents). Instead, Intents will be known (shared voluntarily) by people at the point of need. This would create a world of Rich Data – where the Intent is determined algorithmically using smaller data sets (and without the need to maintain a large amount of historical data)


Thus, to break it down further, here are some more thoughts:

a)      Big Data vs. Rich Data: To gain insights from data, we currently collect all the data we can lay our hands on (Big Data).  In contrast, for Rich Data, instead of collecting all data in one place in advance, you need access to many small data sets for a given person and situation. But crucially, this ‘linking of datasets’ should happen at the point of need and dynamically. For example:  Personal profile, Contextual information and risk profile ex for a person who is at a risk of Diabetes or a Stroke – only at the point of a medical emergency(vs. gathered in advance).

b)      Context already exists: Much of this information exists already. The mobile industry has done a great job of  capturing contextual  information accurately – for example location and tying it to content(Geo tagged images)

c)       The ‘segment of one’ idea has been tried in many variants: Segmenting has been tried – with some success. In Retail (The future of Retail is segment of One), BCG perspective paper (Segment of One marketing – pdf) Inc magazine – Audience segmenting – targeting your customers . Segmentation is already possible

d)      Intents are not linked to context: The feedback loop is not complete because currently while context exists – it is not tied to Intent. Most people do not trust advertisers and others with their intent

e)      Intent (Predictions) are based on the past:  Because we do not trust providers with Intent – Intent is gleaned through Big Data. Intents are related to Predictions. Predictions are based on a large number of historical observations either of the individual or related individuals. To create accurate predictions in this way, we need large amounts of centralized data and any other forms of Data.  That’s the Big Data world we live in

f)       IoT: IoT will not solve the problem. It will create an order of magnitude of contextual information – but providers will not be trusted and datasets will not be shared. And we will continue to create larger datasets with bigger volumes.


To recap:

a)      To gain insights from data, we currently collect all the data we can lay our hands on. This is the world of Big Data.

b)      We take this approach because we do not know the Intent.

c)       Rather, we (as people) do not trust providers with Intent.

d)      Hence, in the world of Big Data, we need a lot of Data.  In contrast, for Rich Data, instead of collecting all data in one place in advance, you need access to many small data sets for a given person and situation. But crucially, this ‘linking of datasets’ should happen at the point of need and dynamically. For example:  Personal profile, Contextual information and risk profile ex for a person who is at a risk of Diabetes or a Stroke – only at the point of a medical emergency(vs. gathered in advance).


From an algorithmic standpoint, the overall objective is:  To determine the maximum likelihood of sharing under a Trust framework. Given a set of trust frameworks and a set of personas ( for example person with a propensity of a stroke)  - We want to know the probability of sharing information and under which trust framework

We need a small number of observations for an individual

We need an inbuilt trust framework for sharing

We need the Calibration of Trust to be ‘people driven’ and not provider driven


A possible way to implement the above could be through a Naive Bayes Classifier.

  • In machine learning, Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) independence assumptions between the features.
  • Workings: Let {f1, . . . , fm} be a predefined set of m features. A classifier is a function f that maps input feature vectors x ∈ X to output class labels y ∈ {1, . . . , C} where X is the feature space. Our goal is to learn f from a labelled training set of N input-output pairs, (xn, yn), n = 1 : N; this is an example of supervised learning i.e. the algorithm has to be trained
  • An advantage of Naive Bayes is that it only requires a small amount of training data to estimate the parameters (means and variances of the variables) necessary for classification.
  • This represents the basics of Naive Bayes. Tom Mitchell in a Carnegie Mellon paper says “A hundred independently drawn training examples will usually suffice to obtain a maximum likelihood estimate of P(Y) that is within a few percent of its correct value1 when Y is a Boolean variable. However, accurately estimating P(X|Y) typically requires many more examples.”
  • In addition, we need to consider feature selection and dimensionality reduction. Feature selection is the process of selecting a subset of relevant features for use in model construction. Feature selection is different from dimensionality reduction. Both methods seek to reduce the number of attributes in the dataset, but a dimensionality reduction method do so by creating new combinations of attributes, where as feature selection methods include and exclude attributes present in the data without changing them. Examples of dimensionality reduction methods include Principal Component Analysis


  • Thus, a combination of Naive Bayes and PCA may be  a start to implementing Rich Data. Naive Bayes needs relatively a smaller amount of data. PCA will reduce dimensionality.
  • How to incorporate Trust? The next question is: How to incorporate Trust? Based on above, Trust become a feature (an input vector) to the algorithm with an appropriate weightage. The output is then based on the probability of sharing under a Trust framework for a given persona
  • Who calibrates the Trust? A related and bigger question is: How to calibrate Trust within the Algorithm? This is indeed the Holy Grail and underpins the foundation of the approach. Prediction in research has grown exponentially due to the availability of Data – but Predictive science is not perfect (Good paper: The Good, the Bad, and the Ugly of Predictive) .  Predictive Algorithms gain their intelligence through two ways:  Supervised learning  (like Naive Bayes where the algorithm learns through training Data) or through Unsupervised learning where the algorithm tries to find hidden structure in unlabeled data.


So, if we have to calibrate trust for a Supervised learning algorithm – the workings must be open and the trust (propensity to share) must be created from the personas itself. Ex – People at risk of a stroke, elderly etc. Such an Open algorithm that learns from the people and whose workings are transparent will engender trust. It will in turn lead to greater sharing – and a different type of predictive algorithm which will need smaller historical amounts of data  - but will track a larger number of Data streams to determine value at their intersection. This in turn will complete the feedback loop and tie intent to context

Finally, I do not propose that a specific algorithm (such as Naive Bayes) is the answer – rather I propose that both the Algorithms and the ‘training’ should be transparent – which leads to greater Trust and greater sharing.  This in turn does not need us to hold large amounts of Data (Big Data) to determine Predictions(Intents). Instead, Intents will be known (shared voluntarily) by people at the point of need. This would create a world of Rich Data – where the Intent is determined algorithmically using smaller data sets (and without the need to maintain a large amount of historical data)

Comments welcome – at ajit.jaokar at 

Predictive Analytics as a service for IoT


This post is a personal viewpoint based on my teaching (IoT and Machine Learning) at the City sciences program at UPM in Madrid – Technical University of Madrid and at Oxford University (with a mobile perspective).

Predictive Analytics are critical for IoT, but most companies do not have the skillsets to develop their own Predictive analytics engine.  The objective of this effort is to provide a predictive analytics interface for Hypercat. We aim to provide a solution accessed through a Hypercat API and a library. Whenever possible, we will use Open Source. We will also encapsulate industry best practices into the solution. The post is also related to extending the discussions at the event Smart cities need a Trusted IoT foundation

Data and Analytics will be the key differentiator for IoT.

A single sensor collecting data at one-second intervals will generate 31.5 million datapoints year (source Intel/WindRiver). However, the value lies not just in one sensor’s datapoints – but rather the collective intelligence gleaned for thousands (indeed millions) of sensors working together

As I discuss below, this information (and more specifically the rate of IoT based sensor information and its real time nature) will make a key difference for IoT and Predictive analytics.

IoT and predictive analytics will change the nature of decision making and will change the competitive landscape of industries. Industries will have to make thousands of decisions in near real-time. With predictive analytics, each decision will improve the model for subsequent decisions (also in near real time). We will recognize patterns, make adjustments and improve performance based on data from multiple people and sensors

IoT and Predictive analytics will enable devices to identify, diagnose and report issues more precisely and quickly as they occur. This will create a ‘closed loop’ model where the Predictive model improves with experience. We will thus go from identifying patterns to making predictions – all in real time  

However, the road to this vision is not quite straight forward. The two worlds of IoT and Predictive analytics do not meet easily

Predictive analytics needs the model to be trained before the model makes a prediction. Creating a model and updating it on a continuous real-time basis with streaming IoT data is a complex challenge. Also, it does not fit in the traditional model of map reduce and it’s inherently batch processing nature. This challenge is being addressed already (Moving Hadoop beyond batch processing and MapReduce) but will become increasingly central as IoT becomes mainstream.


IoT and Predictive analytics – opportunities

For IoT and Predictive analytics, processing will take place both in the Cloud but also more to the edge. Not all data will be sent to the Cloud at all times. The newly launched Egburt from Camgian microsystems is an example of this new trend.  Some have called this trend ‘Data gravity’ where computing power is brought to the data as opposed to processing Data in a centralized location.

In addition, the sheer volume of IoT data leads to challenges and opportunities. For example 100 million points per second in a time series is not uncommon. This leads to specific challenges for IoT (Internet of Things – time series data challenge)

Here are some examples of possible opportunities for IoT and Predictive analytics where groups of sensors work together:

  • We could undertake system wide predictive maintenance of offshore equipment like wind farms for multiple turbines (i.e. the overall system as opposed to a specific turbine).  If we predict a high likelihood of failure in one turbine, we could dynamically reduce the load on that turbine by switching to a lower performance.
  • Manage overall performance of a group of devices – again for the wind farm example – individual turbines could be tuned together to achieve optimal performance where individual pieces of equipment have an impact on the overall performance
  • Manage the ‘domino effect’ of failure – as devices are connected (and interdependent) – failure of one could cascade across the whole network. By using predictive analytics – we could anticipate such cascading failure and also reduce its impact

IoT and Predictive analytics – challenges

Despite the benefits, the two worlds of IoT and Predictive analytics do not meet very naturally

In a nutshell, Predictive analytics involves extracting information from existing data sets to identify patterns which help predict future outcomes and trends for new (unseen) scenarios.  This allows us to predict what will happen in future with an acceptable level of reliability.

To do this, we must

a)      Identify patterns from existing data sets

b)      Create a model which will predict the future


Doing these two steps in Real time is a challenge. Traditionally, data is fed to a system in a batch. But for IoT, we have a continuous stream of new observations in real time. The outcome (i.e. the business decision) also has to be made in real time. Today, some systems like Credit card authorization perform some real time validations – but for IoT, the scale and scope will be much larger.


So, this leads to more questions:

a)      Can the predictive model be built in real time?

b)      Can the model be updated in real time?

c)       How much historical data can be used for this model?

d)      How can the data be pre-processed and at what rate?

e)      How frequently can the model be retrained?

f)       Can the model be incrementally updated?


There are many architectural changes also for Real time  ex In memory processing, stream processing etc



According to Gartner analyst Joe Skorupa. “The enormous number of devices, coupled with the sheer volume, velocity and structure of IoT data, creates challenges, particularly in the areas of security, data, storage management, servers and the data center network, as real-time business processes are at stake,”

Thus, IoT will affect many areas: Security, Business processes, Consumer Privacy Data Storage Management Server Technologies Data Center Network etc

The hypercat platform provides a mechanism to manage these complex changes

We can model every sensor+actuator and person as a Digital entity. We can assign predictive behaviour to digital objects (Digital entity has processing power, an agenda and access to meta data). We can model and assign predictive behaviour to multiple levels of objects(from the while refinery to a valve)

We can model time varying data and predict behaviour based on inputs at a point in time.  The behaviour is flexible (resolved at run time) and creates a risk prediction and a feedback loop to modify behaviour in real time along with a set of rules

We can thus cover the whole lifecycle – Starting with discovery of new IoT services in a federated manner, managing security and privacy to ultimately creating autonomous, emergent behaviour for each entity

All this in context of a security and Interoperability framework


Predictive analytics as a service?

Based on the above, predictive analytics cannot be an API – but it would be more a dynamic service which can provide the right data, to the right person, at the right time and place. The service would be self improving(self learning) in real time.

I welcome comments on the above. You can email me at ajit.jaokar at or post in the Hypercat LinkedIn forum






Small Data: A Deterministic and predictive approach


Image source: Daniel Villatoro 


In this blog/article, I expand on the idea of ‘Small data’.

I present a generic model for Small data combining Deterministic and Predictive components

Although I have presented the ideas in context of IoT(which I understand best) – the same algorithms and approach could apply to domains such as Retail, Telecoms, Banking etc

We could have a number of data sets which may be individually small but it is possible to find value at their intersection.  This approach is similar to the mobile industry/ foursquare scenario of knowing the context to provide the best service/offer etc to a customer segment of one. That’s a powerful idea in itself and a reason to consider Small Data. However, I wanted to extend the deterministic aspects of Small data (intersection of many small data sets) by also considering the predictive aspects. The article describes a general approach for adding a predictive component to Small data which comprises of three steps: a) A limited set of features are extracted, b) Their dimensionality is reduced(ex using clustering) and c) finally we use a classification and a recognition method like Hidden Markov Models to recognize a higher order metric (ex walking or footfall)


 Last week, I gave an invited talk on IoT and Machine Learning at the Bigdap conference organized by the Ontic project . The Ontic project is a EU FP7 project doing some interesting work on Big Data and Analytics mainly from a Telco perspective.

The audience was technical and was reflected in the themes of the event which (for example : Techniques, models and algorithms for Big data, Scalable Data Mining and Machine learning techniques and mechanisms, Big Data Security and Privacy challenges, Cleaning Big Data (noise reduction), acquisition & integration, Multidimensional Big Data, Algorithms for enhancing data quality.)

This blog post is inspired by some conversations following my talk with Daniel Villatoro (BBVA) and Dr Alberto Mozo (UPM/Ontic). It extends many of the ideas and papers I referenced in my talk.


In his talk, Daniel referred to ‘small data’ (image from Slides used with permission). In this context, as per slide, Small data refers to the intersection of various elements like customers, offers, social context etc in a small retailer context. Small data is an interesting concept and I wanted to explore it more. So, I spent the weekend thinking more about it.

When you have data elements, the concept of small data is a deterministic. It is similar to the mobile industry/ foursquare scenario of knowing the context to provide the best service/offer etc. Thus, given the right datasets, you can find value at the intersection. This works even if the individual Data sets are small as long as you find enough intersecting datasets to create a customer segment of one at their intersection.

That’s a powerful idea in itself and a reason to consider Small Data.

However, I wanted to extend the deterministic aspects of Small data (intersection of many small data sets) by also considering the predictive aspects. In the case of Predictive aspects, we want to infer insights from relatively limited data sets

In addition, I was also looking for a good use case to teach my students @citysciences. Hence, this blog will explore the predictive aspects of Small data in an IoT context

I believe the ideas I discuss could apply to any scenario (ex retail/banking) and indeed also to Big Data sets

A caveat:

The examples I have considered below strictly apply to Wireless Sensor Networks(WSNs). WSNs differ from IoT because there is potentially communication between the nodes. The topology of the WSNs can vary from a simple star network to an advanced multi-hop wireless mesh network. The propagation technique between the hops of the network can be routing or flooding.  In contrast, IoT nodes do not necessarily communicate between each other in this way. But for the purposes of our example, the examples are valid because we are interested in the insights inferred from the Data.

Predictive characteristics of Small data

From a predictive standpoint, I propose that Small data will have the following characteristics:

1)      The Data is missing or incomplete

2)      The data is limited

3)      Alternatively, we have Large data sets which need to be converted to a smaller data set to make it more relevant(ex a small retailer)  to the problem at hand

4)      The need for inferred metrics i.e. higher order metrics derived from raw data

This complements the deterministic aspects of Small data i.e. finding a number of data sets to identify the value at their intersection even if each data set itself may be small(Small data)

So, based on papers I reference below, I propose three methodologies that can be used for understanding Small data from a predictive standpoint

1)      Feature extraction

2)      Dimensionality reduction

3)      Feature Classification and recognition

To discuss these in detail, I use the problem of monitoring physical activity for assisted living patients. These patients live in an apartment under a privacy-aware manner. Here, we use sensors and infer behaviour based on the sensor readings but yet want to protect the privacy of the patient

The papers I have referred to are (also in my talk):

  • Activity Recognition Using Inertial Sensing for Healthcare, Wellbeing and Sports Applications: A Survey – Akin Avci, Stephan Bosch, Mihai Marin-Perianu, Raluca Marin-Perianu, Paul Havinga University of Twente, The Netherlands
  • Robust location-aware activity recognition: Lu and Fu 

This problem is a ‘small data’ problem because we have limited data, some of it is missing (not all sensors can be monitoring at all times) and we have to infer behaviour based on raw sensor readings. We will complement this with the deterministic interpretation of Small Data (where we accurately know a reading).

Small data: Assisted Living Scenario

source Robust Location-Aware Activity Recognition Using Wireless Sensor Network in an Attentive Home Ching-Hu Lu, Student Member, IEEE, and Li-Chen Fu, Fellow, IEEE

In an assisted living scenario, the goal is to recognize activity based on the observations of specific sensors. Traditionally, researchers used vision sensors for activity recognition. However, that is very privacy invasive.  The challenge is thus to recognize human behaviour based on raw readings / activity from multiple sensors. In addition, in an assisted living system, the subject being monitored may have a disorder (for example Cognitive disorders or Chronic conditions).

The techniques presented below could also apply to other scenarios – ex to detect Quality of Experience in Telecoms or in general for any situation where we have to infer insights from relatively limited data sets(ex footfall)

The steps/methods for retrieving activity information from raw sensor data are: preprocessing, segmentation, feature extraction, dimensionality reduction and classification

 In this post, we will consider the last three i.e. feature extraction, dimensionality reduction and classification. We could use these three techniques for situations where we want to create a predictive component for ‘small data’


Small data: Extracting predictive insights

In the above scenario, we could extract new insights using the following predictive techniques (even when we have less data)

 1)      Feature extraction

Feature extraction takes inputs from raw data readings and finds find the main characteristics of a data segment that accurately represent the original data. The smaller set of features can be described as abstractions of raw data. The purpose of feature extraction is to transform large quantities of input data into a reduced set of features. This smaller set of Data is represented as an n-dimensional feature vector. This feature vector is then used as an input to a classification algorithm.

 2)      Dimensionality Reduction

Dimensionality reduction methods aim to increase accuracy and reduce computational effort. By reducing the features involved in the classification process, less computational effort and memory are needed to perform the classification. In other words, if the dimensionality of a feature set is too high, some features might be irrelevant and do not even provide useful information for classification.The two general forms of dimensionality reduction are: feature selection and feature transform.

 Feature selection methods select the features, which are most discriminative and contribute most to the performance of the classifier, in order to create a subset of the existing features. For example: SVM-Based Feature Selection select several most important features and conclude that 5 attributes would be enough to classify daily activities accurately. K-Means Clustering is a method to uncover structure in a set of samples by grouping them according to a distance metric. K-means clustering algorithms rank individual features according to their discriminative properties and their co-relationships.

 Feature Transform Methods : Feature transform techniques try to map the high dimensional feature space into a much lower dimension, yielding fewer features that are a combination of the original features. They are useful in situations where multiple features collectively provide good discrimination but individually, those features would provide poor discrimination. Principal Component Analysis (PCA) PCA is a well known and widely used statistical analysis method and can be used to transform the original features into a lower dimensional space.

 3)     Classification and Recognition: The selected or reduced features from the dimensionality reduction process are used as inputs for the classification and recognition methods.  

For example: Nearest Neighbor (NN) algorithms are used for classification of activities based on the closest training examples in the feature space. (ex k-NN algorithm)

 Naïve Bayes is a simple probabilistic classifier based on Bayes’ theorem which can be used for Classification.

 Support Vector Machines (SVMs) are supervised learning methods used for classification. In the assisted living scenario, SVM based activity recognition system using objects attached with sensors can be used to recognize drinking, phoning, and writing activities

 Hidden Markov Models (HMMs) are statistical models that can also be used for activity recognition. I used a simple analogy to explain hidden markov analysis from a paper which explained HMM for inferring temperature in the distant past based on tree ring sizes

 Gaussian Mixture Models (GMMs) can be used to recognize transitions between activities

 Artificial Neural Networks can also be used to detect occurrences – ex falls.

 Thus, we get a scenario as below











sensors(adapted from Activity Recognition Using Inertial Sensing for Healthcare,Wellbeing and Sports Applications: A Survey)

activity (adapted from Robust location-aware activity recognition: Lu and Fu  )

Small Data: Complementing the Deterministic by the predictive

To conclude:

Small Data could be a deterministic problem when we know a number of datasets and value lies at the intersection of these data sets. This strategy is possible with Mobile context based services and Location based services. The results so achieved could also be complemented by a predictive component of Small data.

In this case,  a limited set of features are extracted, their dimensionality is reduced(ex using clustering) and finally we use a classification and a recognition method like Hidden Markov Models to actually recognize a higher order metric (ex walking, retail footfall etc)

I believe that these ideas could be adapted to many domains. Data science is engineering problem. It’s like building a Bridge where there is no fixed solution in advance. Every Bridge is different and will present a unique set of challenges.  I like the blog post – Machine Learning is not a Kaggle competition . The author(Julia Evans) correctly emphasizes that we need to understand the business problem first. So, I think the above approach could apply to many business scenarios – ex in Retail (footfall), Healthcare, Airport lounges etc by inferring predictive insights from data streams


Ardusat, Countdown Institute at CTIA connected for Good event (part of super mobility week) in Las Vegas

In October, we fully launch the Countdown Institute in Miami (lab Miami) for STEM education

Countdown is based on using Ardusat technology which allows you to conduct experiments in space on a live Cubesat based satellite

Essentially, the Ardusat is based on Cubesat and contains Arduino sensors which allows us to learn Computer Science in context of Space exploration experiments

Sunny Washington President of Ardusat is speaking at the CTIA connected for good event (part of the Super Mobility week) in Las Vegas today

It’s great to see this

The talk reflects the hard work our team in Miami has been putting in working with Ardusat (Richard, Jessica, Alex and also the faculty Nelson, Willie and Patrick)

If you are at CTIA – say Hi to the Ardusat team!