IoT analytics, Edge Computing and Smart Objects

 

 

 

 

 

The term ‘Smart objects’ has been around from the times of Ubiquitous Computing.

However, as we have started building Smart objects, I believe that the meaning and definition has evolved.

Here is my view on how the definition of Smart Objects has changed in the world of Edge Computing and increasing processing capacity

At a minimum, a smart Object should have 3 things

a) An Identity ex ipv6
b) Sensors / actuators
c) A radio (Bluetooth / cellular etc)

In addition, a smart object could incorporate

a) Physical context ex location
b) Social context ex proximity in social media

To extend even more, Smartness could incorporate analytics

Some of these analytics could be performed on the device itself ex computing at the edge concept from Intel, Cisco and others.

However, Edge Computing as discussed today, still has some limitations

For example:

a)     The need to incorporate multiple feeds from different sensors to reach a decision ‘at the edge’

b)    The need for a workflow process i.e. actions based on readings – again often at the edge with it’s accompanying security and safety measures

To manage multiple sensor feeds, we need to understand concepts like sensor fusion (pdf) (source freescale).

We already have some rudimentary workflow through mechanisms like IFTTT(If this then that)

In addition, the rise of CPU capacity leads to greater intelligence on the device – for example Qualcomm Zeroth platform which enables Deep learning algorithms on the device.

So, in a nutshell, its a evolving concept especially if we include IoT analytics in the definition of Smart objects (and that some of these analytics could be performed at the Edge)  ..

We cover these ideas in the #DataScience for #IoT course and also at the courses I teach at Oxford University

Comments welcome

 

 

Become a Data Scientist for the Internet of Things – download free paper

 

Free paper: 


An Introduction to Deep Learning and it’s role for IoT/ future cities

 

  •  

  •  


  • Yes, I am interested in Data science

    Yes, I am interested in IoT

    I am specifically interested in the intersection of Data Science and IoT

    I am currently just curious and exploring

     

  •  



  • subscribed:
    1



  • Email Marketingby GetResponse

 

An Introduction to Deep Learning and it’s role for IoT/ future cities

Note The paper below best read as a pdf which you can download free below

 

An Introduction to Deep Learning and it’s role for IoT/ future cities

By Ajit Jaokar

@ajitjaokar

Please connect with me if you want to stay in touch on linkedin and for future updates

Background and Abstract

This article is a part of an evolving theme. Here, I explain the basics of Deep Learning and how Deep learning algorithms could apply to IoT and Smart city domains. Specifically, as I discuss below, I am interested in complementing Deep learning algorithms using IoT datasets. I elaborate these ideas in the Data Science for Internet of Things program which enables you to work towards being a Data Scientist for the Internet of Things  (modelled on the course I teach at Oxford University and UPM – Madrid). I will also present these ideas at the International conference on City Sciences at Tongji University in Shanghai  and the Data Science for IoT workshop at the Iotworld event in San Francisco

Please connect with me if you want to stay in touch on linkedin and for future updates

Deep Learning

Deep learning is often thought of as a set of algorithms that ‘mimics the brain’. A more accurate description would be an algorithm that ‘learns in layers’. Deep learning involves learning through layers which allows a computer to build a hierarchy of complex concepts out of simpler concepts.

The obscure world of deep learning algorithms came into public limelight when Google researchers fed 10 million random, unlabeled images from YouTube into their experimental Deep Learning system. They then instructed the system to recognize the basic elements of a picture and how these elements fit together. The system comprising 16,000 CPUs was able to identify images that shared similar characteristics (such as images of Cats). This canonical experiment showed the potential of Deep learning algorithms. Deep learning algorithms apply to many areas including Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition etc

 

How does a Computer Learn?

To understand the significance of Deep Learning algorithms, it’s important to understand how Computers think and learn. Since the early days, researchers have attempted to create computers that think. Until recently, this effort has been rules based adopting a ‘top down’ approach. The Top-down approach involved writing enough rules for all possible circumstances.  But this approach is obviously limited by the number of rules and by its finite rules base.

To overcome these limitations, a bottom-up approach was proposed. The idea here is to learn from experience. The experience was provided by ‘labelled data’. Labelled data is fed to a system and the system is trained based on the responses. This approach works for applications like Spam filtering. However, most data (pictures, video feeds, sounds, etc.) is not labelled and if it is, it’s not labelled well.

The other issue is in handling problem domains which are not finite. For example, the problem domain in chess is complex but finite because there are a finite number of primitives (32 chess pieces)  and a finite set of allowable actions(on 64 squares).  But in real life, at any instant, we have potentially a large number or infinite alternatives. The problem domain is thus very large.

A problem like playing chess can be ‘described’ to a computer by a set of formal rules.  In contrast, many real world problems are easily understood by people (intuitive) but not easy to describe (represent) to a Computer (unlike Chess). Examples of such intuitive problems include recognizing words or faces in an image. Such problems are hard to describe to a Computer because the problem domain is not finite. Thus, the problem description suffers from the curse of dimensionality i.e. when the number of dimensions increase, the volume of the space increases so fast that the available data becomes sparse. Computers cannot be trained on sparse data. Such scenarios are not easy to describe because there is not enough data to adequately represent combinations represented by the dimensions. Nevertheless, such ‘infinite choice’ problems are common in daily life.

How do Deep learning algorithms learn?

Deep learning is involved with ‘hard/intuitive’ problem which have little/no rules and high dimensionality. Here, the system must learn to cope with unforeseen circumstances without knowing the Rules in advance. Many existing systems like Siri’s speech recognition and Facebook’s face recognition work on these principles.  Deep learning systems are possible to implement now because of three reasons: High CPU power, Better Algorithms and the availability of more data. Over the next few years, these factors will lead to more applications of Deep learning systems.

Deep Learning algorithms are modelled on the workings of the Brain. The Brain may be thought of as a massively parallel analog computer which contains about 10^10 simple processors (neurons) – each of which require a few milliseconds to respond to input. To model the workings of the brain, in theory, each neuron could be designed as a small electronic device which has a transfer function similar to a biological neuron. We could then connect each neuron to many other neurons to imitate the workings of the Brain. In practise,  it turns out that this model is not easy to implement and is difficult to train.

So, we make some simplifications in the model mimicking the brain. The resultant neural network is called “feed-forward back-propagation network”.  The simplifications/constraints are: We change the connectivity between the neurons so that they are in distinct layers. Each neuron in one layer is connected to every neuron in the next layer. Signals flow in only one direction. And finally, we simplify the neuron design to ‘fire’ based on simple, weight driven inputs from other neurons. Such a simplified network (feed-forward neural network model) is more practical to build and use.

Thus:

a)      Each neuron receives a signal from the neurons in the previous layer

b)      Each of those signals is multiplied by a weight value.

c)      The weighted inputs are summed, and passed through a limiting function which scales the output to a fixed range of values.

d)      The output of the limiter is then broadcast to all of the neurons in the next layer.

Image and parts of description in this section adapted from : Seattle robotics site

The most common learning algorithm for artificial neural networks is called Back Propagation (BP) which stands for “backward propagation of errors”. To use the neural network, we apply the input values to the first layer, allow the signals to propagate through the network and read the output. A BP network learns by example i.e. we must provide a learning set that consists of some input examples and the known correct output for each case. So, we use these input-output examples to show the network what type of behaviour is expected. The BP algorithm allows the network to adapt by adjusting the weights by propagating the error value backwards through the network. Each link between neurons has a unique weighting value. The ‘intelligence’ of the network lies in the values of the weights. With each iteration of the errors flowing backwards, the weights are adjusted. The whole process is repeated for each of the example cases. Thus, to detect an Object, Programmers would train a neural network by rapidly sending across many digitized versions of data (for example, images)  containing those objects. If the network did not accurately recognize a particular pattern,  the weights would be adjusted. The eventual goal of this training is to get the network to consistently recognize the patterns that we recognize (ex Cats).

How does Deep Learning help to solve the intuitive problem

The whole objective of Deep Learning is to solve ‘intuitive’ problems i.e. problems characterized by High dimensionality and no rules.  The above mechanism demonstrates a supervised learning algorithm based on a limited modelling of Neurons – but we need to understand more.

Deep learning allows computers to solve intuitive problems because:

  • With Deep learning, Computers can learn from experience but also can understand the world in terms of a hierarchy of concepts – where each concept is defined in terms of simpler concepts.
  • The hierarchy of concepts is built ‘bottom up’ without predefined rules by addressing the ‘representation problem’.

This is similar to the way a child learns ‘what a dog is’ i.e. by understanding the sub-components of a concept ex  the behavior(barking), shape of the head, the tail, the fur etc and then putting these concepts in one bigger idea i.e. the Dog itself.

The (knowledge) representation problem is a recurring theme in Computer Science.

Knowledge representation incorporates theories from psychology which look to understand how humans solve problems and represent knowledge.  The idea is that: if like humans, Computers were to gather knowledge from experience, it avoids the need for human operators to formally specify all of the knowledge that the computer needs to solve a problem.

For a computer, the choice of representation has an enormous effect on the performance of machine learning algorithms. For example, based on the sound pitch, it is possible to know if the speaker is a man, woman or child. However, for many applications, it is not easy to know what set of features represent the information accurately. For example, to detect pictures of cars in images, a wheel may be circular in shape – but actual pictures of wheels may have variants (spokes, metal parts etc). So, the idea of representation learning is to find both the mapping and the representation.

If we can find representations and their mappings automatically (i.e. without human intervention), we have a flexible design to solve intuitive problems.   We can adapt to new tasks and we can even infer new insights without observation. For example, based on the pitch of the sound – we can infer an accent and hence a nationality. The mechanism is self learning. Deep learning applications are best suited for situations which involve large amounts of data and complex relationships between different parameters. Training a Neural network involves repeatedly showing it that: “Given an input, this is the correct output”. If this is done enough times, a sufficiently trained network will mimic the function you are simulating. It will also ignore inputs that are irrelevant to the solution. Conversely, it will fail to converge on a solution if you leave out critical inputs. This model can be applied to many scenarios as we see below in a simplified example.

An example of learning through layers

Deep learning involves learning through layers which allows a computer to build a hierarchy of complex concepts out of simpler concepts. This approach works for subjective and intuitive problems which are difficult to articulate.

Consider image data. Computers cannot understand the meaning of a collection of pixels. Mappings from a collection of pixels to a complex Object are complicated.

With deep learning, the problem is broken down into a series of hierarchical mappings – with each mapping described by a specific layer.

The input (representing the variables we actually observe) is presented at the visible layer. Then a series of hidden layers extracts increasingly abstract features from the input with each layer concerned with a specific mapping. However, note that this process is not pre defined i.e. we do not specify what the layers select

For example: From the pixels, the first hidden layer identifies the edges

From the edges, the second hidden layer identifies the corners and contours

From the corners and contours, the third hidden layer identifies the parts of objects

Finally, from the parts of objects, the fourth hidden layer identifies whole objects

Image and example source: Yoshua Bengio book – Deep Learning

Implications for IoT

To recap:

  • Deep learning algorithms apply to many areas including Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition etc
  • Deep learning systems are possible to implement now because of three reasons: High CPU power, Better Algorithms and the availability of more data. Over the next few years, these factors will lead to more applications of Deep learning systems.
  • Deep learning applications are best suited for situations which involve large amounts of data and complex relationships between different parameters.
  • Solving intuitive problems: Training a Neural network involves repeatedly showing it that: “Given an input, this is the correct output”. If this is done enough times, a sufficiently trained network will mimic the function you are simulating. It will also ignore inputs that are irrelevant to the solution. Conversely, it will fail to converge on a solution if you leave out critical inputs. This model can be applied to many scenarios

In addition, we have limitations in the technology. For instance, we have a long way to go before a Deep learning system can figure out that you are sad because your cat died(although it seems Cognitoys based on IBM watson is heading in that direction). The current focus is more on identifying photos, guessing the age from photos(based on Microsoft’s project Oxford API)

And we have indeed a way to go as Andrew Ng reminds us to think of Artificial Intelligence as building a rocket ship

“I think AI is akin to building a rocket ship. You need a huge engine and a lot of fuel. If you have a large engine and a tiny amount of fuel, you won’t make it to orbit. If you have a tiny engine and a ton of fuel, you can’t even lift off. To build a rocket you need a huge engine and a lot of fuel. The analogy to deep learning [one of the key processes in creating artificial intelligence] is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.”

Today, we are still limited by technology from achieving scale. Google’s neural network that identified cats had 16,000 nodes. In contrast, a human brain has an estimated 100 billion neurons!

There are some scenarios where Back propagation neural networks are suited

  • A large amount of input/output data is available, but you’re not sure how to relate it to the output. Thus, we have a larger number of “Given an input, this is the correct output” type scenarios which can be used to train the network because it is easy to create a number of examples of correct behaviour.
  • The problem appears to have overwhelming complexity. The complexity arises from Low rules base and a high dimensionality and from data which is not easy to represent.  However, there is clearly a solution.
  • The solution to the problem may change over time, within the bounds of the given input and output parameters (i.e., today 2+2=4, but in the future we may find that 2+2=3.8) and Outputs can be “fuzzy”, or non-numeric.
  • Domain expertise is not strictly needed because the output can be purely derived from inputs: This is controversial because it is not always possible to model an output based on the input alone. However, consider the example of stock market prediction. In theory, given enough cases of inputs and outputs for a stock value, you could create a model which would predict unknown scenarios if it was trained adequately using deep learning techniques.
  • Inference:  We need to infer new insights without observation. For example, based on the pitch of the sound – we can infer an accent and hence a nationality

Given an IoT domain, we could consider the top-level questions:

  • What existing applications can be complemented by Deep learning techniques by adding an intuitive component? (ex in smart cities)
  • What metrics are being measured and predicted? And how could we add an intuitive component to the metric?
  • What applications exist in Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition etc which also apply to IoT

Now, extending more deeply into the research domain, here are some areas of interest that I am following.

Complementing Deep Learning algorithms with IoT datasets

In essence, these techniques/strategies complement Deep learning algorithms with IoT datasets.

1)      Deep learning algorithms and Time series data : Time series data (coming from sensors) can be thought of as a 1D grid taking samples at regular time intervals, and image data can be thought of as a 2D grid of pixels. This allows us to model Time series data with Deep learning algorithms (most sensor / IoT data is time series).  It is relatively less common to explore Deep learning and Time series – but there are some instances of this approach already (Deep Learning for Time Series Modelling to predict energy loads using only time and temp data  )

2)      Multiple modalities: multimodality in deep learning. Multimodality in deep learning algorithms is being explored  In particular, cross modality feature learning, where better features for one modality (e.g., video) can be learned if multiple modalities (e.g., audio and video) are present at feature learning time

3)      Temporal patterns in Deep learning: In their recent paper, Ph.D. student Huan-Kai Peng and Professor Radu Marculescu, from Carnegie Mellon University’s Department of Electrical and Computer Engineering, propose a new way to identify the intrinsic dynamics of interaction patterns at multiple time scales. Their method involves building a deep-learning model that consists of multiple levels; each level captures the relevant patterns of a specific temporal scale. The newly proposed model can be also used to explain the possible ways in which short-term patterns relate to the long-term patterns. For example, it becomes possible to describe how a long-term pattern in Twitter can be sustained and enhanced by a sequence of short-term patterns, including characteristics like popularity, stickiness, contagiousness, and interactivity. The paper can be downloaded HERE

Implications for Smart cities

I see Smart cities as an application domain for Internet of Things. Many definitions exist for Smart cities/future cities. From our perspective, Smart cities refer to the use of digital technologies to enhance performance and wellbeing, to reduce costs and resource consumption, and to engage more effectively and actively with its citizens (adapted from Wikipedia). Key ‘smart’ sectors include transport, energy, health care, water and waste. A more comprehensive list of Smart City/IoT application areas are: Intelligent transport systems – Automatic vehicle , Medical and Healthcare, Environment , Waste management , Air quality , Water quality, Accident and  Emergency services, Energy including renewable, Intelligent transport systems  including autonomous vehicles. In all these areas we could find applications to which we could add an intuitive component based on the ideas above.

Typical domains will include Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition. Of special interest are new areas such as the Self driving cars – ex the Lutz pod and even larger vehicles such as self driving trucks

Conclusions

Deep learning involves learning through layers which allows a computer to build a hierarchy of complex concepts out of simpler concepts. Deep learning is used to address intuitive applications with high dimensionality.  It is an emerging field and over the next few years, due to advances in technology, we are likely to see many more applications in the Deep learning space. I am specifically interested in how IoT datasets can be used to complement deep learning algorithms. This is an emerging area with some examples shown above. I believe that it will have widespread applications, many of which we have not fully explored(as in the Smart city examples)

I see this article as part of an evolving theme. Future updates will explore how Deep learning algorithms could apply to IoT and Smart city domains. Also, I am interested in complementing Deep learning algorithms using IoT datasets.

I elaborate these ideas in the Data Science for Internet of Things program  (modelled on the course I teach at Oxford University and UPM – Madrid). I will also present these ideas at the International conference on City Sciences at Tongji University in Shanghai  and the Data Science for IoT workshop at the Iotworld event in San Francisco

Please connect with me if you want to stay in touch on linkedin and for future updates

Does the ‘app economy’ still exist?

 

 

 

 

 

 

Something extraordinary happened last week

An app (meerkat) (which was a ‘massive hit’ at SXSW) and which was launched only two months ago – raised $14m in funding.

Three days after that – it’s popularity plunged rapidly after the launch of Twitter’s periscope.

Probably never to return to its height.

A few more days after that Meerkat and Periscope are neck to neck

In two months  an app goes from launch – to funding (14m) – plunge.

Some blame the Tech journalists – and there is some truth in that.

A whole ecosystem has grown up to support the ‘app economy’ – including the VCs, tech journalists, conference creators, hackathons and industry analysts who rank apps.

Sentiment changes rapidly.

Now, some articles call it the Schrödinger’s meerkat(is it dead or is it alive?)

Others have taken to defend the tech journalists themselves ex from the Guardian Tech journalists may have been wrong about Meerkat but they’re right to get excited about new apps

But there is a wider question here ..

Apps uptake metrics(ex downloads) have become a bit like the dot com era obsession ..

There is a lot of activity but it is transient (as we see in the case of Meerkat) because the value no longer lies in the App itself.

For long term success, the value (if it exists) lies beyond the app.

Here are some reasons why the app economy dynamic is changing and value is shifting away from the app:

a)      Even when the app has been poor, the company has done well when the value lay beyond the app. The best example of this is LinkedIn – whose app and website are always frustrating to me. I need to sometimes use wikihow to understand even the basics such as deleting a contact  . The app could be a lot better – but we still use it despite the app

b)      APIs are becoming increasingly important and are managing much of the complexity for example health care APIs. The app then becomes a simple interface – APIs do the work

c)       ‘App only’ brands are hard to sustain and expand: Unlike Linkedin – where the value lies beyond the app – for Rovio(angry birds) the product (and the value) was in the app itself. And 2014 has been a bad year for Rovio. It’s  unclear if the popularity of the brand will ever return.

d)      Content has a fleeting timescale and its getting even smaller: The diminishing popularity timescales apply to all online content. Gangnum style broke the YouTube popularity counter – but look again.. Gangnam style was launched in July 2012. Google trends for Gangnum style shows that it peaked in Dec 2012 – with a precipitous drop soon after. And Gangnum has been dropping in popularity ever since(even when cumulative views increase). Content apps also may have the same problem. Beyond the first year (or two) – they appear to be from an older era especially if the user base is younger. The Draw something app also had the same problem of drop in popularity

e)      Which apps do IoT developers use? Is like focussing on the dashboard and ignoring the engine: Which apps do IoT developers use is the wrong question – because it places too much emphasis on the app than the vertical(IoT). It’s like saying – which web development technique they use for their website? Does it matter? IoT is a hugely complex domain. Same will apply to automotive apps, healthcare apps etc.

f)       Apps are not open: Coming back to Meerkat – we are reminded with Twitter’s move that apps and social media are not open. If Twitter does a deal with Operators for ‘sponsored data’ – that’s even worse for innovation like Meerkat (and I expect that type of deal will be increasingly common – further suppressing  Long Tail innovation)

Analysis:

Apps continue to drive Long tail innovation

But for the reasons mentioned above, there is a fundamental shift in the ecosystem

Value is now closely tied to the vertical

In some ways, it is a natural maturing of the ecosystem

But when tied to a specific vertical – the value apportioned to the app is relatively less

Knowledge and integration about the Vertical now becomes more important than app in this maturing phase(leaving aside the Openness issue).

For example – for IoT – IBM bet $3 billion into IoT – but the focus is on analyzing data coming from many different devices.

The skillsets to do this are not the same as for the app – although there will be undoubtedly an app interface

So, does the app economy still exist?

Increasingly, not in the form we know it (across verticals)

In a more maturing phase, we will see deeper integration with specific verticals.

For other forms of apps – there is no way to predict economic value even over short periods

PS – if you are interested in IoT – have a look at this(  upskill to Big Data, Data Science and IoT )

We will also have an online version. Please contact me at ajit.jaokar at futuretext.com

Great to be on this list: The Internet of Things Landscape 2015: Top 100 Individuals and Brands

 

 

 

 

 

 

 

 

 

 

 

Great to be on this list http://www.onalytica.com/…/the-internet-of-things-landscap…/ (full list needs a free download) – I am No 90 (for individuals)

Good list of people and brands to follow

Data Science for Internet of Things course – London

 

 

 

 

 

 

 

 

 

 

Hello all

Over the past few years, I have been teaching a specialized course at Oxford University for Telecoms and Big Data

This year, I have also started teaching a new course for Data Science and IoT.

Here, we apply Predictive algorithms to IoT datasets.

Its a complex course and currently we have launched it with a few corporates through Oxford

Independent of the academic course, I have also launched a version with fablab London

The outline below gives you the the approach, content and modules.

If you can commute to London and want to master Data Science for Internet of Things – have a look at London Data Science for IoT

Alternately, we will have an online version for $600

This course is ideally suited to developers who want to transition their career towards Data Science(with an emphasis on Internet of Things)
By working with very small groups – I believe the program can truly make a difference
If you are interested in knowing more, please have a look at Data Science for IoT – London or please contact me for the Online version
I will also continue to share papers/research in this space as we develop more.

Book review: About Time Series Databases and a New look at Anomaly detection by Ted Dunning and Ellen Friedman

Introduction

 This blog is a review of two books. Both are available for free from the MapR site, written by Ted Dunning and Ellen Friedman (published by O Reilly) : About Time Series Databases: New ways to store and access data and A new look at Anomaly Detection

 The  MapR platform is a key part of the Data Science for the Internet of Things (IoT) course – University of Oxford and I shall be covering these issues in my course

 In this post, I discuss the significance of Time series databases from an IoT perspective based on my review of these books. Specifically, we discuss Classification and Anomaly detection which often go together for typical IoT applications. The books are easy to read with analogies like HAL (Space Odyssey ) and I recommend them.

 

Time Series data

The idea of time series data is not new. Historically, time series data can be stored even in simple structures like flat files. The difference now is the huge volume of data and the future applications possible by collecting this data – especially for IoT. These large scale time series databases and applications are the focus of the book. Large scale time series applications typically need a NoSQL database like Apache Cassandra, Apache HBase,  MapR-DB etc.  The book’s focus is Apache HBase and MapR-DB for the collection, storage and access of large-scale time series data.

  Essentially, time series data involves measurements or observations of events as a function of the time at which they occurred. The airline ‘black box’ is a good example of a time series data. The black box records data many times per second for dozens of parameters throughout the flight including altitude, flight path, engine temperature and power, indicated air speed, fuel consumption, and control settings. Each measurement includes the time it was made. The analogy applies to sensor data. Increasingly, with the proliferation of IoT, Time series data is becoming more common and universal. The data so acquired through sensors is typically stored in Time Series Databases.  The TSDB (Time series database) is optimized for best performance for queries based on a range of time

 

Time series data applications

Time series databases apply to many IoT use cases for example:

  • Trucking, to reduce taxes according to how much trucks drive on public roads (which sometimes incur a tax). It’s not just a matter of how many miles a truck drives but rather which miles.
  • A smart pallet can be a source of time series data that might record events of interest such as when the pallet was filled with goods, when it was loaded or unloaded from a truck, when it was transferred into storage in a warehouse, or even the environmental parameters involved, such as temperature.
  • Similarly, commercial waste containers, called dumpsters in the US, could be equipped with sensors to report on how full they are at different points in time.
  • Cell tower traffic can also be modelled as a time series and anomalies like flash crowd events that can be used to provide early warning.
  • Data Center Monitoring can be modelled as a Time series to predict  outages, plan upgrades
  • Similarly, Satellites, Robots and many more devices can be modelled as Time series data

From these readings captured in a Time Series database, we can derive analytics such as:

Prognosis: What are the short- and long-term trends for some measurement or ensemble of measurements?

Introspection: How do several measurements correlate over a period of time?

Prediction:  How do I build a machine-learning model based on the temporal behaviour of many measurements correlated to externally known facts?

Introspection:  Have similar patterns of measurements preceded similar events?

Diagnosis:  What measurements might indicate the cause of some event, such as a failure?

 

Classification and Anomaly detection for IoT

The books gives examples of usage of Anomaly detection and Classification for IoT data.

For Time series IoT based readings, anomaly detection and Classification go together. Anomaly detection determines what normal looks like, and how to detect deviations from normal.

When searching for anomalies, we don’t know what their characteristics will be in advance. Once we know characteristics, we can use a different form of machine learning i.e. classification

Anomaly in this context just means different than expected—it does not refer to desirable or un‐ desirable. Anomaly detection is a discovery process to help you figure out what is going on and what you need to look for. The anomaly-detection program must discover interesting patterns or connections in the data itself.

Anomaly detection and classification go together when it comes to finding a solution to real-world problems. Anomaly detection is used first in the discovery phase—to help you figure out what is going on and what you need to look for. You could use the anomaly-detection model to spot outliers, then set up an efficient classification model to assign new examples to the categories you’ve already identified. You then update the anomaly detector to consider these new examples as normal and repeat the process

The book goes on to give examples of usage of these techniques in EKG

For example, for the challenge of finding an approachable, practical way to model normal for a very complicated curve such as the EKG, we could use a type of machine learning known as deep learning.

 Deep learning involves letting a system learn in several layers, in order to deal with large and complicated problems in approachable steps. Curves such as the EKG have repeated components separated in time rather than superposed. We take advantage of the repetitive and separated nature of an EKG curve in order to accurately model its complicated shape to detect normal patterns using Deep learning

The book also refers to a Data structure called t-Digest for Accurate Calculation of Extreme Quantiles  t-digest was developed by one of the authors, Ted Dunning, as a way to accurately estimate extreme quantiles for very large data sets with limited memory use. This capability makes t-digest particularly useful for selecting a good threshold for anomaly detection. The t-digest algorithm is available in Apache Mahout as part of the Mahout math library. It’s also available as open source at https://github.com/tdunning/t-digest

 

Anomaly detection is a complex field and needs a lot of data.

For example: what happens if you only save a month of sensor data at a time, but the critical events leading up to a catastrophic part failure happened six weeks or more before the event?

IoT from a large scale Data standpoint

To conclude, much of the complexity for IoT analytics comes from the management of Large scale data.

Collectively, Interconnected Objects and the data they share make up the Internet of Things (IoT).

Relationships between objects and people, between objects and other objects, conditions in the present, and histories of their condition over time can be monitored and stored for future analysis, but doing so is quite a challenge.

However, the rewards are also potentially enormous. That’s where machine learning and anomaly detection can provide a huge benefit.

For Time series, the book covers themes such as

Storing and Processing Time Series Data

The Direct Blob Insertion Design

Why Relational Databases Aren’t Quite Right

Architecture of Open TSDB

Value Added: Direct Blob Loading for High Performance

Using SQL-on-Hadoop Tools

Using Apache Spark SQL

 Advanced Topics for Time Series Databases(Stationary Data, Wandering Sources, Space-Filling Curves )

For Anomaly detection:

Windows and Clusters

 Anomalies in Sporadic Events

Website Traffic Prediction

Extreme Seasonality Effects

Etc

 

Links again:

About Time Series Databases: New ways to store and access data and A new look at Anomaly Detection  by Ted Dunning and Ellen Friedman (published by O Reilly).

Also the link for Data Science for the Internet of Things (IoT) course – University of Oxford where I hope to cover these issues in more detail in context of  MapR

Data Science for Internet of Things (IoT) course – University of Oxford

I am pleased to announce a unique course  – Data Science for the Internet of Things (IoT) course – University of Oxford

We are launching first with very limited places We already are collaborating with Mapr, Sigfox, Hypercat and Red Ninja and many others

So the course will be based on practical insights from current systems

Everyone finishing the course will receive a University of Oxford certificate showing that they have completed the course

Course is fully online

Have a look  Data Science for the Internet of Things (IoT) course – University of Oxford for more

Welcome feedback and will update a lot more over next few weeks

If you want to avail of this very unique certification, please email me for more information ajit.jaokar at futuretext.com

Infographic: Fascinating Advancements in Electrical/Computer Engineering by Ohio State University

Ohio University Online

Programming for Data Science the Polyglot approach: Python vs. R OR Python + R + SQL

 

 

 

 

In this post, I discuss a possible new approach to teaching Programming for Data Science.

Programming for Data Science is focussed on the R vs. Python question.  Everyone seems to have a view including the venerable Nature journal (Programming – Pick up Python).

Here, I argue that we look beyond Python vs. R debate and look to teach R, Python and SQL together. To do this, we need to look at the big picture first (the problem we are solving in Data science) and then see how that problem is broken down and solved by different approaches. In doing so, we can more easily master multiple approaches and then even combine them if needed.

On first impressions, this Polyglot approach (ability to master multiple languages) sounds complex.

Why teach 3 languages together?  (For simplicity – I am including SQL as a language here)

Here is some background

Outside of Data science, I also co-founded a social enterprise to teach Computer Science to kids  Feynlabs. At Feynlabs, we have been working with ways to accelerate learning to Code. One way to do this is to compare and contrast multiple programming languages. This approach makes sense for Data Science also because a learner can potentially approach Data science from many directions.

To learn programming for Data Science, it would thus help to build up from an existing foundation they are already familiar with and then co-relate new ideas to this foundation through other approaches. From a pedagogical standpoint, this approach is similar to David Asubel who stressed the importance of prior knowledge in being able to learn new concepts:  “The most important single factor influencing learning is what the learner already knows.”

But first, we address what is the problem we are trying to solve and how that problem can be broken down

I also propose to make this approach as part of Data Science for IoT course/certification but I also expect I will teach it as a separate module – probably in a workshop format in London and USA. If you are interested to know more, please sign up on the mailing list   HERE

Data Science – the problem we are trying to solve

Data science involves the extraction of knowledge from data. Ideally, we need lots of data from a variety of sources.  Data Science lies at the intersection of multiple disciplines: Programming, Statistics, Algorithms, Data analysis etc. The quickest way to solve Data Science problems is to start analyzing data as soon as possible. However, Data Science also needs a good understanding of the theory – especially the machine learning approaches.

A Data Scientist typically approaches a problem using a methodology like OSEMN (Obtain, Scrub, Explore, Model, Interpret). Some of these steps are common to a classic data warehouse and are similar to classic ETL (Extract Transform Load) approach. However, the modelling and interpreting stage are unique to Data Science. Modelling needs an understanding of Machine Learning algorithms and how they fit together. For example: Unsupervised algorithms (Dimensionality reduction, Clustering) and Supervised algorithms (Regression, Classification)

To understand Data Science, I would expect some background in Programming. Certainly, one would not expect a Data Scientist to start from “Hello World”. But on the other hand, the syntax of a language is often over-rated. Languages have quirks – and they are easy to get around with most modern tools.

So, if we try to look at the problem / big picture first (ex the Obtain, Scrub, Explore, Model and Interpret) stages – it is easier to fit in the Programming languages to the stages. Machine Learning has 2 phases: the Model Building phase and the Prediction phase. We first build the model (often as a batch mode – and it takes longer). We then perform predictions on the model in a dynamic/real-time mode. Thus, to understand Programming for Data Science, we can divide the learning into four stages: The Tool itself (IDE), Data Management, Modelling and Visualization

Tools, IDE and Packages

After understanding the base syntax – it’s easier to understand the language in terms of its packages and libraries. Both Python and R have a vast number of packages (such as Statsmodels)  – often distributed as libraries (scikit-learn). Both languages are interpreted. Both have good IDEs such as Spyder, iPython for Python and RStudio for R. If using Python, you would probably use a library like scikit-learn and a distribution of Python such as the Anaconda distribution. With R, you would use the RStudio  and install specific packages using R’s  CRAN package management system.

Data management

Apart from R and Python, you would also need to use SQL. I include SQL because SQL plays a key role in the Data Scrubbing stage. Some have called this stage as the Janitor work of Data Science and it takes a lot of time. SQL also plays a part in SQL on Hadoop approaches like Apache Drill which allow users to write SQL queries on data stored in Hadoop and receive results

With SQL, you are manipulating data in Sets. However, once the data is inside the Programming environment, it is treated differently depending on the language.

In R, everything is a vector and R Data structures and functions are vectorized . This means, most functions in R work on Vectors (i.e. on all the elements – not on individual elements in a loop). Thus, in R, you read your data in a data frame and use a built-in model (here are the steps / packages for linear regression) . In Python, if you did not use a library like scikit-learn , you would need to make many decisions yourselves and that can be a lot harder. However, with a package like scikit-learn, you get a consistent, well documented  interface to the models. That makes your job a lot easier by focussing on the usage.

Data Exploration and Visualization

After the Data modelling stage, we come to Data exploration and visualization. Here, for Python – the pandas package is a powerful tool for data exploration. Here is a simple and quick intro to the power of Python Pandas (YouTube video). Similarly, R uses dplyr and ggplot2 packages for Data exploration and visualization.

A moving goalpost and a Polyglot approach

Finally, much of this discussion is a rapidly moving goalpost. For example, in R, large calculations need the data to be loaded in a matrix (ex nxn matrix manipulation). But, with platforms like Revolution Analytics – that can be overcome. Especially with the acquisition of Revolution analytics by Microsoft – and with Microsoft’s history for creating good developer tools – we can expect development in R would be simplified.

Also, since both R and Python are operating in the context of Hadoop for Data science, we would expect to leverage the Hadoop architecture through HDFS connectors both for Python Hadoop frameworks and R Hadoop integration. Also, one would argue that we are already living in a post hadoop/mapreduce world with Spark and Storm especially for Real time calculations and that at least some Hadoop functions may be replaced by Spark

Here is a good introduction to Apache Spark and a post about Getting started with Spark in Python. Interestingly, the Spark programming guide includes integration with 3 languages (Scala, Java and Python) but no R. But the power of Open source means we have SparkR which integrates R with Spark.

The approach to cover multiple languages has some support – for instance, with the Beaker notebook . You could also achieve the same effect by working on the command line for example in Data Science at the Command Line

Conclusions

Even in a brief blog post – you can get a lot of insights when we look at the wider problem of Data science and compare how different approaches are addressing segments of that problem. You just need to get the bigger picture of how these Languages fit together for Data Science and understand the  major differences (for example vectorization in R).

Use of good IDEs, packages etc softens the impact of programming.

It then changes our role, as Data Scientists, to mixing and matching a palette of techniques as APIs – sometimes spanning languages.

I hope to teach this approach as part of Data Science for IoT course/certification

Programming for Data Science will also be a separate module talk over the next few months at fablab london, London IT contractors meetup group, CREATE Miami, a venture accelerator at Miami Dade College, City Sciences conference(as part of a larger paper) in Shanghai and MCS Madrid

For more schedules and details please sign up HERE