Data Science for Internet of Things Course – Big Picture and Outline

Over the last year and a half, I have been teaching and evolving the concept of Data Science for Internet of Things

Here is how our current course outline looks like(and the rationale behind the approach)
comments welcome

If you are interested in being a part of the next batch, please contact me at

Data Science for Internet of Things is based on time series data from IoT devices – but with three additional techniques: Deep learning, Sensor fusion (Complex Event Processing) and Streaming.

We consider Deep learning because we treat cameras as sensors but also include reinforcement neural networks for IoT devices

The course is based on templates(code) for the above in R, Python and Spark(Scala). It is hence suited for people with a Programming background(even if from other languages)

The ideas learnt in the core modules are implemented in Projects. Projects could last as long as six months

The diagram is representative of the course (not an application per se). It shows the core modules(ex time series etc). The advanced modules(ex Sensor fusion) are built on these

Much of our work has been published in leading blogs like KDnuggets and Data Science Central etc

The course has evolved based on active contribution from participants: ex Jean Jacques Barnard(methodology), Peter Marriot(Python), Sibanjan Das(H2O/Deep learning), Shiva Soleimani(methodology), Yongkang Gao(Nvidia TK1), Raj Chandrasekaran(Spark) , Vinay Mendiratta(systems level optimization of IoT sensors). We plan to open source most of our code

We use Apache Spark for Streaming and Apache flink for sensor fusion.

Ironically, due to the emphasis on Data, the course is strictly not an IoT course ie we are concerned primarily with applying predictive learning algorithms on IoT datasets

Finally, the course is personalized. I see it more as coaching than a course. – for example you can choose to focus on a smaller subset of topics which is decided in the personal learning plan at the outset

Interested ? Email for details of the September batch (now in it’s fourth batch)

Data Science foundation for Programmers – One day workshop in London, Miami and New York




Data Science foundation for Programmers is a one day course that introduces Programmers to developing Data Science applications.

The hands-on course uses the R Programming language to introduce machine learning algorithms.

The program includes a one day workshop followed by a one week online session to complete the Programming Exercises.

Workshop Outline

What is Data Science

  •  An introduction to Data Science
  • Data Science process flow/steps
  • Machine Learning algorithms
  • How to choose an algorithm

The R Programming Language

  • Why should you learn R and who is using it
  • R in the ‘Big Data world’
  • R syntax(Assignments, Data Structures, Flow Control, Functions)
  • R packages – an overview
  • Loading and Handling Data in R
  • Example Datasets

Exploratory Data analysis

1) Understanding your Data:

In this section, we understand the characteristics of the data which help us later in choosing an algorithm. This includes

  • Summary in R
  • Distributions
  • Dimensions of Data – Mean, Standard deviation, Mode
  • Data corelations

2) Preprocessing Data :

Here, we understand the steps in preprocessing data including Scale, Center,

Standardize, Normalization and Principal Component Analysis

3) Visualizing Data :

In this section, we discuss techniques to data visualization in R

From Programming to Statistical Programming

  • Making Predictions – Supervised and unsupervised learning
  • Understanding Linear Regression
  • Non linear regression techniques (ex Support vector machines, k nearest, Decision trees)
  • Linear classification techniques (ex Logistic regression)
  • Non linear classification techniques(ex Neural networks)
  • Model Evaluation

R in the wider context

  • R in the Big Data world – ex Apache Spark
  • Deep learning
  • R and Python

Dates and Venue:


July 22 9:30 am to 4:30 pm – venue in central London


Workshop one: Tuesday Aug 2 and Wednesday Aug 3 in the evening 5:30 to 8 pm

Workshop two: Saturday Aug 6 full day (9 am to 4:30 pm)

New York

Aug 10 and 11 in the evening 5:30 to 8 pm


$199 USD for New York and Miami and

£140 GBP + VAT for UK

For registration (including Online option) please contact


a) Workshops have limited places – please contact fast if interested

b) Outline and Syllabus subject to change

c) The course is hands-on – and you will need to have your own laptop and previously install R on it.(instructions will be provided)

d) You do not need to already know R Programming but you must have some Programming background in a language.


Internet of Things world – the world’s largest IoT event

This year, as usual, we are proud to support IoT world in Santa Clara – the world’s largest IoT event

There is a great list of speakers

Topics I liked from the agenda include ..

  • How the Internet of Things Is Poised to Drive Business and Societal Transformation On a Global Scale
  • Revolutionizing Business Models; Driving Innovation & Embracing Disruption in a Future Connected World
  • Creating Value from connecting “things” Assessing the commercial feasibility and monetization of IoT
  • How will the “Internet of Things” remake your industry?
  • How secure is IoT in the home
  • Examining the role of Big data, Processing and Analytics for the Industrial Internet
  • The future of mobility: Assessing long term investment in car tech and how the industry will change over the next 10 years
  • The competing visions on the future of the connected car: Which horse should you bet on?
  • Examining the Smart Watch Trends, Timelines & Predictions
  • The Rise to Smarter Health Through Wearables
  • Examining Advancements and Adoption of Industrial Robotics
  • Building the “Smart Factory”
  • Showcasing a Connected City’s innovations and plans to move towards a smarter city
  • Transforming Public Services with IoT
  • Next Generation Data Visualization & Predictive Analytics
  • How is the Internet of Things is changing healthcare?
  • Remote Patient Monitoring with IoT
  • Building IoT solutions and taking data analytics into the cloud
  • Fog Computing: A Platform for IoT & Analytics
  • Hacking IoT
  • Building a Business Case for More Integrations in the Home
  • IoT in Shipping
  • Monetizing the Smart City with Location and Context
  • Connected Aviation Opportunities – Understanding Passenger Experience
  • Energy, Environment & Agriculture
  • Data Analytics & Visualization for Energy and Utility Companies
  • Utility Provider Case Study: Making the move towards the connected home versus providing
  • Examining the security risks associated with smart meters and smart grids
  • Embracing IoT in Agriculture: A Monsanto Case Study
  • IoT and the Farm: Challenges and Goals for the Future
  • Delivering an effective UX for the consumer
  • Vision for IoT: How Image Sensors Are Set To Influence the Next Generation of Connected Devices
  • Is your data center ready for IoT?
  • Moving towards the Next Generation of IoT Data with Machine Learning
  • Panel: Panel: Examining the role of Big data, Processing and Analytics for the Industrial Internet

for the full agenda and speakers see IoT world in Santa Clara – the world’s largest IoT event

Deep Learning for Internet of Things Using H2O

Pleased to see this blog on Kdnuggets – co-authored by me  Deep Learning for Internet of Things Using H2O



Deep Learning Applications for Smart cities

Background and Approach

This blog is based on my talk in London at the Connected City Summit on Deep Learning Applications for Smart cities. The talk is based on a forthcoming paper created with the help of my students at UPM/citysciences on the same theme. Please email me at  ajit.jaokar at  or follow me  @ajitjaokar  for more details.

Here are some notes on our approach:

  • When we speak of Machines – the media dramatizes the issue.  Yet,  city officials and planners plan for ten to twenty years in the future. They will have to consider many of these issues in a pragmatic way.
  • Deep Learning / Artificial Intelligence will impact many aspects of Smart cities. We decided to approach the subject in a pragmatic manner and to explore the impact of Deep Learning/AI technology on the lives of future citizens.

How could self-learning machines affect humanity in cities?

Initially, we started off with the usual Smart City approach i.e. domains such as Security – Transport – Health – Governance – Environment etc

Then, we were inspired by a statement “Man becomes the sex organs of the machine world – the bee of the plant world – enabling machines to evolve ever new forms” – Marshall McLuhan

It indicates that disruptive innovations like Deep Learning and AI cannot be viewed in silos. Instead, we decided to reframe the problem in a more disruptive way by asking the questions;

    What can Machines learn from Observations?

    What can Machines learn from Data?

    What impact does it have on new services, culture, citizens ?

    What are the threats?

    How will the lives of future citizens be impacted through self learning machines?


The shortest introduction to Deep learning:

Here is a brief introdcution to Deep Learning.  I have spoken of the Evolution of Deep Learning models and An introduction to Deep Learning and it’s role for future cities

Deep Learning can be seen more as a specific form of Machine Learning that leads to creating Self Learning Machines.  The whole objective of Deep Learning is to solve ‘intuitive’ problems i.e. problems characterized by High dimensionality and no rules.  With Deep learning, Computers can learn from experience but also can understand the world in terms of a hierarchy of concepts – where each concept is defined in terms of simpler concepts. The hierarchy of concepts is built ‘bottom up’ without predefined rules . This is similar to the way a child learns ‘what a dog is’ i.e. by understanding the sub-components of a concept ex  the behavior(barking), shape of the head, the tail, the fur etc and then putting these concepts in one bigger idea i.e. the Dog itself.

More specifically, a form of Deep Learning called Reinforcement Learning is making a huge impact in areas such as AlphaGo. Reinforcement Learning (RL) is based on a system of rewards. RL is a form of unsupervised learning – An RL agent learns by receiving a reward or reinforcement from its environment, without any form of supervision other than its own decision making policy.

In machine learning, the environment is typically formulated as a Markov decision process (MDP) as many reinforcement learning algorithms for this context utilize dynamic programming techniques. The main difference between the classical techniques and reinforcement learning algorithms is that the latter do not need knowledge about the MDP and they target large MDPs where exact methods become infeasible. Reinforcement learning differs from standard supervised learning in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected. Further, there is a focus on on-line performance, which involves finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge). (adapted from wikipedia)


Here are the trends we note from the themes noted above. Link sources from Home of AI info and the web

What are machines learning from Data and Observations?

  • New computer program first to recognize sketches more accurately than a human
  • Deep Learning Algorithm ‘Paints’ in the Style of Any Artist it Copies
  • New big data system developed at MIT is more intuitive than humans
  • Artificial intelligence breakthrough as intuition algorithm beats humans in data test
  • MIT Develops Device That Can See People Through Walls
  • Lie-detecting algorithm spots fibbing faces better than humans
  • Machines That Can See Depression on a Person’s Face
  • An algorithm aims to be able to replace human intuition
  • ‘Psychic Robot’ System Guesses Intentions From Your Movements
  • MIT’s intelligent drone can avoid crashes and fly at 30 MPH
  • Facebook working on AI that can tell what’s in photos
  • Computer Algorithms Could Aid Schizophrenia Diagnose
  • Machines That Can See Depression on a Person’s Face
  • Robot Radiologists Will Soon Analyze Your X-Rays
  • Predicting change in the Alzheimer’s brain
  • A new computer program that can diagnose cancer in just two days!
  • Machine learning to help predict online gambling addiction
  • Predicting people’s daily activities with deep learning
  • MIT Scientists Create An AI System That Can Determine How Memorable Your Face Is
  • This Algorithm Is Better At Predicting Human Behaviour Than Humans Are
  • New Artificial Intelligence: Russia Endows Robots With Collective Mind
  • Scientist Develop New Machine Which Can Calculate Pattern Recognition with Near Human speed
  • Machine Vision Algorithm Learns to Recognize Hidden Facial Expressions
  • Artificial Intelligence: Scientists Developed a Handwriting Algorithm
  • Computer With Built-In Algorithm Beats Man In A Turing Test
  • Machine learning to differentiate between positive and negative emotions using pupil diameter


Self learning for Robots(from observation)

  • Giving robots a more nimble grasp
  • Why it is hard to teach robots to choose wisely
  • Machine learning plays vital role in the evolution of Man
  • Designing Robots That Learn as Effortlessly as Babies
  • How Robots Can Quickly Teach Each Other to Grasp New Objects
  • Why IBM just bought billions of medical images for Watson to look at
  • Read my lips: truly empathic robots will be a long time coming


Learning Culture, Humanity, emotions and ethics

  • Smart Programs Read Shakespeare
  • Artificial intelligence learns how to put together interactive stories just as good as a human
  • How do you teach a machine to be moral?
  • ‘Psychic Robot’ System Guesses Intentions From Your Movements
  • Lie detection software learns from real court cases
  • Why Helping Humanity Should Be Core to Learning
  • Could Artificial Morals and Emotions Make Robots Safer?
  • AI: In search of the sarcasm algorithm
  • Microsoft Teaches Computers To Be Funny
  • Microsoft’s Project Oxford Can Now Detect Emotions from Photos
  • Robots are learning to disobey humans: Watch as machine says ‘no’ to voice commands
  • Robots could be converted to religion someday: Scientists
  • Intimacy & Falling In Love With A Robot Could Happen In 50 Years Because Of Artificial …
  • Health
  • If We Want Humane AI, It Has to Understand All Humans
  • Humai Is Working On A Way To Bring Your Loved Ones Back From The Dead
  • Mum Robot Goes Darwinian on Her Kids

How does that (self) learning affect services and our lives in future cities

  • Artificial intelligence comes to toys
  • Beyond the Pill: Data Is the New Drug – Google Life Sciences Rebrands As Verily, Uses Big Data To Figure Out Why We Get Sick
  • Nvidia Aims To Power Flying Vehicles with Jetson TX1 Board
  • Motorcycle-riding robot may take on world champion racer
  • Meet Mercedes-Benz’s Vision Tokyo, a self-driving car for the megacity
  • How artificial intelligence could lead to self-healing airplanes
  • Trains with brains: how Artificial Intelligence is transforming the railway industry
  • A self-driving sailboat to patrol the oceans and monitor the environment
  • Malaysia testing ‘artificial intelligence’ for prisons
  • Real-Time Seizure Detection Possible with Learning Algorithm
  • Facebook Is Helping People With Blindness “See” the Photos on Their Walls
  • Mitsubishi Electric uses machine-learning tech to detect distracted drivers
  • Tinder matches made easy with new intelligent algorithm
  • Deep Learning Algorithm Successfully Identifies Potential Intracranial Haemorrhaging
  • An artificial intelligence based third Umpire
  • When children talk to toys, some are talking back
  • Predicting change in the Alzheimer’s brain
  • Robotic Automation Meets Agriculture
  • Food delivered by drones, driverless cabs and cyber PAs to organise your party: A revolution in …
  • AI will soon be forecasting the weather
  • How Artificial Intelligence Can Fight Air Pollution in China
  • Starfish-killing robot to protect Great Barrier Reef
  • Self-Driving Car Tech Allows Vehicle To ‘See’ Environment In Real Time
  • US Company On Plan To Bring People Back From Dead Using Artificial Intelligence
  • A trillion tiny robots in the cloud: The future of AI in an algorithm world
  • Teforia Is A Tea Brewing Robot That Uses Algorithms To Pour The Perfect Cup
  • Japanese artificial intelligence passes university exams (but still can’t quite get into the country’s …
  • Facebook AI built to help visually impaired people
  • Problem of Climate Change and Global Conflicts Can Be Solved Using Human and Computer …


Risks to humanity and cities

  • ‘Only movies build bad robots‘ – famous last words?
  • Why human-in-the-loop computing is the future of machine learning
  • As Robots Steal Millennials’ Jobs, Young Workers Focus On Skills, Not Careers
  • Millions of jobs at risk from artificial intelligence
  • Davos report projects 5 million jobs will be lost to new technologies by 2020
  • Can Humanity Rein In The Rise Of The Machines?
  • Christian leader warns of ‘Frankenstein monsters’ due transhumanism
  • The rise of the killer robots — and why we need to stop them
  • Producer of Russia’s Armata T-14 plans to create army of AI robots
  • Inside the Pentagon’s Effort to Build a Killer Robot
  • How Technology Could Prevent Another Paris-Like Attack
  • Kaspersky deepens security offering through machine learning
  • Robots will declare war on humans within 25 years, claims artificial intelligence expert
  • Law firm bosses envision Watson-type computers replacing young lawyers
  • Hitachi Hires First ‘Artificial Intelligence’ Boss To Manage Workers

Conclusion and Evolution

We reframed the problem of Deep Learning and Smart cities by asking the Question:

How could self-learning machines affect humanity in cities?

    What can Machines learn from Observations?

    What can Machines learn from Data?

    What impact does it have on new services, culture, citizens

    What are the threats?

Please contact me at ajit.jaokar at to know more updates – especially if you are a city official. We are also planning to explore the implementation of these ideas by working with companies like Nvidia.

I would also like to thank the students who helped me with this project.

Wind energy forecasting using R : Data Science for Internet of Things Project

Participants from the Data Science for Internet of Things course are working on some excellent projects.

Here is a great example -

Created by Vaijayanti – and other collaborators from the course

Project name: 4Wind

Domain:  Renewables/Wind energy

Problem statement: To make day-ahead forecast for wind power (10 minute interval data)

Data: collected from around 5000 turbines

Duration: She has been working for more than a year and still ongoing

Programming Approach : Using R for data science. To evaluate various models: ARIMA, Holt-Winters forecast, neural networks ACF and PACF plots reveal 35 lags matter. Hence the data is recoded to have 35 columns of lags for both wind speed and wind power.

R packages: forecast, neuralnet, nnet

Challenges: Training neural networks has become heavy on a PC. Hence, experimenting with multiple models including Distributed(Spark) and AWS, Azure etc. 35 lags were based on ACF and PACF analysis. However, this number could be different. More analysis might yield different lags as predictors. Also, we could use not just consecutive lags but also the samples a day before and year before are also predictors.

UI: Using the Shiny package 

Publication/Open source methodology: Contribution to Data Science for IoT methodology and to forthcoming book on Data Science for IoT

Image source: wikipedia



The role of Constructivism in teaching Data Science for Internet of Things

The role of Constructivism in teaching Data Science


Constructivism, Data Science and Online Learning

I enjoy teaching.  I teach at University level (Oxford and UPM – Madrid). I also teach young people the principles of Computer Science using Space technology using a live Satellite in collaboration with Ardusat. Over the last decade, education itself has changed with more emphasis on Online education. This allows us (as teachers/tutors) the opportunity to explore new modes of  learning.  However, the principles of learning have not changed. I believe, in many ways, we now need to come back to the more traditional/timeless principles of teaching in a Digital world

This document explains the learning philosophy and the principles underpinning the Data Science for Internet of Things course. Specifically, it discusses a learning technique called Constructivism which I have been exploring for the teaching  of complex topics like Data Science for IoT. I  am thankful to Jean Jacques Barnard (a participant in the course) for his comments and feedback to this post.

Constructivism (constructivist learning philosophy) is based on learning mechanisms in which knowledge is internalized by learners through processes of accommodation and assimilation. The learner has an internal representation of a concept. They then encounter new ideas and concepts. These ideas and concepts are assimilated through learning. In doing so, they may incorporate new experiences into the existing framework. Alternately, if they find that the new experience contradicts their existing framework – they may evolve their internal representation or change their perception.  Thus, learning is an ongoing process of concept assimilation. We incorporate constructivist ideas into the course content, Projects and the methodology. Here are the implications and findings of this approach.

To summarize

a)   To acquire new knowledge in complex domains, the context is important. Specifically, the starting point of the Learner.

b)   New concepts and ideas have hierarchical dependencies and also lateral connections. These need to be incorporated into the learning process

c) The process is slow and personalized. It is not possible to use this approach in a mass/massive mode

d)  The Physical context matters. But Projects need to be seen in the wider context. Projects and Methodology go together. Projects provide the Physical context. The methodology provides the panoramic view to the problem in context of the Big Picture

e)  There is a broad structure to acquiring knowledge but not a fixed path



1)    Greater learning agility and flexibility

When I first met course participant Priscila Grison a few years ago in San Francisco, she mentioned an interesting Spanish language book by Julio Cortazar called Rayuela(in English Hopscotch). The author has a unique structure for the book. An author’s note suggests that the book would best be read in one of two possible ways, either progressively from chapters 1 to 56 or by “hopscotching” through the entire set of 155 chapters according to a “Table of Instructions” designated by the author. Cortázar also leaves the reader the option of choosing a unique path through the narrative.

I find this idea fascinating i.e. that the author should create one structure but the reader could assimilate it in more than one way. Why could that not be true of Online Education also? The ‘Educo’ in ‘Education’ is to draw from within. Online learning offers us that unique ability to learn from within – a concept now becoming more common through ideas like ‘Learning agility’ i.e. an individual must the ability to learn, adapt, and apply in quick cycles. By using the constructivist philosophy, we incorporate a ‘structure’ a linear flow of modules but also an ‘unstructure’ through various means – such as Personal learning plans, choice of programming languages etc. This leads to greater learning agility.


2)    Implications for Projects in a constructivist sense

Projects are the core of our course – but we use the idea of Projects in a wider learning context. The key idea in constructivism is: The starting point is familiar to the learner. Hence, we spend a lot of  time in the first few months trying to understand the learner’s current state of understanding. After this, the next phase is spent on the core modules which correlates new ideas to the existing understanding thereby building on existing concepts. In turn, these are further expanded in the  project.  The project itself is contrasted against a methodology Creating an Open methodology for Internet of Things – IoT analytics and Data Science (created by me, Jean Jacques and another course participant Shiva Soleimani) which shows the learner how their knowledge fits into the wider context of the problem.


3) Acquiring new and complex skills

Constructivism is very suited to acquiring new and complex skills because new knowledge is both hierarchical and also laterally related. For example, PCA  needs understanding of Eigenvalues .. which in turn is based on Matrix multiplication. Similarly, many concepts in Python and R are related(lateral co-relation of new concepts)


4) Not Massive (Not a MOOC)

The strategy of Constructivism will not scale – because it needs greater direct engagement with the participants. That may be for the good!  I have never heard a teacher use the word ‘Massive’ in context of teaching.  VCs use ‘Massive’ for MooCs  i.e. Massive Open Online Courses – but then again VCs have been known to search for mythical beasts like Unicorns.  If you really think about it – the word Massive is associated with teaching only as a Business model.  We do not believe that ‘Massive’ helps teaching because it is really not possible to give attention in the massive model

5) Not Free but Affordable

An education system based on Constructivism will be not free but affordable. To create a large/Massive system we need a free service – with minimal personal engagement.  In contrast, everyone remembers from their childhood at least one teacher who took an extra interest in their work. Such personal attention is not possible on a large scale. Instead of free – a more affordable model is possible. Such models are not  likely to be funded by traditional VCs. But, they could be still be viable and niche institutions.


In industry – the analogy would be the German concept of Mittelstand companies (common in Germany, Austria and Switzerland) . Mittelstand companies have no real equivalence outside of Germany. It would be wrong to call them an SME (Small to medium enterprise) because they operate on a specific ethos  which includes qualities such as: Long-term focus, Independence, Emotional attachment to the business, Investment into the workforce, Flexibility, Lean hierarchies, Innovation, technical excellence, craftsmanship, Social responsibility etc.  The point here is: It is possible to create a viable venture on the Web which serves a niche customer base – taking a long term view.  This model is not so common – but is still very achievable.  That’s our aspiration.

6) Community and personalization

This strategy makes us rethink the word ‘community’. Everyone on the Web likes to think of themselves as a ‘community’. But this is yet another word that has morphed into a different meaning in a commercial context. Take the case of LinkedIn. Is LinkedIn a ‘community’? Or more many people trying to sell things to others(with very little in common other than that function).  On the Web – communities reduce to forums where involvement of the tutor is minimised.  But with a smaller group, it is possible to have calls/ meetings etc.  So, we encourage lots more communication between the tutor and the participant i.e. actually speaking/Skype calls etc. Again, you cannot do this on a mass scale.

To Conclude ..

To summarize

a)   To acquire new knowledge in complex domains, the context is important. Specifically, the starting point of the Learner.

b)   New concepts and ideas have hierarchical dependencies and also lateral connections. These need to be incorporated into the learning process

c) The process is slow and personalized. It is not possible to use this approach in a mass/massive mode

d)  The Physical context matters. But Projects need to be seen in the wider context. Projects and Methodology go together. Projects provide the Physical context. The methodology provides the panoramic view to the problem in context of the Big Picture

e)  There is a broad structure to acquiring knowledge but not a fixed path

To conclude, all this sounds less like training but more like going back to the first Principles of Teaching or even Coaching for complex topics such as Data Science for IoT. Going back to timeless values(even when they don’t quite scale!) may well be called for in the next generation. Dr Hessa Al Jaber ex Minister of ICT for Qatar (with whom I was honoured to work with at the World Economic Forum) says in an insightful  post Why my daughter’s generation faces challenges mine didn’t   In the midst of this digital age, and global connectivity fueling all of these conflicting ideas, her generation might start to feel lost.  Values that seemed immutable have begun to shift.

I  welcome comments (ajit.jaokar at ). I believe this is a new approach – and it’s a learning experience for us all!

Image:  Jean Piaget – the founder of Constructivism – source – Wikipedia

MWC 2016: 3 developments to watch which impact IoT analytics Introduction

I have attended the Mobile World Congress for the last 7 years (5 of them as a speaker/panellist).

This year, I did not go. Partly, this is reflective of my change in business focus (to Data Science for Internet of Things course) .

But, also I believe that today, MWC does not reflect disruptive innovation.  The show has more than 100,000 participants but we do not see fundamental game-changing innovations likeDeepmind – Go from Google last week.

Thus, in keeping with my current focus, this blog highlights developments to watch from MWC 2016 which impact IoT analytics. From an IoT / IoT analytics standpoint – the Telecoms industry would only play a major role with  5G(beyond 2020) ex see The rise of 5G: the network for the Internet of Things  and The plans for 5g to power the Internet of Things

Before I list the areas that impact IoT analytics, here is some background. These factors have remained the same for most of the Mobile Data Industry’s lifetime

a)      Telecoms Operators only make money when Data passes through the Cellular network. That is not always the case(ex WiFi, Bluetooth etc).

b)      Telecom Operator business models are  not suited to selling other types of products(ex to Enterprise)

c)       The interface to the customer, once so prized, is now gone to Web and Social media players like Facebook

d)      Telecoms have a long history of successful engineering led standards. But these standards have not been successful at higher layers of the stack. So, again we see many players unite to launch One M2M – but the history of Telecoms standards is poor

e)      Device vendors like Samsung, Apple etc are more successful with innovation but are take an independent approach as best they can

f)       Even when Data passes through the Telecoms network, the Operator may not be able to monetize it for legal reasons(ex just because an Operator knows your location, they do not have the permission to sell you advertising)

Having said that, from an IoT Analytics standpoint, there are some areas which impact IoT data(hence IoT analytics) which are unique to Telecoms. This blog discusses three such developments which I am tracking


Analysis and Focus areas

1) LPWA battles

low power wide area networks are a key battleground for the industry at the moment. With the arrival of 5G years away(including 5G devices) is years away, the main players today are Sigfox and LoRa. The Operators have been forced to push LTE-M in a hurry (see Lora vs LTE vs Sigfox). The momentum is certainly with LoRa and Sigfox.


2) Platforms

There is no shortage of IoT based platforms including Telecoms oriented platforms like Jasper(now acquired by Cisco).  However, we will see maturity in Platforms and new partnerships. Here are three developments

Analytics platforms spanning multiple network types: Platforms ItalTel launches solutions for healthcare and Infrastructure – this is an example of a more mature platform supporting multiple network types(Sigfox and LoRa ) and even WebRTC. The platform claims to have analytics built into it.

Open Source IoT analytics platforms like Kaa IoT who were at the show and who I am following with interest

Ericsson – Amazon Cloud Deal – A deal which combines the IoT cloud PAAS vendors with the Telecoms network providers.

3) e-SIM

And finally, the eSim. I have been following the work of companies like Gemalto for a while who have long advocated the eSim technology. Today, in the age of IoT, its a technology whose time has come.

The embedded SIM (also called eSIM or eUICC) is a new secure element designed to remotely manage multiple mobile network operator subscriptions and compliant with GSMA specifications

eSim could help customers to set up and manage subscriptions on devices remotely via a single embedded SIM (eSIM) in a process which will be a lot cheaper, easier and faster without sacrificing any levels of security. There were a few key eSim announcements such as Gemalto with Jasper wireless  and also Sierra wireless and valeo on telematics



IoT analytics (Data Science for IoT) is still a nascent field and Telecoms will be a key enabler for IoT. I see the initial impact around areas like Security, LPWA and Platforms which facilitate analytics for IoT. If you want to see more of my work, please see Data Science for Internet of Things course

Book review : Big Data Analytics with Spark A Practitioner’s Guide to Using Spark for Large Scale Data Analysis By Mohammed Guller

Book review : Big Data Analytics with Spark A Practitioner’s Guide to Using Spark for Large Scale Data Analysis I have been reading and reviewing a number of excellent books for the Data Science for IoT course  and also my Oxford University course. Big Data Analytics with Spark  By Mohammed Guller is for data scientists, business analysts, data architects, and data analysts looking for a better and faster tool for large-scale data analysis. It is also for software engineers and developers building Big Data products. The book covers a subject which I have been focussing on through my teaching and research. It provides a  step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. The book covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML. My analysis: The book covers Mllib, Scala, Spark and Analytics in detail but it is also readable. It also covers Code for all these sections. The only recommendations I would make are: A better index and releasing code in Github. However, the book pdf can be bought for an extra $5(so you can copy and paste the code if you need it) I see the book comprising three sections: a)      The main theme of the book i.e. Big Data Analytics with Spark b)      The first five chapters leading up to the theme c)       The last three chapters on Spark deployment The main theme of the book i.e. Big Data Analytics with Spark

  • Chapter 6: Spark Streaming (23 pages):  Introduce Spark streaming and show an example app using Spark streaming includes Spark streaming introduction, How Spark streaming works and A spark streaming example app.
  • Chapter 7: Spark SQL and Dataframes (50 pages): Introduce Spark SQL along with a few examples
  • Chapter 8: MLlib and SparkML (50 pages): Introduce machine learning and MLlib along with a few examples covers Machine learning introduction, Linear regression, Logistic regression, Classification, Clustering, Recommender system. Building a machine learning application with MLlib, MLBase
  • Chapter 9(23 pages): GraphX   Introduce Graph analysis and GraphX along with a few examples

The first five chapters leading up to the theme

  • Chapter 1: Big Data Technology Landscape :  Cluster computing(Hadoop MapReduce, HDFS, Hive), Data serialization( Avro, Proto Buffer), Columnar storage (Parquet), Messaging system (Kafka, ZeroMQ), NoSQL databases (HBase, Cassandra), Distributed SQL Query engine (Apache Drill, Impala, PrestoDB)
  • Chapter 2: Functional Programming in Scala (30 pages)  Introduce Scala so that readers can understand and write Spark applications in Scala, which is the primary language supported by Spark. This includes Key functional programming concepts including Basic Scala constructs, Scala Shell etc
  • Chapter 3: Spark’s Essentials (35 pages):  Introduce Spark fundamentals and key concepts

What is Spark, Why Spark is hot, Why Spark is faster than Hadoop MapReduce, Resilient Distributed Datasets (RDD)

  • Chapter 4: Spark Shell (10 pages): Introduce Spark Shell and show how it can be used for interactive data analysis, Spark shell introduction, Interactive data analysis in Spark-shell
  • Chapter 5: A Stand-alone Spark Application (10 pages):  Provide step-by-step directions for writing and running a Spark application. Basic structure of a stand-alone Spark application, Compiling a Spark application

The last three chapters (Deployment Chapters)

  • Chapter 10: Deploying Spark – a walkthrough of Spark deployment with different cluster management technologies such as YARN, Mesos, and services like AWS (EC2)
  • Chapter 11: Monitoring a Spark Cluster (20 pages)

Overall, I very much recommend this book. Big Data Analytics with Spark A Practitioner’s Guide to Using Spark for Large Scale Data Analysis By Mohammed Guller I also plan to use this book in the Data Science for IoT course  and also my Oxford University course which I will teach later in the year.

#DataScience for #IoT meetup – (Nvidia Jetson demo) pleased to announce sponsor – Shack 15

For the first #DataScience for #IoT meetup – great to have an amazing sponsor and venue Shack 15 (@SHACK15LDN )

The meetup will be about Nvidia jetson Tk1 talk and demo

We will be demoing Nvidia jetson Tk1 and also discussing actual implementations ex Nvidia jetson tk1 with caffe on mnist 

The talk will be presented by Yongkang Gao  

SHACK15 is London’s up and coming data science hub. They strive to create an ecosystem of developers from the most innovative and creative data science startups, particularly those in their early stage, together with academics from the most prestigious institutions in the UK and across the globe. SHACK15 is run by Meltwater, the global leaders in social media and news analytics powered by their state-of-the-art developer platform, Fairhair.

If you are interested in Deep learning, please register here #DataScience for #IoT meetup