MWC 2016: 3 developments to watch which impact IoT analytics Introduction

I have attended the Mobile World Congress for the last 7 years (5 of them as a speaker/panellist).

This year, I did not go. Partly, this is reflective of my change in business focus (to Data Science for Internet of Things course) .

But, also I believe that today, MWC does not reflect disruptive innovation.  The show has more than 100,000 participants but we do not see fundamental game-changing innovations likeDeepmind – Go from Google last week.

Thus, in keeping with my current focus, this blog highlights developments to watch from MWC 2016 which impact IoT analytics. From an IoT / IoT analytics standpoint – the Telecoms industry would only play a major role with  5G(beyond 2020) ex see The rise of 5G: the network for the Internet of Things  and The plans for 5g to power the Internet of Things

Before I list the areas that impact IoT analytics, here is some background. These factors have remained the same for most of the Mobile Data Industry’s lifetime

a)      Telecoms Operators only make money when Data passes through the Cellular network. That is not always the case(ex WiFi, Bluetooth etc).

b)      Telecom Operator business models are  not suited to selling other types of products(ex to Enterprise)

c)       The interface to the customer, once so prized, is now gone to Web and Social media players like Facebook

d)      Telecoms have a long history of successful engineering led standards. But these standards have not been successful at higher layers of the stack. So, again we see many players unite to launch One M2M – but the history of Telecoms standards is poor

e)      Device vendors like Samsung, Apple etc are more successful with innovation but are take an independent approach as best they can

f)       Even when Data passes through the Telecoms network, the Operator may not be able to monetize it for legal reasons(ex just because an Operator knows your location, they do not have the permission to sell you advertising)

Having said that, from an IoT Analytics standpoint, there are some areas which impact IoT data(hence IoT analytics) which are unique to Telecoms. This blog discusses three such developments which I am tracking


Analysis and Focus areas

1) LPWA battles

low power wide area networks are a key battleground for the industry at the moment. With the arrival of 5G years away(including 5G devices) is years away, the main players today are Sigfox and LoRa. The Operators have been forced to push LTE-M in a hurry (see Lora vs LTE vs Sigfox). The momentum is certainly with LoRa and Sigfox.


2) Platforms

There is no shortage of IoT based platforms including Telecoms oriented platforms like Jasper(now acquired by Cisco).  However, we will see maturity in Platforms and new partnerships. Here are three developments

Analytics platforms spanning multiple network types: Platforms ItalTel launches solutions for healthcare and Infrastructure – this is an example of a more mature platform supporting multiple network types(Sigfox and LoRa ) and even WebRTC. The platform claims to have analytics built into it.

Open Source IoT analytics platforms like Kaa IoT who were at the show and who I am following with interest

Ericsson – Amazon Cloud Deal – A deal which combines the IoT cloud PAAS vendors with the Telecoms network providers.

3) e-SIM

And finally, the eSim. I have been following the work of companies like Gemalto for a while who have long advocated the eSim technology. Today, in the age of IoT, its a technology whose time has come.

The embedded SIM (also called eSIM or eUICC) is a new secure element designed to remotely manage multiple mobile network operator subscriptions and compliant with GSMA specifications

eSim could help customers to set up and manage subscriptions on devices remotely via a single embedded SIM (eSIM) in a process which will be a lot cheaper, easier and faster without sacrificing any levels of security. There were a few key eSim announcements such as Gemalto with Jasper wireless  and also Sierra wireless and valeo on telematics



IoT analytics (Data Science for IoT) is still a nascent field and Telecoms will be a key enabler for IoT. I see the initial impact around areas like Security, LPWA and Platforms which facilitate analytics for IoT. If you want to see more of my work, please see Data Science for Internet of Things course

Book review : Big Data Analytics with Spark A Practitioner’s Guide to Using Spark for Large Scale Data Analysis By Mohammed Guller

Book review : Big Data Analytics with Spark A Practitioner’s Guide to Using Spark for Large Scale Data Analysis I have been reading and reviewing a number of excellent books for the Data Science for IoT course  and also my Oxford University course. Big Data Analytics with Spark  By Mohammed Guller is for data scientists, business analysts, data architects, and data analysts looking for a better and faster tool for large-scale data analysis. It is also for software engineers and developers building Big Data products. The book covers a subject which I have been focussing on through my teaching and research. It provides a  step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. The book covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML. My analysis: The book covers Mllib, Scala, Spark and Analytics in detail but it is also readable. It also covers Code for all these sections. The only recommendations I would make are: A better index and releasing code in Github. However, the book pdf can be bought for an extra $5(so you can copy and paste the code if you need it) I see the book comprising three sections: a)      The main theme of the book i.e. Big Data Analytics with Spark b)      The first five chapters leading up to the theme c)       The last three chapters on Spark deployment The main theme of the book i.e. Big Data Analytics with Spark

  • Chapter 6: Spark Streaming (23 pages):  Introduce Spark streaming and show an example app using Spark streaming includes Spark streaming introduction, How Spark streaming works and A spark streaming example app.
  • Chapter 7: Spark SQL and Dataframes (50 pages): Introduce Spark SQL along with a few examples
  • Chapter 8: MLlib and SparkML (50 pages): Introduce machine learning and MLlib along with a few examples covers Machine learning introduction, Linear regression, Logistic regression, Classification, Clustering, Recommender system. Building a machine learning application with MLlib, MLBase
  • Chapter 9(23 pages): GraphX   Introduce Graph analysis and GraphX along with a few examples

The first five chapters leading up to the theme

  • Chapter 1: Big Data Technology Landscape :  Cluster computing(Hadoop MapReduce, HDFS, Hive), Data serialization( Avro, Proto Buffer), Columnar storage (Parquet), Messaging system (Kafka, ZeroMQ), NoSQL databases (HBase, Cassandra), Distributed SQL Query engine (Apache Drill, Impala, PrestoDB)
  • Chapter 2: Functional Programming in Scala (30 pages)  Introduce Scala so that readers can understand and write Spark applications in Scala, which is the primary language supported by Spark. This includes Key functional programming concepts including Basic Scala constructs, Scala Shell etc
  • Chapter 3: Spark’s Essentials (35 pages):  Introduce Spark fundamentals and key concepts

What is Spark, Why Spark is hot, Why Spark is faster than Hadoop MapReduce, Resilient Distributed Datasets (RDD)

  • Chapter 4: Spark Shell (10 pages): Introduce Spark Shell and show how it can be used for interactive data analysis, Spark shell introduction, Interactive data analysis in Spark-shell
  • Chapter 5: A Stand-alone Spark Application (10 pages):  Provide step-by-step directions for writing and running a Spark application. Basic structure of a stand-alone Spark application, Compiling a Spark application

The last three chapters (Deployment Chapters)

  • Chapter 10: Deploying Spark – a walkthrough of Spark deployment with different cluster management technologies such as YARN, Mesos, and services like AWS (EC2)
  • Chapter 11: Monitoring a Spark Cluster (20 pages)

Overall, I very much recommend this book. Big Data Analytics with Spark A Practitioner’s Guide to Using Spark for Large Scale Data Analysis By Mohammed Guller I also plan to use this book in the Data Science for IoT course  and also my Oxford University course which I will teach later in the year.

#DataScience for #IoT meetup – (Nvidia Jetson demo) pleased to announce sponsor – Shack 15

For the first #DataScience for #IoT meetup – great to have an amazing sponsor and venue Shack 15 (@SHACK15LDN )

The meetup will be about Nvidia jetson Tk1 talk and demo

We will be demoing Nvidia jetson Tk1 and also discussing actual implementations ex Nvidia jetson tk1 with caffe on mnist 

The talk will be presented by Yongkang Gao  

SHACK15 is London’s up and coming data science hub. They strive to create an ecosystem of developers from the most innovative and creative data science startups, particularly those in their early stage, together with academics from the most prestigious institutions in the UK and across the globe. SHACK15 is run by Meltwater, the global leaders in social media and news analytics powered by their state-of-the-art developer platform, Fairhair.

If you are interested in Deep learning, please register here #DataScience for #IoT meetup

Using Space exploration to teach young people about Data Science – we go live today

PS – see our latest updates
 Please email me at ajit.jaokar at with subject Young Data Scientist if you want to know more as we launch the book/initiative
Many of you know that I co-founded a social startup in USA teaching young people Data science using space exploration, we have been working closely with Ardusat.
we go live today
10am Utah MST
12pm Miami EST
5pm London GMT
the globe will have a count down timer..
A lot of our work will start after we get live data from space.

AM2 from Ajit Jaokar on Vimeo.

The lipstick robot – a great way to explain Deep learning

I love motivational examples in teaching complex ideas

I use this simple little video teach Deep learning to my students in Oxford, UPM and Data science for Internet of Things

Think of ideas like teaching a computer to recognize images of cats using Deep learning .. OR training a computer to play pacman using Deep learning

They all work in the same way

You let the deep learning system iterate with many examples and in each case, you tell the computer using a classifier if it’s interpretation was correct or not(aka is it a cat or not, Pacman scores etc)

Now watch the video below

I see the click at the end of the step as a classifier

As you see, the robot has a long way to go!

I even think its improving on each iteration!

That’s deep learning for you!

PS I am not sure that this was the original intent of the video by @SimoneGiertz but its cool

Video link is lipstick robot

Creating an open methodology for Internet of Things (IoT) Analytics: Data science for Internet of Things



a) I am not referring to ‘standardization’ here. Rather the need for a methodology i.e. structured way to solve problems(Think of it like Kaggle meets #IoT analytics)

b)  Added reference to PFA(Portable format for Analytics) – thanks Gregory Piatetsky-Shapiro @kdnuggets for the feedback

We often encounter this problem in my teaching Data Science for Internet of Things:

There is no specific methodology to solve Data Science for IoT  (IoT Analytics) problems.

This leads to some initial questions:

  • Should there be a distinct methodology to solve Data Science problems for IoT?
  • Are IoT problems for Data Science unique enough to warrant a specific approach?
  • What existing methodologies should we draw upon?

On one hand , A Data Science for IoT problem is a typical Data Science problem. On the other hand, there are some unique considerations to IoT – for example in the use of Hardware, High Data volumes, Use of CEP(Complex event processing), impact of verticals(like automotive), Impact of streaming data etc.

Background and inspiration

Some initial background:

Data mining has well known methodologies such as Crisp DM.  Hilary Mason and others have also proposed specific methodologies for Data Science . Kaggle problems have a specific approach to solving them . With techniques like PFA(Portable format for Analytics) provide a way of formalizing and moving Analytics models.

All these strategies also apply to IoT. IoT itself has methodologies like Ignite IoT – but these do not cover IoT analytics in detail.

A methodology for IoT analytics(Data Science for IoT) should cover the unique aspects of each step in Data Science. For example: It is more than the choice of the model family. The choice of the model family (ANN, SVM, Trees, etc) is only one of the many choices to make – Others include :

a) Choice of the model structure – optimisation methodology (CV, Bootstrap, etc)

b)  Choice of the model parameter optimisation algorithm (joint gradients vs. conjugate gradients )

c)  Preprocessing of the data (centring, reduction, functional reduction, log-transform, etc.)

d)  How to deal with missing data (case deletion, imputation, etc.)

e)  How to detect and deal with suspect data (distance-based outlier detection, density-based, etc.)

f)  How to choose relevant features (filters, wrappers, embedded method ?)

g)  How to measure prediction performances (mean square error, mean absolute error, misclassification rate, lift, precision/recall, etc.)

source Methodology and standards for data analysis with machine learning tools Damien Fran¸cois ∗

The methodology could also cover  -

Exploratory analysis of data

Hypothesis testing (“Given a sample and an apparent effect, what is the probability of seeing such an effect by chance?” )

and other ideas ..

An Open methodology for IoT analytics problems

Building on the above, we need an Open, end-to-end,  step by step methodology to solve IoT Analytics/Data Science for IoT problems

In addition, the methodology would need to consider the unique aspects of IOT. For example:

a)      Complex event processing especially using Apache Spark for CEP

b)      Deep learning (because we consider Cameras as sensors)

c)      Anomaly Detection: Consider Anomaly detection (a typical IoT analytics scenario). There are many considerations:  What is the triggering event, How much has the machine deviated from the plan, What is the root cause of the bottleneck, Are there any external factors affecting the system performance, How do I know that I should trust IOT data? Is there a recommended plan of action? How is the Data visualized? Does the Data have missing elements? How do we detect failure in other processes? (Anomaly detection adapted from Dr Vinay Mehendiratta)

In addition, IoT vertical domains have special considerations: Smart Grid, Smart cities, Smart energy, Automotive, Smart factory, Mobile, Wearables, Smart home etc.

For example:

Modelling energy prices,

Classifying step using machine learning,

Bus routing using mobile phone data,

Linear and non-linear regression models to predict global temperature and weather prediction


Creating an Open methodology

Currently, this is an evolving thought process being developed as a part of the Data Science for IoT course. We intend to create it as an open methodology – starting with the question: What is common across these IoT analytics problems and how can we adapt existing Data Science techniques  to solve IoT analytics problems?

Over the next few weeks, we are conducting a survey and developing the methodology

If you are interested in participating and knowing more, please sign up to our mailing list and download our papers or contact me at ajit.jaokar at 

My blog featured in 4 Top 50/100 lists for IoT / Big Data / Data Science last year

An interesting year in social media last year .. and A nice way to start the year

My blog featured in 4 Top 50/100 lists for IoT / Big Data / Data Science last year
I always find this interesting since I write about a very niche space(Data Science for IoT) and its more mathematical / technical than my previous work in Mobile
These are great lists also – some very clued on people – well worth following them

Inline images 1

What is the best way for getting started in Statistics for Programmers/Data Science?

What is the best way for getting started in Statistics for Programmers/Data Science?

I am often asked this question: What’s the best way for getting started in Statistics for Programmers?

At the Data Science for IoT course – and also in my teaching at Oxford University – I have used the following approach.

Comments welcome:

Firstly, the interest in Statistics for Programmers is a fairly recent phenomenon.

This interest is based on the uptake of Data Science – a hot profession now.

Here’s how most people approach the problem

They pick up an old High School statistics text book – either their own from younger days– or a standard book.

These books are often decades old.

They start with page One .. and work linearly through a few pages ..

They quickly realize why they disliked stats earlier.

And that sentiment has not changed with the passage of time ..

But, here is a different approach

For Data Science, you do not need to master Statistics per se

You need to understand Statistical models.

A model is defined as a combination of  predictive algorithms (based on Statistics) and Data.

Data science is based on creating models that improve with experience / training/

In contrast, in the Data Science for IoT course – we start with problems (the Engineering approach).

I recommend three sources which I am using (if you have others, please let me know at ajit.jaokar at and I shall link them and refer back to you)

Start with Understanding the problem

See these two links by @Brandon Rohrer  (@Microsoft Data Science)  -

Which algorithm family can answer my question and

Which questions can Data Science answer.

See also this post by Dr Vincent Granville @DataScienceCtrl

on 24 uses of Statistical modelling Part 1 and  2

These posts give you an idea of the problems that can be solved using Data science and stats(without going into the math itself initially)

Then read Allen Downey’s books

Allen Downney writes excellent books and they are all free under creative commons. You can download them  at Green Tea Press and they have an excellent ethos. Especially – Think Stats, Think Bayes, Think complexity (in that order).

To encourage the author I would also encourage you to buy these books especially Think Stats.

You can follow him on Twitter @allendowney

Having mastered to this stage, then start with code and small datasets.

I prefer UCI datasets and Python scikit learn library.

Sumit also works with the REPL approach and Paul uses Spark notebook in our course.

In any case, these are small sections of code run in a controlled environment and show you how the stats are implemented(libraries / APIs like scikit learn – are relatively easier to understand if you come from a Programming background)

Thats the path we are using in the Data Science for IoT course.

Any comments/feedback welcome on your approach to teach statistics (ajit.jaokar at

Image source: Scatter plots – wikipedia

Data Science for Internet of Things – practitioner course – March 2016

Now running in it’s third batch ..

Welcome to the world’s first course that helps you to become a Data Scientist for the Internet Of Things ..

For the latest batch of this course see  Data Science for Internet of things #DataScience #IoT – Aug – Sep 2016 start now in its fourth batch




The course starts on March 22 – 2016 - 

Please contact [email protected] 

This niche, personalized course is suited for:

  • Developers who want to transition to a new role as Data Scientists
  • Entrepreneurs who want to launch new products covering IoT and analytics
  • Anyone interested in developing their career in IoT Analytics

Duration: The course starts from March 2016 and extends to July  2016. We work with you for the next six months after that on a specific project and to help transition your career to Data Science through our network. The extra time also allows you to catch up on specific modules in the course

Scope: Created by Data Science and IoT professionals, the course covers infrastructure (Hadoop – Spark), Programming / Modelling (Python/R/Time series) and Deep Learning (Theano, Deeplearning4j) within the context of the Internet of Things.

Internet of Things: We cover unique aspects of Data Science for IoT including Deep Learning, Complex event processing/sensor fusion and Streaming/Real time analytics


Offline (London):  £1,200 GBP + VAT
Online:  Yes. Please contact us at [email protected]


Contact  us at [email protected] to signup




  • The course aims to equip you to be a Data Scientist for the Internet of Things domain
  • You can transition your career to Data Science for IoT. This could mean a new job, role, project or a start-up idea
  • You are not alone: Toolkits and community support to start working on real Data science problems for IoT
  • You master specific skills: Spark, R, Python, Scala, IoT platforms, Data analysis, Deep Learning and SQL among others
  • The course content can be personalized (see below)
  • The Data Science principles can apply to other domains i.e. beyond IoT



(Note the modules and the sequence are subject to change)


An overview of Data Science

An overview of Data Science,  What is Data Science? What problems can be solved using Data science – Extracting meaning from Data – Statistical processes behind Data – Techniques to acquire data (ex APIs) – Handling large scale data – Big Data fundamentals


Data Science and IoT

The IoT ecosystem, Unique considerations for the IoT ecosystem – Addressing IoT problems in Data science (time series data, enterprise IoT edge computing, real-time processing, cognitive computing, image processing, introduction to deep learning algorithms, geospatial analysis for IoT/managing massive geographic scale, strategies for integration with hardware, sensor fusion)


The Apache Spark ecosystem

Apache spark in detail including Scala, SQL, SparkR, Mlib and GraphX


The Data Science for IoT methodology

A specific approach to solve Data Science problems for IoT including strategy and development


Mathematical foundations of Machine learning

Here we formally cover the mathematics for Data science including Linear Algebra, Matrix algebra, Bayesian Statistics, Optimization techniques (Gradient descent) etc. We also cover Supervised algorithms, unsupervised algorithms (classification, regression, clustering, dimensionality reduction etc) as applicable to IoT datasets


Unique Elements for IoT

This module emphasises the following unique elements for IoT

  • Complex event processing (sensor fusion)
  • Deep Learning and
  • Real Time (Spark, Kafka etc)


FAQ: Summary of Benefits and Features


Impact on your work Designed for developers/ICT contractors/Entrepreneurs who want to transition their career towards Data science roles with an emphasis on IoT
Typical profile A developer who has skills in programming environments like Java, Ruby, Python, Oracle etc and wants to learn Data Science within the context of Internet of Things with the goal of becoming a Data Scientist for IoT
Community support? Yes. Also includes the Alumni network i.e. beyond the duration of the course at no extra cost.
Approach to Big Data For Big Data, the course is focussed on Apache Spark – specifically Scala, SQL, mlib. Graphx and others on HDFS
Approach to Programming see scope below
Approach to Algorithms see scope below
Is this a full data science course? Yes, we cover machine learning / Data science techniques which are applicable to any domain. Our focus is Internet of Things. The course is practitioner oriented i.e. not academic and is not affiliated to a university.
Investment Offline(London):  £1,200 GBP + VAT(if applicable)
Online:  Yes. Please contact us at [email protected]
Help with jobs/employment yes, we aim to transition your career. Hence, we are selective in the recruitment for the course. There are no guarantees – but a career transition is a key goal for us. We work with you  over the duration of the course(including the Project) to get a new role in Data Science/IoT
Created by professionals See our profiles below
Personalization The course is based on a PLP (Personal learning plan) which allows you to customize for language, projects, domains, career goals, entrepreneurial goals etc . The course can be personalized. Examples include a focus on CEP/Sensor fusion,  RNNs and Time series, Edge processing, SQL  etc. There is no extra cost for this but we agree scope before we start through a Personal Learning Program(PLP). If you are interested in this option, please let us know at [email protected]If you want to see examples of our work and content, please see Spark SQL real time analytics by Sumit Pal(published on kdnuggets)The evolution of Deep learning models by Ajit Jaokar
Duration The course starts from March 2016 and extends to July  2016. We work with you for the next six months after that on a specific project and to help transition your career to Data Science through our network. The extra time also allows you to catch up on specific modules in the course
Projects A significant part of the course is Project based. Projects are based on   predictive analytics algorithms for IoT applications. Projects use our methodology which is based on a formalized way of solving IoT analytics  problems. Projects can be based in any of the Programming Languages we cover i.e. R or Python. Spark(Scala) and SQL(distributed processing i.e. Big Data) and  Theano and deeplearning4j for Deep learning . If you want to work on a specific project you should indicate in advance(or if you want to explore some ideas deeper)
Access to knowledge We do not restrict access to knowledge by specialization. For example – if you choose to focus on sensor fusion – you will still have access to all material for Deep learning
Batch sizes Are limited to ensure personalized attention
Time per week about 5 hours/week. No additional materials needed to buy etc
Certificate of completion Yes – based on the quiz and projects.
Delivery of content via video. You do not have to be online at specific times


How is this approach different to the more traditional MOOCs?

Here’s how we differ from MOOCs

a)  We are not ‘Massive’ – this approach works for small groups with more focused and personalized attention. We will never have 1000s of participants

b)  We help in career leverage: We work actively with you for career leverage – ex you are a startup / you want to transition to a new job etc

c)  We are vendor agnostic

d)  We work actively with you to build your brand(Blogs/Open source/conferences etc)

e)  The course can be personalized to streams(ex with Deep learning, Complex event processing, Streaming etc)

f)  We teach the foundations of maths where applicable

g)  We work with a small number of platforms which provide current / in-demand skills – ex Apache Spark, R etc

h)  We are exclusively focused on IoT (although the concepts can apply to any other vertical)


Approach to Programming

The main Programming focus is on Python, R , Spark (Scala, SQL and R). We also use  Deeplearning4j and Theano(for Deep learning).  We will also use an ioT platform (like Thingworx) but we will emphasize IoT analytics.  The participants need to be able to Code/come from a development background (the Programming language itself does not matter).


What is your approach to working with Algorithms and Maths?

The course is based on modelling IoT based problems in the Python and R programming language.  We follow a context based learning approach – hence we co-relate the maths to specific R based IoT models. You will need an aptitude for maths. However, we cover the mathematical foundations necessary. These include: Linear Algebra including Matrix algebra, Bayesian Statistics, Optimization techniques (such as Gradient descent) etc.


What is the implication of an emphasis on IoT?

In 2015, IoT is emerging but the impact is yet to be felt over the next five years. Today, we see IoT driven by Bluetooth 4.0 including iBeacons. Over the next five years, we will see IoT connectivity driven by the wide area network (with the deployment of 5G 2020 and beyond). We will also see entirely new forms of connectivity (ex LoRa, Sigfox etc). Enterprises (Renewables, Telematics, Transport, Manufacturing, Energy, Utilities etc) will be the key drivers for IoT. On the consumer side, Retail and wearables will play a part. This tsunami of data will lead to an exponential demand for analytics since analytics is the key business model behind the data deluge. Most of this data will be Time series data but will also include other types of data. For example, our emphasis on IoT also includes Deep Learning since we treat video and images as sensors.  IoT will lead to a Re-imagining of everyday objects.


Why is this course unique?

The course emphasizes some aspects are unique to IoT (in comparison to traditional data science). These include: A greater emphasis on time series data, Edge computing, Real-time processing, Cognitive computing, In memory processing, Deep learning, Geospatial analysis for IoT, Managing massive geographic scale(ex for Smart cities), Telecoms datasets, Strategies for integration with hardware and Sensor fusion (Complex event processing). Note that we include video and images as sensors through cameras (hence the study of Deep learning)



Who is creating/teaching this course?

The course is created by futuretext and conducted by Ajit Jaokar, Dr Paul Katsande and Sumit Pal

Ajit Jaokar  – Based in London, Ajit’s research and consulting is based on Data Science and the Internet of Things. His work is based on his teaching at Oxford University and UPM (Technical University of Madrid) and covers IoT, Data Science, Smart cities and Telecoms.



Sumit Pal is a big data, visualisation and data science consultant. He is also a software architect and big data enthusiast and builds end-to-end data-driven analytic systems. Sumit has worked for Microsoft (SQL server development team), Oracle (OLAP development team) and Verizon (Big Data analytics team) in a career spanning 22 years. Currently, he works for multiple clients advising them on their data architectures and big data solutions and does hands on coding with Spark, Scala, Java and Python. Sumit is based in Boston.


Dr Paul Katsande is a technical architect based in London working with Apache Spark, Scala and Data Science. Paul’s PhD research is based on image processing from the University of Manchester.



We have limited spaces. Please contact us at [email protected] if you want to take the next steps!



See video below




Weekly schedule


Week 0 March 15 Orientation, introductions, Personal learning plans, Platform signup
Week 1 mar 21 Foundations:An analytics Driven Organization – IoT and Machine Learning  - Data Science for IoT – Unique characteristics – Data Science for IoT – why now?
Mar 28 Machine Learning concepts Deep Learning concepts
Apr 4 An introduction to IoT (Internet of Things)
Apr 11 IoT platforms – From sensor to Cloud
Apr  18 Concepts of Big Data Part One
Apr  25 Concepts of Big Data Part Two
May 2 Market drivers for IoT
May 9 Choosing a model – what technique to Use?
May 16 Use Cases  and IoT datasets (these will continue throughout the course)
May  23 Time series and NoSQL databases
May 30 Streaming analytics part One
June  6 Streaming analytics part two
June 13 Deep learning part one
June 20 Deep learning part two
June  2 7 Machine learning algorithms – part one
July 4 Machine learning algorithms – part two
July 11 Mathematical foundations – part one
July 18 Mathematical foundations – part two
July To Dec 31 Project





Week 0 Mar 15 Orientation, introductions, Personal learning plans, Platform signup
Week 1 mar 21
Mar 28
Apr 4 Intro to R, Installations, Basics of R
Apr 11
Apr  18 Data Frames in R & Tabular Data
Apr  25
May 2 Data Processing & Data Visualization in R
May 9
May 16 Scala basics
May  23
May 30 Spark batch processing I
June  6
June 13 Spark Batch Processing II
June 20
June  2 7 Spark SQL
July 4
July 11 Spark Streaming
July 18
July To Dec 31 Projects


 Contact  us at [email protected] to signup



IoT data analytics and visualization event – Palo alto – Feb 2016

As per every year, we are supporting this great event. The IoT data analytics and visualization event – Palo alto is now a must attend event for IoT professionals.

DATA15’ which provides a 15% discount to attend the event

Have a look at the conference and the speakers IoT data analytics and visualization event – Palo alto – Feb 2016