Meet me at AI-europe in London (Uber, Nvidia, Kayak, UBS, Bell Labs + others speaking)

I am at Ai-europe next week. It should be a great event where Uber, Nvidia, Kayak, UBS, Bell Labs  are speaking

I am very much looking forward to the Nvidia talk (I work with Nvidia for my course which I teach at Oxford University Data Science for Internet of thing)

Also, look forward to the following talks. I believe that the event is almost full but there are only a last few places left. see more at Ai-europe 

  •  Opening Speech De-mystifying AI Terry JONES Founding Chairman KAYAK
  • AI as a game–changer for every industry: disruption now and perspectives for 2025 Robin BORDOLI Chief Executive Officer CROWDFLOWER
  • Serge PALARIC Vice President EMEA Embedded & OEMs NVIDIA – DEPLOYING DEEP LEARNING EVERYWHERE: CUTTING-EDGE RESEARCH TEAMS, HYPER-SCALE DATA CENTERS, ENTERPRISES USING AI
  • CONTACT CENTERS: HOW ARTIFICIAL INTELLIGENCE IS REVOLUTIONIZING THE CUSTOMER EXPERIENCE  Dr Nicola J. MILLARD Head of Customer Insight & Futures BT GLOBAL SERVICES
  • Banking: why UBS is interested in AI and other fintech innovations Annika SCHRÖDER Director UBS Group Innovation UBS AG
  • Health: the value of integrating deep learning – Use case: applying deep learning in devices to diagnose cancer – Carlos JAIME – Head of Health & Medical Equipment Division SAMSUNG ELECTRONICS FRANCE
  • BRINGING MACHINE LEARNING TO EVERY CORNER OF YOUR BUSINESSLuming WANG Head of Deep Learning UBER
  • Augmented Reality Danny Lopez COO Blippar
  • Virtual assistants: their impacts on  the Internet and the society – Why AI-based digital assistants will contribute to revolutionize the Internet and place technology at the service of humans Julien HOBEIKA – Juliedesk
  • IMAGE ANALYSIS: RESEARCH AND ITS APPLICATIONS IN THE REAL WORLD – Miriam Redi – Bell Labs

 

 

 

Implementing Enterprise AI course

 


 

 

Implementing Enterprise AI course

 

Implementing Enterprise AI is a unique and limited edition course that is focussed on AI Engineering / AI for the Enterprise.

The course is launched for the first time and has limited spaces.

Created in partnership with H2O.ai , the course uses Open Source technology to work with AI use cases. Successful participants will receive a certificate of completion and also validation of their project from H2O.ai.

 

The course covers

  • Design of Enterprise AI
  • Technology foundations of Enterprise AI systems
  • Specific AI use cases
  • Development of AI services
  • Deployment and Business models

 

The course targets developers and Architects who want to transition their career to Enterprise AI. The course correlates the new AI ideas with familiar concepts like ERP, Data warehousing etc and helps to make the transition easier. The course is based on a logical concept called an  ‘Enterprise AI layer’. This AI layer is focused on solving domain specific problems for an Enterprise.  We could see such a layer as an extension to the Data Warehouse or the ERP system (an Intelligent Data Warehouse/ Cognitive ERP system). Thus, the approach provides tangible and practical benefits for the Enterprise with a clear business model. The implementation / development for the course is done using the H2O APIs for R, Python & Spark. 

 

 

The course covers the following Enterprise AI Use Cases

 

  • Healthcare
  • Insurance
  • Adtech
  • Fraud detection
  • Anomaly detection
  • Churn, classification
  • Customer analytics
  • Natural Language Processing, Bots and Virtual Assistants

 

The course comprises three parts

 

Section One: Implementing Enterprise AI

  • Introduction
  • Machine learning
  • neural net and deep net
  • NLP
  • Reinforcement learning
  • Bots
  • Implementation in H2O of the use cases above

 

Section Two: Deploying Enterprise AI
Here, we cover the actual deployment issues for Enterprise AI including

  • Acquiring Data and Training the Algorithm
  • Processing and hardware considerations
  • Business Models – High Performance Computing – Scaling and AI system
  • Costing  an AI system
  • Creating a competitive advantage from AI
  • Industry Barriers for AI

 

Section Three: Projects  

Enterprise AI project created for AI use cases in teams.

 

Duration:

Starting Jan 2017 Approximately six months (3 months for the content and upto three months for the Project)

Course includes a certificate of completion and also validation of the project from H2O.ai. (Projects will be created in a team)

 

Course Logistics:

Offered Online  and Offline( London and Berlin)

When:  Jan 2017

Duration: Approximately six months (including Project)

Fees:         please contact us

Contact :  [email protected]


 

 

 

 

Please contact us to sign up or to know more [email protected]

Enterprise AI Data Scientist: Implementing Enterprise AI Course

Overview

Introduction

Enterprise AI Data Scientist is a niche course that targets developers who want to transition their career towards Enterprise AI

The course covers:

  • Design of Enterprise AI
  • Technology foundations of Enterprise AI systems
  • Specific AI use cases
  • Development of AI services
  • Deployment and Business models

The course targets developers and Architects who want to transition their career to AI. The course correlates the new AI ideas with familiar concepts like ERP, Data warehousing etc and helps to make the transition easier,

 

According to Deloitte: by the “end of 2016 more than 80 of the world’s 100 largest enterprise software companies by revenues will have integrated cognitive technologies into their products”. Gartner also predicts that 40 percent of the new investment made by enterprises will be in predictive analytics by 2020. AI is moving fast into the Enterprise and AI developments can create value for the Enterprise.

The Enterprise AI Layer

The course is based on a logical concept called an  ‘Enterprise AI layer’. This AI layer is focussed on solving relatively mundane problems which are domain specific for an Enterprise.  While this is not as ‘sexy’ as the original vision of AI, it provides tangible and practical benefits to companies. We could see such a layer as an extension to the Data Warehouse or the ERP system(an Intelligent Data Warehouse/ Cognitive ERP system). Thus, the approach provides tangible and practical benefits for the Enterprise with a clear business model. For instance,  an organization would transcribe call centre agents’ interactions with customers create a more intelligent workflow, bot etc using Deep learning algorithms.

 

So, if we imagine such a conceptual AI layer for the enterprise, what does it mean in terms of new services that can be offered by an Enterprise?  Here are some examples

  • Bots : Bots are a great example of the use of AI to automate repetitive tasks like scheduling meetings. Bots are often the starting point of engagement for AI especially in Retail and Financial services
  • Inferring from textual/voice narrative:  Security applications to detect suspicious behaviour, Algorithms that  can draw connections between how patients describe their symptoms etc
  • Detecting patterns from vast amounts of data: Using log files to predict future failures, predicting cyberseurity attacks etc
  • Creating a knowledge base from large datasets: for example an AI program that can read all of Wikipedia or Github.
  • Creating content on scale: Using Robots to replace Writers or even to compose Pop songs
  • Predicting future workflows: Using existing patterns to predict future workflows
  • Mass personalization:  in advertising
  • Video and image analytics: Collision Avoidance for Drones, Autonomous vehicles, Agricultural Crop Health Analysis etc

 

These  applications provide competitive advantage, Differentiation, Customer loyalty and  mass personalization for any Enterprise. They have simple business models (such as deployed as premium features /new products /cost reduction )

 

Course Outline

AI – A conceptual Overview

In this section, we cover the basics of AI and Deep learning. We start with machine learning concepts and relate how Deep Learning/AI fits with them.  We explore the workings of Algorithms and the various technologies underpinning AI. AI enables computers to do some things better than humans especially when it comes to finding insights from large amounts of Unstructured or semi-structured data. Technologies like Machine learning , Natural language processing (NLP) , Speech recognition, and computer vision drive the AI layer. More specifically, AI applies to an algorithm which is learning on its own. We explore the design and principles behind these Algorithms.

 

Understanding the Enterprise AI Technology Landscape

In this section, we focus on various implementations of Machine Learning and Deep Learning including   : linear models(GLM) , Ensembles (ex: Random Forest etc), Clustering(k-means), Deep neural networks(Autoencodes, CNNs, RNNs), Dimensionality reduction(PCA). We are cover the various Deep learning libraries i.e. TensorFlow, Caffe, mxnet, Theano, We also discuss the ancillary technologies like  Natural Language Processing Computer Vision

Enterprise AI Use Cases

Here, we discuss the following use cases

  • Healthcare
  • Insurance
  • Adtech
  • Fraud detection
  • Anomaly detection
  • Churn, classification
  • Customer analytics
  • Natural Language Processing, Bots and Virtual Assistants

Implementing Enterprise AI

Building on the above, we discuss the implementations of the use cases.

Deploying Enterprise AI

Here, we cover the actual deployment issues for Enterprise AI including

  • Acquiring Data and Training the Algorithm
  • Processing and hardware considerations
  • Business Models
  • High Performance Computing – Scaling and AI system
  • Costing  an AI system
  • Creating a competitive advantage from AI
  • Industry Barriers for AI

 

Course Logistics

Where:  London

When:  Jan 2017

Duration: Approximately six months

Online?:  Yes . Please contacts us

Contact :  [email protected]

Fees:         contact us

The AI layer for the Enterprise and the role of IoT

Introduction 

According to Deloitte: by the “end of 2016 more than 80 of the world’s 100 largest enterprise software companies by revenues will have integrated cognitive technologies into their products”. Gartner also predicts that 40 percent of the new investment made by enterprises will be in predictive analytics by 2020. AI is moving fast into the Enterprise and AI developments can create value for the Enterprise. This value can be captured/visualized by considering an ‘Enterprise AI layer’. This AI layer is focussed on solving relatively mundane problems which are domain specific.  While this is not as ‘sexy’ as the original vision of AI, it provides tangible benefits to companies.

 

In this brief article, we proposed a logical concept called the AI layer for the Enterprise.  We could see such a layer as an extension to the Data Warehouse or the ERP system. This has tangible and practical benefits for the Enterprise with a clear business model. The AI layer could also incorporate the IoT datasets and unite the disparate ecosystem. The Enterprise AI layer theme is a key part of the Data Science for Internet of Things course. Only a last few places remain for this course!.

 

Enterprise AI – an Intelligent Data Warehouse/ERP system?

AI enables computers to do some things better than humans especially when it comes to finding insights from large amounts of Unstructured or semi-structured data. Technologies like Machine learning , Natural language processing (NLP) , Speech recognition, and computer vision drive the AI layer. More specifically, AI applies to an algorithm which is learning on its own.

 

To understand this, we have to ask ourselves: How do we train a Big Data algorithm?  

There are two ways:

  • Start with the Rules and apply them to Data (Top down) OR
  • Start with the data and find the rules from the Data (Bottom up)

 

The Top-down approach involved writing enough rules for all possible circumstances.  But this approach is obviously limited by the number of rules and by its finite rules base. The Bottom-up approach applies for two cases. Firstly, when rules can be derived from instances of positive and negative examples(SPAM /NO SPAN). This is traditional machine learning when the Algorithm can  be trained.  But, the more extreme case is : Where there are no examples to train the algorithm.

 

What do we mean by ‘no examples’?

 

a)      There is no schema

b)      Linearity(sequence) and hierarchy is not known

c)      The  output is not known(non-deterministic)

d)     Problem domain is not finite

 

Hence, this is not an easy problem to solve. However, there is a payoff in the enterprise if AI algorithms can be created to learn and self-train manual, repetitive tasks – especially when the tasks involve both structured and unstructured data.

 

How can we visualize the AI layer?

One simple way is to think of it as an ‘Intelligent Data warehouse’ i.e. an extension to either the Data warehouse or the ERP system

 

For instance,  an organization would transcribe call centre agents’ interactions with customers create a more intelligent workflow, bot etc using Deep learning algorithms.

Enterprise AI layer – What it mean to the Enterprise

So, if we imagine such a conceptual AI layer for the enterprise, what does it mean in terms of new services that can be offered?  Here are some examples

  • Bots : Bots are a great example of the use of AI to automate repetitive tasks like scheduling meetings. Bots are often the starting point of engagement for AI especially in Retail and Financial services
  • Inferring from textual/voice narrative:  Security applications to detect suspicious behaviour, Algorithms that  can draw connections between how patients describe their symptoms etc
  • Detecting patterns from vast amounts of data: Using log files to predict future failures, predicting cyberseurity attacks etc
  • Creating a knowledge base from large datasets: for example an AI program that can read all of Wikipedia or Github.
  • Creating content on scale: Using Robots to replace Writers or even to compose Pop songs
  • Predicting future workflows: Using existing patterns to predict future workflows
  • Mass personalization:  in advertising
  • Video and image analytics: Collision Avoidance for Drones, Autonomous vehicles, Agricultural Crop Health Analysis etc

 

These  applications provide competitive advantage, Differentiation, Customer loyalty and  mass personalization. They have simple business models (such as deployed as premium features /new products /cost reduction )

 

The Enterprise AI layer and IoT

 

So, the final question is: What does the Enterprise layer mean for IoT?

 

IoT has tremendous potential but faces an inherent problem. Currently, IoT is implemented in verticals/ silos and these silos do not talk to each other. To realize the full potential of IoT, an over-arching layer above individual verticals could ‘connect the dots’. Coming from the Telco industry, these ideas are not new i.e. the winners of the mobile/Telco ecosystem were iPhone and Android – which succeeded in doing exactly that.

 

Firstly, the AI layer could help in deriving actionable insights from billions of data points which come from IoT devices across verticals. This is the obvious benefit as IoT data from various verticals can act as an input to the AI layer.  Deep learning algorithms play an important role in IoT analytics because Machine data is sparse and / or has a temporal element to it. Devices may behave differently at different conditions. Hence, capturing all scenarios for data pre-processing/training stage of an algorithm is difficult. Deep learning algorithms can help to mitigate these risks by enabling algorithms to learn on their own. This concept of machines learning on their own can be extended to ‘machines teaching other machines’. This idea is not so far-fetched and is already happening, A Fanuc robot teaches itself to perform a task overnight by observation and through reinforcement learning. Fanuc’s robot uses reinforcement learning to train itself. After eight hours or so it gets to 90 percent accuracy or above, which is almost the same as if an expert were to program it. The process can be accelerated if several robots work in parallel and then share what they have learned. This form of distributed learning is called cloud robotics

 

We can extend the idea of ‘machines teaching other machines’ more generically within the Enterprise. Any entity in an enterprise can train other ‘peer’ entities in the Enterprise. That could be buildings learning from other buildings – or planes or oil rigs.  We see early examples of this approach in Salesforce.com and Einstein. Longer term, Reinforcement learning is the key technology that drives IoT and AI layer for the Enterprise – but initially any technologies that implement self learning algorithms would help for this task

Conclusion

In this brief article, we proposed a logical concept called the AI layer for the Enterprise.  We could see such a layer as an extension to the Data Warehouse or the ERP system. This has tangible and practical benefits for the Enterprise with a clear business model. The AI layer could also incorporate the IoT datasets and unite the disparate ecosystem.  This will not be easy. But it is worth it because the payoffs for creating such an AI layer around the Enterprise are huge! The Enterprise AI layer theme is a key part of the Data Science for Internet of Things course. Only a last few places remain for this course!.

Data Science for Internet of Things course – Strategic foundation for decision makers

Data Science for Internet of Things course - Strategic foundation for decision makers 

To sign up or learn more email [email protected] The course starts in Sep 2016

We have had a great response to the Data Science for Internet of Things course. The course takes a technological focus aiming enabling you to become a Data Scientist for the Internet of Things. I also had many requests for a Strategic version of the Data Science for Internet of Things Course for decision makers.

Today, we launch special edition of the course only for decision makers.

The course is based on an open problem solving methodology for IoT analytics which we are developing within the course.

 Why do we need a methodology for Data Science for IoT?

 

IoT will create huge volumes of Data making the discovery of insights more critical. Often, the analytics process will need to be automated. By establishing a formal process for extracting knowledge from IoT applications by IoT vertical, we capture best practise.

This saves implementation time and cost. The methodology is more than Data Mining (i.e. application of algorithms) – but rather, it leans more to KDDM (Knowledge Discovery and Data Mining) principles. It is thus concerned with the entire end-to-end Knowledge extraction process for IoT analytics.

This includes developing scalable algorithms that can be used to analyze massive datasets, interpreting and visualizing results and modelling the engagement between humans and the machine. The main motivation for Knowledge Discovery models is to ensure that the end product will be useful to the user.

Thus, the methodology includes aspects of IoT analytics such as validity, novelty, usefulness, and understandability of the results(by IoT vertical). The methodology builds on a series of interdependent steps with milestones. The steps often include loops and iterations and cover all the processes end to end (including KPIs, Business case, project management). We explore Data Science for IoT analytics at multiple levels including Process level, Workflow level and Systems level.

The concept of a KDDM process model was discussed in 1990s by Anand, Brachman, Fayyad, Piatetsky-Shapiro and others. In a nutshell, we build upon these ideas and apply them to IoT analytics. We also create code in Open source for this methodology.

As a decision maker, by joining the course, you have early and on-going access to both the methodology and the open source code.

Please contact us to sign up or to know more [email protected]

Testimonials for our current course

 Jean Jacques Bernand – Paris – France

“Great course with many interactions, either group or one to one that helps in the learning. In addition, tailored curriculum to the need of each student and interaction with companies involved in this field makes it even more impactful.

As for myself, it allowed me to go into topics of interests that help me in reshaping my career.”

Johnny Johnson, AT&T – USA

“This DSIOT course is a great way to get up-to-speed.  The tools and methodologies for managing devices, wrangling and fusing data, and being able to explain it are taking form fast; Ajit Jaokar is a good fit.  For me, his patience and vision keep this busy corporate family man coming back.”

Yongkang Gao, General Electric, UK.

“I especially thank Ajit for his help on my personal project of the course — recommending proper tools and introducing mentors to me, which significantly reduced my pain in the beginning stage.”

karthik padmanabhan Manager – Global Data Insight and Analytics (GDIA) – Ford Motor Pvt Ltd.

“I am delighted to provide this testimonial to Ajit Jaokar who has extended outstanding support and guidance as my mentor during the entire program on Data science for IoT. Ajit is a world renowned professional in the niche area of applying the Data science principles in creating IoT apps. Talking about the program, it has a lot of breadth and depth covering some of the cutting edge topics in the industry such as Sensor Fusion, Deep Learning oriented towards the Internet of things domain. The topics such as Statistics, Machine Learning, IoT Platforms, Big Data and more speak about the complexity of the program. This is the first of its kind program in the world to provide Data Science training especially on the IoT domain and I feel fortunate to be part of the batch comprising of participants from different countries and skill sets. Overall this journey has transformed me into a mature and confident professional in this new space and I am grateful to Ajit and his team. My wish is to see this program accepted as a gold standard in the industry in the coming years”.

Peter Marriott – UK – www.catalystcomputing.co.uk

Attending the Data Science for IoT course has really helped me in demystifying the tools and practices behind machine learning and has allowed me to move from an awareness of machine learning to practical application.

Yair Meidan Israel – https://il.linkedin.com/in/yairmeidandatamining

“As a PhD student with an academic and practical experience in analytics, the DSIOT course is the perfect means by which I extend my expertise to the domain of IoT. It gradually elaborates on IoT concepts in general, and IoT analytics in particular. I recommend it to any person interested in entering that field. Thanks Ajit!”

Parinya Hiranpanthaporn, Data Architect and Advanced Analytics professional Bangkok

“Good content, Good instructor and Good networking. This course totally answers what I should know about Data Science for Internet of Things.”

 

Sibanjan Das – Bangalore

Ajit helped me to focus and set goals for my career that is extremely valuable. He stands by my side for every initiative I take and helps me to navigate me through every difficult situation I face. A true leader, a technology specialist, good friend and a great mentor. Cheers!!!

Manuel Betancurt – Mobile developer / Electronic Engineer. – Australia

I have had the opportunity to partake in the Data Science for the IoT course taught by Ajit Jaokar. He have crafted a collection of instructional videos, code samples, projects and social interaction with him and other students of this deep knowledge.

Ajit gives an awesome introduction and description of all the tools of the trade for a data scientist getting into the IoT. Even when I really come from a software engineering background, I have found the course totally accessible and useful. The support given by Ajit to make my IoT product a data science driven reality has been invaluable. Providing direction on how to achieve my data analysis goals and even helping me to publish the results of my investigation.

The knowledge demonstrated on this course in a mathematical and computer science level has been truly exciting and encouraging. This course was the key for me to connect the little data to the big data.

Barend Botha – London and South Africa – http://www.sevensymbols.co.uk

This is a great course for anyone wanting to move from a development background into Data Science with specific focus on IoT. The course is unique in that it allows you to learn the theory, skills and technologies required while working on solving a specific problem of your choice, one that plays to your past strengths and interests. From my experience care is taken to give participants one to one guidance in their projects, and there is also within the course the opportunity to network and share interesting content and ideas in this growing field. Highly recommended!

- Barend Botha

Jamie Weisbrod – San Diego - https://www.linkedin.com/in/jamie-weisbrod-3630053

Currently there is a plethora of online courses and degrees available in data science/big data. What attracted me to joining the futuretext class “Data Science for ioT” is Ajit Jaokar. My main concern in choosing a course was how to leverage skills that I already possessed as a computer engineer. Ajit took the time to discuss how I could personalize the course for my interests.

I am currently in the midst of the basic coursework but already I have been able to network with students all over the world who are working on interesting projects. Ajit inspires a lot of people at all ages as he is also teaching young people Data science using space exploration.

 Robert Westwood – UK – Catalyst computing
“Ajit brings to the course years of experience in the industry and a great breadth of knowledge of the companies, people and research in the Data Science/IoT arena.”

Overall, the syllabus covers the following themes in 6 months

Note that the schedule is personalized and flexible for the strategic course

i.e. we discuss and personalize your schedule at the start of the course

  • Principles
  • Problem solving with Data Science: Is an overall process of solving Data Science problems(agnostic of a language) and covers aspects such as exploratory Data analysis)
  • IoT analytics (includes analysis for each vertical within iot. This will be ongoing throughout the course including in the methodology)
  • Foundations of R: The basics of one Programming language ( R ) and how to implement Data science algorithms in R
  • Time Series – which forms the basis of most IoT data (code in R)
  • Spark and NoSQL databases: Code in Scala and implementation in Cassandra
  • Deep Learning
  • Data Science for IoT Methodology
  • Maths and Stats – (this will also be ongoing but will be a core module)
we also have (from day one) what we call foundation projects where you work in groups with projets where you already have code etc. so you apply the concepts in context of a real situation

 

Data Science for Internet of Things: A coaching approach

In the Data Science for Internet of Things course I take a coaching approach. I have alluded to this in the post about foundation projects  and construtivism.

Coaching has a questionable reputation – with some justification

But here, we are talking of high performance coaching strategies

For example: Consider the approach of a book like The talent code 

The author explores the world’s greatest talent hotbeds: tiny places that produce huge amounts of talent ex a small gym in Moscow that produces a large number of gold medalists in athletics. He found that there’s a pattern common to all of them: methods of training, motivation, and coaching. They also place and emphasis on hard skills

So, what does this mean for participants in context of foundation projects?

a)      Start with what you know(ideally)

b)      Work collaboratively

c)       Push your limits(you can choose something different)

d)      Each group for a project will have one person/s who is knowledgeable

e)      Your outcomes should be specific

f)       You can see the big picture through the methodology for problem solving with Data Science for Internet of Things

g)      Your contribution should be measurable

h)      Your contribution should be based on acquiring a specific skill

i)        foundation projects have a quiz

From my perspective – as tutor / coach

  • I need to understand what the participants already know (baseline)
  • Provide measurable feedback
  • Extend your capabilities/push limits
  • Ensure you acquire definite skills
  • Keep you motivated
  • Keep your learning at the right pace
  • Foster a sense of community
  • Provide alternative mentors in the community
  • Use newer methods of learning ex concept maps
  • Create great conversations
  • Allow room for unplanned expansion

I think these techniques applied online are new – and there is so much to learn for all.

If you are interested in the Data Science for Internet of Things course, please email us at info at futuretext.com

Data Science for IoT – role of foundation projects(constructivist learning)

 

In the Data Science for Internet of Things course, I use some elements of constructivism through the use of foundation projects.

Foundation projects allow the participant to choose a learning context which is most familiar to them based on their  existing experience
Foundation projects are different from the Capstone projects for each participant
This form of context based learning is not familiar to most people hence some notes
1) Context based learning is based loosely on constructivism .
A concise description –  Constructivism is pedagogy / learning theory which advocates that people construct their own understanding and knowledge of the world, through experiencing things and reflecting on those experiences. The teacher makes sure she understands the students’ preexisting conceptions, and guides the activity to address them and then build on them.
adapted from source :
“The most important single factor influencing learning is what the learner already knows. Ascertain this and teach him accordingly”
Quote by Asubel one of the pioneers of this education:
In Holland and Germany, this form of education in Science is called by various names ex concept context learning (pdf)
What it means for learning in the Data Science for IoT course:
1)  we follow two modes of learning in parallel - instructivist (via the video based modules) and constructivist (via the foundation projects)
2) for the foundation projects, the participants choose a context most familiar to them from your prior experience. (ex healthcare, renewables, Industrial IoT etc)
The downside of applying constructivist methods to learning is .. they take a relatively long time – hence the longer duration of the course
for the current batch, the foundation projects are:
The foundation projects and project leaders are
Wearables: led by Quang Nam Tran (London)

Renewables:led by vaijayanti vadiraj(Bangalore)

Python for Data Science - temporarily led by me

Big Data: Spark and Cassandra for IoT - temporarily led by me – looking to handover and Trenton Potgieter (Austin)
Deep Learning with Nvidia: led by Jean Jacques Bernard(Paris) and Yongkang Gao(UK)
Data visualization with R: Barend Botha(London)
Predix: Industrial IoT – temporarily led by me – looking to hand over
ETL/Pentaho -
Deep learning and Machine learning with H2O led by Sibanjan Das(Bangalore)
Remote monitoring of elderly/patient care / healthcare - Manuel Betancurt(Sydney)

More details about the course:  Data Science for Internet of Things course

Image: Jean Piaget – the founder of Constructivism

I am listed no 19 among top 50 authorities on twitter for #iot

nice to be listed here amongst some great company

top 50 authorities on twitter for #iot

Young Data Scientist: Data visualizations of our Ardusat/ASE Space experiment using Python

These are visualizations from the live data from our satellite experiment with Ardusat ASE challenge which we won last year.

Will be part of a book called Young Data Scientist

Created using Python libraries json, pandas, matplotlib, statsmodels, numpy.

We use linear regression and logistic regression to detect cloud presence. will be released as part of the Young Data Scientist book (Countdown Institute)

I even got a mapping of the route of the satellite (equatorial orbit) For some background see

Using Space Exploration to teach Young People about Data Science

  Please email me at ajit.jaokar at futuretext.com with subject Young Data Scientist if you want to know more as we launch the book/initiative

A methodology for solving problems with DataScience for Internet of Things

 

Introduction

This (long!) blog is based on my forthcoming book:  Data Science for Internet of Things.

It is also the basis for the course I teach  Data Science for Internet of Things Course.   

Welcome your comments. 

Please email me at ajit.jaokar at futuretext.com  - Email me also for a pdf version if you are interested in joining the course

Here, we start off with the question:  At which points could you apply analytics to the IoT ecosystem and what are the implications?  We then extend this to a broader question:  Could we formulate a methodology to solve Data Science for IoT problems?  I have illustrated my thinking through a number of companies/examples.  I personally work with an Open Source strategy (based on R, Spark and Python) but  the methodology applies to any implementation. We are currently working with a range of implementations including AWS, Azure, GE Predix, Nvidia etc.  Thus, the discussion is vendor agnostic.

I also mention some trends I am following such as Apache NiFi etc

The Internet of Things and the flow of Data

As we move towards a world of 50 billion connected devices,  Data Science for IoT (IoT  analytics) helps to create new services and business models.  IoT analytics is the application of data science models  to IoT datasets.  The flow of data starts with the deployment of sensors.  Sensors detect events or changes in quantities. They provide a corresponding output in the form of a signal. Historically, sensors have been used in domains such as manufacturing. Now their deployment is becoming pervasive through ordinary objects like wearables. Sensors are also being deployed through new devices like Robots and Self driving cars. This widespread deployment of sensors has led to the Internet of Things.

Features of a typical wireless sensor node are described in this paper (wireless embedded sensor  architecture). Typically, data arising from sensors is in time series format and is often geotagged. This means, there are two forms of analytics for IoT: Time series and Spatial analytics. Time series analytics typically lead to insights like Anomaly detection. Thus, classifiers (used to detect anomalies) are commonly used for IoT analytics to detect anomalies.  But by looking at historical trends, streaming, combining data from multiple events(sensor fusion), we can get new insights. And more use cases for IoT keep emerging such as Augmented reality (think – Pokemon Go + IoT)

Meanwhile,  sensors themselves continue to evolve. Sensors have shrunk due to technologies like MEMS. Also, their communications protocols have improved through new technologies like LoRA. These protocols lead to new forms of communication for IoT such as Device to Device; Device to Server; or Server to Server. Thus, whichever way we look at it, IoT devices create a large amount of Data. Typically, the goal of IoT analytics is to analyse the data as close to the event as possible. We see this requirement in many ‘Smart city’ type applications such as Transportation, Energy grids, Utilities like Water, Street lighting, Parking etc

IoT data transformation techniques

Once data is captured through the sensor, there are a few analytics techniques that can be applied to the Data. Some of these are unique to IoT. For instance, not all data may be sent to the Cloud/Lake.  We could perform temporal or spatial analysis. Considering the volume of Data, some may be discarded at source or summarized at the Edge. Data could also be aggregated and aggregate analytics could be applied to the IoT data aggregates at the ‘Edge’. For example,  If you want to detect failure of a component, you could find spikes in values for that component over a recent span (thereby potentially predicting failure). Also, you could correlate data in multiple IoT streams. Typically, in stream processing, we are trying to find out what happened now (as opposed to what happened in the past).  Hence, response should be near real-time. Also, sensor data could be ‘cleaned’ at the Edge. Missing values in sensor data could be filled in(imputing values),  sensor data could be combined to infer an event(Complex event processing), Data could be normalized, we could handle different data formats or multiple communication protocols, manage thresholds, normalize data across sensors, time, devices etc

 

 

Applying IoT Analytics to the Flow of Data

Overview

Here, we address the possible locations and types of analytics that could be applied to IoT datasets.

(Please click to expand diagram)

 

Some initial thoughts:

  • IoT data arises from  sensors and ultimately resides in the Cloud.
  • We  use  the  concept  of  a  ‘Data  Lake’  to  refer  to  a repository of Data
  • We consider four possible avenues for IoT analytics: ‘Analytics  at  the  Edge’,  ‘Streaming  Analytics’ , NoSQL databases and ‘IoT analytics at the Data Lake’
  • For  Streaming  analytics,  we  could  build  an  offline model and apply it to a stream
  • If  we  consider  cameras  as  sensors,  Deep  learning techniques could be applied to Image and video datasets (for example  CNNs)
  • Even when IoT data volumes are high, not  all  scenarios  need  Data  to  be distributed. It is very much possible to run analytics on a single node using a non-distributed architecture using Python or R systems.
  • Feedback mechanisms are a key part of IoT analytics. Feedback is part of multiple IoT analytics modalities ex Edge, Streaming etc
  • CEP (Complex event processing) can be applied to multiple points as we see in the diagram

 

We now describe various analytics techniques which could apply to IoT datasets

Complex event processing

Complex Event Processing (CEP) can be used in multiple points for IoT analytics (ex : Edge, Stream, Cloud et).

In general, Event processing is a method of tracking and  analyzing  streams  of  data and deriving a conclusion from them. Complex event processing, or CEP, is event processing that combines data from multiple sources to infer events or patterns that suggest more complicated circumstances. The goal of complex event processing is to identify meaningful events (such as opportunities or threats) and respond to them as quickly as possible.

In CEP, the data is at motion. In contrast, a traditional Query (ex an RDBMS) acts on Static Data. Thus, CEP is mainly about Stream processing but the algorithms underlining CEP can also be applied to historical data

CEP relies on a number of techniques including for Events: pattern detection, abstraction, filtering,  aggregation and transformation. CEP algorithms model event hierarchies and detect relationships (such as causality, membership or timing) between events. They create an abstraction of an  event-driven processes. Thus, typically, CEP engines act as event correlation engines where they analyze a mass of events, pinpoint the most significant ones, and trigger actions.

Most CEP solutions and concepts can be classified into two main categories: Aggregation-oriented CEP and Detection-oriented CEP.  An aggregation-oriented CEP solution is focused on executing on-line algorithms as a response  to  event  data  entering  the  system  –  for example to continuously calculate an average based on data in the inbound events. Detection-oriented CEP is focused on detecting combinations of events called events patterns or situations – for example detecting a situation is to look for a specific sequence of events. For IoT, CEP techniques are concerned with deriving a higher order value / abstraction from discrete sensor readings.

CEP uses techniques like Bayesian    networks,    neural    networks,     Dempster- Shafer methods, kalman filters etc. Some more background at Developing a complex event processing architecture for IoT

Streaming analytics

Real-time systems differ in the way they perform analytics. Specifically,  Real-time  systems  perform  analytics  on  short time  windows  for  Data  Streams.  Hence, the scope  of  Real Time analytics is a ‘window’ which typically comprises of the last few time slots. Making Predictions on Real Time Data streams involves building an Offline model and applying it to a stream. Models incorporate one or more machine learning algorithms which are trained using the training Data. Models are first built offline based on historical data (Spam, Credit card fraud etc). Once built, the model can be validated against a real time system to find deviations in the real time stream data. Deviations beyond a certain threshold are tagged as anomalies.

IoT ecosystems can create many logs depending on the status of IoT devices. By collecting these logs for a period of time and analyzing the sequence of event patterns, a model to predict a fault can be built including the probability of failure for the sequence. This model to predict failure is then applied to the stream (online). A technique like the Hidden Markov Model can be used for detecting failure patterns based on the observed sequence. Complex Event Processing can be used to combine events over a time frame (ex in the last one minute) and co-relate patterns to detect the failure pattern.

Typically, streaming systems could be implemented in Kafka and spark

 

Some interesting links on streaming I am tracking:

 Newer versions of kafka designed for iot use cases

Data Science Central: stream processing and streaming analytics how it works

Iot 101 everything you need to know to start your iot project – Part One

Iot 101 everything you need to know to start your iot project – Part Two

 

Edge Processing

Many vendors like Cisco and Intel are proponents of Edge Processing  (also  called  Edge  computing).  The  main  idea behind Edge Computing is to push processing away from the core and towards the Edge of the network. For IoT, that means pushing processing towards the sensors or a gateway. This enables data to be initially processed at the Edge device possibly enabling smaller datasets sent to the core. Devices at the Edge may not be continuously connected to the network. Hence, these devices may need a copy of the master data/reference data for processing in an offline format. Edge devices may also include other features like:

•    Apply rules and workflow against that data

•    Take action as needed

•    Filter and cleanse the data

•    Store local data for local use

•    Enhance security

•    Provide governance admin controls

IoT analytics techniques applied at the Data Lake

Data Lakes

The concept of a Data Lake is similar to that of a Data warehouse or a Data Mart. In this context, we see a Data Lake as a repository for data from different IoT sources. A Data Lake is driven by the Hadoop platform. This means, Data in a Data lake is preserved in its raw format. Unlike a Data Warehouse, Data in a Data Lake is not pre-categorised. From an analytics perspective, Data Lakes are relevant in the following ways:

  • We could monitor the stream of data arriving in the lake for specific events or could co-relate different streams. Both of these tasks use Complex event processing (CEP). CEP could also apply to Data when it is stored in the lake to extract broad, historical perspectives.
  • Similarly, Deep learning and other techniques could be applied to IoT datasets in the Data Lake when the Data  is ‘at rest’. We describe these below.

ETL (Extract Transform and Load)

Companies like Pentaho are applying ETL techniques to IoT data

Deep learning

Some deep learning techniques could apply to IoT datasets. If you consider images and video as sensor data, then we could apply various convolutional neural network techniques to this data.

It gets more interesting when we consider RNNs(Recurrent Neural Networks)  and Reinforcement learning. For example – Reinforcement learning and time series – Brandon Rohrer How to turn your house robot into a robot – Answering the challenge – a new reinforcement learning robot

Over time, we will see far more complex options – for example for Self driving cars  and the use of Recurrent neural networks (mobileeye)

Some more interesting links for Deep Learning and IoT:

Optimization

Systems level optimization and process level optimization for IoT is another complex area where we are doing work. Some links for this

 

 Visualization

Visualization is necessary for analytics in general and IoT analytics is no exception

Here are some links

NOSQL databases

NoSQL databases today offer a great way to implement IoT analytics. For instance,

Apache Cassandra for IoT

MongoDB and IoT tutorial

 

Other  IoT analytic techniques

In this section, I list some IoT  technologies where we could implement analytics

 

A Methodology to solve Data Science for IoT problems

We started off with the question: Which points could you apply analytics to the IoT ecosystem and what are the implications? But behind this work is a broader question:  Could we formulate a methodology to solve Data Science for IoT problems?  I am exploring this question as part of my teaching both online and at Oxford University along with Jean-Jacques Bernard.

Here is more on our thinking:

  • CRISP-DM is a Data mining process methodology used in analytics.  More on CRISP-DM HERE and HERE (pdf documents).
  • From a business perspective (top down),we can extend CRISP-DM to incorporate the understanding of the IoT domain i.e. add domain specific features.  This includes understanding the business impact, handling high volumes of IoT data, understanding the nature of Data coming from various IoT devices etc
    • From an implementation perspective(bottom up),  once we have an understanding of the Data and the business processes, for each IoT vertical : We first find the analytics (what is being measured, optimized etc). Then find the data needed for those analytics. Then we provide examples of that implementation using code. Extending CRISP-DM to an implementation methodology, we could have Process(workflow), templates,  code, use cases, Data etc
    • For implementation in R, we are looking to initially use Open source R and Spark and the  h2o.ai  API

 

Conclusion

We started off with the question:  At which points could you apply analytics to the IoT ecosystem and what are the implications? And extended this to a broader question:  Could we formulate a methodology to solve Data Science for IoT problems?  The above is comprehensive but not absolute. For example, you can implement deep learning algorithms on mobile devices (Qualcomm snapdragon machine learning development kit for mobile mobile devices).  So, even as I write it, I can think of exceptions!

 

This article is part of my forthcoming book on Data Science for IoT and also the courses I teach

Welcome your comments.  Please email me at ajit.jaokar at futuretext.com  - Email me also for a pdf version if you are interested. If you want to be a part of my course please see the testimonials at Data Science for Internet of Things Course.