Feature Story

Testing IoT at Scale Using Realistic Data, Part II

Continuing the IoT Testing conversation on how to comprehensively test a large IoT system and not just the sum of its parts. KnowThings.io introduces a novel solution to perform quality testing on a massive scale using machine learning and dramatically reduces the engineering workload…

Editor’s Note: This is Part II of a two-part series on testing IoT on a massive scale. Click to find: Part I of Testing IoT at Scale Using Realistic Data.

The Internet of Things (IoT) is growing into big business, and with it comes masses of sensors, all gathering data for a more significant cause. One of the problems developers have, however, is how to realistically test the whole system, in all its potential chaotic glory. How do you road-test managing communications with thousands of IoT sensors to a cloud? And even if you manage to fake that kind of traffic, what nuances are you missing at that kind of scale?

Embedded Systems Engineering’s Editor-in-Chief, Lynnette Reese, sat down with KnowThings.io, an ambitious, smart startup accelerator project within CA Technologies that created a tool for IoT developers to realistically test IoT applications, from a small to an immense scale by leveraging machine learning. They currently call it the Self-learning IoT Virtualizer. KnowThings.io’s CEO, Anand Kameswaran, talked with ESE about its mission to make realistic IoT simulation and testing effective and easy, including cloud interaction, internet foibles, massive numbers of IoT sensors and connections, and IoT chaos in general.

Lynnette Reese (LR), Embedded Systems Engineering:

Why not just create a script that fakes inputs from a large number of devices? More IoT just creates more traffic, right?

Anand Kameswaran (AK), KnowThings.io: That sounds good on the face of it, but IoT doesn’t work that way past the first layer. There’s much more to a live, inter-integrated IoT ecosystem than simulating sensors spinning out data. We wanted a way to test an entire IoT network, interacting with a cloud and potential latencies, against as much chaos that real-world scenarios can throw at an engineer who’s trying to make sure that an automated 10 plate juggling act scales to 10,000 plates, including a database, cloud, and multiple IP addresses.

Faking 10,000 sensors can be done manually using a computer to spit out data. However, not only does that approach take time and a potential learning curve, it’s also coming from one source, and the data is not varying naturally as it would with a machine-learning algorithm to simulate realistically. The ability to generate realistic data scenarios for our customers is one of our unique value propositions. This includes factoring for latency, environmental factors, the correlation between them and the IoT data, as well as replicating the real data that can be predicted to be accurate over an extended period. In addition, the ability to generate a large amount of data (for example, one year’s worth of data) within a brief period is very valuable for customers who want to use a huge quantity of data to qualify and make decisions on their predictive analytic systems under test.

The IoT Virtualizer saves time and helps you find issues that you didn’t know you would or could have at scale. It allows IoT developers to build robust code by testing from the ground up and helps testers find out what they don’t know. Someone said, “There are things we know that we know. And there are known unknowns, which just means that there are things that we now know we don’t know. But there are also unknown unknowns.” And this is what KnowThings provides, a look into testing what we do not know that we don’t know.

LR: I think that was Donald Rumsfeld, talking about national security.

AK: Yes, that sounds right. But it’s true. You cannot test for what you don’t know can possibly happen at higher levels, after everything is online and interconnected. Abstractions aside, it’s just smart to test thoroughly, and simulation and machine-learning are our core competencies.

LR: Speaking of security, how does the self-learning IoT Virtualizer help with security?

AK: We take security seriously and are working on that, with plans to add the ability to deal with encrypted data streams in our next major release. But first we had to get the tool working solidly for as many IoT industries and scenarios as possible. Another interesting angle for security is testing the whole system and how the entire system in action reacts to hacking attempts. If a node is overwhelmed, will the security you have in place work the same as you planned when it had only a couple of requests a second? Our tool creates realistic, dynamic scenarios in which to test security hypotheses. In addition, working with our customers, we found some creative ways to detect phone home and back-door security exploits that we are hoping to market in the future.

LR: So, I’m hearing that it’s just another way of prototyping with a more realistic, real-world scenario simulation using machine learning.

AK: Large-scale mass data input is difficult to realistically simulate without the IoT Virtualizer, because we incorporated machine learning to replicate the underlying physical behavior. Building on that, the Virtualizer creates a simulation of what large IoT systems might experience in the real world. And this is how IoT developers can test their IoT systems realistically and thoroughly. Once the tool creates a template and gets a basic idea of the overall interactions, it fills in the gaps for you. That is where some of the time savings come from. The manual way of doing simulations is you have to build every case that you want to test for, and that becomes very time-consuming. The plan for our commercial version due later this year will allow the software to handle network latencies and congestions by editing the model. The time of device interactions can be adjusted to replicate devices functioning in different regions of the world, for example.

Figure 1: FunkNFresh Farm uses aquaponics. The fish live in the same water that is fed to hydroponically grown vegetables.

LR: I’m having a hard time visualizing what the Virtualizer is virtualizing, so to speak. Can you give me an example of an application where the Virtualizer would make a real difference for a developer or quality/test engineer?

AK: One example of a use for this first product is in farming or large-scale agriculture. There are so many sensors that are being deployed to monitor the moisture, temperature, humidity and various other factors. The data is collected and developers look for anomalies. But developers do not have the ability to set up 10,000 actual sensors and have them working as they would in the real world. That’s what we can offer.

LR: I recall researching for this interview and reading your blog on the hydroponics farm, FunkNFresh Farms. Can you expand on that IoT example?

AK: Sure, but it’s an aquaponics farm, not just hydroponics. Aquaponics is when you raise fish in a pond or tank. The fish live in the same water that is fed to hydroponically grown vegetables. The plants get nourishment from the fish waste and also purify the water, keeping the fish healthy. Whereas high technology and instrumentation are not exactly required for aquaponics, this is my wife’s company, so I was recruited to make it all work, and of course I had to make the greenhouse profitable.

LR: So, you have personal IoT design and development experience, then?

AK: Oh, yes. In fact, I spent several years contributing to open source automation servers and other similar projects before the current generation of smart home solutions. As far as the greenhouse project, it was a lot of fun, and the farm is a successful project that’s selling produce. Aquaponics requires a controlled mini ecosystem but brings year-round crop production. The IoT part of it includes optimization of greenhouse operations through instrumentation, data collection, and automated actions. So, the greenhouse is an IoT device, which measures and reports on water temperature, ambient lighting, and circulating water pH.

Figure 2: The aquaponics farm has an air and a water pump. Both pumps are monitored simultaneously by a microphone.

LR: This is connected to the web?

AK: Yes, and before you point out that this is just automation and that IoT involves data analysis, I am working on that through audio analysis. I am putting a microphone in the greenhouse that picks up audio of both the water and air pump. In a two-for-one status check, the system checks both pump run statuses at the same time by capturing the audio and doing spectrum analysis on the .wav files. I can determine if the pumps are running, and if they are under stress or load. While this is very much a work in progress, I am happy that its moving in the right direction.

LR: Clever.

AK: Thanks. It is progressing well, and it’s a cheaper solution than individually monitoring each pump. The analysis is done on a remote computer. Over time we will have files that will not only predict a pump that’s shut off, but a pump that is about to fail. IoT not only alerts on status, but also allows me to do optimizations at a lower cost, fewer parts, and with less complexity versus a straight automation route.

Figure 3: The .wav files from the microphone are sent via internet to a remote computer for an audio analysis to determine if a pump has failed (and if so, which one), and whether a pump is under an unexpected load.

LR: How did the IoT Self-Learning IoT Virtualizer come into being?

AK: We started out with machine learning and genome sequencing on a massive scale. Fast-forward to three or four years later, and we are applying it to IoT devices, for example, smartwatches with heart monitors and such. We are training on the data that transmits back and forth, so when your smartwatch is telling your phone, “your pulse rate is now at 58 bpm,” we can learn quickly what that means and be able to reproduce that without an engineer having to serve as an interpreter. The machine learning does all of that interpretation for us. This means that if I am a developer, at the end of the day I don’t really care how those bytes are structured. I just want to run my tests, build my code, and configure the system. Machine learning makes it so that the very time-consuming tasks are made faster, for example, configuring the physical system or the simulator by replicating the scenarios.

LR: I can see how that would save time for engineers because they don’t have to stop to decode what’s going on. The Self-Learning IoT Virtualizer carries that load.

AK: Yes, and the engineer is relieved of the task of getting his or her physical IoT device environment setup so that they can verify application logic by generating realistic data. By plugging in the virtualizer, which utilizes machine learning, you can take what would be two to three weeks of painstaking work and accomplish it in about five minutes. You do not need to be an expert in every device or every piece of kit you’re playing with, you just need to know how they all come together as a whole. Now I can use a tool to create the realistic data that I need in about five minutes, which otherwise could have taken up to a month to produce and would need subsequent care and feeding throughout the life cycle of a product.

LR: What kind of QA testing have you already done?

AK: We’re on the third version of the tool and have tested with several real-world customers who have very large IoT networks. It took one of our customers about 80 hours to create their own simulation for their particular IoT scenario. KnowThings.io was able to create an adaptive virtual device for them using our Self-Learning IoT Virtualizer in just five minutes. We call them ‘adaptive’ virtual devices because they can learn how to simulate behavior that they have not yet seen, however, it is still a behavior that is possible. Adaptive virtual devices are useful for testing because you aren’t forced to think of all of the ways your device could behave. In the end you can test with a blend of what you actually observed, what was generated through the machine learning, and whatever else you want to add. This grants complete coverage to test and build your solution.

LR: What kinds of clouds do you work with? Can the Virtualizer capture the quirks of the various clouds from different providers?

AK: The IoT Virtualizer is cloud-agnostic. It doesn’t matter what cloud you have chosen, because we interface deeper down, in the network layer.

LR: What IoT protocols can the IoT Virtualizer support?

AK: We are working on integrating several protocols, but our KnowThings product currently supports TCP/IP, REST, and CoAP over TCP. We are also working on integrating several other protocols such as Zigbee, LoRa, Modbus, Bluetooth etc. We are open to suggestions.

LR: IoT can be as simple as a group of sensors. Can I virtualize a group of sensors?

AK: Yes, you can virtualize sensors on a very large scale if you like.

LR: How can developers get their hands on this tool?

AK: We are in the early adopter stage right now and offering a role in beta testing. There’s an opportunity for us to partner with those customers that are part of the early adoption program. Not only will partners shape what everything should look like, but also help in developing best practices in a very challenging development environment. We want to know about the real challenges IoT is facing and concentrate on solving the problems that IoT developers care about.

Anyone interested in trying it out and contributing suggestions to improving the tool can download the community edition pre-release, or sign up for the early adopter program at . Commercial product launch is in mid-summer.

For more information, go to the KnowThings.io Self-Learning IoT Virtualizer FAQ online.

Return to: 2018 Feature Stories