Fast Track IoT with MongoDB

Fast Track IoT with MongoDB

While working with clients who are starting down their Internet of Things adoption path I’ve noticed some anti-patterns that are delaying, and in many cases derailing, these efforts. I believe that the main issue involves conflating IoT with Hadoop because discussions of processing machine data lends itself so easily into the Volume, Velocity and Variety paradigm that is now synonymous with Big Data and by extension Hadoop. Also, people tend to use Big Data and NoSQL interchangeably, which is simply inaccurate. These false equivalences have led many IoT initiatives to be labeled a “science project”. Ironically, we’ll see my suggestion to avoid being labeled a science project is to actually follow the structure of a science project. 

In order to set the stage for our science project, first break down the problem set. The question is “what is the most effective mechanism with which to extract meaningful, actionable intelligence from incoming machine data”. Typically, Enterprise Data Warehouses have been collecting, transforming and aggregating data from operational systems using relational databases. This is a well-known, highly used, very effective pattern for a majority of uses cases. Devices generate data differently than do people, naturally. JSON has become the standard format for exchanging machine-generated data. The hypothesis is that a NoSQL solution could be more effective than a traditional solution, whatever “traditional” means in your organization (.NET/SQL Server, Java/Oracle, Ruby/MySQL, etc).

 Finally, you will need a set of metrics against which you can compare MongoDB against a proposed solution using your current environment and/or against Hadoop. As long as each metric can be rigidly defined, you can produce a fair and accurate comparative analysis. For data projects, the following list is fairly standard.

  1. Resiliency
  2. Stability
  3. Adaptability of Data Model
  4. Data Migration and Movement
  5. Performance
  6. Configuration Flexibility
  7. Administrator Functionality
  8. Enterprise-Grade Support Options

Let’s take a quick overview of the potential candidates for an IoT system.

Like any good science project, we want to limit the number of variables that are being changed, optimally to a single variable. Hadoop provides a robust ecosystem of tools to support analytics processing on petabyte scale data. A platform this powerful does require specialized expertise from both the development and operational side of the house. While there are tools that make life easier for both sides, it’s still not easy. MongoDB is a NoSQL database, not a platform. Quite simply, MongoDB is a tool; Hadoop is a suite of tools. With MongoDB, your developers will need to understand a JSON document model, but this is also true of common tools such as RESTful web services. So while your current developers will almost certainly have never done map reduce in Pig, they will almost certainly have worked with JSON.

Data from devices typically has flexible schemas or schemas that are out of your control so a schema-on-read database makes sense. Data from devices are typically in JSON, so a database that uses JSON natively makes sense. Developers will be most effective in the shortest amount of time if you allow them to keep their programming language of choice. MongoDB satisfies all of these. Even if you have Hadoop in place, the value of sensor and machine data tends to be in near real time analysis, which MongoDB excels at doing. While Hadoop is best known for batch analytics on large datasets, MongoDB’s Aggregation Framework provides a robust alternative to Map Reduce. And, if your data volume grows sufficiently to overwhelm the effectiveness of the Aggregation Framework, then MongoDB has a native driver to archive data to Hadoop.

A good proof of concept, like a good science experiment, can’t get out of the lab based on results alone. These results point to the adoption of a new technology and senior management is likely going to want more than an enthusiastic developer’s perspective.The issues that I usually hear coming from senior management surrounding their failure to adopt any new technology include:

  1. Lack of people with the right skills
  2. Rapidly changing technology solutions
  3. Lack of time, too busy with current departmental jobs
  4. No clear business imperative

 

I think I first wrote this list on a 5250 in DCF/SCRIPT, but it still has merit. The most effective way to implement a change that will go to Production is to make that change as small as possible, but no smaller. Since MongoDB is only one tool and it only introduces on new idea (document oriented storage, it’s fairly straightforward to retrain your existing staff. If you do need to go the staff augmentation route, MongoDB is the most popular NoSQL database so it should be straightforward. Hadoop is a platform so it presents more tools changing at different rates so it is difficult to keep up. The same is obviously not true of a single product. Time is a flexible tool in a developer’s hands; you’ll be surprised at how much free time there is for a MongoDB project. Finally, the business has requirements they don’t know how to articulate since you haven’t enabled them with all of this new, fast, cheap data. Pick low hanging fruit, like uptime, and push it to your most forward thinking line of business partners.

It has been my experience that many enterprise Big Data or NoSQL projects fail to make it to Production not because the technologies do not work but because the right tool is not used for the right job. At this point, there have been sufficient adoption across so many types of businesses successfully implementing a huge assortment of use cases that we are no longer in Gartner’s “hype” zone with Big Data and NoSQL. The problems of adoption are now likely to be organizational rather than technical which in theory should make it easier to successfully deploy with these new technologies. However, success is never guaranteed and you need to choose your tools, and battles, wisely. Particularly for IoT projects, MongoDB stands out as the most reasonable path to NoSQL success.

Abhijith Soman

Senior Security Analyst @ Riscure | Embedded Devices, Hardware Security, Fault injection

9y

Mongo is really cool. Too bad they are not supporting arm

Like
Reply
Amal P S

Entrepreneur | Attentive Learner (Former Group CEO Omind, Former Founder & CEO at Keito)

9y

Defintely. Mogo Db have the power of document representation and graph format. This can improve data analytics alot. :)

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics