NIST framework aids big data tool creation for multiple platforms

The National Institute of Standards and Technology has developed a framework to support the creation of tools that can be used in any computing environment.

NIST is a physical sciences laboratory in the Department of Commerce with a mission to support the nation’s innovation and industrial competitiveness.

To improve approaches for analyzing very large quantities of data, computer scientists at NIST have released to healthcare entities and other industries specifications for how to build more widely useful technical tools for jobs requiring analysis of substantial amounts of data.

NIST Building-CROP.jpg

The agency has published a final version of the NIST Big Data Interoperability Framework, the result of a collaboration effort that involved more than 800 experts from various industries.

Also See: NIST asks for help to create standards for artificial intelligence

The new NIST framework guides developers on how to deploy software tools that can analyze data using any type of computing platform, from a single laptop to the most powerful cloud-based environment. The framework further enables analysts to move work from one platform to another and substitute a more advanced algorithm without retooling the computing environment.

NIST wants to enable data scientists to do effective work using whatever platform they chose, says Wo Chang, a NIST computer scientist. “This framework is a reference for how to create an agnostic environment for tool creation,” he adds. “If software vendors use the framework’s guidelines when developing analytical tools, then analysts’ results can flow uninterruptedly, even as goals change and technology advances.”

The framework will help data scientists who are trying to extract meaning from ever-growing and varied datasets while navigating a shifting technology ecosystem.

For example, interoperability is increasingly important as huge amounts of data pour in from growing numbers of platforms, including tiny sensors and devices being linked into the Internet of Things.

With rapid growth of tool availability, data scientists can scale work from a desktop to distributed cloud environments but this comes with a cost, according to NIST. Tools may have to be rebuilt from scratch using a different computer language or algorithm, costing staff time and possibly critical insights.

As a result, the NIST framework is an attempt to address these problems—it includes consensus definitions and taxonomies to ensure developers are on the same page when they discuss plans for new tools, as well as guidance on the data security and privacy protections that tools should have.

“The reference architecture interface specification will enable vendors to build flexible environments that any tool can operate in,” notes Chang. “Before there was no specification on how to create interoperable solutions. Now, they will know how.”

For reprint and licensing requests for this article, click here.