Friday, November 1, 2019

MIT ‘SuperCloud’ analyzes the internet

©Mark Ollig




Massachusetts Information of Technology (MIT) researchers are using their SuperCloud for capturing and analyzing web traffic from around the world.

Over the years, we have seen more clouds cover the sky of the internet; lately, it has been very cloudy.

We have an ever-growing and impressive-sounding computing cloud list, which includes Google Cloud Platform, Amazon Web Services Cloud, The Apple iCloud, Microsoft Azure, IBM Cloud, and the Oracle Cloud.

There are hybrid clouds, personal clouds, private clouds, and clouds with other names.

Now, we have the MIT SuperCloud.

Will there ever be a Bits & Bytes Cloud? Stay tuned.

Cloud computing, by any other name, is an internet-connected computer hub used for storing and accessing data and software programs.

But, I digress.

A massive dataset containing more than 50 billion information packets of global internet data traffic was collected by MIT, starting in 2015.

Web giants with large supernode computing hubs, such as Google and Facebook, along with thousands of other websites sprinkled across the internet, were included in an MIT analytical computer study model.

Internet data points used to connect us with online services from our computer programs, apps on our smartphones, and other devices were collected by MIT using anonymization collection techniques to protect the privacy of the entity using the services.

MIT’s SuperCloud neural network collected the global internet traffic data used in their mammoth data-modeling analysis project.

Some 10,000 processors, within computers located at the MIT Lincoln Laboratory Supercomputing Center and its other locations, handled the data.

Using the MIT supercomputing system and advanced software, researchers modeled how web traffic network connections affect each other throughout the world.

The information becomes a measurement tool for internet research, growth, and security.

Benefits include better distribution of computing resources to ensure optimum data flow and avoidance of internet traffic blockages.

This data is used with cybersecurity, revealing illegal IP addresses used for perverse or illicit purposes, spamming web sites, originators of Distributed Denial of Service virus attacks, and other cyber threats.

The data MIT researchers collected included sporadic isolated links within the internet that seem to impact web traffic within some core computing nodes.

MIT’s research is similar to measuring and learning about the cosmic microwave background of the universe and the phenomena within it.

“We built an accurate model for measuring the background of the virtual universe of the internet,” stated Jeremy Kepner, a researcher at the MIT Lincoln Laboratory Supercomputing Center.

To sort all the data collected into meaningful analytical models, MIT used software called Dynamic Distributed Dimensional Data Mode (D4M).

D4M uses averaging techniques to efficiently compute and sort the enormous amalgam of internet “hypersparse data” and turn it into meaningful graphs.

Capturing the full magnitude of those data graphs is nearly impossible using traditional computer traffic study models.

“You can’t touch that data without access to a supercomputer,” Kepner said.

Researchers decoded and distributed 50 billion digital packets of global internet data among MIT’s SuperCloud computing processors.

The processors compressed this data and produced computing graph models containing rows and columns of interactions between originating internet sources and the destination points through which the data traveled.

MIT researchers are collaborating with the scientific community to find the next application for their SuperCloud modeling capabilities.

Come to think about it; I do have something of an internet cloud with this weblog.