News

Google TPUs open up on cloud; LinkedIn intros Hadoop Dynamometer

Jack Vaughan

Listen to this podcast

In big data news, we find Google TPUs, or Tensor Processing Units, offered as a cloud service, while LinkedIn is open sourcing a Hadoop test simulator called Dynamometer.

Podcast

Signs continue to point to the fact that big data is getting bigger. More importantly, its very bigness sets the tone for innovation.

This trend is seen in new releases: one, a LinkedIn large-scale Hadoop test system, known as Dynamometer; the other, Google TPUs, or Tensor Processing Units, available as a service on the Google Cloud Platform. Both are discussed in this edition of the Talking Data podcast.

Machine and deep learning are both generally seen as means to turbocharge predictive analytics. Google, with its massive data centers and eager army of technologists, has been at the forefront of the technology, with a particular showcase being TensorFlow. This is an open source framework the search giant has fashioned for the highly recursive task of neural processing on massive data sets.

Google TPUs represent a specialized hardware approach to such neural processing. The hardware is proprietary to Google and, as is discussed in the podcast, is of a type somewhat out of the reach of typical IT shops. In February, the company announced that Google TPUs would be available in beta on the Google Cloud Platform. The TPUs reside four to a board, and they can be connected as pods via an ultra-fast, dedicated network.

Also discussed in the podcast is Dynamometer, which was open sourced by LinkedIn this month in an effort to improve testing of Hadoop Distributed File System (HDFS) clusters. Such testing has become an issue as cluster node counts have gone higher into the thousands of nodes.

Appearing in this edition of Talking Data is Mike Matchett, analyst and founder of the Small World Big Data consultancy. According to Matchett, high performance Hadoop testing is difficult if teams are required to gather data and configure setups that match the production implementation node for node. The LinkedIn approach, he indicated, takes a novel tack, matching physical HDFS name nodes with simulated data nodes.

To learn more, listen to the podcast.