Ethernet switches


View All News

HDFS Get Started

Bring yourself up to speed with our introductory content

  • Hadoop Distributed File System (HDFS)

    The Hadoop Distributed File System (HDFS) is the primary data storage system used by Hadoop applications. Continue Reading

  • Hadoop Distributed File System options for big data

    Because big data can scale to petabytes of capacity, organizations are looking to manage it in ways that are easier and less expensive than traditional scale-out NAS. Object storage and software-defined storage are frequently mentioned as big data tools. Both can add intelligence required for analyzing data and take advantage of low-cost disk storage.

    An object storage system handles files differently than a traditional file system. Servers use unique identifiers to find objects, which use metadata in a far more detailed way than file systems do. The unique identifiers mean objects can be geographically dispersed because they can be retrieved without the storage system knowing their physical location. That makes objects a good choice for large data stores or data stored in a cloud.

    Software-defined storage has many forms and use cases, but it applies to big data when used to pool and manage data across off-the-shelf commodity hardware. Because the management and analytics happen in software appliances, the hardware can be cheap, deep disk without bells and whistles.

    Perhaps the most well known option available is the Apache Hadoop Distributed File System (HDFS), which is a Java-based file system designed to be used in Hadoop clusters. HDFS currently scales to 200 petabytes and can support single Hadoop clusters of 4,000 nodes. It offers storage performance on a large scale and at a low cost, which is atypical of most enterprise arrays that cannot perform all three tasks simultaneously.

    In this chapter of "Tools to Tackle Big Data Troubles," we look at some core HDFS features, three HDFS commercial distributions and other Hadoop storage-related tools and their related applications.

     Continue Reading

  • Storage for big data and IoT is no small detail

    Capturing and capitalizing on vast amounts of data about customers and products can help a business adapt and even thrive. But implementing big data or IoT means creating or adjusting IT resources to handle the burden. With these emerging IT workload types, storage takes on a critically important role. But can a single storage system do the job? A business will need to determine the types of data its IoT and big data projects will collect. Gathering many tiny data files that arrive simultaneously, for instance, will not require the same type of storage that collecting fewer, larger files will. Object storage for big data and IoT may be the answer in certain situations. Other conditions might call for a network file system, Fibre Channel or other types of resources. Making the right decisions when it comes to storage will be an important factor in determining whether an IoT or big data initiative succeeds or fails. Continue Reading

View All Get Started

Evaluate HDFS Vendors & Products

Weigh the pros and cons of technologies, products and projects you are considering.

View All Evaluate

Manage HDFS

Learn to apply best practices and optimize your operations.

View All Manage

Problem Solve HDFS Issues

We’ve gathered up expert advice and tips from professionals like you so that the answers you need are always available.

  • Hadoop data analysis: Common concerns with the HDFS platform

    Using HDFS technology as a data analysis platform may be insufficient for your storage needs. Explore the challenges storage administrators may encounter and find out how to address them. Continue Reading

  • IoT applications make advances, but hurdles lie ahead

    Data is an upside and a downside to the Internet of Things. Many companies are eager to make IoT products or add IoT capabilities to their devices, and some don't go beyond that. But taking IoT from cool toy to useful tool means doing something with all the data IoT applications produce.

    In the cover story of this issue of Business Information, executive editor Craig Stedman shares stories from companies that are implementing IoT applications and capturing the data they create. Businesses that have made the decision to invest in the IoT describe the changes they made to their organizational structure and technology infrastructure to be ready for the onslaught of data from connected devices. For example, one company using IoT-enabled equipment, Rockwell Automation Inc., now uses two databases to store all the incoming information.

    Manufacturing companies such as Rockwell had a bit of a jump on the IoT. In another feature, executive editor David Essex writes about how sensors laid the foundation for IoT applications. But that doesn't mean adopting full-blown IoT is easy for manufacturers. "It can be hard to get wireless connectivity into manufacturing facilities that are laden with concrete walls and heavy iron pipes and machinery," writes Essex. One thing is certain: IoT capabilities are going to be an investment for any company, and it's one that more and more are willing to make.

    Also in this issue, Essex talks with Phil Crannage, core systems director at British Gas, about a project he's leading to move the U.K. energy provider to smart meters by 2020. Our look at an emerging technology or term -- What's the Buzz? -- tackles the hype and reality of data storytelling. And Stedman returns with a column on the diminishment of MapReduce. Continue Reading

  • Planning, skills needed to navigate Hadoop data lakes

    In the business intelligence and analytics world, data lakes are their own region -- one in which today's multifarious forms of information can be stored in their native forms until used -- and cheaply at that. But these vast storage repositories, which are based on open source Apache Hadoop are not for those seeking rest and recreation. They take serious work -- and often sought-after skills -- to build and maintain.

    In this guide, SearchDataManagement peers across several types of data lakes to discover how different organizations today are implementing them. First, editor Craig Stedman talks to three companies that have taken the dive -- and learns about the challenges and benefits presented to each. Next, reporter Jack Vaughan tells one executive's story -- how a Hadoop-based system opened new doors for his company. Finally, Vaughan quizzes Forrester analyst Mike Gualtieri on whether data lakes can compete with data warehouses to quench organizations' thirst for storing and analyzing business data. Continue Reading

View All Problem Solve