Browse Definitions :
Definition

SequenceFile

A SequenceFile is a flat, binary file type that serves as a container for data to be used in Apache Hadoop distributed computing projects. SequenceFiles are used extensively with MapReduce.

Since Hadoop functions best with larger files, SequenceFiles are used to store and compress files that are smaller than the optimum size for operating efficiently with Hadoop, which can help reduce required disk space capacity and I/O  requirements.

SequenceFiles serve as a container for a sequence of files. Keys are listed for reference and values, and the contents of the file are referenced in a given key. SequenceFiles support a Writer, a Reader and a Sorter class for respective functions in relation to keys. As an example, a SequenceFile might contain a massive number of log files for a server where the key would be a timestamp and the value would be the entire log file. Normally, the small text files would be very inefficient in Hadoop. After packaging into SequenceFiles, however, they can be used effectively.

Beyond packaging files into a manageable size for Hadoop, SequenceFiles support compression of the keys, the values or both.  When both are compressed, the file keys and values are collected into blocks and separately compressed. The type of compression chosen determines the file format.

This was last updated in December 2017

Continue Reading About SequenceFile

Networking
  • Network as a Service (NaaS)

    Network as a service, or NaaS, is a business model for delivering enterprise WAN services virtually on a subscription basis.

  • network configuration management (NCM)

    Network configuration management is the process of organizing and maintaining information about all of the components in a ...

  • presentation layer

    The presentation layer resides at Layer 6 of the Open Systems Interconnection (OSI) communications model and ensures that ...

Security
  • crypto wallet (cryptocurrency wallet)

    A crypto wallet (cryptocurrency wallet) is software or hardware that enables users to store and use cryptocurrency.

  • zero-day (computer)

    A zero-day is a security flaw in software, hardware or firmware that is unknown to the party or parties responsible for patching ...

  • backdoor (computing)

    A backdoor attack is a means to access a computer system or encrypted data that bypasses the system's customary security ...

CIO
HRSoftware
  • team collaboration

    Team collaboration is a communication and project management approach that emphasizes teamwork, innovative thinking and equal ...

  • employee self-service (ESS)

    Employee self-service (ESS) is a widely used human resources technology that enables employees to perform many job-related ...

  • learning experience platform (LXP)

    A learning experience platform (LXP) is an AI-driven peer learning experience platform delivered using software as a service (...

Customer Experience
  • marketing stack

    A marketing stack, also called a marketing technology stack, is a collection of technologies used by marketers to perform, ...

  • social media influence

    Social media influence is a marketing term that describes an individual's ability to affect other people's thinking in a social ...

  • headless commerce (headless e-commerce)

    Headless commerce, also called headless e-commerce, is a platform architecture that decouples the front end of an e-commerce ...

Close