Browse Definitions :
Definition

Apache Kafka

Apache Kafka is a distributed publish-subscribe messaging system that receives data from disparate source systems and makes the data available to target systems in real time. Kafka is written in Scala and Java and is often associated with real-time event stream processing for big data.

Like other message brokers systems, Kafka facilitates the asynchronous data exchange between processes, applications and servers. Unlike other messaging systems, however, Kafka has very low overhead because it does not track consumer behavior and delete messages that have been read. Instead, Kafka retains all messages for a set amount of time and makes the consumer responsible for tracking which messages have been read.

Kafka software runs on one or more servers and each node in a Kafka cluster is called a broker. Kafka uses Apache ZooKeeper to manage clusters; the broker's job is to help producer applications write data to topics and consumer applications read from topics. Topics are divided into partitions to make them more manageable and Kafka guarantees strong ordering for each partition. Because messages are written into a partition in a particular order and are read in the same order, each partition essentially becomes a commit log that can function as a single source of truth (SSoT) for a distributed system’s events.

Kafka’s code base, which was originally developed at LinkedIn to provide a mechanism for parallel load in Hadoop systems, became an open source project under the Apache Software Foundation in 2011. In 2014, the developers at LinkedIn who created Kafka started a company called Confluent to facilitate Kafka deployments and support enterprise-level Kafka-as-a-service products. Version 5.0 of the Confluent Platform, which was commercially released in 2018, improves the handling of application client failover for disaster recovery (DR) and reduces reliance on the Java programming language for data streaming analytics applications.

This was last updated in March 2019

Continue Reading About Apache Kafka

SearchNetworking
  • virtual network functions (VNFs)

    Virtual network functions (VNFs) are virtualized tasks formerly carried out by proprietary, dedicated hardware.

  • network functions virtualization (NFV)

    Network functions virtualization (NFV) is a network architecture model designed to virtualize network services that have ...

  • overlay network

    An overlay network is a virtual or logical network that is created on top of an existing physical network.

SearchSecurity
  • encryption

    Encryption is the method by which information is converted into secret code that hides the information's true meaning.

  • X.509 certificate

    An X.509 certificate is a digital certificate that uses the widely accepted international X.509 public key infrastructure (PKI) ...

  • directory traversal

    Directory traversal is a type of HTTP exploit in which a hacker uses the software on a web server to access data in a directory ...

SearchCIO
  • resource allocation

    Resource allocation is the process of assigning and managing assets in a manner that supports an organization's strategic ...

  • chief digital officer (CDO)

    A chief digital officer (CDO) is charged with helping an enterprise use digital information and advanced technologies to create ...

  • security audit

    A security audit is a systematic evaluation of the security of a company's information system by measuring how well it conforms ...

SearchHRSoftware
SearchCustomerExperience
  • implementation

    Implementation is the execution or practice of a plan, a method or any design, idea, model, specification, standard or policy for...

  • first call resolution (FCR)

    First call resolution (FCR) is when customer service agents properly address a customer's needs the first time they call.

  • customer intelligence (CI)

    Customer intelligence (CI) is the process of collecting and analyzing detailed customer data from internal and external sources ...

Close