Browse Definitions :
Definition

streaming data architecture

A streaming data architecture is an information technology framework that puts the focus on processing data in motion and treats extract-transform-load (ETL) batch processing as just one more event in a continuous stream of events. This type of architecture has three basic components -- an aggregator that gathers event streams and batch files from a variety of data sources, a broker that makes data available for consumption and an analytics engine that analyzes the data, correlates values and blends streams together.

The system that receives and sends data streams and executes the application and real-time analytics logic is called the stream processor. Because a streaming data architecture supports the concept of event sourcing, it reduces the need for developers to create and maintain shared databases. Instead, all changes to an application’s state are stored as a sequence of event-driven processing (ESP) triggers that can be reconstructed or queried when necessary. Upon receiving an event, the stream processor reacts in real- or near real-time and triggers an action, such as remembering the event for future reference.

The growing popularity of streaming data architectures reflects a shift in the development of services and products from a monolithic architecture to a decentralized one built with microservices. This type of architecture is usually more flexible and scalable than a classic database-centric application architecture because it co-locates data processing with storage to lower application response times (latency) and improve throughput. Another advantage of using a streaming data architecture is that it factors the time an event occurs into account, which makes it easier for an application’s state and processing to be partitioned and distributed across many instances.

Streaming data architectures enable developers to develop applications that use both bound and unbound data in new ways. For example, Alibaba’s search infrastructure team uses a streaming data architecture powered by Apache Flink to update product detail and inventory information in real-time. Netflix also uses Flink to support its recommendation engines and ING, the global bank based in The Netherlands, uses the architecture to prevent identity theft and provide better fraud protection. Other platforms that can accommodate both stream and batch processing include Apache Spark, Apache Storm, Google Cloud Dataflow and AWS Kinesis.

This was last updated in October 2018

Continue Reading About streaming data architecture

SearchNetworking
  • routing table

    A routing table is a set of rules, often viewed in table format, that's used to determine where data packets traveling over an ...

  • CIDR (Classless Inter-Domain Routing or supernetting)

    CIDR (Classless Inter-Domain Routing or supernetting) is a method of assigning IP addresses that improves the efficiency of ...

  • throughput

    Throughput is a measure of how many units of information a system can process in a given amount of time.

SearchSecurity
  • quantum key distribution (QKD)

    Quantum key distribution (QKD) is a secure communication method for exchanging encryption keys only known between shared parties.

  • Common Body of Knowledge (CBK)

    In security, the Common Body of Knowledge (CBK) is a comprehensive framework of all the relevant subjects a security professional...

  • buffer underflow

    A buffer underflow, also known as a buffer underrun or a buffer underwrite, is when the buffer -- the temporary holding space ...

SearchCIO
  • benchmark

    A benchmark is a standard or point of reference people can use to measure something else.

  • spatial computing

    Spatial computing broadly characterizes the processes and tools used to capture, process and interact with 3D data.

  • organizational goals

    Organizational goals are strategic objectives that a company's management establishes to outline expected outcomes and guide ...

SearchHRSoftware
  • talent acquisition

    Talent acquisition is the strategic process employers use to analyze their long-term talent needs in the context of business ...

  • employee retention

    Employee retention is the organizational goal of keeping productive and talented workers and reducing turnover by fostering a ...

  • hybrid work model

    A hybrid work model is a workforce structure that includes employees who work remotely and those who work on site, in a company's...

SearchCustomerExperience
  • database marketing

    Database marketing is a systematic approach to the gathering, consolidation and processing of consumer data.

  • cost per engagement (CPE)

    Cost per engagement (CPE) is an advertising pricing model in which digital marketing teams and advertisers only pay for ads when ...

  • B2C (Business2Consumer or Business-to-Consumer)

    B2C -- short for business-to-consumer -- is a retail model where products move directly from a business to the end user who has ...

Close