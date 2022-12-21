Dish Network's advertising placement subsidiary has found that when it comes to observability, less is more.

Dish Media, based in New York, replaced multiple IT monitoring and security monitoring tools, including the open source version of Elasticsearch, with just one service over the last two years: the SaaS edition of Elastic Inc.'s commercial version of Elastic Stack. Dish now funnels all its operational, security and business systems data into Elastic Cloud, yielding faster incident response, proactive anomaly detection and quicker responses to business queries than it was able to perform with multiple analytics tools and pools of data.

The Elastic Stack consists of Elasticsearch for storage and indexing of data; Logstash for data collection and processing; Kibana for data visualization via dashboards; and Beats for data collection and transformation. The Elastic Cloud managed service now handles upgrades, patches and day-to-day maintenance work on Dish Media's Elasticsearch clusters.

Those clusters contain billions of records generated by Dish Media's targeted advertising systems, both on-premises and in the AWS public cloud, which collect data on Dish Network's 7.5 million subscribers. Dish Media collects, on average, about 10 billion records per day into this system, generated by 25 million device endpoints. The company's systems also scrub and mask personally identifiable information within these data sets before they analyze subscriber preferences and use them to place ads.

"That way, you're not watching diaper commercials when you don't have any kids in the house," said John Haskell, head of engineering at Dish Media, in a presentation at AWS re:Invent 2022. "There's tons of data that comes back, and we will then have to … correlate it and re-analyze it all over again for the next campaign."

The combination of a consolidated pool of observability data for multiple IT disciplines and using a managed service reduced toil for the company's DevOps engineers, Haskell said in an interview this month.

"Now I can have developers focusing on feature sets and new data indexes and looking again at the innovation side, versus just maintaining an Elastic cluster," Haskell said. "It's allowed us to do flexible reporting and analytics, which means that we're able to now get answers [for business managers] really fast."

In the past, business managers might have to wait as many as two weeks for an answer if their questions required searching the company's back-end SQL databases, as well as an Amazon Redshift data warehouse, application performance monitoring tools and other repositories. During that time, the company's engineers would locate the relevant data sets, correlate data between them, wait for queries to finish running on the back and, and possibly do custom dashboard development work with tools such as open source Grafana to create a report.

Now, the entire body of corporate data can be queried in Elastic Cloud in seconds, Haskell said.

"Sometimes we can get answers within the same call [with a business manager]," he said. "And we can change the query instantly [to] apply additional filters … and look at it from many different perspectives."

From reactive to proactive incident response Like other enterprise IT pros undertaking digital transformation projects, sprawling distributed systems and explosive data growth have required major changes to traditional practices to keep up. Ingesting data into the Elastic Cloud back end required Dish Media to set up data transformation pipelines triggered using the Amazon MQ message queuing service, along with custom Python scripts. The reward for Dish Media's drastic consolidation onto Elastic Cloud, where it can apply Elastic's AI-based anomaly detection features to a wide variety of operational and business data, has been proactive incident response for both IT performance issues and security vulnerability management, Haskell said. Normally hard drives take a couple of weeks to actually fail out, but we can see it almost instantly now and be proactive. John HaskellHead of engineering, Dish Media "Let's say a customer has a failing hard drive on a set-top box," he said. "Normally, hard drives take a couple of weeks to actually fail out, but we can see it almost instantly now and be proactive, call up the customer and get those boxes replaced before they even know they have an issue." Because Dish Media's data is also analyzed within Elastic's SIEM, it provides similar proactive threat detection for common vulnerabilities and exposures (CVEs) in its network. "We can now stay ahead of CVEs in real time," Haskell said. "When a new CVE hits the threat bulletins, we will instantly be notified that it's there and if there's a patch available." For now, Dish Media's IT team responds to these alerts with a manually managed patching process, but the company is considering using Elastic's auto-remediation AIOps feature for this purpose, he said.