your123 -

Dish Media swaps observability sprawl for Elastic Stack SaaS

With fewer tools and data repositories to wrangle and Elastic cluster management outsourced to SaaS, Dish Media's ops teams reduced toil and achieved proactive incident response.

Dish Network's advertising placement subsidiary has found that when it comes to observability, less is more.

Dish Media, based in New York, replaced multiple IT monitoring and security monitoring tools -- including the open source version of Elasticsearch -- with just one service over the last two years: the SaaS edition of Elastic Inc.'s commercial version of Elastic Stack. Dish now funnels all its operational, security and business systems data into Elastic Cloud, yielding faster incident response, proactive anomaly detection and quicker responses to business queries. These outperform the multiple analytics tools and pools of data used prior.

The Elastic Stack consists of Elasticsearch for storage and indexing of data; Logstash for data collection and processing; Kibana for data visualization via dashboards; and Beats for data collection and transformation. The Elastic Cloud managed service now handles upgrades, patches and day-to-day maintenance work on Dish Media's Elasticsearch clusters.

Those clusters contain billions of records generated by Dish Media's targeted advertising systems, both on-premises and in the AWS public cloud, which collect data on Dish Network's 7.5 million subscribers. Dish Media collects, on average, about 10 billion records per day into this system, generated by 25 million device endpoints. The company's systems also scrub and mask personally identifiable information within these data sets before they analyze subscriber preferences and use them to place ads.

"That way, you're not watching diaper commercials when you don't have any kids in the house," said John Haskell, head of engineering at Dish Media, in a presentation at AWS re:Invent 2022. "There's tons of data that comes back, and we will then have to … correlate it and re-analyze it all over again for the next campaign."

The combination of a consolidated pool of observability data for multiple IT disciplines and using a managed service reduced toil for the company's DevOps engineers, Haskell said in an interview this month.

"Now I can have developers focusing on feature sets and new data indexes and looking again at the innovation side, versus just maintaining an Elastic cluster," Haskell said. "It's allowed us to do flexible reporting and analytics, which means that we're able to now get answers [for business managers] really fast."

In the past, business managers might have to wait up to two weeks for an answer if their questions required searching the company's back-end SQL databases as well as an Amazon Redshift data warehouse, application performance monitoring tools and other repositories. During that time, the company's engineers located the relevant data sets, correlated data between them, waited for queries to finish running on the back end, and possibly do custom dashboard development work with tools such as open source Grafana to create a report.

Now the entire body of corporate data can be queried in Elastic Cloud in seconds, Haskell said.

"Sometimes we can get answers within the same call [with a business manager]," he said. "And we can change the query instantly [to] apply additional filters … and look at it from many different perspectives."

From reactive to proactive incident response

Like other enterprise IT pros undertaking digital transformation projects, sprawling distributed systems and explosive data growth have required major changes to traditional practices to keep up. Ingesting data into the Elastic Cloud back end required Dish Media to set up data transformation pipelines triggered using the Amazon MQ message queuing service along with custom Python scripts.

The reward for Dish Media's drastic consolidation onto Elastic Cloud -- where it can apply Elastic's AI-based anomaly detection features to a wide variety of operational and business data -- has been proactive incident response for both IT performance issues and security vulnerability management, Haskell said.

Normally hard drives take a couple of weeks to actually fail out, but we can see it almost instantly now and be proactive.
John HaskellHead of engineering, Dish Media

"Let's say a customer has a failing hard drive on a set-top box," he said. "Normally hard drives take a couple of weeks to actually fail out. But we can see it almost instantly now and be proactive, call up the customer and get those boxes replaced before they even know they have an issue."

Because Dish Media's data is also analyzed within Elastic's SIEM, it provides similar proactive threat detection for common vulnerabilities and exposures (CVEs) in its network.

"We can now stay ahead of CVEs in real time," Haskell said. "When a new CVE hits the threat bulletins, we will instantly be notified that it's there and if there's a patch available."

For now, Dish Media's IT team responds to these alerts with a manually managed patching process. But the company is considering using Elastic's auto-remediation AIOps feature for this purpose, he said.

Elastic Cloud roadmap to deepen ties with AWS

Elastic Inc. hasn't always had the coziest relationship with AWS, especially in the years since AWS launched its own managed Elasticsearch service in 2016, prompting a trademark infringement lawsuit from Elastic. Eventually AWS launched its own Open Distro for Elasticsearch. Earlier this year it dropped the Elasticsearch name from the service to resolve the suit.

Four core products of the Elastic Stack
Dish Media replaced multiple observability tools with a managed Elastic Stack, whose components are listed here.

In the meantime, Dish Media has become an AWS partner for 5G wireless services and is a heavy consumer of its public cloud and AWS Outpost hybrid cloud products. Haskell said he hopes Elastic Cloud will add integration with the newly introduced Amazon DataZone, a data analytics portal with built-in governance.

"It could help with data classification [in Elastic Cloud] as well," he said. "I want to follow up with Elastic and AWS and see if there will be some integration between their two services."

Elastic does plan to evaluate such an integration, said Sajai Krishnan, general manager of observability at the vendor, in an interview this month. Krishnan joined Elastic in May after a long tenure at VMware, under Elastic's new CEO Ashutosh Kulkarni, who took over the company in January. Under this new executive regime, which added a new chief product officer in August, it appears Elastic now looks to start a new chapter in its relationship with AWS.

Sajai Krishnan, ElasticSajai Krishnan

"DataZone is brand new, so we are going to look at it. I don't yet have a roadmap commitment. But every quarter there is a [review of] Amazon integrations and features that we do," Krishnan said. "Amazon is a major partner."

Further customization support for dashboards within Elastic Cloud's Kibana Lens UI -- also on Haskell's wishlist -- is also in the works, Krishnan said, though he declined to specify a release date. Furthermore, the company disclosed plans in October to create a stateless version of Elasticsearch to boost its scalability and performance.

"Offloading index storage into an external service will also allow us to re-architect Elasticsearch by separating indexing and search responsibilities," according to a company blog post. "Instead of having primary and replica instances handling both workloads, we intend to have an indexing tier and a search tier. Separating these workloads will allow them to be scaled independently and hardware selection to be more targeted for the respective use cases."

Beth Pariseau, senior news writer at TechTarget, is an award-winning veteran of IT journalism. She can be reached at [email protected] or on Twitter @PariseauTT.

Dig Deeper on IT systems management and monitoring

Software Quality
App Architecture
Cloud Computing
Data Center