Getty Images

Log management discipline saves ERP company $3M

IT ops pros at a SaaS provider made a set of simple changes to log management over the last two years and reaped substantial cost benefits.

ERP SaaS provider Infor slashed its log management bill from Sumo Logic by 50% last year after its service reliability architect team moved to data tiering and set ingest budgets for app teams.

The log management cost rationalization began two years ago after an engineer at Infor took over as the leader of the cross-functional team, which coaches application teams on best practices such as elasticity, scalability and reliability, similar to a site reliability engineering team. His predecessor assumed responsibility for managing a Sumo Logic log management system the company had for 10 years, but the new lead decided to change that approach.

"[Sumo Logic] was introduced … as a free-for-all, anything goes, everybody-has-superhuman-powers [program]. … So there were more and more people doing more and more advanced things and having more and more advanced questions. And [my predecessor] took it upon himself to help," said Iwan Eising, service reliability architect team lead at Infor. "When I took over his role, … he told me, 'Oh, yeah, by the way, you also have to manage Sumo.' And I was like, 'OK, that's not going to happen.'"

Eising's team still maintains the Sumo Logic license. providing advice and support. However, it delegated day-to-day management of the platform to app teams, establishing a new set of 10 standard roles with pre-defined sets of permissions to access the log management platform. Sumo Logic users also received training on those permissions and why they were added.

"Having those restrictions in place made it easier for people to do their job because they got the training that they needed for their role. [And] it also made it a lot easier [for us] to manage the whole population," Eising said.

Ownership, accountability yield log management savings

Each production DevOps team now has a "Sumo owner" who manages that team's ingest budget, a policy enforced through a Sumo Logic API that limits the number of gigabytes each team can send to the log management back end in a 24-hour timeframe. Eising's team also developed a dashboard using the API that shows each Sumo owner the dollar value of their daily log ingestion.

By 2023, these changes would yield a 50% savings on Sumo Logic storage costs, which Eising estimated at $3 million.

Iwan Eising, service reliability architect team lead, InforIwan Eising

"You can have a discussion with an individual who's accountable, and you make things visible [along with] the tools to do something about it," Eising said. "Because we can show, on a daily basis, the effects of making changes, teams and Sumo owners get a good, responsive financial system."

Another contributor to Infor's savings came from fine-tuning where ingested logs were stored. The company now uses a Sumo data tiering feature first made generally available in 2020, where it pays a fee per query per gigabyte for logs accessed less frequently. This is opposed to a higher per-gigabyte fee for frequently accessed data kept on a higher-performing, continuous tier of storage and used to generate alerts.

"Actionable logs go to the continuous tier because you want to act on them immediately," Eising said. "Anything else, you want to have that information available because it has contextual information when you have an error, but you're never going to look at it actively. … [It] turns out 90% to 95% of all the logs are actually non-actionable."

In addition to savings on storage, data tiering has helped teams respond to issues faster by narrowing down what data underpins alerts, filtering out noise, he said.

Log data cleanup sets stage for observability

Next, Infor plans to ingest metrics and tracing data into Sumo as a means of achieving observability, further pinpointing the most critical information about system performance.

"It's a conscious decision to stay with Sumo so that we have our trace information and our log information close together within one product," Eising said.

As for future log management, Eising said he plans to consider machine learning algorithms to sift through huge repositories of data as well as generative AI tools that could make system data more accessible to non-technical users.

Before it was acquired by a private equity firm last year, Sumo Logic strengthened its security information and event management (SIEM) product with acquisitions. Eising said he believes security and observability will eventually merge.

As such systems develop, keeping costs under control will likely require new approaches to log data management, since which logs are actionable will depend on the context, Eising predicted.

"[In traditional SIEM products], you proactively define what information is considered security information," he said. "All of that needs to change. Everything should be considered observable data."

Because we can show, on a daily basis, the effects of making changes, teams and Sumo owners get a good, responsive financial system.
Iwan EisingService reliability architect team lead, Infor

As log management gives way to observability, vendors such as Sumo and AWS are moving to data lakes that accommodate high volumes of data more affordably. In Sumo's case, its security data lake can ingest data from its AWS counterpart for further analysis. But Eising said it would be more efficient for Sumo to access data directly within the Amazon Security Lake rather than requiring it be ingested into Sumo's back end first.

According to a Sumo Logic spokesperson, "Sumo Logic ingests all that log, events, metrics, traces and other data from various components and services across our customer's organization and unifies this telemetry in one place, the Sumo Logic log analytics platform. Our customers are dealing with diverse data everywhere, and we want to make it easy for them to gather insights from their data quickly."

Beth Pariseau, senior news writer at TechTarget, is an award-winning veteran of IT journalism. She can be reached at [email protected] or on Twitter @PariseauTT.

Dig Deeper on IT systems management and monitoring

Software Quality
App Architecture
Cloud Computing
SearchAWS
TheServerSide.com
Data Center
Close