olly - Fotolia


Anticipate the unexpected with application capacity planning

Disasters and crises that disrupt daily routines are inevitable. Follow these app capacity guidelines to prepare IT teams for such events and maintain business capabilities.

As IT organizations face crises in a time where the internet is almost ubiquitous, they must build on the lessons they learn to more easily handle application capacity planning.

Although it might be too late to deal with the root cause of many app capacity issues in the midst of a disaster, IT teams can still ensure everything is in place to manage the next inevitable crisis -- even while they work from home, using communications and collaboration technology to act as a cohesive unit.

Prepare for remote monitoring

The first thing to account for in application capacity planning for crises is a shortage of engineers with access to data center equipment -- whether that equipment resides in an owned facility, a shared colocation or is part of cloud-based infrastructure. Establish monitoring processes for the overall application platform, along with automated remediation capabilities to resolve as many problems as possible without human intervention. Do not rely on the availability of humans close to the equipment, as a crisis could enforce a unilateral work-from-home mandate -- and thus eliminate that availability. Remote monitoring and management tools, such as Kaseya VSA, WhatsUp Gold and ConnectWise Automate, fill this role.

At the software level, remote access is already in place for the majority of organizations. However, monitoring is still necessary, and tools such as Dynatrace -- generally combined with an automated software remediation system like Puppet or Jenkins -- ensure IT teams quickly identify issues and resolve them automatically. Service desk tools, such as ServiceNow and Vivantio, offer a range of either direct remediation services or tight integrations with external system that automate remediation with zero to little human input.

SolarWinds and BMC also offer a set of integrated software that provides a broad raft of remote monitoring, remediation and support capabilities.

Anticipate the need for manual effort

Even with automated remediation in place, administrators will still need to log in remotely to manage app capacity issues they cannot automate. Nearly all monitoring and management tools offer such capabilities. However, opening the platform to a much larger number of administrators and other staff working remotely requires advanced security. Do not share admin usernames and passwords, apply two-factor authentication wherever possible and enforce full audits of access and actions. Enable automated rollback on full or partial failure that doesn't require the administrator to log back in.

Plan for user access issues

When working from home, expect user access to systems to vary more than it does in the office -- both at the device and connectivity level. Any organization that does not fully support a plethora of device types, from small-format cellphones and tablets to full-sized PC screens, will be at a disadvantage when attracting and keeping customers. Additionally, those with sites that consume resources heavily will lose potential customers with connectivity that is low-bandwidth, carrier-throttled or highly shared. For emergency application capacity planning, ensure that sites are resource-friendly, or have sufficient built-in intelligence to decide which type of front end to present to the user.

Applications must be able to manage usage surges. Ensure that resources can be flexibly provided: Virtualize and automate the app to ensure that users do not suddenly experience performance issues due to unavailable resources.

Meet business expectations

At a business level, ensure that the user has a good transactional experience. For example, it is a waste of time for a retail application to show items as being in stock when they aren't, or to let a customer fill an online shopping cart to only find that the items can't be delivered for an extended period of time. Set expectations as far upfront as possible -- it is better to lose a customer for a single transaction through honesty than to lose them entirely for hiding the truth until after they've spent a great deal of time.

With remote chat, organizations can have employees work from home and still interact with and guide users through problems. Frequently asked questions (FAQs) pages lower the volume of requests for human assistance and prevent issues with automated messages in both webchat and telephone interactions. Intelligent FAQs and chatbots that use live data such as stock levels, delivery availability and return status make life easier for both the organization and the customer. This extra functionality requires extra resources, though, so account for this in application capacity plans.

Dig Deeper on Systems automation and orchestration

Software Quality
App Architecture
Cloud Computing
Data Center