A mission-critical application is a software program or suite of related programs that must function continuously in order for a business or segment of a business to be successful. If a mission-critical application experiences even brief downtime, the negative consequences are likely to be financial. In addition to lost productivity, a mission-critical app's failure to function may also damage the business' reputation. Examples of mission-critical applications vary from industry to industry. For example, an automatic vehicle locator (AVL) app might be mission-critical for an ambulance company but if a plumbing business uses the same software, it may be characterized as being important, but not essential.
When deploying mission-critical software, information technology (IT) administrators must determine exactly what support is necessary to ensure an application's ability to function under sub-optimal circumstances. For example, if a server handles transactional data, it should have multiple, redundant power supplies to keep the server running in the event of a power outage. Depending on the company's budget and data center physical infrastructure, mission-critical applications require N+1 redundancy at a minimum. Making sure that help desk support is available 24/7 can also help administrators make sure mission-critical applications are always available, as can frequent and automated backups to protect applications from corruption or deletion.
IT administrators often tier their disaster recovery plans to prioritize the restoration of mission-critical applications and sometimes choose not to update mission-critical applications as frequently as lower priority applications in order to reduce the risk of introducing changes that might cause problems. Although many companies today use cloud storage to provide redundancy for their mission-critical apps, the question of whether or not to actually host mission-critical applications in the cloud is still quite controversial and the choice depends on many variables, including regulatory compliance requirements and trust in the cloud provider's ability to provide security and meet service level agreements (SLAs).
Characteristics of a mission-critical application
An application is mission-critical when it is essential to operation. Mission-critical applications should not experience any downtime when end users are likely to utilize them.
There are many possible mission-critical IT services, and the importance of various systems is different from ecosystem to ecosystem.
The architects, developers, testers, and IT operations and support teams supporting a mission-critical application must value stability and availability. Efforts to ensure continuous operations include redundant copies of an application, IT systems and data center infrastructure on which it runs; hot backups; duplicate staging and production environments for thorough testing; and other measures.
An organization might choose not to update mission-critical applications as frequently as lower priority applications to reduce the risk of changes causing problems.
Mission-critical vs. business-critical
Labeling applications and workloads makes it easy to prioritize each element of an application for updates, troubleshooting and maintenance. There is no set, agreed-upon critical tier list. An enterprise and a startup might categorize applications completely differently. Similarly, companies in one industry vertical might have more applications categorized as mission-critical than companies in another. Regardless of categorization, common labels include mission-critical, business-critical and low priority.
In contrast to a mission-critical application, a business-critical application is important to the company's operation, but the overall organization can still function at a basic level in the event of its failure. A business-critical application outage hinders productivity or user experience, but not to the same degree of severity as the failure of a mission-critical application, and users can turn to alternatives as IT restores service.
If a business-critical application were to fail for an extended amount of time, the organization would be seriously impacted monetarily. However, a business-critical system can be down for a small amount of time, such as a matter of hours, and not seriously damage the business or revenue.
Noncritical, also called low-priority, applications can remain unusable or at low performance after a failure for days or weeks, with only a minor effect on the IT environment or business. A rarely used business application might be considered a noncritical application, as could one that simply makes other tasks easier to accomplish.
Mission-critical IT setups
When deploying mission-critical IT, businesses should evaluate their technical requirements for resilience and application availability, as well as their requirements for return on investment and cost. For example, if a server supports day-to-day operations, it should have multiple redundant power supplies and backups. If a data center loses power, a backup power supply can keep all the mission-critical systems online with minimal interruption. Depending on the company's budget and data center physical infrastructure, this could mean N+1 or even higher redundancy.
The +1 in N+1 represents an additional backup component, which can increase depending on how many backup components are added to the system -- N+2, N+3 and so on. N+1 redundancy adds an additional component as a backup. For example, a server has an additional server configured to the same specifications as the normal application to take over the workloads if the app fails. Copies of an application can provide the same kind of redundancy for mission-critical services.
Frequent and automated backups ensure any mission-critical application software's configuration and updates are preserved in the event of a service degradation or outage, and they protect applications from corruption or deletion. IT organizations can tier disaster recovery plans to prioritize the restoration of mission-critical applications.
Help desk support -- either 24/7 or 9-to-5 -- is a way to ensure the availability of mission-critical computing and applications. The designations 24/7 and 9-to-5 refer to work week hours for the help desk. Mission-critical hardware or software should have 24/7 help desk support. Support from 9-to-5 involves help desk coverage for services during typical office hours.
Mission-critical applications on public cloud
The IT industry is embroiled in a debate about the merits and dangers of hosting mission-critical applications in the public cloud. The choice depends on many variables, including regulatory compliance, security, performance and availability. The regulatory variables include government obligations or laws that restrict where applications and data can be hosted and stored.
Public cloud technology has matured in the areas of security, performance and availability since its inception. As public cloud providers, such as AWS, grow larger and more accomplished, security becomes less of a concern.
In some cases, an organization can save money by relying on security services from a public cloud vendor rather than investing in specialized tools and staff. Similarly, some organizations prefer to control the entire IT infrastructure for mission-critical applications to ensure the availability of resources for optimal performance, while others turn to cloud providers and specify the capacity and scalability required, leaving the vendor to manage the infrastructure and resources.
Availability is another major factor for organizations thinking of moving their mission-critical applications to the cloud. Availability depends on the ability of the cloud provider to keep their service up and running.
Some industry experts argue that public cloud providers are better at maintaining infrastructure uptime than individual IT organizations running data centers. However, if a public cloud provider's services become unavailable, the users' organizations will be unable to resolve the issue.
In 2013, Amazon Simple Storage Service (S3) experienced slow performance for around four hours. Websites experienced increased loading times of more than 1,000% of normal activity. The region in which S3 slowed down, U.S. East-1, was a major hub for customer data and client usage. Larger companies that hosted business and mission-critical applications using S3 lost around a million dollars during the slowdown.
This decision should be made on a case-by-case basis depending on the size of the organization, the perceived risks and the application specifics.