auremar - Fotolia
Downtime and outages are inevitable. That makes the IT service desk the most visible and critically judged part of IT operations.
No IT service provider, from the wealthiest web companies to local businesses, guarantees 100% uptime. What makes an IT service desk effective is what it does when things break.
To build a successful IT service desk, start with workable SLAs. Set up communication channels to users about problems and outages and track metrics to gauge whether the service desk is improving or needs work. For these three tasks, solicit feedback from service desk customers and business stakeholders.
An IT department exists to serve business needs. The feedback from internal users -- or external customers, if that's who your organization supports -- is the primary indicator of how the service desk has handled an issue or outage.
A service-level agreement (SLA) formalizes business expectations for IT performance. It might set incentives or penalties related to these expectations. The SLA should be in place and understood by all parties before any outage can possibly occur. Over time, the SLA might need to change to match shifting business demands or system realities. Organizations can also set different SLAs, such as one for mission-critical applications and another for lower-priority systems.
Customers, internal or external, need an easy way to provide feedback to the service desk. Offer a feedback form on a per-ticket basis, or as a general option available at any time. User satisfaction with the service you're providing eclipses how well IT handles everything else.
Be alert to negative perception, as it can lead to shadow IT and even service desk outsourcing.
Direct and indirect communication
In the event of an outage or service disruption, communication should be your first focus, even before technicians analyze statistics or attempt a fix. If you get the message out quickly when something's broken, fewer calls and tickets will come in. There will still be an increase in service desk contact at the time of a major incident, but less of a spike as would occur if users didn't know what was going on.
Some businesses set up a website to show outages. Be sure to update it frequently and keep the UI simple. Support teams can use a traffic light system -- green is good; red is bad -- to display service issues, but a page that just shows problems is easy to understand at a glance. Microsoft has a good example of a technical issues status page for Microsoft 365 Service administrators.
For some businesses, emails, SMS text messages or instant messages work to alert affected people on the status of an outage, provide any additional information and give an estimated time of restoration.
When an outage has ceased, communicate about any cleanup actions required, such as workstation reboots. Provide an incident summary of what happened. Even if this kind of post-mortem is not required, it's a good habit to get into to keep your team accountable, knowledgeable, transparent and striving to improve the experience for all.
Communication is a two-way street. Encourage customers to give feedback -- both negative and positive -- to the support team on current or planned outage messaging. Is it clear and timely? What is missing?
There are many metrics to track for the IT service desk. Common ones include the average call time, how long it takes to assign a ticket and time from problem reported to incident resolved.
KPIs based on these statistics aren't overly helpful for a single outage. They are useful generally, to measure improvements after changes are implemented around operations at the IT service desk as a whole. When you make changes to processes and practices, see if the IT support staff notice a difference. Compare how the team performs from one crisis to the next. Look for discrepancies and places to improve. While overall stats are helpful, numbers are never the full story, and should serve merely as guidance.
During an outage or incident, focus on KPIs around answer times and response times. The user experience is improved when staff can move through calls quickly and politely. Monitor calls in the queue and wait times, and make more staff members available if needed.
While outage communication is paramount, avoid two bad practices in this situation. First, do not pull IT team members from working on the fix to answer calls. Even if calls in the queue and wait times grow, this is temporary, and the technicians need to work to resolve the problem. Secondly, don't overlook tickets logged in other ways that might also be urgent.
Communicate around these metrics as well. In a longer outage, advise customers about longer wait times. Provide options on how to contact the service desk, and advise on urgent requests.
All hands on deck
It can be tricky to know whether the IT service desk is handling outages and disruptions well. Normal metrics go out the window in an emergency, as call times stretch, unanswered tickets pile up and support technicians work frantically to triage the situation.
In an outage, the IT service desk must:
- scope and identify the issue;
- troubleshoot and work around it;
- identify the root cause;
- implement remediations; and
- avoid future recurrences.
The whole time, communication to the users is crucial.