Incident Management is a process of IT Service Management (ITSM). This process is focused on returning the performance of your organization’s services to normal as quickly as possible. Ideally, in a way that has little to no negative impact on your core business. This means incidents sometimes rely on temporary workarounds, while you identify the root problem of an incident afterward.
What is an incident?
An incident is a single event where one of your organization’s services isn’t
performing as desired. This also includes internal services. For instance, a broken printer, or a PC that doesn’t boot properly. According to ITIL principles, callers or service desk employees log an incident after it’s been reported. Open incidents are monitored until they’re resolved and/or closed. In some ITSM tools, you can use standard solutions to quickly resolve recurring incidents.
Incident vs Change/Problem
An incident concerns a brief disruption to one of your organization’s (IT) services. But a change or problem is bigger than that. In ITSM, a change concerns activities such as replacing someone’s workstation (a simple change). Or even replacing a whole department’s workstations (an extensive change).
In ITSM, a problem is used to register recurring disruptions to your IT infrastructure. For instance, if one printer breaks down every week, it’s no longer efficient to fix it every week. In that case, it’s better to register a problem in your ITSM tool and find the underlying cause.
Process
Incident identification
The first step in the life of an incident is incident identification. Incidents come from users in whatever forms the organization allows. Sources of incident reporting include walk-ups, self-service, phone calls, emails, support chats, and automated notices, such as network monitoring software or system scanning utilities. The service desk then decides if the issue is truly an incident or if it’s a request. Requests are categorized and handled differently than incidents, and they fall under request fulfillment.
Incident logging
Once identified as an incident, the service desk logs the incident as a ticket. The ticket should include information, such as the user’s name and contact information, the incident description, and the date and time of the incident report (for SLA adherence). The logging process can also include categorization, prioritization, and the steps the service desk completes.
Incident categorization
Incident categorization is a vital step in the incident management process.
Categorization involves assigning a category and at least one subcategory to the incident. This action serves several purposes. First, it allows the service desk to sort and model incidents based on their categories and subcategories.
Second, it allows some issues to be automatically prioritized. For example, an incident might be categorized as “network” with a sub-category of “network outage”. This categorization would, in some organizations, be considered a high-priority incident that requires a major incident response.
The third purpose is to provide accurate incident tracking. When incidents are categorized, patterns emerge. It’s easy to quantify how often certain incidents come up and point to trends that require training or problem management. For example, it’s much easier to sell the CFO on new hardware when the data supports the decision.
Incident prioritization
Incident prioritization is important for SLA response adherence. An incident’s priority is determined by its impact on users and on the business and its urgency. Urgency is how quickly a resolution is required; impact is the measure of the extent of potential damage the incident may cause.
- Low-priority incidents are those that do not interrupt users or the business and can be worked around. Services to users and customers can be maintained.
- Medium-priority incidents affect a few staff and interrupt work to some degree. Customers may be slightly affected or inconvenienced.
- High-priority incidents affect a large number of users or customers, interrupt business, and affect service delivery. These incidents almost always have a financial impact.
Incident response
Once identified, categorized, prioritized, and logged, the service desk can handle and resolve the incident.
Incident resolution involves five steps:
Initial diagnosis. This occurs when the user describes his or her problem and answers troubleshooting questions.
Incident escalation. This happens when an incident requires advanced support, such as sending an on-site technician or assistance from certified support staff. As mentioned previously, most incidents should be resolved by the first tier support staff and should not make it to the escalation step.
Investigation & diagnosis. These processes take place during troubleshooting when the initial incident hypothesis is confirmed as being correct. Once the incident is diagnosed, staff can apply a solution, such as changing software settings, applying a software patch, or ordering new hardware.
Resolution & recovery. This is when the service desk confirms that the user’s service has been restored to the required SLA level.
Incident closure. At this point, the incident is considered closed and the incident process ends.