Find Resources and Connect with members on topics that interest you.

AI - Acquire and Implement

PO - Plan and Organize

DS - Deliver and Support

Please sign in to see your topics.

Subscribe to this discussion

Auditing incident and problem management processes

I'm looking for key risks and key controls to adress in incident and problem management processes (audits)..
Thx
Bas
You must sign in to rate content.
(Unrated)

Comments

RE: Auditing incident and problem management processes

Hi Bas, a few risks and controls. Better late than never!

Incident Management:
Key risks:
1) IT Support team is not aware which applications or supporting infrastructure underpin critical business processes
   
Control A: All technology components that underpin the service are categorsied as Configuration Items (CIs) and are maintained within a Configuration Management Database (CMDB)

2) The level of responsiveness to reported Incidents is inconsisent and affects business productivity

Control A: Incidents are recorded in an Integrated Service Management (ISM) System (Remedy 7, HP Service Manager, Service-Now, Prolin Smart Client etc etc) and identify the user affected and if possible, the service and configuration item affected.

Control B: Incident records are prioritised based on Impact and Urgency which determines Priority. The higher the Priority, the more responsive the IT Support team should be to restoring the service to normal operating conditions. Normal conditions may be defined in a Service Level Agreement. Be on the lookout though for fire-fighting behaviours though. This is where high priority issues are dealt with immediately and medium to low priority calls are taking in excess of 20 days to be resolved. A key report to analyse here would be the "Aged Calls" report which will list the Incident records that are older than 30 days and still not resolved.

Problem Management
Key risks:
1) Recurring Incidents are not being identified as Problems or recorded within an ISM system (as above)

2) Recurring Incidents are not related or linked to Problem records (therefore hard to determine impact and resolution)

3) Problem record exists but little to no root cause analysis (RCA) is being performed. (Analyse a few RCA reports and determine the level of effort and approach undertaken to determine causes, not symptoms)

4) Root causes are identified (thus becoming a known error) but do not result in a subsequent Change record (RFC) being raised. (Sometimes the cost of change may be excessive so the business "puts up" with the problem, but generally you should be able to see the Incident records which led to the Problem record being raised, the subsequent RCA report and then the Change record. Make sure the CI listed in the Change record is relevant to the CI identified in the RCA report.

Cheers, Matt
Matt_CampbellLively at 11/21/2011 5:48:31 PM Quote
You must sign in to rate content.
(1 ratings)

RE: Auditing incident and problem management processes

Hi Bas, a few risks and controls. Better late than never!

Incident Management:
Key risks:
1) IT Support team is not aware which applications or supporting infrastructure underpin critical business processes
   
Control A: All technology components that underpin the service are categorsied as Configuration Items (CIs) and are maintained within a Configuration Management Database (CMDB)

2) The level of responsiveness to reported Incidents is inconsisent and affects business productivity

Control A: Incidents are recorded in an Integrated Service Management (ISM) System (Remedy 7, HP Service Manager, Service-Now, Prolin Smart Client etc etc) and identify the user affected and if possible, the service and configuration item affected.

Control B: Incident records are prioritised based on Impact and Urgency which determines Priority. The higher the Priority, the more responsive the IT Support team should be to restoring the service to normal operating conditions. Normal conditions may be defined in a Service Level Agreement. Be on the lookout though for fire-fighting behaviours though. This is where high priority issues are dealt with immediately and medium to low priority calls are taking in excess of 20 days to be resolved. A key report to analyse here would be the "Aged Calls" report which will list the Incident records that are older than 30 days and still not resolved.

Problem Management
Key risks:
1) Recurring Incidents are not being identified as Problems or recorded within an ISM system (as above)

2) Recurring Incidents are not related or linked to Problem records (therefore hard to determine impact and resolution)

3) Problem record exists but little to no root cause analysis (RCA) is being performed. (Analyse a few RCA reports and determine the level of effort and approach undertaken to determine causes, not symptoms)

4) Root causes are identified (thus becoming a known error) but do not result in a subsequent Change record (RFC) being raised. (Sometimes the cost of change may be excessive so the business "puts up" with the problem, but generally you should be able to see the Incident records which led to the Problem record being raised, the subsequent RCA report and then the Change record. Make sure the CI listed in the Change record is relevant to the CI identified in the RCA report.

Cheers, Matt
Matt_CampbellLively at 11/21/2011 5:48:31 PM Quote
You must sign in to rate content.
(1 ratings)

RE: Auditing incident and problem management processes

Hi Bas, a few risks and controls. Better late than never!

Incident Management:
Key risks:
1) IT Support team is not aware which applications or supporting infrastructure underpin critical business processes
   
Control A: All technology components that underpin the service are categorsied as Configuration Items (CIs) and are maintained within a Configuration Management Database (CMDB)

2) The level of responsiveness to reported Incidents is inconsisent and affects business productivity

Control A: Incidents are recorded in an Integrated Service Management (ISM) System (Remedy 7, HP Service Manager, Service-Now, Prolin Smart Client etc etc) and identify the user affected and if possible, the service and configuration item affected.

Control B: Incident records are prioritised based on Impact and Urgency which determines Priority. The higher the Priority, the more responsive the IT Support team should be to restoring the service to normal operating conditions. Normal conditions may be defined in a Service Level Agreement. Be on the lookout though for fire-fighting behaviours though. This is where high priority issues are dealt with immediately and medium to low priority calls are taking in excess of 20 days to be resolved. A key report to analyse here would be the "Aged Calls" report which will list the Incident records that are older than 30 days and still not resolved.

Problem Management
Key risks:
1) Recurring Incidents are not being identified as Problems or recorded within an ISM system (as above)

2) Recurring Incidents are not related or linked to Problem records (therefore hard to determine impact and resolution)

3) Problem record exists but little to no root cause analysis (RCA) is being performed. (Analyse a few RCA reports and determine the level of effort and approach undertaken to determine causes, not symptoms)

4) Root causes are identified (thus becoming a known error) but do not result in a subsequent Change record (RFC) being raised. (Sometimes the cost of change may be excessive so the business "puts up" with the problem, but generally you should be able to see the Incident records which led to the Problem record being raised, the subsequent RCA report and then the Change record. Make sure the CI listed in the Change record is relevant to the CI identified in the RCA report.

Cheers, Matt
Matt_CampbellLively at 11/21/2011 5:48:31 PM Quote
You must sign in to rate content.
(1 ratings)

Leave a Comment

* required

You must login to leave a comment.