Showing posts with label note taking. Show all posts
Showing posts with label note taking. Show all posts

Thursday, 13 September 2007

A glimpse of a slick, professional team

I consider note taking to be a key behaviour of members of a world class service management desk. Note taking while investigating issues creates an audit trail that easily gives the engineer working on an issue, as well as others in the team, a trace of how something has been investigated. It allows others to be included on the investigation, allowing them to make contributions.

Without the key behaviour of note taking, the service management desk becomes prone to common problems that frustrate stakeholders of less well managed service desks:
  • Lack of visibility of issues raised by end users
  • Engineers progressing issues in isolation and difficulty in tracking the progress they have made on issues
  • Difficulty in different engineers picking up and progressing issues worked on by other engineers
  • Difficulty in work done by an engineer to be peer reviewed and retrospectively reviewed
  • Over-reliance on specific engineers for specific tasks
I am currently building a service desk team, with many members of the team inherited from elsewhere. Getting their adoption of the note taking practise has been slow to happen, but today I've started to see a glimpse of the kind of slick, professional team we are working towards: able to pass issues between engineers easily, with clear visibility to anyone interested of the technical investigation done and confidence in the capability of the team rather than individuals.

Specifically, the glimpse that I "saw" was that
  • Engineer 1 completed some work to build a new server to a very particular specification. He had recorded the details of his investigation on the ticket that was raised, #12. At first glance, the notes on the ticket seem excessive and as though not much thought had gone into them. They usually never are excessive and that there are notes always is the key, not necessarily the quality. The build of the server took almost 3 weeks to complete, between Engineer 1 working on other things.
  • Recently, almost 3 months later, a similar request, #354, came in for machine of the same specification to be built. In the past the engineer picking up the issue would have had to reinvestigate and re-determine how to build such a machine. In fact, the task of building this machine might have fallen to the same engineer who had previously worked on the issue, as that engineer might remember some of the details of what they had done 3 months previously in the previous occurrence.
  • However, because there are sufficient details on #12, a new engineer (Engineer 2) was able to pick up the new ticket, #354, and complete the work for the new server. I'm sure he sought clarification from Engineer 1 on some things, but there is enough in #12 to confidently work on this new similar issue on his own. He was also able to complete the work for ticket #354 quicker than the time taken to complete #12 – days rather than weeks. This is because he did not have to do any rework or reinvestigation done for #12.
  • This alone I thought was a great improvement in working practises…. But it gets better! Engineer 2 was away today and a further request was made on #354 by the user who logged the issue. In the past, this might have had to wait for Engineer 2 to return to work because no one would have been quite sure of what had been done. However, Engineer 2 had also made notes on #354 as he progressed the issue meaning a third engineer, Engineer 3, could respond to this and progress the issue further.
We still have some way to go for stories like this being true of every incident that we deal with. However, I think it is encouraging that we are now starting to the note taking behaviour being adopted and the benefits of this.

Wednesday, 29 August 2007

Cultural Change: The most fundamental task. The most difficult task?

It is one thing to talk about "best practises". It is quite another to have them implemented and working effectively within a team or organisation. Since the end of June, when I took a team on to shape and mould into an Operations team, perhaps the most striking problem has been some members' shear resistance to adopting new working practises. This has to be a problem in any organisation attempting to improve.

In particular, I want the team to make notes on their incident investigations. There are a multitude of reasons for this, such as allowing other engineers to review and continue the work if necessary and allowing the notes to be reviewed retrospectively if similar incidents occur in future.

My first and default method for getting the engineers to follow this practise was to tell the team quite simply what my expectations were and that we should be doing with respect to taking notes. This was enough for one of the team of 5 to take it all on board and start working as expected.

I then worked through some problems and showed how this could be of benefit. No further engineers were swayed to this new way of working.

The next step was to organise a "training". In this, I invited the users of the Operations teams - project manager, developers and others who would be using the services of the Operations team. I asked them to tell me what they thought would make a great Operations team. They came up with suggestions like "knowledge sharing in the team", "clear idea of where an investigation is and the process". I then went through how I would investigate a problem using this note taking working practise and how this satisfied their requirements. Almost all the users liked what they saw and approved.

The training had an interesting affect - one of the engineers requested to change teams soon afterwards, leaving a team of 4. The others became more convinced of the usefulness, but after an initial stab at trying the new method, soon reverted to their old ways.

After a major incident, a retrospective was held and some of the same themes re-emerged: the need to knowledge share, the need for more logging/notes on the investigation. These themes came from the team members themselves, yet still behaviours have not changed and the working practises have not been adopted.

During and after other incidents, users have sent emails relating to these same points and themes. The team members have seen these mails, yet still continue to work in the same way.

Changing working practises and creating a working environment where this is possible has now become the major and most fundamental issue. Everything else, such as what kind of things are "best practise" or IT Service processes, is a secondary issue.

Thursday, 23 August 2007

Giving customers visibility of issue progression - Skype example

A week ago there was a massive outage at Skype - none of their 9 million users could use their service for 2+ days. You can imagine that if Skype is one of the central ways in which you speak with your friends, you would have been very frustrated - the frustration you would feel with an outage to your mobile phone network for a few days.

What is interesting is that they used their blog to keep their users updated on progress: http://heartbeat.skype.com/

If you look at the entries for the month at
http://heartbeat.skype.com/2007/08/
you can see the entries they made throughout the incident to keep their users posted on what was happening. I've copied edited down snippets here and I really recommend going through these updates pretending to be one of the frustrated Skype users wanting their service working. I have further comments below.

Problems with Skype login
By Joosep on August 16, 2007.
UPDATED 14:02 GMT: Some of you may be having problems logging into Skype. Our engineering team has determined that it's a software issue. We expect this to be resolved within 12 to 24 hours...

Thanks for your support
By Villu Arak on August 16, 2007.
We'd like to thank everyone who has taken the time to send us their thoughts...

The latest on the Skype sign-on issue
By Villu Arak on August 16, 2007.
... we wanted to dispel some of the concerns ... The Skype system has not crashed or been victim of a cyber attack...

Further on the sign-on issue
By Villu Arak on August 17, 2007.
...We feel that we are on the right track to bring back services to normal. (Updated at 2:15am GMT)

Where we are at 0400 GMT
By Sten on August 17, 2007.
...We're fixing issues in our networking software and monitoring the clients getting online with increased success...

Looking slightly better at 0700 GMT
By Sten on August 17, 2007.
...even though it is too early to call out anything definite yet we are now seeing signs of improvement in our sign-on performance...

Where we are at 1100 GMT
By Villu Arak on August 17, 2007.
...We're on the road to recovery. Skype is stabilizing... Neither Wednesday's planned maintenance of our web-based payment services nor any form of attack was related to the current sign-on issues in any way.

Update at midnight GMT
By Villu Arak on August 18, 2007.
...Skype presence and chat may still take a few more hours to be fully operational....

The words we've all been waiting for
By Villu Arak on August 18, 2007.
Take a deep breath. Skype is back to normal.

What happened on August 16
By Villu Arak on August 20, 2007.
...The disruption was triggered by a massive restart of our users' computers across the globe within a very short timeframe as they re-booted after receiving a routine set of patches through Windows Update...

Now, as you were waiting for your Skype to start working again... How did that make you feel, reading those updates (compared to not having anywhere to look to see what was going on)? What if they were updating the blog every few minutes as they worked on the problem rather than every few hours and added technical detail which you may or may not understand - would that have made you feel more or less happy that the problem was being investigated and resolved? Then compare that with the service from, for example a bureaucratic government organisation or even your lawyer during the process of buying/selling a house. There is no place to go and see what is happening with your issue and you feeling you are banging our head against the wall - constantly chasing for updates through phone calls or other means.

This Skype example gives a glimpse of what is possible through using a ticketing/bug tracking system when engineers working on the problem update those tickets with notes. The dramatic increase in visibility of an issue being progressed gives greater confidence to customers, reducing their anxiety.