Friday 17 June 2011

Fault Localisation and Search Satisfaction

" #softwaretesting #testing "

A recent meet-up with Christin Wiedemann, Oscar Cosmo and Daniel Berggren to talk testing was a very enjoyable and thought-provoking session. It reminded me of a problem that I'd recently being involved with, where I'd observed the problem of Search Satisfaction.

Background
A tricky performance problem existed and it was getting more and more "stakeholder" attention. The problem description (bug report) was quite detailed around the network for the part of the system under test (SUT) in the picture. The bug report described a fairly major problem in the SUT and the investigation had been ongoing for some time when I eventually came into the loop.

The ongoing investigation and fault localisation was made more difficult by the fact that several different problems were being observed in the whole network simultaneously (multiple problems in each network element) and so rooting out the real problem was quite tricky.

Several rounds of extracting different log and trace information to localise the fault and create patch builds had been made. Then there was some form of inertia - little or no progress.

For some reason I was involved in the loop.

Search Satisfaction
When reading "How Doctors Think", ref [1], I encountered the notion of search satisfaction. This is where a diagnosis is made based on the observed data (or really a subset of the complete symptoms), no further testing or searching is done. The search is satisfied and the diagnosis made.

When I read the bug report for the problem above, and all the associated notes, I found lots of evidence pointing to the SUT. There was a lot of information about faults in the SUT and actions that had been taken to correct them and re-test. However, there was something missing - for me. The main problem/issue didn't seem to be related to the faults that had been corrected.

The description I read in the bug report would justify some of the associated faults being corrected (in an effort to localise the main problem) but not really the most serious fault. For that there needed to be more information on the surrounding network elements - I suspected an interaction problem and to understand that I needed information on the behaviour in the rest of the network.

I raised these issues with the people directly involved. They took another look, now with a different perspective and after some further rounds of localisation (now involving a wider search) the issue was found. A fault in another network element which was masked by the interaction between several elements.

So, in this example the faults being worked on were driven partly by the data available and partly by what could be fixed. The actions on those faults were correct. The search for fixes was satisfied by some of the symptoms that could be seen. But did those symptoms explain the severity of the main problem? Sometimes that's an important question - I can observe symptom X, Y & Z, but do they explain the main problem?

Search Satisfaction - How?
Search satisfaction is not just about fulfilling a search and not wanting to continue search. We learn to see patterns, get comfortable with patterns and recognize patterns. This has been demonstrated in different studies.

In one case, ref [2], it was found that Americans and Chinese subjects focus on different aspects of a picture. These habits of perception are thought to be influenced by the environment and society that they live in - people learn what they're used to and that is the default habit.

Another study, ref [3], showed an experiment where kittens had learned to perceive their environments in different ways (by being deprived of horizontal or vertical perception) and then had problems when they needed to use the perception they'd been deprived of.

Another favourite perspective of mine - which I use when thinking about open-ended tests - is the example in this picture from hubble, here. Read the comment. Sometimes if you think you won't see anything, you might not see anything - so sometimes you need to keep looking!

So, whether seeing a pattern in what one observes when testing, those patterns can be something that has been learnt. It fits the observation. But is it the only explanation, or is there a different observation (perspective) that might contradict the hypothesis?

Observation
In the example that I observed there were two factors that kept the search "satisfied" and constrained. These were a fixed focus and constant stakeholder attention.

Too focussed:
De-Focus and Re-Focus techniques can help avoid this problem. Sometimes you have to step back and observe what's going on around the periphery to then make some different sense of the detail. This is sometimes easier said than done. Sometimes it needs a fresh pair of eyes to trigger this.

Stakeholder "attention":
This can actually induce search satisfaction. Someone wants an answer (or diagnosis) and you have an answer staring right at you. This is more difficult to avoid, but a first step to avoiding this is realizing that you might be operating with limited data.

Lessons from Science
Don't stick to confirmatory approaches. Think about disconfirmatory evidence - how could my theory be proved wrong or incomplete? Do I have enough information? Can I think of a way in which it might not be a sufficient (or good enough) theory?

References
  1. How Doctors Think (Groopman, Houghton Mifflin 2007)
  2. Cultural variation in eye movements during scene perception (Chua, Boland and Nisbett, PNAS 2005)
  3. Development of the brain depends on the visual environment (Blakemore and Cooper, Nature 1970)

Friday 3 June 2011

Carnival of Testers #22

" #softwaretesting #testing "

Diverse is the word for the writing in May...

Learning & Insights

  • Pete Walen highlighted parallels between deduction and test investigation (a la House), here.
  • Some relevant learnings for testers from Ralph van Roosmalen, here.
  • Albert Gareev made a triple post of his interview with Michael Bolton, good reading here, here and here.
  • THE happy tester, Sigge Birgisson, wrote about his optimistic and positive approach to testing, here.
  • Binary disease, and some of the issues it causes testers, was the subject of a post by Rikard Edgren.
  • Some interesting perspectives on memetics in testing were posted by Peter Haworth-Langford
  • Rob Lambert was prolific in May, with this post on sharing information and communicating just one of many examples.
  • I loved the dashboards and representation ideas in Trish Khoo's post, here!
  • Christin Wiedemann illustrated that sometimes if it tastes "off" maybe it is "off", here.


New Pens

  • New writing this month from David Greenlees, with a piece on the value of conferences, both to yourself and your employer.
  • Claire Moss, started her first blog with an account of her exploratory approach to STAREAST, here.


Conferences

  • Continuing with STAREAST, here is a good overview of Lisa Crispin's experience and learnings on the STC.
  • WAT2 got some good coverage from Markus Gärtner, here, Alan Page, here, and Marlena Compton, here.
  • Markus Gärtner also provided some insights from the Problem Solving Leadership course, here.

This is just a small sample of some of the good writing this past month.

Until the next time...