Reading Archives - JitGo Jitesh Gosai Tester, Speaker, Presenter

August – Toread

31st August

📻 How Can You Stop Comparing Yourself With Other People? If you manage people then this podcast is worth a listen. Having a better understanding of why we compare ourselves to others (social creatures living in hierarchical structures) and what issues it can cause (de-motivation, decreased self-esteem and confidence) can help you from stop doing it but also help you help your reports from falling into the trap. They also cover some biases that can lead to it such as casual inference and narrative fallacy.

💪 How Resilience Works The post calls out three traits for resilience: 1. A grasp of reality 2. Life has a purpose (for you) and 3. An ability to improvise. I’ve not see resilience called out like this before but there are some good anecdotal stories in there and in broad strokes I agree with it. But like a lot of things with the mind its easier said than done. Especially when you’re in the thick of things going wrong. (Book to add to the reading list: Mans search for meaning)

🍏 How Apple controls the App Store and therefore the end users How Ben explains the App Store Integration in stages is really interesting and key to understanding how Apple has so much control over developers and users. This is a long read but worth it to understand Apple’s almost unbelievable control of developers and users. If you want to access Apple’s users then you almost have no choice but to do as they say otherwise they can revoke your certificates and cut you off in a instant. The thing is this integration is so complicated most people are either not going to understand it or take the time to figure it out. This is very different to how Microsoft controlled Windows.

17th August

🗺 Things Jobs said I’m no Steve Jobs fan (in the literal sense of the word) but no one can deny he helped create some incredible products. Every so often I read these quotes from him and depending on what’s going on in my work life they take on a different meaning. But one that always stays with me is this one: “You can’t connect the dots looking forward; you can only connect them looking backwards. So you have to trust that the dots will somehow connect in your future. You have to trust in something–your gut, destiny, life, karma, whatever. This approach has never let me down, and it has made all the difference in my life.”

🏌️‍♀️The Beginner’s Guide to Deliberate Practice What is it? Deliberate practice requires focused attention and is conducted with the specific goal of improving performance, while regular practice might include mindless repetitions. But its not that simple deliberate practice requires that you break down the task into small sub-sections and practice each one till you get better. This is easy if the domain you’re trying to learn is well known, but if it doesn’t have any existing training that you can make use of or you don’t have access to trainers that can help (e.g. mentors or coaches) than you might struggle. I believe this is why it is always a good idea to learn from multiple sources when skilling up in something new that pushes you out of your comfort zone in different ways. If you’re learning something just from one source then keep in mind that it might be one sided…

10th August

🎼 What software teams can learn from music masterclasses Via twitter from ‪@katrinaclokie.‬ Feedback is by far one of the best ways you can learn and Helen makes a great point in that software teams can learn a lot from music masterclasses and studio classes too. Both are great ways to get feedback from more established artists, peers and teachers. But also from peers in different disciplines who can give a viewpoint that your own peer group might not be able to. Another key point Helen makes is that giving and receiving feedback is a skill and as such needs to be practiced to really help people. I don’t think we do this enough in software teams and when we do it’s not always the best. There is a lot we can learn from artistic masterclasses as an industry which I guess reflects the maturity of their professions and the relative youth of ours.

🚽 Code Coverage Best Practices This post from Google testing on the toilet series makes some great points on how code (or test) coverage can be a useful metric for teams to use. The biggest one being about how it highlights code that isn’t covered by tests. This is the perfect opportunity for teams to discuss if it should or shouldn’t. Also the advice on using it to inform on conversations topic for code reviews per commit it also a really good idea. But as the articles points out going straight in with “We should use code coverage!” is probably not going to get you very far. Most engineering teams have been burnt pretty badly by it in the past with developers just trying to hit numbers or it being used to measure the effectiveness of them. Both of which lead to the wrong incentives of number gaming rather than productive conversation starters on what are good and bad tests for your context.

🐦🧵 Everything you needed to know about 2+2=5 Kareem makes a great point that it’s all about context. If you’re thinking just about raw numbers then 2+2 =4 but if the context was say a male cat and a female cat give it some time then could quite easily be 1+1=8. Numbers are an abstraction of the underlaying reality therefore context matters when you’re looking at numbers. One to ponder the next time you’re looking at statistics 🤔

📹 What is white privilege? Via BBC Bitesize from psychologist John Amaechi. This short 3 minute video does a really good job of explain what privilege is and what white privilege in particular means.It’s not that white people have it easy or struggle any less than people of other races. It’s that their struggles are not going to be about their race where as race can be an additional limiting factor from people in the BAME community. In short white privilege means your skin colour will not be used against you.

At the intersection of software, technology and people
Things I’ve been reading this week that I’ve found interesting or intriguing.

How Falling Behind Can Get You Ahead

📹 15 minutes TEDx talk from Manchester 2020 How falling behind can get you ahead some highlights:

“Jack of all trades, master of none,” the saying goes. But it is culturally telling that we have chopped off the ending: “…but oftentimes better than master of one.”
In a society hyperfocused on headstarts, we are told to choose our paths early, focus narrowly, and start racking up our 10,000 hours of deliberate practice. But a mountain of research shows that, among people who end up fulfilled and successful, early specialization is the exception, not the rule.
Winding paths and mental meandering can be sources of power, not disadvantages, but we rarely hear those stories. David is trying to change this.

Specialising may not be such a great thing and references an interesting theory Kind and Wicked Learning environments.

Good explanation here: https://www.driverlesscrocodile.com/books-and-recommendations/david-epstein -on-kind-and-wicked-learning-environments/ from the speaker of the ted talk
Actual paper from Robert Hogarth et al The two settings of kind and wicked learning environments

Kind Learning environments

Kind give lots of feedback as you progress which aid deliberate learning
The rule of the system don’t change either so what it is today is the same tomorrow
Golf and chess are such environments

Wicked learning environments

Mixed levels of feedback as you progress
Rules of the system keep changing
I think software engineering maybe a wicked environment
But need to do more research

The biases called out in the paper could be really helpful when thinking about decision making that affect learning in these environments. I’m also certain I’ve been influenced by survivorship, censorship, selection and the ‘hot stove’ biases.

July – Toread

At the intersection of software, technology and people

What is this?

Things I’ve been reading this week that I’ve found interesting or intriguing. Sharing because I thought you might like them too. Most of the links will revolve at the intersection between software, technology and people – with the occasional testing slant. I aim to update them weekly, with some commentary on my thoughts and findings. Feedback always welcome 😁

📬 Latest post what do testers do next if the risks mitigated by manual testing can be reduced through other means? Is it about moving more towards creating a quality culture and if so what do you need to know?

📝 My notes on Kind and Wicked learning environments and how they affect your ability to pick up skill.

Some more notes on a really interesting idea from Eugen Wei on Invisible asymptote. See July 10th below for more or head over to my notes on the article that pull out some of the bits I found interesting.

31st July

❓ Four-Level Training Evaluation Model some useful ideas on what to look for when trying to get feedback on your training or other presentations. Another question that comes to mind: Is the training for the learners or for you to accomplish/be recognised for something… 🤔

💭 10 signs you’re an over thinker While thinking is obviously a good thing overthinking isn’t. But how do you know when you’re doing the good type of thinking? Simple rule: overthinking is focusing on the problems (by either ruminating about the past or worrying about the future). Good thinking is problem solving by focusing on the solutions and self-reflective thinking is looking at situations from a different perspective and finding new insights.

👷‍♀️ 3 things that motivates us to work From Dan Pink’s RSA lecture based on his book Drive. The three things being autonomy, mastery and purpose. Autonomy is about being self directed over what and how you do something. Mastery is having the ability to get better at something that challenges us and making a contribution. Purpose is the reason for being or why are we doing the thing we do. The interesting thing is this is about individual motivation to work. Does it still apply when working in teams as we do in software?

27th July

A model of what could happen if you dropped the ‘In Test’ column…

👷‍♀️ From ‘In Testing’ to ‘In Progress’ columns on team boards: This has a very narrow focus on just dev and test relationship. This model helps illustrate how improving their relationship and getting them to actively collaborate to improve confidence that the code changes work as intended is going to start having an affect on work in progress (WIP). Which as @johncutlefish shows high WIP can lead to a whole host of other problems. The grey lines are what it was previously with the ‘In Testing’ column broken out into it’s own section.

👯‍♂️ Don’t Mock Types You Don’t Own This happens more often then you realise and leads to lots of other problems the main one being you now have to maintain a mock of a service you don’t own or fully understand how it works. Therefore you’re testing against your assumptions of that service you’re mocking. This could lead to a false sense of confidence that everything will work when you go to production. Ideally you want to be using a stub with little to no logic e.g. little to no assumptions and any made are obvious to other developers. Contract and consumer driven contract testing particularly can help here. The other issue is people use the word mock to mean a whole host of other types of Test doubles (fakes, stubs, spies etc) which leads to more confusion so check what they mean when they say mock before assuming you’re talking about the same thing.

🎓 Accountability vs Responsibility This has been really useful when thinking about who is accountable within teams for tasks and who is responsible. I found others (and myself included!) mix these up. Accountability can not be shared and means you are answerable for your actions where as responsibility can be shared and you must respond when someone questions your actions. Having these distinctions can be really helpful in making sure people understand what they are accountable and responsible for. The comments are worth a read too…

20th July

😱 Programming is not a craft from Dan North in 2011 and I have to say I agree with his take even from back then this still stands. I think this really sums it up “Non-programmers don’t care about the aesthetics of software in the same way non-plumbers don’t care about the aesthetics of plumbing – they just want their information in the right place or their hot water to work”. By putting programming at the centre (by treating it as a traditional craft) and not the value you are delivering you risk building what you want and not what users want/need/care/value. Thats not to say the that the code can be shoddy far from it, but just like the plumbing it needs to work but does it need to be gold plated with silver fixings?

🐦 Learning How to Learn thread from Jez Humble calling out a book: Learning how to learn: A guide for kids and teens. The book aims to help you talk to younger people about how to learn. It covers a really interesting topics called focused and diffuse mode of learning that I hadn’t come across before. There is also a free coursera course by the author on the mental tools of learning covered in the book.

📻 How to make your own luck (podcast) the frame with which you look at world (people, events, things that happen etc) are going to have a big impact on the opportunities that you’re going to find. So what frame are you using when making decisions? The world is a wicked learning environment (slow feedback hard to tell which variable caused the outcome) while poker can be kind learning environment (fast environment, low number of variables, easier to identify mistakes and learning from them) therefore helps you to understand your decision making easier and then possibly translate over to the real world.

You can find more about wicked and kind learning environments from How Falling Behind Can Get You Ahead:
Kind Learning environments

Kind give lots of feedback as you progress which aid deliberate learning
The rule of the system don’t change either so what it is today is the same tomorrow
Golf, chess and poker are such environments

Wicked learning environments

Mixed levels of feedback as you progress
Rules of the system keep changing
I think software engineering maybe a wicked environment

13th July

🤩 Invisible asymptote (AKA The Invisible glass ceiling of testing) Excellent (and long) read from Eugen Wei and a must read for anyone working in product and software development in general. Brilliantly articulates that all products have an invisible glass ceiling and that by recognising your total addressable market it can help you understand when you’re going to hit it and actually do something about it.

Why should testers care?

This is a great way to understand how your product owners might be thinking (or should be if they are not). In terms of product quality this could be one of the lenses from which you should look at your products to understand what is valuable to product owners. It’s also a great way to start understanding what value your product is potentially bringing to your users and what cohort that it is and isn’t addressing. My notes on the article pull out some of the bits I found interesting.

In terms of analysis this hits two of the three domains that testers should have a grasp of: business and users. From that angle we can help the third domain (teams) understand how this affects them.

Remember testing doesn’t always look like testing

🔈 How do you handle criticism Getting feedback is by far the best way to get better but not all feedback is equal. You need to filter out the valuable parts from the things that sting the ego. One way to get better at receiving feedback is to rate yourself on how you respond to it. 5 being excellent and 1 being poor. Did you respond positively and thank them (4 out of 5) or did you try and talk them out of their opinion (2 out of 5). This will help you get better at hearing feedback but also more likely to do something about it.

👩‍💻 Develop your culture like its software Interesting post from 2017 from the ex-engineering manager of The New York times. They used a google doc to make it collaborative and to start iterating on it. Culture is something that either just happens and evolves in a direction out of your control or you try and be deliberate about it. My preference is towards deliberate because then if it starts heading in a direction you don’t want you’re in a position to do something about it. Otherwise you find out when something hits the headlines. At which point its too late to do something…

🏚 Extreme testing Cool video of what IBM do to make sure their mainframes can handle earthquakes. Makes you wonder what type of testing AWS/Asura/GCC do for all their server farms

6th July

👩‍🏫 Professionalism is not enough via Ten things I’ve learned by Milton Glaser

…when you are doing something in a recurring way to diminish risk or doing it in the same way as you have done it before, it is clear why professionalism is not enough. After all, what is required in our field, more than anything else, is continuous transgression. Professionalism does not allow for that because transgression has to encompass the possibility of failure and if you are professional your instinct is not to fail, it is to repeat success. So professionalism as a lifetime aspiration is a limited goal.

🥾 New employee bootcamp really interesting approach to getting people (product owners in this case) up to speed quickly and productive within their work. I really like the concept of “put your own gas mask on first before helping others” in terms of helping them figure out their own career paths. What would this look like for on boarding new testers in a team?

🧫 What is culture? I was doing some research on this and it turns out (unsurprisingly) that its not that easy of a question to answer but the Centre for Applied Linguistics at the University of Warwick (UK) has some really good resources. In particularly this doc which tries to answer that very question in a way that is approachable and can actually help you understand what it is. They break it down into 12 key characteristics but I think this explanation from Spencer-Oatey (2008) does a pretty good job:

“Culture is a fuzzy set of basic assumptions and values, orientations to life, beliefs, policies, procedures and behavioural conventions that are shared by a group of people, and that influence (but do not determine) each member’s behaviour and his/her interpretations of the ‘meaning’ of other people’s behaviour“

https://blackswanfarming.com/value-a-framework-for-thinking/

🤑 What is value? Interesting way of thinking about what value means. In this model there are two focusing areas: revenue and costs. How does something sustain revenue, increase revenue, avoid cost and/or reduce cost. By applying a monetary number to these you can then discuss them in a way that everyone understands and can hopefully agree on. The other reason for relating this back to a number is having a discussion on what assumptions people are making about those numbers.

Thanks to Duncan Nisbet for his intriguing blog series on cost of delay Vs cost of poor quality which linked me to the above post. In Duncans post he does a really good job of showing why trying to answer that question is really difficult and is setting up a framework in trying to do just that. I’m looking forward to seeing how this works out!

June – Toread

Toread

At the intersection of software, technology and people

What is this?

29th June

🎓 Why Your Organization Isn’t Learning All It Should In your every day work do you have time to look at the cause of small issues or do you work around them with hacks and fixes? Can you bring them up with other colleges or are they brushed off as trivial or worse are you labelled as a complainer? As detailed in How to learn from failure and Nobody ever gets credit for fixing problems that never happen it’s usually these small preventable problems that build up into big complex problems so having time to get at the 2nd order problems (root causes) is really important. Also research suggests that once you start fixing these issues there is a re-enforcing effect that encourages others to raise problems but also start to address them too, slowly improving the whole system.

🖥 Brief history of Apples OSX If you primary work machine is a Mac then having a basic understanding of its history helps you understand why it works the way it does. If it’s not your main machine then this short history lesson can also help you understand what Apple did that was different to iOS and to some extent Windows.

🔬 First and second order problem solving From the article: research on problem solving makes a distinction between fixing problems (first-order solutions) and diagnosing and altering root causes to prevent recurrence (second-order solutions).First-order problem solving allows work to continue but does nothing to prevent a similar problem from occurring. Workers exhibit first-order problem solving when they do not expend any more energy on a problem after obtaining the missing input needed to complete a task.Second-order problem solving, in contrast, investigates and seeks to change underlying causes of a problem.

👩‍🏫 XConf Online Usually a free to attend conference held by ThoughtWorks at different locations throughout Europe with one being in Manchester. But COVID means it’s all online so anyone from anywhere can attend this year. It’s usually a multi track conference but this time they’ve just gone with one track with all of the talks coming from ThoughtWorks employees.

In previous years they’ve never really had any talks on testing but this year they had three covering what a unit is for a mainframe system, testing things in production and mutation testing. The interesting thing was none of these talks where given by testers but developers. Is this a sign of things to come 🤔?

All talks should hopefully be made available soon so I’d highly recommend checking them out. Personally I got a lot out of the Redefining the unit and Your test coverage is a lie (mutation testing) talks for which you can find summaries in my notes.

22nd June

🗞 75 years of US advertising – This will help you create a deeper understanding of the advertising market and how the trends in online businesses and consumer behaviour (think end users) is sifting power from the traditional advertising conglomerates (print/TV/radio) to the online ones (Google, Facebook, Amazon etc). It’s not a simple lift and shift but a decline in business that did mass marketing to business that need targeted advertising.

🕵️‍♀️ Snopes on white privilege in America – Snopes was something I used a lot when I was in my late teens/twenties to disprove things that used to be flying about on the Wild West of the early internet days. This time they pull part an argument that white privilege doesn’t exist in America. This is a long but worth while read on what white privilege is and how its affected black Americans lives.

🍏 Apple begins to enforce in-app purchases (IAP) of *any* good – Why should testers care? Understanding the what of this is useful in knowing how it could affect the apps you work on and other apps in the market.

🤖 Ironies of automation The two ironies

Designers of automation believe that the manual operators are unreliable and sources of problems. Therefore believe that by automating the manual task remove the unreliability. The irony here is that the errors that the designer unintentionally creates in the automation are the major source of unreliability in the process
Designer who attempt to automate the manual operator away still leaves the manual operator in the process to handle the tasks that couldn’t be automated. The problem here is the tasks left for the manual operator can be quite arbitrary with little thought for how they will actually carry out the task resulting in a new set of issues.

This is not to say that automation is bad but some of the intention (removing people) may not be for the right reasons (improving the process). I believe if the focus was to improve the process then the risk from the two ironies could be greatly reduced. One of the best ways to achieve this is to actually include the manual operator as part of the improvement process. Essentially have a user centric approach to improving systems.

15th June

🎓 SQ3R learning technique 30s reading time
Not come across this one before but could be useful for helping you recall things you’ve read . I particularly liked attaching meaning to the content and looking at it from different viewpoints .Personally I’ve found I’ve done this unknowingly with some things and it really does help with recall. Plus trying to re-call it again at a later date and applying that knowledge to different situations has helped further embed that information.

🗣 Asking Powerful questions 1 min reading time
The Powerful questions pyramid could be a helpful tool for asking more open questions that get people thinking.

📑 Making job descriptions more accessible 3 mins reading time
Stop reusing job descriptions and start tailoring them to the candidates you want. We’re not doing ourselves any favours by sending out the same old thing every time.

5th June

📰 The Origin of Product Discovery or what is Product Discovery? … true collaborations between engineers, designers and product managers. Not a dictatorship run by a Product Manager. But what does true collaboration in this context actually mean? Which leads me onto a scales of collaboration…

⚖️ Scales of collaboration While thinking about how people and teams collaborate it occurred to me that there must be scale from not at all to full collaboration. Quick search returned something called the levels of collaboration. This could this be helpful for teams to 1] make them aware how well they are working together and 2] what would they need to improve to able to get better at it. Maybe we should stops saying we need to collaborate more and start saying what that actually means? Which leads me onto some advice…

☀️🧴It was that time of year again (🎂) so what better then some advice: Learn how to learn from those you disagree with, or even offend you. See if you can find the truth in what they believe taken from 68 bits of (someone else’s) unsolicited advice. Need some life advice then this list is a good place to start, there’s a video too. But I think I’ll always remember Mary Schmich column, Advice, like youth, is probably wasted on the young better know as the lyrics to Baz Luhrmann song Everybody’s free (to wear sunscreen)

🤔 Long read of the week comes from Joep Schuurkes Nobody ever gets credit for fixing problems that never happen: Creating and sustaining process Improvement. The work harder and work smarter balancing loops in this model do a great job of explains what “work smarter, not harder” actually means. But the reinvestment reinforcing and shortcuts balancing loops really get at what goes wrong with most process improvements initiatives in software organisations: it’s difficult to attribute process improvements to improved performance the way you can with working harder. So we trap ourselves with working ever harder till one day you just can’t do it anymore 🤯

Agile Manchester 2020: Testers Edition

I’ve always found that tester representation at agile conferences to be lacking. It’s a bit like it doesn’t have the word test in the titles so it’s not for me. Personally I’ve always found a treasure trove of information from talks that are directly or indirectly related to software testing. Remember testing doesn’t always look like testing you sometimes need to change the frame with which you look at things to get the best out of them.

Below are the talks that I attended with a brief summary of the talk and what it could mean for testers.

Want the full unadulterated notes then see my personal notes.

Fighting Code Rot with Continuous Improvement

by @garyfleming Slides http://bit.ly/fight-code-rot

Summary of the talk Good talk covering all the basic with keeping your system update and why. Well delivered and really useful for less experienced team members and a good recap for “they should know better” members.

For Testers For testers understanding what needs to be updated when can help them understand how that change could affect end users. Be proactive what do the release notes say for X, how do we use system Y. Building this knowledge takes time but can be really valuable in the long run. Start small and work your way up. Developers can help you but try and help by having specific questions for them.

Agile metrics for predicting the future

Summary of the talk: Forecasts will always beat estimates for non-deterministic projects (think all software projects). As they help you understand what could happen with a confidence rating. You’ve probably already got most of this data but knowing your lead times and throughput can help with this and some spreadsheets.
The key thing to remember is you need to talk to your stakeholder and make sure they understand what the numbers mean and how it affects them. Don’t just give them the spreadsheets and expect them to understand.

For Testers: If quality means value to someone then value to people working in delivery roles is greater predictability with delivering our software systems. Understanding what these metrics are, how they are used and what affects them will not only enable you to have more productive conversation with delivery but also understand why they are important to that group. This will help you articulate risk better within your teams as you will be able to tailor the message to that specific audience. This will not only help you help them understand risk better but increase your value from just being the person that finds all the issues. This elevates you from being just a tester towards a test analysts.

Crucial conversations in agile teams

How making it safe to talk about almost anything unlocks continuous improvement by Chris Smith. Slides: https://www.slideshare.net/chris_smith1976/crucial-conversations-for-agile-teams-agile-manchester-virtual-may-2020

Summary of the talk Really interesting talk by @cj_smithy at #agilemanc focusing around the book Crucial Conversations but bringing in ideas from the Chimp paradox, 5 dysfunctions of team, Radical Candor and generative cultures by Ron Westrum. Conversations within agile software teams are incredibly important (remember individuals & interactions over process & tools) so being better skilled at them is a great thing.

For testers We have conversation with team members all the time. Whether that is to find out information or to inform others about whats going on, its a core part of our skill set. Getting better at communicating verbally should be part of our personally development as testers. The resources mentioned in this talk would go a long way to help you build up that skill and help keep it sharp.

Leading an agile organisation through hyper growth

By Patrick Kua

Summary of the talk The company Patric was a CEO for went through some really fast growth over a very short period. The model he used was simple and acknowledged that it wouldn’t work forever so they kept iterating and scaling the organisation. Useful to benchmark your company against to see where you are in the growth of your organisation.

For testers Understanding how a company develops from startup to enterprise is really helpful in in seeing what types of problems you’re likely to face. This can help you help your team understand how quality is likely to be affected when scaling and what they can do to mitigate it.

Improve your agile coaching skills with a Training from the BACK of the Room

By Sabine Khan

Summary of the talk: Its called back of the room as you’re not using slides decks and you presenting but getting the participants to stand and present instead. Hence coaching from the back of the room.
Really interesting approach to learning and using coaching skills in teaching others. The 6 learning principles are really quite easy to apply and can make almost any session interactive. This combined with the 4C’s gives you a framework to turn any learning topics into something more than just sit and listen.

For tester needs updating: Need to help team members understand what exploratory testing is then why not do something interactive instead of just another slide deck. Teach them through doing and the key points might actually stick.

Culture + Code ≠ Delivery

By Vimla Appadoo @thatgirlvim

Summary of the talk: Vim makes a great point in that just delivering through code and a good culture isn’t enough as this misses how that delivery affects users. Especially if the system has biases built in unintentionally. She says we need communications as well. Communication to be able to link together all the parts of the culture of the organisation, systems, processes and people + the code can help us deliver the right things.

For tester: Testers help raise the awareness of quality within systems, but for them to be able to do that affectively they need to take into account the team, the business and users as well. The key skill to be able to do this is communication and helping to link together all the parts to understand the other. This is not to say that testers are the key to culture but on a smaller team scale they can have huge influence over what direction that culture moves in. Do the teams care about how their systems affect users or wait to see what happens?

How to learn from failure

Reading time 13 minutes

Below are my personal notes from Amy Edmundson excellent article Strategies for learning from failure. It’s a long read but I highly recommend it over my notes as it goes into a lot more detail then I have covered.

Summary

Not all failures are the same and categorisation of failures can make a big difference in enabling learning from them.

Why should testers care?

Considering we deal with software failure all the time we have a tendency to forget the human cost of failures. Especially in terms of how that failure occurred (the team), how that failure affects the users and the outcome for the business. This article is a great introduction in how we can learn from failure first and then how we could enable our teams and business to learn from them by reframing errors as different types of failure.

[Organisations] that catch, correct, and learn from failure before others do will succeed
Amy Edmundson

Amy classifies failure into three types of categories

Preventable
Complex
Intelligent

But we have a tendency to view all failures as one type. In software testing we group them into different levels of risk but generally all failures are error. Which means something isn’t right and should be avoided. We’ve started to try and learn from them but the need for interdisciplinary teams to do so is a cost that is often too high to pay so doesn’t happen very often. I think if we focused our efforts to investigate complex failures we can use the learnings to start minimising preventable issues and stop some of the them happening altogether.

How should we respond to failure?

Some people believe that respond constructively to failures could give rise to an anything-goes attitude. They think that If people aren’t blamed for failures, then how else will they try as hard as possible to do their best work? But this has a tendency to try and avoid failure and in some cases cover it up.

What we actually need is culture that makes it safe to admit and report on failure (so we can learn from them) which coexist with high standards for performance (to make use of that learning to get better).

The blame game

If people see failure as something to be avoid you end up in the blame game. Which has a spectrum of reasons for failure from blameworthy to praiseworthy:

🤔Notice how things that are blameworthy are about individuals but praiseworthy are all about the things.

I wonder how many time people don’t blame others but themselves for the failure and hence keeping quiet or downplaying issues when they occur?

To embrace failure we need to classify it better then the catch all term that failure encourages. Amy Edmundson suggest these three categories: preventable, complex and intelligent failures.

Preventable

These are usually found in routine tasks that are well defined and the outcomes are well understood
Preventable failures tend to occur when we deviate from this routine
In software engineering certain routine task can and should be automated. Such as build processes and specific types of checks
If they do need to be performed manually then tasks lists and check lists are well suited to these types of tasks
- Note: exploratory testing falls under intelligent failures
Failures which result from these types of tasks can usually be mitigate through better understanding of the work we do, how we do it but most importantly why
When we spot these types of failures (deviation from the routine) we should immediately address them
This is in part about stopping errors from being passed down the process and building quality in

Complex failures

Many systems we work in are complex and too big for any one person and in most cases even groups of people to fully understand
This means complex systems can be unpredictable and ambiguous and fail in ways we could not have anticipated
The way in which complex failures occur can in some cases be traced to things all happening in just the right way
But assuming failures will never occur can be counter productive and we should build into the process to handle what happens when things go wrong
When complex failures do occur we should recognise them as such and investigated them in a praiseworthy way to understand all the components that led to the failure and identify if any of the smaller issues that resulted in the failure can be made preventable
- For example
- Most accidents in hospitals result from a series of small failures that went unnoticed and unfortunately lined up in just the wrong way.

Intelligent failures

Named by the Duke University professor of management Sim Sitkin as intelligent failures
These are the failures that occur during experimentation
They help you understand what works and what doesn’t
- And importantly quickly
These are situations where the answers are not knowable in advanced
The only way you can find out is to actually do it
Exploratory testing is all about raising awareness of intelligent failures
As Amy Edmondson calls them they are failures at the frontier
- Situations that haven’t happened before
- Or maybe won’t happen again
For software engineering this is a lot of the work that we are doing
- Hence agile software development so we can adapt to the changing environment
- To do things in a way that helps you learn from your work
- We should be producing lots of intelligent failures that help us learn about the system we’re building , the people that use it and the domain in which it used
- Exploratory testing is all about exploring a system and seeing in what ways it can fail to better understand how it works

Small experiments over Big Bang experiments

At the frontier, the right kind of experimentation produces good failures quickly. Managers who practice it can avoid the unintelligent failure of conducting experiments at a larger scale than necessary.

Trail and failure?

“Trial and error” is a common term for the kind of experimentation needed in these settings, but it is a misnomer, because “error” implies that there was a “right” outcome in the first place.

Tolerance of failure

We need to be able to accept complex and intelligent failures and understand that doing so does not mean mediocrity. Tolerance is actually something that we need in order to be able to learn from these types of failures. The problem with failure is that there is almost always an emotional element to it and so needs leadership to enable the learning that needs to happen.

How do you learn from failure?

Leaders should insist that their organizations develop a clear understanding of what happened—not of “who did it”—when things go wrong.

This requires consistently:

reporting failures, small and large;
systematically analysing them; and
proactively searching for opportunities to experiment.

Anyone working on experimental work needs to clearly know that the faster we fail the faster we will succeed but most people don’t understand this subtle but important concept.

The quicker things fail the quicker you can pivot or try another idea that can succeed
But the longer that failure takes the longer you are executing on an idea that will not help your objective
What is the opportunity cost of working on one thing and not the other?

Some people may approach experimental work as if it’s well defined and understood such as production line style of work where you need to produce the same thing over and over.

For example, statistical processes control, which uses data analysis to assess unwarranted variances, is not good for catching and correcting random invisible glitches such as software bugs.

In a typical software team this would be predefined test cases or automated checks

There are three main ways to learn from failure: detection, analysis, and experimentation.

Detection

We need to detect and make issues visible earlier on in our processes before they become bigger issues later on

Don’t shoot the messenger

Unfortunately a lot of people are reluctant to raise issues early on in the process for all manor of reasons. The biggest culprit being people unwilling to take interpersonal risks in raising issues.

One of the best ways to combat this is for management to lead by example and not only encourage the raising of issues earlier on in the process no matter how small but also applauding the people that do and having a system in place to make something happen about it.

Another issue is a human tendency to not admit failure due to the stigma attached to it “it failed therefore I’ve failed”. Therefore people keep going hoping that things will get better when they should have admitted failure or worse they haven’t realised they’ve failed due to inadequate measures or goal when starting out.

Changing the stigma around failure is one way to improve the situation such as failure parties to encourage the reporting of failures and help people look at the situation in another way.

Example of how other organisations detect errors

Through speaking up supported by management from Amy Edmundson:

In researching errors and other failures in hospitals, I discovered substantial differences across patient-care units in nurses’ willingness to speak up about them. It turned out that the behavior of midlevel managers—how they responded to failures and whether they encouraged open discussion of them, welcomed questions, and displayed humility and curiosity—was the cause. I have seen the same pattern in a wide range of organizations.

Building quality in

The idea of the andon cord from the Toyota production system is doing just this; noticing small deviations in process and correcting them there and then to constantly improve the system.

For software engineering this is all about building quality into the process instead of inspecting it at the end. Inspecting at the end is almost too late to make difference due to the increased cost in time and cognitive load to make the change. This usually ends in discussion such as /users are never going to notice X/, /no one is ever going to do Y/ or /let’s see if it’s going to become a problem first/.

Analysis

Once failures have been detected it is important to not just look at the symptoms of the problem and move on but to dig into the root cause of the issues.

Unfortunately we tend to not want to do this as it can be painful to admit that something went wrong especially if we are the cause of it and can negatively affect our self esteem and confidence. There is also an element of interpersonal risk associated with admitting failure that can add towards people not wanting to spend too long looking at issues too deeply. “What if people think I’m incompetent?”

Culture is another aspect that needs to be in place for inquire into failure to occur. Digging into failures needs:

inquiry and openness, patience, and a tolerance for causal ambiguity

But a lot of organisational cultures are geared towards actions and results not reflection as needed for learning from failure.

We are also highly susceptible to fundamental attributes error. This is where we downplay our responsibility and blame external factors when we fail and do the opposite when others do.

Amy research back in 2010 showed that failure analysis is often limited and ineffective – sadly I think this is still the case for a lot of organisations.

Analysing complex failures is difficult as they tend to occur across teams and departments and due to the reason listed above most people only focus on the symptoms rather then getting at the underlying causes of the failures. Therefore it’s best to use multidisciplinary teams to carry out the investigation with the support of management that you are looking at what happened not what someone did or didn’t do.

From the NASA Colembine disaster

A team of leading physicists, engineers, aviation experts, naval leaders, and even astronauts devoted months to an analysis of the Columbia disaster.
They conclusively established not only the first-order cause: (symptom)
- a piece of foam had hit the shuttle’s leading edge during launch—but also
second-order causes: (underlying reason)
- A rigid hierarchy and schedule-obsessed culture at NASA made it especially difficult for engineers to speak up about anything but the most rock-solid concerns.

Experimentation

A critical activity for effective learning is strategically producing failures—in the right places, at the right times—through systematic experimentation.

For scientists
* 70% of experiments will fail
* They recognise that failure is not optional but a part of the process
* And that Failure holds valuable information that they need to extract and learn from /before the competition/ 🤔

In contrast when product companies design new products they plan for success. So they setup the product for optimal conditions that work instead of representative ones that they can actually learn from. Therefore the pilot only produced information about what does work not what doesn’t.

From Amy Edmundson:

A small and extremely successful suburban pilot had lulled Telco executives into a misguided confidence.
The problem was that the pilot did not resemble real service conditions: It was staffed with unusually personable, expert service reps and took place in a community of educated, tech-savvy customers.
But DSL was a brand-new technology and, unlike traditional telephony, had to interface with customers’ highly variable home computers and technical skills.
This added complexity and unpredictability to the service-delivery challenge in ways that Telco had not fully appreciated before the launch.
A more useful pilot at Telco would have tested the technology with limited support, unsophisticated customers, and old computers.
It would have been designed to discover everything that could go wrong—instead of proving that under the best of conditions everything would go right.
Of course, the managers in charge would have to have understood that they were going to be rewarded not for success but, rather, for producing intelligent failures as quickly as possible.
What incentives are you setting up for your employees? The things you reward are the things you will get.

What makes exceptional organisations?

exceptional organisations are those that go beyond detecting and analysing failures and try to generate intelligent ones for the express purpose of learning and innovating.

Can you think of any organisation that purposely inject failures into their system to see how they behave? Hint they named the tool after monkeys 🐒 and in the process created a whole new discipline: Chaos engineering. These experiments don’t have to be that big either:

[you] don’t have to do dramatic experiments with large budgets. Often a small pilot, a dry run of a new technique, or a simulation will suffice.

recognise the inevitability of failure in today’s complex work organizations. Those that catch, correct, and learn from failure before others do will succeed
Amy Edmundson

How to break the rules?

Dan North gave a really interesting talk at last years GoTo Conference called How to break the rules.

He takes Eliyahu Goldratt 4 steps of how technology is adopted which he presented in a series of lectures he gave on why people didn’t take on his ideas of theory of constraints (TOC) from his book The Goal.

Eliyahu stated that organisation need to go through 4 steps before a new technology can be successfully adopted:

What is the power of the technology
- What does it do for you?
What limitation does the technology diminish?
- What will it make better
What rules enable us to manage that limitation presently?
- And how much are we wedded to those rules
What new rules will we need?
- For the technology to succeed

Dan than takes these 4 rules and applies them to real companies that either succeeded or failed to take advantage of new technologies. The interesting part is for the companies that failed what rules (part 3) they needed to break to make use of the new technology.

I’ve made my rough notes on the talk available below but I highly recommend watching it.

My notes from the talk:

Talks about The Goal (book) and Beyond the Goal – lectures by Goldratt

Series of lectures about 20 years after The Goal was released
He tries to attempt to explain why people didn’t apply his Theory of constraints successfully if at all.

First two lectures are: How to adopt Technology?

What is technology: The application of knowledge

For it to be adopted then answer these questions:

What is the power of the technology
- What does it do for you?
What limitation does the tech diminish?
- What will it make better
What rules enable us to manage that limitation ?
- How much are we wedded to those rules
What new rules will we need?
- For the technology to succeed

He then goes on to apply these questions to real companies.

MRP – first application of computers in a business situation
- Calculated cost of materials for manufacture DuPoint
- Which allowed them to ship faster then others
- But when the competition tried it they couldn’t compete as to take advantage of the system you need to change the rest of your business
Goes on to apply to ERP and
Cloud computing (orgs moving to it)
But also Continuose delivery

Interesting: Dell becomes as big as it did because it could work in smaller batches then the competition . All the others companies offered you a fixed machine. Dell changed this by going online only and allowing you to customise your PC and could ship faster than the competition.

Swedish saying: When talking to farmer use farmers words

Commenting on how we collaborate across divisions
If you want to collaborate successfully you need to either be speaking the same language or use words the other understands

A lot of organisations are failing with Agile/Kanban/Scrum etc because they still have all the exiting rules in place (see point 3 earlier).

To be able to adopt the new technology you need to move to the new rules (point 4) otherwise you’re still doing the same thing.

Used the example of Amazon not doing Cost accounting and moving to throughput accounting and flow of value

Cost accounting is what each department cost
Through put accounting is what is being produced by the department and what value does that provide

The problem with step 3 is that it eventually becomes the culture of the organisation and trying to unpick it then is really hard.

So How to break the rules?

Understand the power of technology
Recognise the limitation the technology will diminish
identify the existing rules we use to manage the limitation
Identify and implement the new rules

Summary

Another great talk from Dan and goes to show a lot of our problems in software development have been issues in other industries.

Not only that they have been solved but we tend to overlook them as they don’t look like our industry. What has manufacturing got to do with writing software? In all don’t take my word for it go watch the talk and start having a look at Eliyahu Goldratt body work. He was onto something…

How to keep up to date with the testing industry?

I recently spoke at UKStar 2018 and the team there asked me a few questions on how I keep up to date in Testing and the industry as a whole which I’ve reproduced here.

What is your favourite testing book/blog? Why is this your favourite?

Two books come to mind. The first is Thinking Fast and Slow by Daniel Kahneman. This book is about how we make decisions and the fact that we are not really all that good at doing it, let alone realising what we are basing our choices on.

Secondly is the Freakonomics set of books by Steven Levitt and Stephen Dubner. Freakonomics is a great way to see how you can go about answering questions that seem impossible at first; but by using some more unconventional methods, you can answer almost anything.

What I enjoy about these books is that neither of them are about testing or software development but the questions that they both aim to answer are so relevant to the work we do.

How do you keep up to date with the software testing industry?

Whenever I’m asked this question it reminds me of what Martin Fowler always says to engineers when they ask him how do they become a better developer? His response is always “understand the business that you work in”. So rather than focusing on testing I take a broader view and look at our industry as a whole.

I tend to follow a number of bloggers whose insight into our industry I have found invaluable.

Ben Thompson’s Stratechery blog has some excellent indepth analysis on the software industry as a whole and how some of the larger organisations (Google, Amazon, Facebook, Apple etc) operate.

Benedict Evans weekly newsletter and Charles Arthur’s daily Overspill site are also great for finding technology focused articles from around the web that you probably would have missed, or never even come across, always with some illuminating commentary.

Twitter is a great source of information too. I tend to build up my feed by following people both inside and outside of the tech industry to get a broad range of ideas and influence.

What is the biggest misconception about testing that you’ve heard?

That automating all your testing will speed up the delivery of your products. A lot of teams still see testing as that bottleneck at the end of development stopping the release. So if you automate all your testing you will be able to ship faster.

What they almost always fail to see is that their releases are either too big so the test team can’t isolate and feedback on the risky areas or they still see testers as simply running check lists against the product. Hence thinking that they can be replaced with automation. Tester’s always do so much more than this but their work has had a history of being boiled down to “Have all the tests passed”.