Test Automation: Don’t report the bugs it catches

Reading time: 3 minutes

Don’t report the bugs your test automation catches. Report the reduction in uncertainty that the system works.

When you report the bugs you send the signal that test automation is there to catch bugs. But that’s not what it’s for. Test automation is there to tell you if your system is still behaving as you intended it to.

What are automated tests for?

Each automated test should be some isolated aspect of the behaviour of the system. Collectively these tests tell you that when you make a change to the system it still behaves as you want it to. What automated tests do is reduce your uncertainty that the system still behaves as you expect it to.

Framing test automation as reducing uncertainty

Framing test automation as reducing uncertainty help emphasize that there are always things we don’t know. Whereas if you frame it as increased certainty it can give the impression that we know more than we do.

Framing testing as increasing certainty
Framing testing as reducing uncertainty

What happens when a test passes or fails

When an automated test passes it’s sending a signal that this specific behaviour still exists. Therefore reducing some of your uncertainty that whatever changes you made have not affected this specific behaviour.

When a test fails it signals that this expected behaviour didn’t occur, but that’s it. What it doesn’t tell you is if it is a bug or if it was due to the change to the system. Someone still needs to investigate the failure to tell you that.

So what we should report is to what extent our uncertainty has been reduced by these tests. But how do we do that?

How to frame test automation as reducing uncertainty

Well a good place to start is to help people understand what behaviour is covered by the tests. For instance, you could categorise the behaviour of your system into 3 buckets such as primary, secondary and tertiary.

Primary could be things that are core to your product’s existence. For example for a streaming service, this could be video playback, playback controls and sign up etc. Tests in this bucket must pass before a release can be made.

Secondary could be behaviour that supports the primary behaviours but if they didn’t exist would be annoying at most but still allows the core features to function. For example, searching for new content or advanced playback controls (think variable playback speeds). Tests in this bucket can fail but they should not render the application unusable. Issues discovered here can be fixed with a patch release.

Tertiary behaviours could be experiments, new features that haven’t yet been proven out or other less frequently used features that are not considered core. Tests in this bucket can also fail and don’t have to be fixed with patch releases.

But be careful of accessibility behaviours falling into Secondary and Tertiary buckets. They might not be your biggest users but those features are critical for others to be able to use your systems.

Defining these categories is a team exercise with all the main stakeholders as it is key that they have a joint understanding of what the categories mean and what behaviours can fall into them.

Then when you report that your primary and secondary tests are passing you signal that the core and supporting features are behaving as expected. This reduces the team’s uncertainty that the system behaves as we expect. You can then decide what you want to do next.

Exploratory and Automated testing: Using the right techniques in the wrong contexts

Reading time 2 minutes

Exploratory testing is about testing in an unpredictable context and therefore detecting unpredictable failures in our software. Automated testing is about testing in a predictable context and therefore detecting predictable failures. The mistake we make with automation is we try to apply it to the wrong context. You can’t use testing methods developed for predictable context in an unpredictable environment.

While there is nothing physically stopping you neither practice is particularly efficient if used in the wrong context. Exploratory testing in a predictable environment would just confirm what you already knew only slower and less consistent when repeating the testing . While automated testing in an unpredictable environment would lead to false negatives.

It’s also not a one size fits all solution either as we work in both contexts. Predictable when initially developing the software and unpredictable once running in the live environment.

The only way you can replace exploratory testing with automation is to make the test environment predictable. But that would then mean you are trying to detect predictable issues. This then negates the outcome you were looking for which is trying to detect unpredictable or complex failures.

Testing in unpredictable contexts

The best way to detect unpredictable failures is to use methodologies that can operate in an unpredictable environment. 

One of the best known methods is exploratory testing (sometime called manual testing) but there are other technique too. Such as monitoring of the live environment. Which is good for issues we can predict in an unpredictable environment. Observability using logs, graphs and other telemetry to see how the system is behaving in the live environment. This is helpful for issues we can’t predict and need to debug in the live environment. Phased rollout of features using techniques such as feature toggles, blue/green deployments, canary releasing etc. Useful for limiting the impact of unintended issues in a unpredictable environment. Basically anything that allows you to slowly enable a feature for subsets of users.

Using monitoring and observability in conjunction with phased rollouts can greatly improve your ability to understand and limit how new code behaves in unpredictable environments. 

Testing in predictable contexts

This is not to say automated testing is invaluable as it can help detect smaller predictable issues. Which if left unchecked could develop into larger unknown failures that only occur with the right mix of other smaller issues. Some issues maybe within our control (software we develop) and some outside of our control (other people’s software). For software in our control (a predictable environment) automated testing is almost a prefect match. For software outside of our control (an unpredictable environment) contract testing, exploratory testing, monitoring and observability and phased roll outs of software is preferable. 

Control and isolation

Next time you’re looking at testing techniques think about how much control (and therefore isolation) you have over your test environment. The greater the level of control then the more automation you should consider, but the less control you have then the more you should consider exploratory testing coupled with monitoring, observability and phased rollouts. 

Testing techniques

The following diagram will help you see how different testing techniques stack up against each other. This is by no means an exhaustive list and is only comparing them on a speed of feedback, value of feedback and testing environment bases. So the next time you get into a discussion about testing you could use these characteristics as a good way to frame that discussion.

Testing techniques plotted on a speed, value and environment axis
Testing techniques plotted on a speed, value and environment axis

Are there testing techniques that should be plotted on the chart?

Do you agree with the axis? Is there another more important characteristics of testing that should be captured?

How would you plot the testing techniques?

My biggest takeaways from AgileTD 2020: The future of testers isn’t in automation or testing

I was lucky enough to speak at AgileTD this year and also attend some of the talks. These are my main takeaways from the conference based on the talks that I was able to make.

My confirmation bias sense is tingling with this but…  

The future of testers is not in automation or testing it will play a part but not as big as – helping teams build quality-in.  

Most teams see testing as either bug hunting or just another cost centre that needs eliminating. Therefore testers (at all levels) need to get much better at communicating the value of testing.

As testers we need to start shifting our skill set from doing the testing to advocating for testing within teams.

The skills we need to develop will take time to build as it’s not a matter of just attending training but having hands on experience of using and applying the skills.   

Otherwise testers risk becoming irrelevant in teams that begin to form without the need for testers if (or when) the next shift happens.

What is working in our favour is the slow shift to adopt new development ideas such as those expressed in Accelerate. But also teams figuring out how to really collaborate and not just cooperate. Think of the dance of passing tickets around that happens in a lot of development teams.

Which talks should you take the time to watch?

So which of these talk further lead me to believe the above. Let me break it down:

The future of testers is not in automation or testing

That is not to say it will go away, but it will not be the main objective of our roles.

Is not automation: Automation Addiction by Huib Schoots and Paul Holland (Day 1)

  • A lot of people’s addiction to automation appears to come from automation tool manufactures marketing (promising the world) and sunk cost fallacy (making it hard for people to stop once they’ve started). I’d also add peoples job spec also asking for automation with no rational as to why they want it
  • It is good for some things, generally things we know how they should behave and especially when we can isolate them from the UI.
    • UI’s can behave in unpredictable ways so not always the best place to put automation that needs to be consistent and reliable
  • So what do you do?
  • Focus on teams and start small: 
    • (Focus on) exploratory testing,
    • (Start small with) a good test strategy that includes what is and is not to be tested
  • Automation should be focused and isolated

Is not testing: Let it go by Nicola Sedgwick (Day 3)

  • We as testers need to let go of testing and start focusing on how we help teams understand what quality is and how they build it in
  • Nicola does this by being a Quality Coach and using Quality Engineers embedded in teams to help them mitigates the risks
  • This was a great talk and something lots of others have been advocating.
  • I think we still need to better define the Quality Coach and Quality Engineer roles but we have to start somewhere
  • I’ve written a little about what testers could do next
  • You can also learn a more about Quality Engineer from my TestBash Manchester talk (paywalled)

Also see

  • Testing is not the goal! By Rob Meaney (See below for more)
  • Beyond the bugs by Rick Tracy (See below for more)

Communicating the value of testing

How to pitch and value testing properly in the age of DevOps by Bjorn Boisschot (Day 1)

  • A fairly simple and affected approach to getting the test team behind a testing vision that they can then use to describe what it is that they do.
  • Rather than the typical dry approach of traditional testing (test scripts, reports, bugs) which all reinforce the bug hunter viewpoint of testing
  • This gets testing focused more towards what the organisation is trying to do (think company mission statement) and focuses the testing vision towards that
  • This helps others understand that it more then just bug hunting but about helping make decisions on the quality of the products and how they affect end users
  • His approach was to create a testing mission based on the company vision statement. With a focus on the why of testing and not the what or the how (see Simon Sinek: Start with why). From there they created a number of goals that would help them achieve that mission. Then they used the goal, question, metrics technique to make it measurable.
  • For some in the org this approach made testing much more accessible and greatly improved their view of it.
    • But for others, well, they still didn’t care 

Beyond the bugs by Rick Tracy (Day 3)

  • As senior members of the test team we need to help our testers understand what value they bring to teams. Then give them the tools (verbal and written communication skills) to make their value relatable to other roles. Otherwise they are very likely to be seen as bug hunters and a cost that can be eliminated.  
  • Really fascinating talk where he showed how everyone outside of testing views our roles (bug hunters that cost money). He then showed how we need to cover three main arguments for others to see the value we bring. These being conceptually (does it make logical sense to them) practically (how can they/others use it) and monetary (what does it cost and what’s the ROI). 
  • He then applied these three arguments to different testing scenarios from doing no testing at all to shifting testing as far left as possible and doing it earlier and earlier in the process. Through this he showed how the initial investment in testing increased but would dramatically decrease the costs later on in the process due to issues being found earlier and therefore easier and cheaper to fix.
  • This all reinforced the idea that testers do much more then add costs to projects and finding bugs. Such as 
    • Manually finding issues that would otherwise affect users 
    • Testing earlier on in the process before code is written by testing requirements and designs to prevent issues entering into the systems earlier
    • Raising levels of team understanding in the product, the processes they use and potential issues that is could be introduced 
    • Types of risk that could affect the team and acting as a type of insurance of that risk 
    • Making testing relatable to non testers 
    • Providing sources of information for innovation and improvements within the teams ways of working 
  • But all of this doesn’t just happen. You have to invest in your testers (and them in themselves!) for them to be able to do this such as their technical skills, improving their awareness, understanding risk, alignment with the org etc. IMO: If you keep testers ‘dumb’ and just bug hunting then that is all you will get 
  • He then linked this investments to potential measures so you can see if your investments was paying off and a way for testers to see improvements. Cycle and lead times where two areas that came up quite often 
  • These measures where then linked to business value. Two main ones being faster time to market and improved customer trust in the product. 

The skills we need to develop

These skills are not limited to just these talk but are great examples of what they are

How to keep your agility as a tester by Ard Kramer (Day 1)

  • Great talk about how he uses the 4 virtues of stoicism to be a better testers. I actually think this would help a lot of people within development teams so if you’ve not heard of it before I recommend checking it out. 
  • This looks like a good resource https://iep.utm.edu/stoiceth/ but this talk focused on just the 4 virtues of wisdom, courage, justice and moderation 

Also see 

  • Extreme learning situations as testers (Day 3)
  • How to keep testers motivated by Federico Toledo (Day 3)
  • Beyond the bugs by Rick Tracy (See above)
  • Testing is not the goal! By Rob Meaney (See below)
  • Introducing psychological safety in a tribe (See below)
  • Growing Quality from Culture in Testing Times by Tom Young (See below)
  • Faster Delivery teams? Kill the Test column by Jit Gosai (See below)

Adopt new development ideas

Testing is not the goal! By Rob Meaney (Day 2)

  • From  testability > operability > observability and his journey with his learning with these techniques and how teams have be able to make use of them.
  • I think one of the really interesting points he made was understanding where your team is in their development life cycle.
    • Are they just starting out or are they an established team and product.
    • Depending on where you on this cycle will affect to what level you will need testability, operability and observability.
    • As the three things are about managing complexity and when you are starting out complexity isn’t the problem, product market fit is. 

Also see

  • Faster Delivery teams? Kill the Test column by Jit Gosai (See below)

How to really collaborate and not just cooperate

Growing Quality from Culture in Testing Times by Tom Young (Day 1)

  • Great story from Tom Young on how the BBC news mobile team have grown over the years and how focusing on their team culture has been one of the best ways to build quality into their product. All they way through the talk Tom shouted out to how the whole team help deliver their product

Faster Delivery teams? Kill the Test column by Jit Gosai (Day 2)

Introducing psychological safety in a tribe by Gitte Klitgaard and Morgan Ahlström (Day 3)

  • Hearing how other people have tried to address psychological safety in organisation was very interesting. There was a lot in the talk that I recognised from Amy Edmundsons work (Teaming and Fearless organisation). They didn’t use Amy definition of what psychological safety is but from what I’ve seen all the definitions are almost the same. Simply put are people willing to take interpersonal risks within group settings. If so they have psychological safety if not then they are considered lacking it. 
  • The things that stood out for me was that all these types of initiatives take time and constant work. They are not things that you run a workshop, take a few questionnaires and you have the safety.
  • Also psychological safety is very personal thing so what one person feels is not the same as another in the same team. 
  • There is also a lot of misconception around psychological safety in that people feel that in psychologically safe environments will no longer have any conflicts and is all about everyone being comfortable. This is not the case.
    • PS environments are about being able to share your thoughts and ideas without the worry that it could be used against you in some way.
    • The main reason for PS is to establish environments conducive to learning from each other – which is what is needed for the knowledge work that we do 
    • But to learn effectively your need some level of discomfort
    • Too much discomfort and it can tip into fear which causes the flight or flight response  
      • and you’re not learning anything other then self protection 
      • The best way to protect yourself? Don’t say anything that could lead to a situation that causes conflict… 
    • So PS environments are about people being able to work through conflict productively that can lead to new insights and ideas

There was many, many more talks at the conference (perhaps too many) that I wasn’t able to make and thats not including the workshops so it is worth looking through the programme and seeing what stands out for you.

Think I missed a talk that should be in the list above let me know in the comments.

What is it about that particular talk that makes you think it should be included?

What above do you disagree with?

Building a quality culture: Is it quality assurance or quality awareness?

5 minute read

If you ask testers what does QA stand for most are likely to say its Quality Assurance and is typically described as providing confidence that some quality criteria will be fulfilled. Some people in the testing community believe that it’s actually Quality Awareness*. The thinking goes how can tester assure the quality of something if they never built it in the first place? All they can do is make your team aware of the quality. I agree with both explanations and believe that they are different sides of the same quality coin.

Read on to see what this means for testers and how it’s a useful to consider both in teams.

*I’ve also heard Quality Advocate and my favourite Question Askers, both fit this model of Quality Awareness.

Firstly some definitions:

What is Quality?

Quality is value to someone – Gerald Winberg

But who is that someone and what does value mean?

From the Lenses of quality on who that someone is:

…that someone could be their Products Owners (PO), the organisation they work for, the team they work with and their end users and all these groups of people could have very different views on what value means to them and even contradicting in some cases.

https://www.jitgo.uk/lenses-of-quality/

Again from the same post on what value is:

Each of these groups of people view quality with a different lens therefore see the same system differently to one another. We as testers should help our teams to see quality through these different lenses by helping them identify these groups and what their measures of quality are.

https://www.jitgo.uk/lenses-of-quality/

Simply put value will depend on the viewpoint of the person. Identify whats viewpoint that person sees the product/system through then you’re halfway there to working out what’s valuable to that person. If you take this a step further and see what incentives drive that persons viewpoint then you might identify whats valuable to them too. But thats not as easy as it sounds.

For any given team there are multiple members and stakeholders therefore they are likely to be a number of unique and overlapping quality attributes too. Identifying the key ones for each cohort of people is a valuable exercise for any team. It might help explain why some people are never happy no matter what you deliver.

What is Quality Assurance?

Firstly what does assurance mean from the Oxford dictionary:

a statement that something will certainly be true or will certainly happen, particularly when there has been doubt about it

One way to think about assurance is that it’s a promise that an outcome will happen so as to give others confidence. In this case it’s quality and as mentioned earlier quality means value to someone. Therefore QA or Quality Assurance means providing confidence in the quality of the product to stakeholders. This it is about certainty that an outcome will happen not a guarantee. You are saying that best efforts will be made and this is an assurances in making that happen.

What is Quality Awareness?

What does awareness mean from the Oxford dictionary:

knowing something; knowing that something exists and is important

interest in and concern about a particular situation or area of interest

This would lead to QA or Quality Awareness to be a person who is interested in quality (value), understands its importance and has knowledge about the quality of a product or system. To take this a step further someone who works in Quality Awareness understands that quality is value to someone, knows who those people are, what viewpoints they hold and possibly what incentives drive those views. They are then able to apply this knowledge to the system and subsequently increases their teams awareness of overall system quality. Essentially Quality Awareness is about building awareness of what quality is in a team, how it is affected and who it matters too.

Quality Awareness sits at the intersection of the team that produces the system, the domain in which the system operates in and users of the system.

Two sides of the same coin

Both are focused on quality but one is about improving the teams understanding of what quality means for their stakeholders. While the other is focused on maintaining (and hopefully improving) the quality of the product.

In this scenario Quality Awareness can improve Quality Assurance by giving it the metrics with which quality is being assessed on. In this model Quality Assurance can actually fulfil it’s job of providing confidence that the quality of the product is being upheld. Why, because it is taking into account who the stakeholders are and what is valuable to them. Then converting that value into a measurable metric which the engineering team can either:

  1. assess themselves against to make sure they are doing what they said they would do or
  2. provide the stakeholders the metrics to improve their confidence that not only is the engineering team doing their job but maintaining and perhaps even improving it.

The thing to keep in mind is not all quality values can be converted into an easy to measure metric. You can use proxy measures to give you an idea but some measures are always inherently subjective. On top of the this the systems we build are interdependent on other systems which are out of our control and can affect the quality of our systems. Therefore techniques such as exploratory testing can be very beneficial as it can help build a fuller awareness of what quality means for your product.

Back to Quality Assurance?

Does this mean we should go back to using QA again and naming ourselves the QA Team? No, we’ve come a long way in some areas of our industry and going back to QA teams might bring back all those old problems. Such as test team silos, testers being the gatekeepers and why didn’t we catch that bug?

How is this helpful?

Where this can be useful is giving teams another lens through which to look at their testing approach and think is this heading towards building assurance of quality or is this about raising awareness of quality. By separating out into these two camps you can see the value it’s actually going to bring and if its worth the investment. It might also help clear up who should be doing what and when.

Using this model of awareness and assurance could be helpful for testers trying to figuring out what it is that they want to do with their careers. Do you want to learn more about building team confidence in quality through test automation (Quality Assurance) or building a quality culture within teams (Quality Awareness)?

How to document Unit Testing

Whenever you talk about unit testing with teams they never tell you what it means to them. They go straight to of course we do and show you 100s of passing tests. Interesting thing is by calling it unit testing everyone thinks they are talking about the same thing. But when you start digging into how they understand it you begin to see that everyone talks about it and understands it differently.

What do the unit tests test?

A selection of responses to the question what do the unit tests test

A unit means different things to different people but we never stop and ask what does a unit and unit testing mean to you? Why? Well that could be risky as you’re potentially questioning someone’s ability. Which probably says more about psychological safety in your team but thats a topics for another day.

A Unit means different things to different people

So what should you call it then? Well maybe as a stop gap just call it what it is a test that checks code; Code test. Now I know what you’re thinking “thats way too generic!” Which is kind of the point because when you do that the first thing people ask is “What’s a code test?” Now you can start the discussion without anyone feeling that you’re questioning their ability.

What are Code tests?

How do you build a team understanding of what it is?

One of the best and easiest ways is to get the team together and pose them three questions:

Three questions to ask teams about unit testing
  • What does a unit mean to you in unit testing?
  • What characteristics make a good unit test?
  • What characteristics make a bad unit test?

Hand out sticky notes or use whatever online tool your team prefer. (Miro is a pretty good online collaborative white board). Then ask each question one at a time. If you can do it in person then doing this in a big room with lots of wall space is best as it allows for people to talk to one-another during the idea generating stage. Allowing them to talk is advantageous as people will build on top of each others ideas. But this may not be practical for distributed teams.

Building a Team Understanding of Code Tests

Once everyone has had a chance to contribute, group and theme the responses. Then as a team look through them and see if there are any contradictions or if anyone strongly disagrees with the groups. If there is then this is a perfect time to build the teams understanding of what code testing is.

Building a Team Understanding of Code Tests

If you’re looking for some inspiration then watching as a group Ian Cooper: TDD, where did it all go wrong and J.B. Rainsberger: Integrated Tests Are A Scam are a good places to start. Both these talks are quite old now so more up-to-date versions maybe available.

You may find that you need to run the sticky note exercise again to build consensus but essentially you want the groups agreement on what a unit is and what are good and bad characteristic of a test. This will give you a high level understanding of what a code test is.

What do you do once you have group agreement?

So You’ve got a high level understanding but you need to turn that group understanding into something more solid. Something that gives them

  • Alignment with each others understanding
  • Autonomy with how they actually implement code level tests
  • But also Accountability so not only is it their responsibility to do but to do it well
autonomy, alignment and accountability

You could just say “look at the code for examples” but as we’ve seen from before this isn’t always the best way as the intent behind the code may not be clear to everyone that reads it.

Ideally it would be something that is lightweight, but not too light that it’s too open to interpretation e.g. sticky notes. But not too heavy either that no one ever reads e.g. 10,000 word essay hidden in a Wiki.

Lightweight documentation

We need to document it in a way that is quick and easy to read and therefore remember.

The best way to demonstrate this is through an example. Now this example isn’t describing code testing (you need to have that discussion with the team first) however it has all the elements we are looking for.

Example principle

The title is short and to the point which makes it easier to remember but also acts like super short summary of the principle itself.

The first paragraph describes what it is about. The language used is really easy to understand too. It takes no effort to read and comprehend. This allows the reader to spend more time understanding the content rather than trying to decipher the words used.

The second and third paragraphs detail good and bad behaviours respectively. Finally they have a list of links that show where they have demonstrated this behaviour.

The great thing about this structure is that each part builds on top of the previous part. The title gets built on by the description. The good and bad behaviours builds on top of the description and the links give concrete examples of those behaviours so the reader can see them in action or even gives them the opportunity to add their own.

Back to the sticky notes

What To Do With The Sticky Notes

You might have worked this out already by those sticky notes will map onto this simple title/description/good/bad framework quite easily. The what does a unit mean to you would be used to write the description of what unit testing is. The key points from the good and bad characteristics would make up the good and bad behaviours descriptions. Finally all those unit tests you have should be used to demonstrate where those good and bad behaviours have been shown in your code base. You’re on your own for coming up with a snappy title.

Autonomy, Alignment and Accountability

You’ve got your lightweight documentation but how does this relate to creating team autonomy, building alignment between developers and making them accountable for their actions?

Building Alignment through a common language

The description is all about what a unit is and gives a common language for the team to use when talking about code testing. This helps to build aliment between team members.

Creating Autonomy through why not how

The good behaviours say nothing about how to write good code tests just what makes a good test within this team. Hence the focus on characteristics during the sticky notes session. The good behaviours coupled with the bad act as guard rails in what we do want and less of what we don’t. This works to keep the developer autonomy as they still have to workout how to actually do it. If they are unsure they have links to where the team have actually implemented tests that demonstrate this behaviour or they can always speak to the other developers.

Autonomy & Alignment enables Accountability

By documenting the principle using easy to comprehend language to build a common team vocabulary and describing behaviours instead of instructions to create autonomy you increase the responsibility within the development team that they are accountable for enabling the principle. Not only that it makes it that much easier for people to find more information and lowers the barrier to approaching the subject in the first place.

One of the great things about documenting things is that you can point at that thing and say you don’t agree which is a lot easier then pointing at a person and saying the same thing.

How Does This Map Onto Alignment Autonomy Accountability?

In Summary

By following this model you can begin to create a team understanding of what unit testing means to them and create a unified language so that they can talk about. It also lowers the barrier to understanding the approach for others which really helps to improve the overall team confidence in what unit testing does and doesn’t give them.

Documenting your teams understanding of unit testing using this lightweight model means that when people eventually leave that knowledge doesn’t leave with them or slowly erode from the teams memory. Another benefit is as new members join the team they can use this to build up their understanding of how the team approach unit testing.

There is a risk that the information does become outdated but you could use the new joiners as motivation for the team to re-visit old principles and see if they are still valid or need updating. You never know by including the new joiner in this process they may add something that you hadn’t considered before and gives them an opportunity to start positively contributing to the team. At a minimum it kicks starts the conversation again and allows the team to visit old assumptions and behaviours.

You could also use this model to document other principles that the team would like to work by all while maintaining their individual autonomy, alignment with one another and emphasising accountability that it’s up to them to make it happen.

Now you can see if it still makes sense calling them code test, unit tests or something else all together.

What is Contract Testing?

And Consumer-driven contract testing

This is a follow on from Contract testing: Why do it

First some quick definitions:

Consumer
Is someone (a dev team for instance) that makes use of a third party component or a combination of components (a system). They consume the service provided by the component/system.

Producer
Are the people (a dev team) who build the component or system and make it available to others to use.

Test double
To keep the tests fast you will be using a Test double of the producer in the majority of your tests. More specifically a stub that is very simple and responds how you tell it to.

Remember don’t mock what your don’t own.

Avoid using mocks for contract tests otherwise you’ll be creating another job for yourself if you attempt to mock the behaviour of your producers. Always think of the producer as a blackbox so don’t make assumption on how the internals of the producer work. That is not your responsibility. A stub should be simple and easy to see how it works and will generally just respond with a simple response.

What is a Contract test?

Contract tests are automated code level tests written from the viewpoint of the consumer. They check that the producer exists, responds to a given request and responds in the format expected by the consumer. A simple rule could be

  • For every unique call you make to the producer write a contract test
  • If output from a producer is going to cause a unique behaviour change in you (the consumer) then write a contract test e.g. an error condition would fall into this category

They wouldn’t go further then this and begin to check that the response contains all the correct data or the behaviour of the producer. That’s the job of the producer not the consumer. The producer will be a black box to the consumer, simply input and output. Whatever transformations that happen to the data on the inside of the producer are unknown.

If you do test that a response contains the correct data then I would only test very specific types of data. Specifically ones that if they where not returned would cause problems for your system. For which your system should handle gracefully in response.

How will Contract testing help?

Focused
If the test follows the guidance above they will focus on just the boundary at which the consumer and producer interact. Therefore if they fail you know exactly where the problem is but also what the issue is as they cover only a small area of interaction. This will allow you to quickly identify if the problem is with your integration of the producer or some other part of your code.

Fast
These test will be written at the code level usually with a native unit testing framework of the language you are working with. The vast majority of the tests will also execute against a stub to keep them fast. If you ran them against the actual producer then they could run slower. Also due to each test being so focused on the interaction boundary they will run well under a second allowing the whole suite of tests to execute in a matter of seconds.

Reliable
Circle of control / Circle of influence
Do to the simplicity of the tests the number of false positives is very low and will only fail if something had changed within the test, your interaction with the test double or the test double itself. Everything is now within your circle of control therefore any brittleness can be remedied quickly and easily.

Automated documentation
You now have tests that document your usage of the dependency that are also executable so will stay up-to-date with every change you or your dependency makes.

Running the tests

These test can now be easily kept as part of the main suite of tests within the code base and run through the development pipeline as usual. Any change to the code base would result in the whole suite of contract tests running and letting the dev team know if there was any issues.

Occasionally you would also want to run the contract tests against the real dependency separately from the main build pipeline just to let you know if the contract had changed and that your test double is still a true stand-in for the real thing.

New version of the dependency

Now when a new version of the dependency is released you can run the contract tests against it and check to see if there are any breaking changes. If no issues are detected then maybe some light exploratory testing of changes detailed in the release notes.

If running the contract tests does detect an issue then it should be quick and easy to pinpoint where the issue is (you or them) and what the necessary mitigation steps should be (fix in your code or reject the release). All this while keeping your build pipeline running and your code base shippable.

If an issue is detected in the live environment then it’s going to be easy to know what changed and how to fix it. Which could be either fixing forward or backing out the change.

Confidence for the Consumer team

Contract tests allow the consuming team to move to a new version of a producer much quicker and with greater confidence than before. If something in the release notes looks risky still then your test team can carry out focused exploratory regression testing and if possible putting the update behind a feature flag for a controlled release to your end users.

What is Consumer-driven Contract Testing?

The thing with contract tests is that they are very much in the consumer domain. If the producer is making regular releases which result in the contract tests failing often then in one hand at least you know before taking the actual update but in the other you still can’t take it without work arounds or additional new releases from the producer. This may lead you to thinking about a new supplier. Why even bother with all the pain with writing contract tests when you knew this already?

What if you could help your producer see that each update is going to cause you issues before they even made a release? What if they told you prior to making the release that they need to introduce a breaking change or better yet that the current API will be deprecated after a certain date/version allowing you to move to the new API in your own time? What if you could work with your producers collaboratively that way they get what they want (easy and quick uptake of new versions) and you get what you need (new bug fixes/features, improved confidence of each update working as intended, less time testing)? This is where Consumer-driven Contract testing can help and really starts to show the benefits of Contract testing.

Benefits of Consumer-driven Contract Testing

As mentioned earlier the Contract Test sit in your circle of control. That is everything in this domain is in your direct control. The producer however is out of your control but can be in your circle of influence.

  • Note the level to which you will have influence over your dependency will depend on your overall relationship. If they are within the same organisation then things maybe easier, outside of your org but a supplier that you have a financial contract with then probably require some contract negotiation so not impossible but still some effort. No financial contract and just something you use through an open source license then contract testing is all you will likely have as a relationship.

One of the ways to start moving your producers into your circle of influence is to start a dialogue with them around your contract tests. These tests will show the producer exactly how you integrate their service and the types of response you expect from them. Also due to the simplicity of the tests and test doubles it should be easy for them to understand without your intervention (another good reason to keep them focused and simple).

Showing them the tests is a good place to start (it’s just code that’s what we are all working with none of that touchy, feely stuff about relationships) but a better way to progress the relationship, sorry, chat would be see if they could run the contract tests as a part of their development pipeline. Perhaps every time they plan to make a release or better yet on every commit (another reason to keep the tests fast and reliable).

This way they not only see how you use them, but they get an early warning if any changes in their code is likely to cause their consumers any problems. They can then see if they really need to make that change or see how they can mitigate the impact to their consumers. If they need to do it they can start a dialog with their consumers and start to migrate them onto a new API. This all helps to improve the relationship between consumer and producers, facilitated with some simple tests. Who knew testing could build stronger relationships between development teams?

Who owns the Contract tests?

Just in case it’s not clear the responsibility to write the contract tests in the first place is always with the consumers. It’s only them that know how they plan to use and integrate the producers. The producers can always offer best practice and how they intend consumer to use their services but it’s up to the consumers to decide if they plan to use the service the way it was intended.

Contract tests only become Consumer-driven once they are executed by the producers. Until then they are just Contract tests and even then just in name. If they test anything more then what was outlined earlier they become something else entirely.

New problems to solve

Figuring out how to share the tests, run them, making the results visible and letting consumers and producers know about breaking changes is a whole host of other issues that need to be resolved. The web testing frameworks have already made some progress in this area but I don’t know of any tools that facilitate this between internal teams other then having access to each other’s build infrastructure and source code repos.

Don’t use contract tests to do functional testing

Contract tests need to be quick and simple to understand and therefore only test at the boundary. If you go further than this they will become more complicated and harder for other teams to understand.

It’s not the producers team to understand how you use their service but giving them some insight into how you integrate it could be beneficial to both teams. There is nothing stopping you from writing more integrated tests but don’t expect your producer to run these. This is your responsibility and the feedback from this would be more beneficial to you than them. Besides you don’t want them thinking you’re trying to fob your testing onto them.

If you do more testing further than what was described above don’t call them contract tests otherwise you’ll cause more confusion. Be specific and call them what they are.

Contract Testing, Why do it?

I’ve been thinking a lot about contract testing lately and trying to explain why it’s a good idea.  I thought I’d start by getting my initial arguments for it down and go from there.


Got an opinion then let me know.

Note: This is a first draft (published 24/10/19) and I’ll (hopefully) revisit it again soon but in the meanwhile here is me thinking out in the open… 

Update 18/11/19: Added more details on what contract testing actually is.

The “Contract Testing Chat”: The Reality, The Problem, The Possible Solution?

Aim: To encourage dev teams to use contract testing to manage their integration of dependencies

The Reality 

Within any of the systems we produce there are components from external teams as this allows us to focus on what is important to us and let the other teams take care of thing that are not our core competency. 

In an ideal situation we would probably make everything ourselves so we have complete control but that would require significant amounts of Time, Money and Skills.

Time – To train your existing staff or recruit the people that have the skills and then allow them to actually build the component/systems.

Money – To hire the people and all the necessary resources they need to do the job.

Skills – That the person needs to be able to do the job

Some organisations can throw money at the situation and recruit the best in the industry and do everything in-house. Think of large organisations with deep pockets and large global brands.

Others (like us) don’t have this luxury and have to rely on external teams and component makers to make up for the parts we choose not to focus on. This leaves us with a dilemma. 


The Problem 

Do we just simply trust that these external components (dependencies) will work as we hope and the teams maintaining them will let us know when things change? What most teams do is integrate the component and put it through a couple of rounds of exploratory testing just to be sure things still work as we intended. If an issue is found then it’s a matter of understating what the problem is and where the problem could actually be and who’s responsibility it is to fix it.

This strategy works quite well once the initial issues have been ironed out. 
Eventually though a new version of the component is released and you need to decide if you test everything again or trust the release notes and just do focused regression testing.

Possible solutions


Focused Regression Testing If you just do focused regression testing and an issue is found in the live environment then trust in the dependence maintainer and possibly the development team integrating the component is diminished. The general response to this is to do a full regression of the integrated component every time. 

Full Regression Testing Full Regression Testing usually takes more time, money and skills so teams only integrate newer versions of the dependencies if they really have too. Generally when it contains something that they need e.g. a new feature or bug fix that affects the team directly. 

But because of the large gaps between integration of the previous and latest dependence there are likely to be even more changes then the team anticipated so not only does a full regression now have to happen but there are likely to be more issues found leading to even longer lead times in integrating the component. The blame game normally starts about now, see below. 

Automated end-to-end testing Some teams try to address this problem with automated end-to-end UI testing. Why? Well that’s what the Testers are doing during regression testing right? Just checking the functionality of the system and finding all the issues. So if we can automate this then we can not only find these issues faster but repeatedly and freeing up the Testers to do other things. It almost looks like you address the time, money and skill question is one initial up-front cost of building out the automation see UI Automation, what is it good for? 

Unfortunately these test only find what you program them to find and not only that the more components the end-to-end test run through the greater the chance of failure from false positives. If it does find an issue then you need to work out where the problem actually is: the test or the code. If it’s the code then another developer needs to investigate where that issue is and you’re heading to the blame game backlog issue. 

The Blame game 

The blame game is when the dependency maintainer blames the integrating team for not taking updates often enough and attempting to integrate the component in a way that they didn’t intend. On top of that any issues now found by the integrating team needs to go into the dependence maintainers backlog to be prioritised as they have other competing work to be getting on with. It’s not like they are the only team integrating their component. Meanwhile the integration team is blaming the maintainers for sneaking in features and bug fixes that they never asked for and holding up their development process. 

Last resort solution? 

Find another supplier Once The Blame game starts then this usually leads the integrating team down one of two paths. Find another more responsive supplier. Perhaps paying an external team to the company might solve their dependency problems. This is throwing money at the problem (see money, time and skill from earlier) or they…

Build it in-house This is all about taking back control and making it the teams responsibility to build the component. No more having to worry about another teams backlog or building things that have no relevance to your team. This is the skills part of the money, time and skills from earlier.

Both of the above solutions is a break down of the relationship between maintainers and integrators or more commonly your dev team and those PITA’s over in <insert location/team/department name here> 😆

Is there anything else we can try that could help with all the issues above and prevent the relationship breakdown? We ended up down this path as we needed to address the time, money and skills costs we couldn’t afford as a team, but all the options above results in one of the core costs having to be paid.

A possible solution? 

Contract testing and Consumer-Driven Contract testing 
Contract testing helps address the time cost by allowing an external team maintain a dependency. This also addresses the money question as the responsibility to fund that team essentially becomes someone else problem along with the skills issue. All the dev team needs to do is integrate the dependency. So how is Contract testing going to actually help? 

See What is Contract testing for more details.

The unintended consequences of automated UI tests

Whenever I see people talking about automated testing I always wonder what type of testing they actually mean? Eventually someone will mention the framework they are using and all too often it’s a UI based automation tool that allows tests to be written end-to-end (A-E2E-UI). 
They are usually very good at articulating what they think these tests will give them: fast automated tests that they no longer need to run manually, amongst other reasons.

But what they fail to look at is the types of behaviours these A-E2E-UI tests encourage and discourage within teams. 

They have a tendency to encourage  

  • Writing more integrated testing with the full stack rather then isolated tests 
    • Isolated behaviour tests (e.g. unit, integration, contract tests etc) run faster and help pinpoint where issues could be
    • A-E2E-UI test will just indicate that a specific user journey is not working. While useful from an end user prospective someone still needs to investigate why. This can lead to just re-running it to see if it’s an intermittent error. Which is only made worse by tests giving false negatives which full stack tests are more likely to because of having more moving parts 
  • Testing becomes someone else responsibility 
    • This is more apparent when the A-E2E-UI test are done by somebody else in the team and not the pair developing the code 
    • Notice ‘pair’ if you’re not a one-person development army then why are you working alone? 
      • Pairs tend to produce better code of higher quality with instant feedback from a real person 
      • It might be slower at first but it’s worth it to go faster later 
      • This is really important for established businesses with paying customers 
      • A research paper called The Costs and Benefits of Pair Programming backs this up but it’s nearly 20 years old now so if you know of anything more recent let me know in the comments.
  • Pushing testing towards the end of the development life cycle 
    • The only way A-E2E-UI tests work is through a fully integrated system therefore testing gets pushed later into the development cycle 
    • You could use Test doubles for parts but then that is not an end-to-end test.
  • Slower feedback loops for development teams 
    • Due to testing being pushed to the later stages of development developers go longer without feedback into how their work is progressing 
    • This problem is increased further when the A-E2E-UI tools are not familiar to the developers who subsequently wait for the development pipeline to run their tests instead of doing it locally
  • Duplication of testing 
    • As the A-E2E-UI test suits get bigger and bigger it becomes hard and harder to see what is and isn’t covered by automation 
    • This leads to teams starting to test things at other levels (code and most likely exploratory testing ) which all add to the development time 

These are just some of the behaviours I’ve observed A-E2E-UI tests encourage, but they also discourage other behaviours which maybe desirable. 

They can discourage development teams from

  • Building testability into the design of the systems 
    • Why would you if you know you can “easily” tests something end-to-end with an automation tool? 
  • Maintainability of the code base
    • By limiting the opportunities to build a more testable design you decrease the maintainability of the code though tests 
    • If you need to make a change it’s harder to see what the change in the code affects
    • By having more fine grained tests you can pinpoint where issues exist
    • A-E2E-UI tests just indicate that a journey has broken and how it could affect the end users
    • Not where the problem was actually introduced  
  • Building quality at the source 
    • You are deferring testing towards the end of the development pipeline when everything has been integrated.  Instead of when you are actively developing the code.
    • Are you really going to go back and add in the tests especially if you know an end-to-end test is going to cover it?
  • The responsibility to test your work 
    • With the “safety net” of the A-E2E-UI tests you send the message that it’s ok if something slips though development 
    • If it affects anything the A-E2E-UI tests will catch it
    • What we should be encouraging is that it’s the developers responsibility to build AND test their work
    • They should be confidant that once they have finished that piece of code it can be shipped 
    • The A-E2E-UI tests should acts as another layer to build on your teams confidence that nothing catastrophic will impact the end users. Think of them as a canary in the coal mine. If it stops chirping then something is really wrong…   
  • More granular feedback loops
    • By having A-E2E-UI tests you’re less likely to write unit and integration tests which give you fast feedback on how that part of the code behaves 
    • Remember code level tests should be testing behaviour not implementation details 

If A-E2E-UI tests cause undesirable behaviours in teams should we stop writing them? While they are valuable at demonstrating end users journeys we shouldn’t be putting so much of our confidence that our system works as intended into them. They should be another layer which helps build the teams confidence that the system hangs together. 

If we put the vast majority of our effort and confidence into these automated end-to-end tests than we risk losing one of the teams greatest abilities: building testability into the design of our systems. But just like the automated UI tests building in testability takes conscious effort. This will take time, patients and experience for the whole team to understand and benefit from.

UI Automation, what is it good for? 

TL;DR: What automation at the UI level does and doesn’t give you.
UPDATE: I originally wrote this back in March 2015, lost it in my drafts and found it again recently so thought I get it out there. Don’t agree then let me know in the comments.

Automation fallacy

Every time I speak with different teams and organisations a theme constantly comes up, UI automation and how it’s going to solve all their problems. The thinking goes that if we can automate more of our tests – read test scripts –  then the Testers no longer have to check that item anymore. This then frees them up to do more interesting things like exploratory testing or that the Tester can be done away with altogether.

There is also a notion that automating all the regression checks will drop the regression test cycle from days to hours. This then supposable allows the team to move faster and release  quicker then before.

What everyone seems to miss is that Automated checks are generally built to check one thing and will tell you if that thing is still there or behaving as the script has been programmed to tell you. If anything else happens that wasn’t programmed into the check then it fails or stops dead, relaying on someone having to then go look and see what went wrong.

A Tester on the other hand can look for workarounds, workout what may have caused the issue or go find other issues based on the information they’ve just learned.

So should we give up on UI automation and face that we’ve got to do rounds and rounds of regression testing and hire more Testers? Well no, what we need to do is ask ourselves

Why are we automating?

It’s looks like a simple question and most people (including myself in the past) would be able to give you a list of answers but what we forget to question is by automating this check what does it tell me when it passes or fails? If it passes does that now mean I no longer have to check that feature or scenario again? If it fails what does that tell me? That I have to check that scenario manually?

When a check fails what do we expect the team to do? Stop everything and investigate the issue? Carry on as normal and hope someone else will check it? Ignore the issue altogether? Who is responsible for checking the issue? Developers, Testers, dedicated automation engineers?

There are a lot of reasons that people give as to why they want to automate their testing such as

  • Reduce test/regression testing
    • The reason for regression testing is to see if the changes you’ve made to your code base haven’t broken anything existing.
    • Unless you have automated all your UI Checks/regression suits then automation is not going to help you as much as you think it will
  • Spot issues/bugs faster
    • Automation doesn’t find new bugs it only tells you that the check you’ve scripted has broken in someway. You need to tell the script that if action A doesn’t produce result B then fail with an error message. What normally happens is the check fails in away your didn’t anticipate.  Don’t forget if you knew before hand how something would break you would probably have put in a fix. Thats why they are called defects something behaving the way you didn’t want/anticipate
  • Free up Testers
    • Potentially but that is if they trust the automation
  • Consistently check a feature the same way
    • This is one thing a automation check is very good at
  • Something that is laborious or difficult to setup and check
    • Another good candidate for automation. We use it to do policy testing of our apps as it’s time consuming and prone to error when trying to manually test
  • We’re doing Behaviour Driven Development (BDD)
    • BDD is not about automating but more about collaborating to understand and create features. The automation is just one small part of it and even then it’s not about testing the UI but the business logic which could be tested at the Unit level
    • If you ever hear a development team saying ’The BDD tests are failing’ then its good indicator that they are probably using BDD incorrectly
  • To release faster
    • Again because you need to do less testing, see Reduce test/regression testing
  • It’s a part of continues integration/delivery so we have to
    • No thought into what you are automating other then it’s what people say you have to do
  • Test manager or some other higher up tells you to
    • Someone thinks that just telling a development team to automate their testing will help them, see above
  • People within the development team or key stockholders don’t trust the developers work
    • The test team are being used as a safety net to check the developers work which tends to have a self-fulfilling prophecy for the developers who start using the test team as that

What does an Automated Check actually do?

Let’s start with an example from an mobile app but could very easily be any platform of your choice:

A simple automated scenario could be when the home page is loaded and I’ve selected an option then I expect to see items X, Y and Z.

Things this scenario will need to do are:

     Start the application
     Wait for it to load up
     Select a menu option
     Wait for the new screen to load
     Then check that the expected items are on screen.

animated gif of example automated check
Click to view animated gif of example automated check

So you can run this check over and over and know that as long as the sequence doesn’t change and the items you are checking for are there then the test will pass. What it isn’t going to tell you though is

  • Formatting issues with any of the screens loading up
  • It is starting to take longer for pages to load in
  • The ordering of the menu options has changed
  • There are new menu options
  • The items you are checking can be seen by the automation framework but nothing is actually visible on screen
  • There are new items on screen that the check is not looking for

All of the above could also be scripted into the check but would likely take quite a bit of effort and you can’t always predict how an app will behave and therefore not be able to script for it.

This is where a real Tester has the advantage. You don’t need to tell a Tester to look for these things they will do this without being prompted and not only that generally a lot faster then an automated check. They can also tell you if it doesn’t feel right or perform in away that would be acceptable for end users which can be very hard to quantify and therefore automate. They can also use the information that they’ve just learned and apply it to what else they can discover. An automated test isn’t going to be able to do this, not with the tools we are using at the moment

Where a Tester can’t match an automated check (or will find very hard to) is checking the same thing in the same way consistently and quickly. As long as they are no physical moving parts an automated check can normally carry out the scenario above in seconds only being delayed by waiting for things to install or load.

So should we stop automating our checks?

Before any team starts to think about automating their testing via the UI they should first, as a team, ask themselves:

Why are we automating?

It sounds like a simple question but as I explained earlier people tend to have differing views on what the automation is actually going to do for them. By talking about why they want to automate they are more likely to come up with solutions that will actually address the problems.

One of the main benefits that I’ve seen from automation especially at the UI is faster feedback that the app:

  • Can actually be installed on a real device/displayed in a browser
  • The app can start without crashing
  • The app can reach any endpoints that it relies on
  • The apps core feature, the one thing it is designed to do actually works for your users e.g.
    • BBC iPlayer: Can video actually be played
    • Google maps: can provide directions to a destination
    • Amazon: allows you to buy products
    • Facebook: The feed shows you what your friends and family are doing

To do this manually, every time a build is made, could take some time but not only that is very tedious and from my experience just doesn’t happen. What tends to happen in this scenario is that developers will wait and see what comes back when Testers finally do test the app. This could be some time from when the change was made and the Testers finally being able to test it.

The longer this feedback loop is the hard it is to fix due to the overhead in understanding what went wrong and what changed to cause the issue. This is exacerbated when working with legacy code especially when not written by the developer making that change.

By automating just the core journey the development team know very quickly that whatever was last committed hasn’t caused a catastrophic failure and that the apps core feature is still functioning. If there is a failure then you can back out the change (or ideally fix it) and get back to a working state. This helps the whole team know that the app works and improves the team overall confidence that if I install this app it’s actually going to be worth their time. There is nothing more frustrating especially in mobile development to get a build, find the device you want to test on and install it only to find it can’t carry out it’s main job for the user or worse crashes on start.

When things fail so easily and obviously it does nothing to instil confidence in the development team more so when your key stakeholders find the issues. This also allows you to start using your Testers for what they a really good at testing and not just checking your developers work.

Core Journey

We use the concept of PUMA  to decide what our core journeys are and ultimately what we should and shouldn’t automate. A generally rule of thumb is if it’s not a core journey then can it be covered by a unit/integration test not invoking the UI. If it still can’t then why would automating it help? Who would do it? How often does it need to run and how quickly do we need feedback that it’s broken? Could we monitor the app stats to check if is still working rather then automating it? If it does break how bad would your users be affected/perception be? Could it be controlled by a feature toggle that allows it to be switched off in the live environment?

So the next time someone asks why don’t you just automate your testing ask them “Why are we automating?” You might realise that the problem they perceive can easily be addressed by one simple automated check rather then 100’s of automated UI checks.

The Do’s and Don’ts of Mobile UI Automation

(…from my experience)

Original posted on Medium

My name is Jitesh ‘Jit’ Gosai and I’m a Senior Developer in Test (DiT) at the BBC working in Mobile Platforms, Digital.

I originally wrote this post well over a year ago (2014!) and recently came across it again so thought I might as well get it out there. Most of the points still stand so should be useful for anyone getting into automation or are already on the road in doing so. Got any other tips then let me know in the comments or Twitter (@JitGo).

During our journey of automating our mobile user interface (UI) testing for iPlayer, we’ve learnt many things (good and bad) that I would like to share with you to help you with your work.
Below, I have compiled some do’s and don’ts to help you and hopefully save you from making the same mistakes as we did.

Android: adb (android device bridge) is your friend

Learn all the little things you can do with adb from simply listing connected devices, to grabbing screenshots. Adb will allow you to do quite a lot with a connected device so make sure you are comfortable with it. My suggestion is to begin with the google developer docs and then start searching around for things you would like to do; for example waiting for a connected device to have started, restarting devices, or sending simple commands to control the UI (don’t try to automate your test this way, you’ll be looking at a lot of pain).

Do learn to control your devices remotely

Learning to control your devices remotely will save you a lot of effort both in time and flow, especially if you have devices connected to your build server which is not physically nearby (in our case, locked in a server room)
Being able to remotely view the device screen and restart the device is very useful and can save you countless journeys when your device ends up in a state you can’t identify. It is also great for being able to return your device to the home screen and therefore into a known state before resuming testing.

Do ditch the android SDK emulator

We tried to use this but found it too slow and unreliable, randomly crashing or disconnecting itself from adb. Intel HAXM was faster but proved to be too flaky and would also crash intermittently. Maybe it was something to do with our setup but we just went with connecting a device to run the jobs and proved to be a lot more reliable. Genymotion is another option which is free for small teams so could work for you.

Do be patient!

When first starting out don’t be tempted to keep playing with your test/build environment for little improvements. Let it settle and run. Know exactly what needs fixing and do just that, not things that are nice to have but essential. Then once you have stability then start to slowly add the things that may help improve speed but with an eye on reliability.

Do be available for pairing

Pair with devs and build tests together. This gets devs familiar with how to write them, encourages them to write tests and stops you being the bottleneck when they break. Also show the UI tests to your Testers to get them familiar with what can and can’t be automated. I’ve found that by walking Testers through an automated UI test vastly improves their knowledge of how they work and things that it can miss.

Do test one thing and one thing only

Don’t be tempted to cover lots of things in one test as it can be harder to tell what has broken when they fail and harder to debug.

Don’t use (or use very sparingly) canned steps

When using step definitions for automating tests it is very tempting to keep reusing steps you’ve already written which at first works well. It can even allow non-technical members of the team to automate test but can result in very verbose scenarios – which are harder to read and more difficult to amend at a later date. We always create methods for interactions with the app e.g. go to home or go to channels in the case for iPlayer. We then use these in our step definitions. This way if the path to get to the to home or channels pages changes then you just update the method and all tests get the change, no need to update all your tests. It also allows you to write tests faster in the long run.

Do use the Page object pattern or similar

This will vastly improve the readability, maintainability and reusability of your tests and teach anyone who will be working with the code how to use it effectively.

Do push your test framework (calabash, appium etc) as far down in the stack as possible

Don’t litter your test code with your frameworks commands but instead delegate that to a module that you access via an interface.
This way if a command changes in a new version you only need to update in one place and if you decide to switch frameworks you can potentially a lot easier.

Don’t fall in love (with your tools!)

They are just that, tools, to help you in your task. If you find that it is causing you more problems and no matter how much searching you do no one can help then dump it and switch to something else (if there is).

Don’t automate everything under the sun

Automate the areas that are going to give the most valuable feedback i.e. that something has broken. We’ve found that UI test are most useful when the dev’s are actively working in those areas so focus your UI tests there.

Do have the need for speed

Keep your tests as fast as possible. Running iOS test through the simulator is faster than devices so we tend to stick with running them there. But for android we use devices as it’s more stable. So you need to weigh up what you need – stability always trumps speed. With that said keep your tests as fast as possible. Don’t use sleeps in your tests (or avoid them at all costs). Use waits that will check repeatedly for something to be on/off screen before proceeding or raising an error.

Do stub it out!

Stub out the data your app uses.We use Charles Proxy with some ruby scripts to automate the launching, closing and loading in of config files. The main reason we used this approach was that devs and tester were familiar with it so the learning curve was easier and was a quick solution that we came up with until a more appropriate solution could be developed/re-purposed such as REST-assured or WireMock.

Do use Metrics

Stats, stats and more stats. Use data to back up your ideas and collect as much as you can
Stats such as test execution time, pass/fail rates, number of tests, no of runs and charts these. We us dashing as it’s quick to setup and offers a very nice visual way to display data.

Dashboards displayed by the app development team. From the Top left app statistics via Graphana and app usage. Bottom left dashing board showing test status, code metrics using custom bubble graphs. Far right screen showing current build status

 

dashboard showing stats on TV stands

Dashboards displayed by the app development team. From the Top left app statistics via Graphana and app usage. Bottom left dashing board showing test status, code metrics using custom bubble graphs. Far right screen showing current build status

I’ve found that the Riskshaw widget to be very useful. Some teams have also been experimenting with AtlasBoard which looks promising.

Do learn your tools

If you are using Ruby (as we are) then learn to use a REPL such as Pry. This will save you countless amount of time when debugging or even creating tests. Watch this video by Conrad Irwin for a great introduction to Pry and REPL driven development.

Do have multiple tests around any given area of functionality

This way if a test fails you know it could be flaky but if all the tests in that area fail you know straight away something is wrong.

Do read these posts!

This excellent post by Gojko Adzic for anyone attemptig to automate testing at the UI level.
If you are on the journey of UI automation or plan to start then I also highly recommend reading this post by Dan North, The Siren Song of Automated UI Testing.To give a balance view on automated UI testing and that it’s no silver bullet to removing manual testing altogether or other (arguably better) forms of automated unit/component testing.

Got any tips on automated testing? Then let us know in the comments and you never know it may make it in the list above.