Exploratory and Automated testing: Using the right techniques in the wrong contexts

Reading time 2 minutes

Exploratory testing is about testing in an unpredictable context and therefore detecting unpredictable failures in our software. Automated testing is about testing in a predictable context and therefore detecting predictable failures. The mistake we make with automation is we try to apply it to the wrong context. You can’t use testing methods developed for predictable context in an unpredictable environment.

While there is nothing physically stopping you neither practice is particularly efficient if used in the wrong context. Exploratory testing in a predictable environment would just confirm what you already knew only slower and less consistent when repeating the testing . While automated testing in an unpredictable environment would lead to false negatives.

It’s also not a one size fits all solution either as we work in both contexts. Predictable when initially developing the software and unpredictable once running in the live environment.

The only way you can replace exploratory testing with automation is to make the test environment predictable. But that would then mean you are trying to detect predictable issues. This then negates the outcome you were looking for which is trying to detect unpredictable or complex failures.

Testing in unpredictable contexts

The best way to detect unpredictable failures is to use methodologies that can operate in an unpredictable environment. 

One of the best known methods is exploratory testing (sometime called manual testing) but there are other technique too. Such as monitoring of the live environment. Which is good for issues we can predict in an unpredictable environment. Observability using logs, graphs and other telemetry to see how the system is behaving in the live environment. This is helpful for issues we can’t predict and need to debug in the live environment. Phased rollout of features using techniques such as feature toggles, blue/green deployments, canary releasing etc. Useful for limiting the impact of unintended issues in a unpredictable environment. Basically anything that allows you to slowly enable a feature for subsets of users.

Using monitoring and observability in conjunction with phased rollouts can greatly improve your ability to understand and limit how new code behaves in unpredictable environments. 

Testing in predictable contexts

This is not to say automated testing is invaluable as it can help detect smaller predictable issues. Which if left unchecked could develop into larger unknown failures that only occur with the right mix of other smaller issues. Some issues maybe within our control (software we develop) and some outside of our control (other people’s software). For software in our control (a predictable environment) automated testing is almost a prefect match. For software outside of our control (an unpredictable environment) contract testing, exploratory testing, monitoring and observability and phased roll outs of software is preferable. 

Control and isolation

Next time you’re looking at testing techniques think about how much control (and therefore isolation) you have over your test environment. The greater the level of control then the more automation you should consider, but the less control you have then the more you should consider exploratory testing coupled with monitoring, observability and phased rollouts. 

Testing techniques

The following diagram will help you see how different testing techniques stack up against each other. This is by no means an exhaustive list and is only comparing them on a speed of feedback, value of feedback and testing environment bases. So the next time you get into a discussion about testing you could use these characteristics as a good way to frame that discussion.

Testing techniques plotted on a speed, value and environment axis
Testing techniques plotted on a speed, value and environment axis

Are there testing techniques that should be plotted on the chart?

Do you agree with the axis? Is there another more important characteristics of testing that should be captured?

How would you plot the testing techniques?

What is Contract Testing?

And Consumer-driven contract testing

This is a follow on from Contract testing: Why do it

First some quick definitions:

Consumer
Is someone (a dev team for instance) that makes use of a third party component or a combination of components (a system). They consume the service provided by the component/system.

Producer
Are the people (a dev team) who build the component or system and make it available to others to use.

Test double
To keep the tests fast you will be using a Test double of the producer in the majority of your tests. More specifically a stub that is very simple and responds how you tell it to.

Remember don’t mock what your don’t own.

Avoid using mocks for contract tests otherwise you’ll be creating another job for yourself if you attempt to mock the behaviour of your producers. Always think of the producer as a blackbox so don’t make assumption on how the internals of the producer work. That is not your responsibility. A stub should be simple and easy to see how it works and will generally just respond with a simple response.

What is a Contract test?

Contract tests are automated code level tests written from the viewpoint of the consumer. They check that the producer exists, responds to a given request and responds in the format expected by the consumer. A simple rule could be

  • For every unique call you make to the producer write a contract test
  • If output from a producer is going to cause a unique behaviour change in you (the consumer) then write a contract test e.g. an error condition would fall into this category

They wouldn’t go further then this and begin to check that the response contains all the correct data or the behaviour of the producer. That’s the job of the producer not the consumer. The producer will be a black box to the consumer, simply input and output. Whatever transformations that happen to the data on the inside of the producer are unknown.

If you do test that a response contains the correct data then I would only test very specific types of data. Specifically ones that if they where not returned would cause problems for your system. For which your system should handle gracefully in response.

How will Contract testing help?

Focused
If the test follows the guidance above they will focus on just the boundary at which the consumer and producer interact. Therefore if they fail you know exactly where the problem is but also what the issue is as they cover only a small area of interaction. This will allow you to quickly identify if the problem is with your integration of the producer or some other part of your code.

Fast
These test will be written at the code level usually with a native unit testing framework of the language you are working with. The vast majority of the tests will also execute against a stub to keep them fast. If you ran them against the actual producer then they could run slower. Also due to each test being so focused on the interaction boundary they will run well under a second allowing the whole suite of tests to execute in a matter of seconds.

Reliable
Circle of control / Circle of influence
Do to the simplicity of the tests the number of false positives is very low and will only fail if something had changed within the test, your interaction with the test double or the test double itself. Everything is now within your circle of control therefore any brittleness can be remedied quickly and easily.

Automated documentation
You now have tests that document your usage of the dependency that are also executable so will stay up-to-date with every change you or your dependency makes.

Running the tests

These test can now be easily kept as part of the main suite of tests within the code base and run through the development pipeline as usual. Any change to the code base would result in the whole suite of contract tests running and letting the dev team know if there was any issues.

Occasionally you would also want to run the contract tests against the real dependency separately from the main build pipeline just to let you know if the contract had changed and that your test double is still a true stand-in for the real thing.

New version of the dependency

Now when a new version of the dependency is released you can run the contract tests against it and check to see if there are any breaking changes. If no issues are detected then maybe some light exploratory testing of changes detailed in the release notes.

If running the contract tests does detect an issue then it should be quick and easy to pinpoint where the issue is (you or them) and what the necessary mitigation steps should be (fix in your code or reject the release). All this while keeping your build pipeline running and your code base shippable.

If an issue is detected in the live environment then it’s going to be easy to know what changed and how to fix it. Which could be either fixing forward or backing out the change.

Confidence for the Consumer team

Contract tests allow the consuming team to move to a new version of a producer much quicker and with greater confidence than before. If something in the release notes looks risky still then your test team can carry out focused exploratory regression testing and if possible putting the update behind a feature flag for a controlled release to your end users.

What is Consumer-driven Contract Testing?

The thing with contract tests is that they are very much in the consumer domain. If the producer is making regular releases which result in the contract tests failing often then in one hand at least you know before taking the actual update but in the other you still can’t take it without work arounds or additional new releases from the producer. This may lead you to thinking about a new supplier. Why even bother with all the pain with writing contract tests when you knew this already?

What if you could help your producer see that each update is going to cause you issues before they even made a release? What if they told you prior to making the release that they need to introduce a breaking change or better yet that the current API will be deprecated after a certain date/version allowing you to move to the new API in your own time? What if you could work with your producers collaboratively that way they get what they want (easy and quick uptake of new versions) and you get what you need (new bug fixes/features, improved confidence of each update working as intended, less time testing)? This is where Consumer-driven Contract testing can help and really starts to show the benefits of Contract testing.

Benefits of Consumer-driven Contract Testing

As mentioned earlier the Contract Test sit in your circle of control. That is everything in this domain is in your direct control. The producer however is out of your control but can be in your circle of influence.

  • Note the level to which you will have influence over your dependency will depend on your overall relationship. If they are within the same organisation then things maybe easier, outside of your org but a supplier that you have a financial contract with then probably require some contract negotiation so not impossible but still some effort. No financial contract and just something you use through an open source license then contract testing is all you will likely have as a relationship.

One of the ways to start moving your producers into your circle of influence is to start a dialogue with them around your contract tests. These tests will show the producer exactly how you integrate their service and the types of response you expect from them. Also due to the simplicity of the tests and test doubles it should be easy for them to understand without your intervention (another good reason to keep them focused and simple).

Showing them the tests is a good place to start (it’s just code that’s what we are all working with none of that touchy, feely stuff about relationships) but a better way to progress the relationship, sorry, chat would be see if they could run the contract tests as a part of their development pipeline. Perhaps every time they plan to make a release or better yet on every commit (another reason to keep the tests fast and reliable).

This way they not only see how you use them, but they get an early warning if any changes in their code is likely to cause their consumers any problems. They can then see if they really need to make that change or see how they can mitigate the impact to their consumers. If they need to do it they can start a dialog with their consumers and start to migrate them onto a new API. This all helps to improve the relationship between consumer and producers, facilitated with some simple tests. Who knew testing could build stronger relationships between development teams?

Who owns the Contract tests?

Just in case it’s not clear the responsibility to write the contract tests in the first place is always with the consumers. It’s only them that know how they plan to use and integrate the producers. The producers can always offer best practice and how they intend consumer to use their services but it’s up to the consumers to decide if they plan to use the service the way it was intended.

Contract tests only become Consumer-driven once they are executed by the producers. Until then they are just Contract tests and even then just in name. If they test anything more then what was outlined earlier they become something else entirely.

New problems to solve

Figuring out how to share the tests, run them, making the results visible and letting consumers and producers know about breaking changes is a whole host of other issues that need to be resolved. The web testing frameworks have already made some progress in this area but I don’t know of any tools that facilitate this between internal teams other then having access to each other’s build infrastructure and source code repos.

Don’t use contract tests to do functional testing

Contract tests need to be quick and simple to understand and therefore only test at the boundary. If you go further than this they will become more complicated and harder for other teams to understand.

It’s not the producers team to understand how you use their service but giving them some insight into how you integrate it could be beneficial to both teams. There is nothing stopping you from writing more integrated tests but don’t expect your producer to run these. This is your responsibility and the feedback from this would be more beneficial to you than them. Besides you don’t want them thinking you’re trying to fob your testing onto them.

If you do more testing further than what was described above don’t call them contract tests otherwise you’ll cause more confusion. Be specific and call them what they are.

Contract Testing, Why do it?

I’ve been thinking a lot about contract testing lately and trying to explain why it’s a good idea.  I thought I’d start by getting my initial arguments for it down and go from there.


Got an opinion then let me know.

Note: This is a first draft (published 24/10/19) and I’ll (hopefully) revisit it again soon but in the meanwhile here is me thinking out in the open… 

Update 18/11/19: Added more details on what contract testing actually is.

The “Contract Testing Chat”: The Reality, The Problem, The Possible Solution?

Aim: To encourage dev teams to use contract testing to manage their integration of dependencies

The Reality 

Within any of the systems we produce there are components from external teams as this allows us to focus on what is important to us and let the other teams take care of thing that are not our core competency. 

In an ideal situation we would probably make everything ourselves so we have complete control but that would require significant amounts of Time, Money and Skills.

Time – To train your existing staff or recruit the people that have the skills and then allow them to actually build the component/systems.

Money – To hire the people and all the necessary resources they need to do the job.

Skills – That the person needs to be able to do the job

Some organisations can throw money at the situation and recruit the best in the industry and do everything in-house. Think of large organisations with deep pockets and large global brands.

Others (like us) don’t have this luxury and have to rely on external teams and component makers to make up for the parts we choose not to focus on. This leaves us with a dilemma. 


The Problem 

Do we just simply trust that these external components (dependencies) will work as we hope and the teams maintaining them will let us know when things change? What most teams do is integrate the component and put it through a couple of rounds of exploratory testing just to be sure things still work as we intended. If an issue is found then it’s a matter of understating what the problem is and where the problem could actually be and who’s responsibility it is to fix it.

This strategy works quite well once the initial issues have been ironed out. 
Eventually though a new version of the component is released and you need to decide if you test everything again or trust the release notes and just do focused regression testing.

Possible solutions


Focused Regression Testing If you just do focused regression testing and an issue is found in the live environment then trust in the dependence maintainer and possibly the development team integrating the component is diminished. The general response to this is to do a full regression of the integrated component every time. 

Full Regression Testing Full Regression Testing usually takes more time, money and skills so teams only integrate newer versions of the dependencies if they really have too. Generally when it contains something that they need e.g. a new feature or bug fix that affects the team directly. 

But because of the large gaps between integration of the previous and latest dependence there are likely to be even more changes then the team anticipated so not only does a full regression now have to happen but there are likely to be more issues found leading to even longer lead times in integrating the component. The blame game normally starts about now, see below. 

Automated end-to-end testing Some teams try to address this problem with automated end-to-end UI testing. Why? Well that’s what the Testers are doing during regression testing right? Just checking the functionality of the system and finding all the issues. So if we can automate this then we can not only find these issues faster but repeatedly and freeing up the Testers to do other things. It almost looks like you address the time, money and skill question is one initial up-front cost of building out the automation see UI Automation, what is it good for? 

Unfortunately these test only find what you program them to find and not only that the more components the end-to-end test run through the greater the chance of failure from false positives. If it does find an issue then you need to work out where the problem actually is: the test or the code. If it’s the code then another developer needs to investigate where that issue is and you’re heading to the blame game backlog issue. 

The Blame game 

The blame game is when the dependency maintainer blames the integrating team for not taking updates often enough and attempting to integrate the component in a way that they didn’t intend. On top of that any issues now found by the integrating team needs to go into the dependence maintainers backlog to be prioritised as they have other competing work to be getting on with. It’s not like they are the only team integrating their component. Meanwhile the integration team is blaming the maintainers for sneaking in features and bug fixes that they never asked for and holding up their development process. 

Last resort solution? 

Find another supplier Once The Blame game starts then this usually leads the integrating team down one of two paths. Find another more responsive supplier. Perhaps paying an external team to the company might solve their dependency problems. This is throwing money at the problem (see money, time and skill from earlier) or they…

Build it in-house This is all about taking back control and making it the teams responsibility to build the component. No more having to worry about another teams backlog or building things that have no relevance to your team. This is the skills part of the money, time and skills from earlier.

Both of the above solutions is a break down of the relationship between maintainers and integrators or more commonly your dev team and those PITA’s over in <insert location/team/department name here> 😆

Is there anything else we can try that could help with all the issues above and prevent the relationship breakdown? We ended up down this path as we needed to address the time, money and skills costs we couldn’t afford as a team, but all the options above results in one of the core costs having to be paid.

A possible solution? 

Contract testing and Consumer-Driven Contract testing 
Contract testing helps address the time cost by allowing an external team maintain a dependency. This also addresses the money question as the responsibility to fund that team essentially becomes someone else problem along with the skills issue. All the dev team needs to do is integrate the dependency. So how is Contract testing going to actually help? 

See What is Contract testing for more details.