The unintended consequences of automated UI tests

Whenever I see people talking about automated testing I always wonder what type of testing they actually mean? Eventually someone will mention the framework they are using and all too often it’s a UI based automation tool that allows tests to be written end-to-end (A-E2E-UI). 
They are usually very good at articulating what they think these tests will give them: fast automated tests that they no longer need to run manually, amongst other reasons.

But what they fail to look at is the types of behaviours these A-E2E-UI tests encourage and discourage within teams. 

They have a tendency to encourage  

  • Writing more integrated testing with the full stack rather then isolated tests 
    • Isolated behaviour tests (e.g. unit, integration, contract tests etc) run faster and help pinpoint where issues could be
    • A-E2E-UI test will just indicate that a specific user journey is not working. While useful from an end user prospective someone still needs to investigate why. This can lead to just re-running it to see if it’s an intermittent error. Which is only made worse by tests giving false negatives which full stack tests are more likely to because of having more moving parts 
  • Testing becomes someone else responsibility 
    • This is more apparent when the A-E2E-UI test are done by somebody else in the team and not the pair developing the code 
    • Notice ‘pair’ if you’re not a one-person development army then why are you working alone? 
      • Pairs tend to produce better code of higher quality with instant feedback from a real person 
      • It might be slower at first but it’s worth it to go faster later 
      • This is really important for established businesses with paying customers 
      • A research paper called The Costs and Benefits of Pair Programming backs this up but it’s nearly 20 years old now so if you know of anything more recent let me know in the comments.
  • Pushing testing towards the end of the development life cycle 
    • The only way A-E2E-UI tests work is through a fully integrated system therefore testing gets pushed later into the development cycle 
    • You could use Test doubles for parts but then that is not an end-to-end test.
  • Slower feedback loops for development teams 
    • Due to testing being pushed to the later stages of development developers go longer without feedback into how their work is progressing 
    • This problem is increased further when the A-E2E-UI tools are not familiar to the developers who subsequently wait for the development pipeline to run their tests instead of doing it locally
  • Duplication of testing 
    • As the A-E2E-UI test suits get bigger and bigger it becomes hard and harder to see what is and isn’t covered by automation 
    • This leads to teams starting to test things at other levels (code and most likely exploratory testing ) which all add to the development time 

These are just some of the behaviours I’ve observed A-E2E-UI tests encourage, but they also discourage other behaviours which maybe desirable. 

They can discourage development teams from

  • Building testability into the design of the systems 
    • Why would you if you know you can “easily” tests something end-to-end with an automation tool? 
  • Maintainability of the code base
    • By limiting the opportunities to build a more testable design you decrease the maintainability of the code though tests 
    • If you need to make a change it’s harder to see what the change in the code affects
    • By having more fine grained tests you can pinpoint where issues exist
    • A-E2E-UI tests just indicate that a journey has broken and how it could affect the end users
    • Not where the problem was actually introduced  
  • Building quality at the source 
    • You are deferring testing towards the end of the development pipeline when everything has been integrated.  Instead of when you are actively developing the code.
    • Are you really going to go back and add in the tests especially if you know an end-to-end test is going to cover it?
  • The responsibility to test your work 
    • With the “safety net” of the A-E2E-UI tests you send the message that it’s ok if something slips though development 
    • If it affects anything the A-E2E-UI tests will catch it
    • What we should be encouraging is that it’s the developers responsibility to build AND test their work
    • They should be confidant that once they have finished that piece of code it can be shipped 
    • The A-E2E-UI tests should acts as another layer to build on your teams confidence that nothing catastrophic will impact the end users. Think of them as a canary in the coal mine. If it stops chirping then something is really wrong…   
  • More granular feedback loops
    • By having A-E2E-UI tests you’re less likely to write unit and integration tests which give you fast feedback on how that part of the code behaves 
    • Remember code level tests should be testing behaviour not implementation details 

If A-E2E-UI tests cause undesirable behaviours in teams should we stop writing them? While they are valuable at demonstrating end users journeys we shouldn’t be putting so much of our confidence that our system works as intended into them. They should be another layer which helps build the teams confidence that the system hangs together. 

If we put the vast majority of our effort and confidence into these automated end-to-end tests than we risk losing one of the teams greatest abilities: building testability into the design of our systems. But just like the automated UI tests building in testability takes conscious effort. This will take time, patients and experience for the whole team to understand and benefit from.

In test column

Update 4/1/20: I’ve also written about How to move away from the In Test Column

A colleague recently asked me why I think the in test column is a bad idea. So I had a quick search around and couldn’t find anything specifically on the topic I decided to put some of my ideas down on why we shouldn’t have one and instead to opt for a more generic In Progress column.

What’s a board?

Most delivery teams have boards of some sort to make the work they are doing more visible and show progress towards some goal. The two most common ones I see are physical boards with sticky notes and avatars representing individuals in the team

The other is a digital one usually Jira or Trello

The usual configuration of these boards are Backlog, In Dev, In Test, Done with tickets moving from left to right as the work progresses though each column. This helps the team visualise where work is currently up to and making it less abstract.

This all seems sensible as you can monitor the number of tickets in each column and create whatever stats you need to report on but from my experience these types of boards that break down in progress work into Dev and Test roles encourage certain types of anti-patterns.

Inner team silos

Boards like this have a tendency to encourages invisible silos within teams. The backlog column becomes Product Owners (PO)/Business Analyst (BA)/Project Managers (PM) responsibility and overall ownership

The Dev and Test column’s belonging to their respective disciplines and things only getting into done when Test say they are done.

Therefore settings up the disciplines into their silo’s and only doing work that is in their column and handing work off to the next column. Rather than jumping into different columns when they can help or pairing on tasks.

Pushing testing towards the end of the delivery pipeline

With the In Test column coming towards the end of the board and normally the one just before Done it starts pushing testing towards the end of the delivery pipeline rather than continuously throughout development. It also starts putting Test in the position of releasing the software rather than it being a team decision.

It almost sets up Test to be the last line of defence before a release.

More work in progress

Another consequence is that Devs starts picking up new work from the backlog while the Test team are verifying the previous ticket. If Test find an issue the ticket has to either go back into the Backlog or straight back into Dev.

Now the team has to decide do we let the Devs switch context, fix the issue so Test can carry on. Or finish their existing ticket, make Test wait, let them context switch to pick up something new from the Test column?

Context switching for either side isn’t a good thing as there is always the usually spin up time to get familiar with the ticket. Which can lead to people taking short cuts to get the work done quicker.

Testing becomes the bottleneck

The only constraint on the Devs doing more work (tickets) is what can be pulled in from the backlog and the amount they can do. Given the nature of our work (building software solutions) teams always have more Developers then Testers. So teams have a tendency to use the Dev teams to 100% capacity churning out more and more tickets.

As the Devs begin to get through more work, tickets begin to pile up in the Test column waiting for Testers to become free. This naturally leads to Test becoming the perceived bottleneck in the system as this is where the tickets are being held up.

How do you fix the bottleneck?

3 options come to mind and 2 that are usually discussed are

1) Add more testers

This is what generally happens if the teams/company can hire them or borrow test resource from somewhere else but for most this isn’t really sustainable or even an option.

2) Automate more of the Test teams work

This usually leads to more automated UI tests which is surely all that the Testers are doing… right?

See UI Automation, what is it go for? (AKA the Automation fallacy) on why automating at this level isn’t going to give you want you want.

The third which is rarely discussed is

3) Remove the In Test (and Dev) columns

This may sound strange as it looks like all you’re simply doing is hiding the work so let me explain.

You switch to having an In progress area instead. This is pretty much any work the team is currently working on and by this I mean actually doing not something they were doing and are now waiting for some information. This should be almost anything: meetings, training, coding, spike work, testing, absolutely anything any of your teams members could be doing that day that takes longer than eg 1 hour or whatever makes sense to your team to have on the board.

Now instead of just adding a Dev (or Devs if they pair) to a ticket also include a Tester as soon a the ticket gets pulled in. As the Devs are getting up to speed with the ticket the Testers should be too forming a 2-3 way pair/group (I’ll refer to this as pair for the duration of the post). This pair will stay with the ticket all the way to Done. No handoffs, no queuing up work in other columns. It is their responsibility to see the work to Done, preferably to production with Testers providing there and then feedback to the Devs on how the work is proceeding and helping them to bolster any automated testing eg unit or integration testing.

If you find that the Devs are taking longer on a ticket then it’s probably a sign that the ticket is too big and needs to be broken up into smaller, independently deliverable parts.

By having just a In Progress column and pairing Dev & Test from the start will foster a more collaborative approach to getting the work actually done in a team. It also stops more work being pulled in while other tickets are still in progress, reduces the amount of context switching for all and starts addressing the issue of Testing becoming the bottleneck.

Remember Testing isn’t something that happens once the Devs have finished it’s something that should be happening continuously.

Update 4/1/20: Looking for details on how to do this then checkout my post on How to move away from the In Test Column