Tuesday, March 13, 2018

Testing Tour Stop #3: Pair Debugging with Thiago

My testing tour continues! Today I had the pleasure to pair with Thiago Amanajás again. In our last session we worked on automating core scenarios of his team's product. In the meantime, Thiago implemented further cases but now had some tests failing. So we decided to first investigate and fix them.

Setting the Stage

The retrospectives of my previous two sessions showed that collaboration styles matter a lot. I resolved to clarify some things right in the beginning of the session. I asked Thiago if he would be up to do strong-style pairing, with one being the driver on the keyboard taking care of the implementation, and the other being the navigator leading the way to the solution. And he was! We decided to switch roles every four minutes. Furthermore, we agreed to try out a mob programming timer tool showing us directly on screen when it's time to take turns so that none of us would need to take care of stopping and restarting a timer while pairing.

Debugging automation. Exploring? Infrastructure issues!

As agreed, we started by having a look at the failing tests. After running the first failing scenario which repeated the same test for all available domains, we discovered that one of the test systems was not available and thus caused the test to fail. We asked a developer about it and found out that the system was up and running but not reachable from the office and that it was a known issue. Now that's interesting, we thought, how would they suppose to test this domain then? Later on we learned that the system was not accessible via wifi; using LAN connection it was.

Well, for now we excluded the concerned domain so we could see if something else was going on here. And by doing so, we found a logic error in the automation code. Great! Fixed it, ran the test again, and - yet another failure, a different domain was complaining now. Okay, let's see. The code compared a given URL with the found URL and stated that they differ. Interesting. We dived deeper and saw that many elements were identified using complicated xpath locators. Hm, according to my experience they are not only hard to read, but also tend to change when implementation changes, and overall are not known to be the fastest. So I asked why not using IDs? And found that the application code did nearly not offer any IDs. Testability could really be improved here! For now, we introduced IDs where we could and returned to our failing test.

We improved the log output to see the actual difference in URLs, ran the test again, and - it failed. We checked and realized that also this domain was not accessible! But wait, we didn't see the log output we added. So we decided to run the test again. It failed again. But not where we expected it to fail! It failed for a domain which just had worked some minutes ago. What was going on here? Our just improved logging of URLs now showed us a really unexpected redirection. We checked the site and found it was marked as insecure due to an invalid certificate. Strange, it didn't happen before! We looked at the dates and found that serendipity helped us uncover an issue. When we started the test, the certificate was still valid, and just when the test came to that domain, the certificate turned invalid as we hit the exact timestamp like the bull's eye!

This made us check the certificates of other domains as well. And we found lots of interesting inconsistencies. Domains running on a certificate for a different domain. A domain with a soon to expire certificate. A domain with a certificate only valid for five days! Here we struggled with the localization of Chrome. We used the Portuguese (Brazil) version of the browser and the certification details date labels said "Inválido Antes de" (invalid before this date), and "Inválido Após" (invalid after that date). This didn't feel intuitive for us. Why stating when it's invalid instead of when it's valid? So we checked the German version and saw that the first label was translated differently to "certificate is valid from this date on". Interesting how different localization approaches can help or impede our understanding.

In the end, we summarized our findings. After clarifying some issues with his team, Thiago resolved to follow up on three things.
  1. Upgrade the infrastructure platform to handle more than ten certificates for the test domains and thus avoid the observed issues
  2. Allow the test systems for the concerned domains be accessible via internal wifi
  3. Create ID attributes for the frontend elements to increase testability

How was it this time?

At the end of our 90 min timebox we did again a short retrospective. We both liked the strong-style pairing a lot, also with the frequent rotation. We had the feeling we shared and learned more this way, especially when compared to our last session.

What really annoyed us, however, was the mob timer we chose to try. The setup was easy and the popup indicating the rotation was gentle and not disturbing; however, we had to restart the timer for each and every rotation again. We changed roles pretty smoothly from the beginning, just grabbing the keyboard and continuing quite naturally as soon as the popup was displayed and the bell ringed. Therefore, having to actively go to the timer to confirm the next round was quite distracting. We would have loved the timer to notify us but just continue counting down the clock so we could stay focused on the task at hand. For my next session I'll definitely look for another timer with that feature.

All in all, we deviated from our path quite a bit, as it's so often the case when you started testing something and that something behaved in a completely different way than you would have expected. So it needed debugging. And then you stumbled across the next issue. And the next. You really wondered why this would behave like this, or show those values. Ending up at a completely different place than intended, but having identified essential problems that need to be taken care of. Exactly what we did, all the while having lots of fun together! It was awesome.

This session showed me again how broad knowledge can be invaluable for a tester, just as really specific core expertise. Being "all over the place" enables you to combine the tools you came across and the skills you learned. Like writing automation to increase your regression safety net. Like using automation to explore strange system behaviors. Like applying infrastructure knowledge to identify the real issues. We cannot know it all, but every single additional piece of knowledge helps us with testing.

Want to join me on my testing tour?

After experiencing some issues when scheduling my next sessions with lots of messages back and forth, I decided to give Calendly a try as I heard a lot of good things about it. The free account offers one event type and lets you integrate the tool with one calendar like Google or Office 365 to show your availability, as well as define additional rules for scheduling. Really important for our international community: It also solves the time zone confusions, as everyone sees the times in their local time.

So in case you like the idea of hands-on practicing testing together and learning from each other, please don't hesitate and schedule a pairing session with me! Looking forward to it :-)

No comments:

Post a Comment