I recently lost about two-and-a-half days to unit/integration tests. At Mighty Networks, we are pretty proud of our test coverage, and we make writing tests part of the development process. Developers are required to write tests for every feature they implement, but in the past few days I’ve seen that this policy needs to be applied flexibly.
A few months ago we wrote a pretty expansive integration with the iTunes Store. Since we allow people to sell subscriptions through our app, we required a pretty complex integration. Apple’s developer APIs are notoriously crappy, so this required a full team effort. One developer wrote a series of tests for our Apple integration.
In theory, the tests are very thorough. But getting real data for testing is virtually impossible. So the developer faked up a json file, then wrote a preprocessor to generate fake data in a format that looked like Apple’s format. Then he wrote tests.
You may see where this is going already.
The tests he wrote essentially tested his preprocessor. Rather than testing the actual methods used in the integration with Apple, the tests looked at the values generated by the preprocessor. Essentially, by writing a clever object to fake Apple data, he removed the actual integration from the tests.
The tests looked correct. They seemed to show that our Apple code worked. But really they were mostly testing the test code itself.
So when I modified a related system, and added a few tests, I suddenly saw a massive cascade of failures all over the place. The failures were of different types too. Sometimes there was a null value, or an unexpected ID, or an error seemingly from Apple.
It took me a day to figure out that the Apple integration itself wasn’t failing, only the preprocessor wasn’t set up to actually work with the rest of the system. Then I took another day-and-a-half to pull out the worst part of the system and replace the non-tests.
I don’t blame the developer who wrote the tests. It’s a common mistake, and we all did it at least once.
In part I blame Rails, because it encourages black/white thinking about software development.
The developer followed the rule that he needs to write tests for every new feature. When he integrated with Apple, he diligently wrote his tests.
The problem arose when he realized that he couldn’t run production code to get real data. He didn’t know how to write a test for the algorithm, so he wrote code that generates Apple-like data, then tested that.
The developer failed to see that writing tests is a guideline, rather than a rule. In this case, it is very difficult to test every part of the integration. It’s acceptable to write tests of the core process, without testing specific return values and specific pieces of data. The tests gave the impression of working code, and full test coverage. But they hid a few problems with the integration by testing for specific values, rather than algorithmic correctness.
So what can we do?
Senior developers need to encourage junior developers to talk about problems that arise when following “the rules.” Senior developers need to encourage an environment where it’s okay to admit that portion of the code just can’t be tested. Or at least to see that a portion of the code can’t be tested in the same way as most of the code. Senior developers need to encourage critical thinking and analysis in situations where the strict interpretation of the rules may not lead to the best results for the development process.