Measuring and Improving Your CI/CD Pipelines

blog.petegoo.com

50 points by kiyanwang 5 years ago

hinkley 5 years ago

> Parallelize

This was my theory early on in my journey, mostly before CI/CD became a thing a few people were doing. My enjoyment of concurrency didn't survive contact with my peers and subordinates. I was frequently stuck working on certain types of problems because either I caused them or there were few other people who could understand them and not all of them were patient people.

It's really easy to design a concurrent system that only 20% of the people on your project will properly understand. You are doing deep violence on your team if you are creating pervasive parts of the system that most of them can't understand.

In a CI pipeline the first concurrency problem is representational: How do I present the progress of 4 tasks happening in parallel when one of them is failing?

The status and behavior of the CI pipeline needs to be obvious to all. If the CI tool has an answer for this problem (like parallel tasks or jobs), then use it if it saves you time. Otherwise, I don't know of a good way to report on the progress of simultaneous tasks, and it's better not to try than to try and fail. Race conditions in your build can sometimes take hundreds of runs to become obvious. And by then it's difficult to roll things back.

petegoo 5 years ago

We developed our own representation of this in Slack. It's just a tool to optimise throughput but you're right, you need to be able to represent it and surface failures early. For us these are separate builds, chained together with fan out and fan in.
Parallelizing within a test suite / build is a whole other thing and yes, there be dragons.

vi1rus 5 years ago

As a DevOps guy I find the biggest hurdle Dev education and stubborn management.

Right now 90% of the end to end tests could have been run during unit testing. Instead they are run after a full code deployment. This adds an extra hour of testing. :(

hinkley 5 years ago

End to end tests seem to be a crutch.
Of the people I've observed learning testing, the ones that do e2e tests early pick up habits that they can't seem to unlearn. And the existence of the e2e tests seems to block prioritization of architectural changes to make unit and functional tests more effective.
And the frameworks are never what I would call reliable. You can do work to remove race conditions from them but it takes tremendous discipline (if a tool is wrong by default, that to me means they are using the wrong metaphor).
These days I try to keep people focused on unit tests until they run out of runway.
- mikekchar 5 years ago
  
  > End to end tests seem to be a crutch.
  Generally I agree. If you have good unit tests that fail when behaviour changes, your end to end tests do not need to be run all the time (or, often, ever).
  The program as a whole operates in certain ways. Ideally, its operation is governed in a deterministic way by the behaviour of its parts. End to end tests usually can't give you full test coverage because the number of permutations of behaviour is too high to specify. However, the behaviour of single units can be exercised fully in most cases -- especially if you write the code to be easy to test (factor out branch points and loops, expose state so that you can test for changes in state, etc). If the behaviour of the units don't change, the behaviour of the final program also won't change. This means that you only have to test the behaviour of the full program in relation to the change of behaviour of the units. Often this can be done with manual testing while the units are being modified.
  In other words, individual stories need to have acceptance tests (which can be manual or automated), but they usually only have to run when merging into the trunk branch. You don't usually have to run them again on deployment (or on any subsequent branch merge -- as that branch will have its own acceptance tests).
  Some end to end tests might be a good idea depending on the architecture of your system, or the cost of failure. For example, if you have several disparate systems that are coordinating (e.g. lots of micro services), then it's pretty hard to do acceptance testing when you are doing development. It's one of the massive downsides to doing development that way. Additionally, it's usually a good idea to have end to end tests if a user's safety could be compromised. Or possibly making sure your "Buy Now" button works in various different scenarios. Relying on end to end tests for your entire application is usually a losing proposition because of the burden to get acceptable test coverage and fast development turnaround.
- claytoneast 5 years ago
  
  Do you have any good books you'd recommend on what you feel is the proper approach to testing? I feel that I reach for feature/e2e tests first, when perhaps I really should be building up a solid base of unit specs before moving on. I'm always a little unsure what specs I should have vs. are unnecessary.
  
  teeray 5 years ago
  
  There are some really fantastic testing resources in the Ruby community. The talk that most influenced my approach to building unit tests (even though I write Go these days) was one of Jim Weirich’s: https://youtu.be/983zk0eqYLY
  I’ve found that his zero-knowledge approach gives me a suite of tests that have high signal when they fail.
  Sandi Metz has also spoken extensively on testing, and I particularly like her advice on using mocks in tests appropriately. This talk of hers on the subject comes to mind: https://youtu.be/URSWYvyc42M
  
  hinkley 5 years ago
  
  Testing is hard. Something I repeat often to frustrated testers. At least once a quarter I wonder if maybe something this hard should be accomplished another way. Like maybe Bertrand Meyer was right 30 years ago about contracts and pre/post conditions.
  For years I felt like a terrible tester on a team of even worse testers. I still catch myself using antipattetms I tell other people not to use.
  Honestly the two biggest things I know are try to write pure functions, and separate deciding from doing (these are not mutally exclusive), something Meyer also talked about. That will open up a lot of your code to simple tests with few mocks.
  Book advice is going to sound like a non sequitur. I find bad habits in many of the testing books I’ve read, like the author hasn’t root caused common problems that pop up on real teams. You can’t blame human frailty for your testing problems. Testing is a monument to human fallability. Own it.
  You can fix a lot of classes of problems with Refactoring by Fowler. There’s a second edition due out in a few months. I hope it maintaina the value judgements and thought processes if the original.

MattPearce 5 years ago

Neat to see an article from Pete on here - I worked with him at Pushpay, he's brilliant. The CI/CD pipeline the SREs built there was a joy to work with (and that's quite a compliment for a CI/CD pipeline!)

kornish 5 years ago

Out of interest, what was the stack and what were some of the aspects that made the pipeline so great to work with?
- MattPearce 5 years ago
  
  They're a Microsoft shop so it was C#, SQL Server, RabbitMQ etc running on AWS.
  What I liked the most about the pipeline:
  - Speed - we spent a lot of time (as the post says) optimising the process of getting changes into production, and making it as streamlined as possible
  - Safety - automated testing caught a huge percentage of the issues, meaning we were able to fix them earlier and avoid the turnaround time of finding out later in the process.Tests included visual diffs of many of the pages, approval tests to check contracts and routes didn't change, etc.
  - Transparency - while there are obviously differing opinions on ChatOps, it was great to be able to scroll back through Slack history on the shipping channel and see a complete record of the pipeline for a particular deploy, seeing the execution of the automated steps interwoven with the conversations of the team members working on it. It was also great being able to see the shipping queue at all times so you could take a look and judge how long it would take to get a change through, and could negotiate with others if you needed to jump ahead of them etc.
  - Focus on having everyone involved - everybody was involved in reviewing, merging, etc. The aim with a new hire was to get them to complete a change and deploy it to production themselves within their first week. If you were the first person on a "carriage" it was your responsibility to "drive" it and to judge the risk factor, whether to allow other specific changes in the carriage, etc. This meant everyone spent a lot of time thinking about how to reduce risks in their PRs (smaller PRs, more tests, always feature flagging, etc) which was much healthier (IMHO) than having one or two people in the team being responsible for merging or deploying etc and having all the responsibility.
  Some of it is cultural as well - Just Culture (blameless postmortems etc), being brutally honest (radical candor) and willing to continually refactor processes.
rjbwork 5 years ago

I worked with him on a HuBot clone a number of years back. Seems like a good guy.
petegoo 5 years ago

Thanks Buddy :)