We live in a world of walls, unfortunately, and some people would like to build even more of them. Whatever you think about that, the walls between software developers and IT operations staff don’t do anybody any favours.
Looking over the wall
It is impossible to begin to learn that which one thinks one already knows.
If you’re a developer, have you ever wondered why ops seem so antagonistic? Here’s why: they’re fed up with your buggy software that doesn’t work in production, and your apparent lack of interest in fixing it.
If you’re an ops person, how do you think you’re seen by developers? The answer is, they think you’re grumpy, unhelpful, resistant to trying new things, and unresponsive to requests for changes. Oh, and you’re a real buzzkill about security.
I’m going to tell you something now which will shock you rigid. The fact is, those folk in the other team aren’t idiots, and they don’t hate you. They’re smart, motivated, and professional, and they’re focused on doing their jobs. But you’re not making it any easier for them. Here are some ideas on how to change that.
People who make music together cannot be enemies… at least while the music lasts.
First, empathise. Understand a bit more about what your colleagues in the other team do, what they care about, and why it matters to them. Second, collaborate. When you work closely with someone, you get a great insight into what it’s like to do their job.
Software developers, get more involved in how your stuff is deployed and run in production. “Throwing code over the wall” won’t fly any more. Your ops friends will help you get a development environment that mirrors production. You can use Vagrant boxes or cloud instances built by the same automation that builds production. No more “it works on my laptop”; when something breaks in production, you will have an identical environment to troubleshoot it.
Ops professionals, you already write and maintain software that runs your infrastructure, so make sure you’re using the same workflows and tools as your friends in development. Get them to do code reviews for you. Their entire working lives are focused on good software engineering practice; there’s a lot you can learn from them.
When there’s a new application or service to be deployed, involve the developers from day one. The software that configures the servers, installs the dependencies, and manages deployments, is as much part of the application as the source code itself.
Do pairing. The best way to collaborate with someone is to pair program with them (or pair sysadmin, depending on the task). That means you’re both sitting at the same screen and keyboard, talking about what you’re doing and working by consensus. You might be coding, troubleshooting a problem, or anything else that’s part of your normal work: you’re just doing it together. If a disagreement comes up about what to do, talk it out or take it to a whiteboard. Involve other people if you think they can help. If you don’t have the information you need to solve something, find the person who does, and pair with them to solve it.
Re-thinking your work
I am a man of fixed and unbending principles, the first of which is to be flexible at all times.
New or experimental projects often need a lot of flexibility. If IT can’t offer this to developers, they’ll have to go around IT to get the job done, and that doesn’t spell collaboration. If developers need your help to get virtual machines running, make it so; if they need the ability to spin up cloud instances to test things, make sure they have it. Re-think your priorities as an operations engineer. It might seem like answering questions, helping people, and working with developers is taking time away from your real work. Guess again. That is your real work!
Developers, you might think your job ends with a
git push. But software that doesn’t work in the real world is a waste of bits. You need to understand where your code runs in production, how it gets there, how the servers are built, how the cloud provisioning works, what happens when your stuff breaks, and how to fix it. You might think learning about Linux command lines, TCP/IP, and network latency is a waste of your time. Actually, it’s making you a better developer. If you think it’s not your job to know this stuff, you misunderstand what your job is.
The truth is there was never a neat line between dev and ops. The overlap is precisely where things get interesting. Lots of important work simply can’t be done without having a foot in both worlds, and the way to do that is for dev and ops to share their particular fu. If deploys are fragile and often result in unplanned downtime, work on that together. Building a safe, reliable, easy-to-use deployment system is right in the centre of the Venn diagram between dev and ops. If you get that right, much else will follow. If releases pass tests, but fail intermittently in production due to weird edge cases, you’ll need to work together to debug that. If performance is a problem, it takes dev and ops collaboration to fix it.
Closing the loop
Show me a completely smooth operation and I’ll show you someone who’s covering mistakes. Real boats rock.
—Frank Herbert, ‘Chapterhouse: Dune’
Finally, monitoring connects it all together. Monitoring tells ops that the services are up, and it tells devs how the software is performing. Good automated monitoring checks don’t just test that a webserver is responding: they match text strings that prove it’s working; they fetch multiple URLs that exercise different parts of the system; they do queries which verify the whole stack. If the system uses login sessions, the monitoring checks log in and behave like users: searching, filling forms, uploading content. The developers know what needs to be tested, and the ops team know how to write checks that test it. Good monitoring demands empathy and collaboration from dev and ops, and it closes the loop between those who write the software and those who run the software.
“But we already write unit tests!” Great, but monitoring is different. Unit tests demonstrate that your code works in theory. Monitoring tells you whether it’s working in practice. Unit tests show what happens in the failure modes you’ve thought of. The real world will throw you failure modes you won’t believe. Unit tests are essential, monitoring is essential, but they’re not the same.
When monitoring detects that a service is down, that alert needs to go to the person who wrote the service. This is one area where developers can be surprisingly resistant to change. Some people have got used to the idea that their responsibility ends once the code ships. But that’s not the case. If software is breaking in production, fixing it needs to take priority over new features, and that means developers need to get that information directly: ops don’t want to spend their time nagging devs about bugs, and it creates a tension between the teams which is unnecessary and unhelpful.
A good way to get developers interested and involved in operations is to set up a highly-visible dashboard screen, showing current system status and uptime. If all is well, the board is green. When there’s a problem, that should be visible to all developers, and when there’s an outage, developers should be getting paged. It’s amazing how being on-call for your own stuff concentrates the mind… on fixing it.
Tearing down the wall
Outside ideas of right doing and wrong doing there is a field. I’ll meet you there.
When devs and ops collaborate, good stuff happens, so start breaking down the walls and coming out of your boxes. The devs learn about how to deploy, run, and monitor services at scale. Ops learn good coding practices, the power of pair programming, and how to build software as a team. We all learn how to be better at our jobs, how to be less defensive, and how to be more empathetic. Don’t wait for the management memo. Just start today. We’re going to build a wonderful bridge, folks. I’ll meet you on the other side.
I should point out that none of the ideas in this piece originate with me; many others have advanced them, possibly more effectively, in talks, tweets, presentations, and books over the last few years. I’m thinking particularly of Jennifer Davis, Katherine Daniels, Gene Kim, Jez Humble, John Willis, and of course Patrick Debois. You should absolutely follow them all on Twitter (and buy their books!)
In answering the question: ‘What does the red spectrum tell us about quasars?’, there are various words that need to be defined. What is a spectrum? What is a red one? Why is it red? And why is it so frequently linked with quasars? …What the hell is a quasar?
— Rimmer, “Red Dwarf”
What is Devops?
Everyone knows what they think it means, but they all have something different in mind. Not surprisingly, many of the conversations about Devops end up being arguments at cross-purposes.
Things people think Devops means:
- Developers who can install Linux
- Sysadmins who can write code
- Connecting operations to business
- “The Cloud”
- Infrastructure as code
- Agile operations
- Cross-pollination of skills between coders and systems hackers
- Continuous integration and deployment
- A weekly hostage exchange programme between development and operations teams
- A vague, warm, fuzzy sense of co-operation. Can’t we all just get along?
- A pragmatic recognition that we need to work together to solve common problems, in order to achieve common goals
Some people see it as a revolution in the way we do operations and development. Other people talk it down as something that good sysadmins were doing anyway, and that the technical proletariat have only lately caught on to. For many small organisations, what they’re doing must be Devops because there’s only one person doing all these jobs anyway. For big organisations at the other end of the scale, where the team that manages switches must fill out an online request form to communicate with the team that makes firewall changes, Devops-style collaboration seems like an impossible dream.
The dream of Devops
The dream is that one day we as a technical people will rise up and live out the true meaning of our creed:
“We hold these truths to be self-evident, that all geeks are created equal”.
To the progressive optimist, the idea of programmers and sysadmins batting for the same team seems obviously desirable. To the conservative cynic, it seems obviously doomed. Both are half right. The short history of computing has seen the rise of two very different cultures: the people who program the machines, and the people who keep the machines running sweetly.
From spliffs to submarines
My first IT job was in the kind of small and supremely relaxed Web software company characteristic of the dotcom boom. Developers wandered into work when they felt like it and left when the pizza ran out. They were artists, a creative élite. Many liked to code high, smoking spliffs at their desk, and the biggest clouds of smoke came from under the door of the CTO’s office. It was like a cross between the movie ‘The Social Network’ and being on tour with the Grateful Dead.
Fast-forward to a few years later, with suitable cinematic effects, and I am working for a large and respected US enterprise hosting firm, staffed chiefly by ex-military types, specifically veterans of the submarine service. Everything runs by the numbers. There is a procedure for everything, and everyone knows their place. If you break military discipline, you can expect a short, sharp shower of rebuke. Innovation and original thought are frowned upon, and spliffs are about as welcome as they would be on a nuclear submarine. Light up a reefer in the data centre, and Halon asphyxiation will be the least of your problems.
The world has changed
In the software studio, we were sculpting something creative, unique, and original, and we were pampered artists. In the server mines, we were running a large and expensive commercial operation, and we were what the hosting industry sarcastically calls ‘intelligent hands’.
The world has changed. People will no longer cut you cheques in exchange for Internet dreams and hash-scented promises of code that will ship ‘real soon now’. Virtualization of the hosting industry, combined with ruthlessly dwindling profit margins, has largely eliminated semi-skilled data centre jobs that require only the ability to rack servers at 4am and survive on bad vending machine food.
Anyone with a pinch of Rails-fu can hack together a social web app with a big colourful icon, push it to Heroku before lunch, and spend the rest of the day wondering how to monetize it. Suddenly you don’t need to be a big company to make software, or to put it in front of millions of people. The software business has moved from the old days of big Everest expeditions with hundreds of Sherpas and tons of supplies, to an Alpine-style business model: small teams, moving fast, with minimal support and overheads, are the quickest to the summit.
Inside the tent, peering out
So now the programmers, the builders of castles in the air, and the operators, who make sure the castles don’t fall down, find themselves all in the same tent at 29,000 feet wondering whose job it was, metaphorically, to bring the tin opener. Coders whose idea of deployment is ‘well, it works on my Mac – I’m off snowboarding’ now have to talk to Bastard Operators From Hell who previously regarded themselves as exclusive guardians of the secret flame of Unix. Technical people everywhere are starting to find that ‘they’ aren’t so very different from ‘us’ after all.
Talking together is a great start, and working together is even better. Thinking together is best of all. The discipline of pairing, where two people share a screen and pass a keyboard back and forth, coding, designing, building, automating, configuring, scripting, testing, fixing, deploying, monitoring, releasing, is probably the most efficient way to share knowledge short of a Vulcan mind-meld.
When two tribes… have lunch
If you’re really doing Devops, integration shouldn’t begin and end with the org chart. Exchange programmes merely reinforce the notion that the other team is a foreign country. We should stop thinking of Dev and Ops as rival tribes, and pool our resources to defeat the real enemy (Marketing).
When coders and sysadmins work together routinely to get their jobs done, and when teams include multi-skilled people who like to learn as well as teach, and when people no longer say ‘That’s not our problem – talk to the sysadmins’, or ‘Don’t blame us – the programmers screwed up’, and when management starts to recognise that joined-up thinking equals good business, then we will at last be able to say that we truly live in a State of Devops.
This article originally appeared as a guest post on the Agile Web Operations blog.
It’s Sysadmin Appreciation Day, so here is my personal list of the most interesting and influential sysadmins and devops folk that I know (in alphabetical order, not order of merit). If you are on Twitter, you need to be following these people, and all of them are also excellent bloggers as well as awesome sysops / devministrators / tech gods. Read more »