The ultimate software development tool

Best practices on project management, issue tracking and support

Author: Hamzah Tariq

Using Forensic Psychology to Spot Problems in Your Code – Interview with Adam Tornhill


We interviewed Adam Tornhill, a software architect who combines degrees in Engineering and Psychology to get a different perspective on software. We discuss his book ‘Your Code as a Crime Scene‘, in which he describes using Forensic Psychology techniques to identify high-risk code, defects and bad design. He describes how his techniques are useful by themselves, but can also be used to make other practices like code reviews and unit testing even more effective.

Adam’s Code Maat software is on GitHub.

Content and Timings

  • Introduction (0:00)
  • About Adam (0:31)
  • Detecting Problem Code (1:43)
  • Improving the Software Architecture (3:00)
  • Social Biases and Working in Groups (4:02)
  • Working with Unit Testing and Code Reviews (7:20)
  • Tools and Resources (8:32)

Transcript

Introduction

Derrick:
Adam Tornhill is a Programmer and Software Architect who combines degrees in engineering and psychology to get a different perspective on software. He’s the author of “Lisp for the Web”, “Patterns in C”, as well as “Your Code as a Crime Scene”, in which he describes using forensic techniques to identify defects and bad design in code.

Adam, thank you so much for taking the time to join us today. I really look forward to hearing what you have to say. Why don’t you say a bit about yourself?

About Adam

Adam:
Oh yeah, sure. It’s a pleasure to be here. I’m Adam Tornhill and I’m from Sweden, which is why I have this wonderful accent. I’ve been a programmer for a long time. I’ve been doing this for almost two decades now, and I still love what I do.

Derrick:
I wanted to touch on one of your books, “Your Code as a Crime Scene” where you apply forensic psychology techniques to software development. What made you think to apply techniques from what would seem to be such an unrelated field?

Complexity metrics, the old kind, they just didn’t cut it

Adam:
It actually came a bit surprising to me as well. What happened was some years ago I was in the middle of my psychology studies, working towards my master’s degree and then got hired into architectural roles where I had to prioritize technical debt and identify problematic areas. I found that it was terribly hard to do that because complexity metrics, the old kind, they just didn’t cut it.

When I tried to talk to the different team members and developers, I found out that nobody seemed to have a holistic picture. It was virtually impossible to get that picture out of a large code base.

At the same time, I took a course in criminal psychology. I was just struck by the similarities in the mindset to what we needed to have in the software business. That’s how it all started. I started to think, how can we apply this stuff to software?

Detecting Problem Code

Derrick:
I wanted to jump into a couple of the techniques you cover. How can forensics psychology help us to detect problematic code?

Adam:
The first kind of technique that I would like to introduce, and that’s actually where I started, is a technique I call a Hotspot Analysis. It’s based on a forensic concept called Geographical Offender Profiling.

I find it really fascinating. What Forensic Psychologists do is basically they try to spot patterns in the spatial movement of criminals, to detect our home bases, so we know the area to inspect.

I thought what if we could do the same for software, that would be pretty cool. If we could take a large code base and somehow spot the spatial movement not of criminals, but of programmers, then we could identify the kinds of, the parts of the code that really matter.

That’s where it started, and what I do is, I start to think about where can I find that information, I realized, it was there right in front of us in a version control system. A version control system, basically records every interaction we have with the code.

So I tried to mine source code repositories and then I looked for code with high change frequencies, then I overlaid that with the complexity analysis, which allows us to identify the most complicated code, that is also the code we have to work with often, which is a hotspot. So that’s where it all started. That’s the starting point for the rest of the analysis basically.

Improving the Software Architecture

Derrick:
You also say that you can use your techniques to help improve the architecture of applications. How is that so?

Adam:
First we have to consider what an architecture actually is. A fundamental idea in my book is that software is a living entity. It constantly changes, evolves and sometimes degrades.

So I came to architecture more as a set of guiding principles to help that evolution happen in a successful way. Just to give you a complete example, consider microservices, they seem to be all the rage right now.

So that means that tomorrow’s legacy applications will be microservices. In a microservice architecture, a typical one where each microservice is cohesive and independent, so that’s basically one architectural principle.

What you do is you try to measure and identify violations of that principle. Again I look at the evolution of the codebase in the source code repository. And I try to identify, in that case, multiple services that tended to change at the same time because that’s a violation of your architectural principles. That’s one way to do it.

Social Biases and Working in Groups

If you want to truly improve software, we need to look beyond technique and look at the social side

Derrick:
You also cover how “most studies on groups of people working together find that they perform below their potential”. Why is this and what are some of the problems teams can experience?

Adam:
Social Psychologists have basically known for decades that teamwork comes with a cost. The theory that I come to in my book is called Process Loss. The idea behind process loss, it’s pretty much as how a mechanical machine cannot operate at 100% efficiency all the time, so neither can software teams.

That loss often results in the communication and coordination overhead or perhaps motivation loss. The interesting thing is when you talk to software teams about this, they are aware of the problems. Quite often we mis-attribute it to a technical issue, when in reality we have a social problem.

In one team I worked, the developers, they noticed that they had a lot of bugs in certain parts of the codebase. They thought it was a technical problem. What happened in reality was they were having excess parallel development in those parts of the codebase.

There were multiple programmers working in the same parts of the code all the time. So merely all they did, they said that, they complained a lot about potential merge conflicts. They were really scared to do merges of different feature branches.

When we started to look into it, we identified that what actually happened was that, due to the way they were working, they didn’t really have a merge problem. What they had was basically a problem that the architecture just couldn’t support their way of working.

I think that’s really important to understand that if you want to truly improve software, we need to look beyond technique and look at the social side.

Derrick:
What can we do to avoid such problems?

Adam:
First thing I recommend is always to use something I call Social Hacks. Basically it takes the social situation and tries to tweak some aspect of it, to make social biases less likely because social biases, they are a huge reason why we get process loss.

One of the simple techniques that I recommend is, in every team, assign someone the role of the devil’s advocate. The role of the devil’s advocate is just to take the opposite stance in every discussion to question every decision made. That helps you reduce a bunch of biases because you’re guaranteed that someone will speak up. It also has the nice benefit of making teams more risk averse. There are a bunch of social stuff we can do and I also present a number of analyses that they can apply to investigate and mine social metrics from their codebases.

Derrick:
So what sort of results have you seen from those applying your techniques?

Adam:
I’ve seen that most people seem to use it for their original purpose, which was to identify the code that matters the most for maintenance. I’ve also seen some interesting uses mostly from managers. Managers seem to love the social aspects of the analysis because it makes it possible for them to suddenly reason about something that they couldn’t measure before.

Another interesting area, is a couple of years ago, I worked with a really good test team. These testers they used to do a bit of exploratory testing at the end of each iteration. So what I did was basically I generated a heat map over the complete source code repository, so we can see where most of the development activity was, and we would start to communicate with the testers. They knew where they should focus a little bit of extra energy. That helped us to identify a lot of bugs much earlier.

Working with Unit Testing and Code Reviews

We’re not so much writing code. What we do most of the time is actually making a modification to existing code

Derrick:
How do you see current development practices like unit testing and code reviews, working with those that you’ve described?

Adam:
Techniques that I present, they don’t replace any current practices, save maybe wild guesses and panic near the deadline. But otherwise, they are there to complement what you already do today.

For example, take code reviews. I often use a hot spot analysis to identify code most in need of review. To prioritize reviews as well. Unit testing is interesting, because I actually have a complete chapter in my book about building a safety net around tests. Automated tests are terribly hard to get right in practice. It’s actually something you can do, by again, measuring the evolution of the codebase, looking at application code and testing them both together.

Derrick:
Ultimately you say developers should optimize for understanding in their code, why is this so important?

Adam:
It’s important because we programmers, we’re not so much writing code. What we do most of the time is actually making a modification to existing code.

If you look at the research behind those numbers you will see that the majority of that time is spent trying to understand what the code does in the first place. If you optimize for understanding, we optimize for the most important phase of coding.

Tools and Resources

Derrick:
What are some tools that people use to mine source code data, to use your techniques?

Adam:
When I started out there weren’t any tools really, there were a bunch of academic tools. They were powerful, but the licences were quite limited, you weren’t allowed to use them on commercial projects, they also focused on one single aspect.

I actually had to write my own set of tools. I actually open-sourced all my tools, because I want the readers of my book to have the possibility to try the techniques and stuff.

If you’re interested in this kind of stuff, you can just go to my GitHub account, Adamtornhill, and download the tools and play around with them. I think we’re going to see a lot of stuff happening in this area now.

Derrick:
Beyond your book, are there any resources you can recommend for those interested in learning more about writing maintainable code and improving their code design?

Adam:
My favorite book when it comes to software design has to be ‘The Structure and Interpretation of Computer Programs’. It’s just brilliant and it had a tremendous influence on how I approached software design.

When it comes to coding itself, I would say Kent Beck’s ‘Smalltalk: Best Practice Patterns’. It’s a brilliant book and it takes us way beyond Smalltalk. It’s actually something I recommend for everyone.

And the final book I would like to mention is ‘Working Effectively with Legacy Code’ by Michael Feathers. It’s one of the few books that takes this evolutionary perspective on software, so it’s a great book by a great author.

Derrick:
Adam, thank you so much for joining us today.

Adam:
Thanks so much for having me here, it was a true pleasure.

Software Development Metrics – Interview with David Nicolete.

 

We’ve interviewed Dave Nicolette, a consultant specializing in improving software development and delivery methods and author of ‘Software Development Metrics’. We dive into what factors to consider when selecting metrics, examples of useful metrics for Waterfall and Agile development teams, as well as common mistakes in applying metrics. He writes about software development and delivery on his blog.

Content and Timings

  • Introduction (0:00)
  • About David (0:21)
  • Factors When Selecting Metrics (3:22)
  • Metrics for Agile Teams (6:37)
  • Metrics for Hybrid Waterfall-Agile Teams (7:37)
  • Optimizing Kanban Processes (8:43)
  • Rolling-up Metrics for Development Management (10:15)
  • Common Mistakes with Metrics (11:47)
  • Recommended Resources (14:30)

Transcript

Introduction

Derrick:
David is a consultant specializing in improving software development and delivery methods. With more than 30 years experience working in IT, his career has spanned both technical and management roles. He regularly speaks at conferences, and is the author of ‘Software Development Metrics’. David, thank you so much for taking your time out of your day to join us. Why don’t you say a bit about yourself?

About David

David:
I’ve been involved with software for a while and I still enjoy it, though I’ve been working as a team coach and organizational coach, technical coach, for a few years. I enjoy that.

Derrick:
Picking up on the book ‘Software Development Metrics’, what made you want to write the book?

David:
It’s an interesting question, because I don’t actually find metrics to be an interesting topic, but I think it is a necessary thing. It’s part of the necessary overhead for delivering. If we can detect emerging delivery risks early, then we can deal with them. If we don’t discover them until late, then we’re just blindsided and projects can fail. It’s important to measure the right things and be sure we’re on track.

Secondly, I think it’s important to measure improvement efforts, because otherwise we know we’re changing things and we know whether they feel good, but we don’t know if they’re real improvements if we can’t quantify that. I noticed several years ago that a lot of managers and team leads and people like that didn’t really know what to measure. I started to take an interest in that, and I started to give some presentations about it, and I was very surprised at the response because quite often it would be standing room only and people wouldn’t want to leave at the end of the session. They had more and more questions. It was as if people really had a thirst for figuring out what to measure and how. I looked at some of the books that were out there and websites that were out there, and they tended to be either theoretical or optimistic.

Derrick:
Metrics for measuring and monitoring software development have been around for decades, but a lot of people still don’t use them effectively. Why do you think that is?

David:
I often see a pattern that when people adopted new process or method, unfamiliar one, they try to use the metrics that are recommended with that process. There are a couple of issues that I see. One is that they may only be using the process in name only, or they’re trying to use it but they’re not used to it yet, and the metrics don’t quite work because they’re not quite doing the process right.

The other issue is that people tend to use the measurements that they’re accustomed to. They’ve always measured in a certain way, now they’re adopting a new process. They keep measuring the same things as before, but now they’re doing the work in a different way. There’s a mismatch between the way the work flows and the way it’s being measured. They have numbers and they rely on the numbers, but the numbers are not telling truth because they don’t line up with the way the work actually flows.

Factors When Selecting Software Development Metrics

Derrick:
Picking the right metrics is paramount. What are some of the factors that we should consider when selecting metrics?

David:
Look at the way work actually flows in your organization and measure that. I came up with a model for that in the course of developing this material which I would look at three factors to try to judge which metrics are appropriate. The first factor is the approach to delivery. The basic idea there is if you try to identify all the risks in advance, all the costs, you identify all the tasks, lay out a master plan, and you follow that plan. That’s what I’m calling traditional.

What I call adaptive is a little different. You define a business capability, you’ve got your customer needs, and you set a direction for moving toward that, and you steer the work according to feedback from your customer, from your stakeholders. You don’t start with a comprehensive plan, you start with a direction and an idea of how to proceed. Then you solicit feedback frequently so you can make course corrections. That’s the first factor I would look at: traditional versus adaptive, and that I think has the biggest impact on which metrics will work.

The second factor to look at is the process model. I don’t have to tell you that there are a million different processes and nobody does anything in a pure way, but if you boil it down I think there’s basically four reference models we can consider, or process models. One is linear, you can imagine what that is. The canonical steps that go through from requirements through support. The next one would be iterative, in which you revisit the requirements multiple times and do something with them. The third one that I identify is time-boxed. It’s really popular nowadays with processes like Scrum and so on. The fourth one is continuous flow. This is becoming popular now with the Kanban method, and it’s also being adapted into Scrum teams quite a lot. We’re really interested in keeping the work moving smoothly.

Now a real process is going to be a hybrid of these, but it’s been my observation that any real process will lean more toward one of those models than the others, and that’ll give us some hints about what kind of metrics will fit that situation. The third thing probably has the least impact on is whether you’re doing discrete projects or continuous delivery. What some people call a continuous beta, or some people just don’t have projects. You have teams organized around product lines or value streams, and they continually support them, call it a stream. Between those two there are some differences in what you can measure. Well I look at those three factors, and based on that you can come up with a pretty good starter set of things to measure, and then you can adapt as you go from there.

Metrics for Agile Teams

Derrick:
Let’s take a couple of example scenarios. If we have an agile team working in short sprints who are struggling to ship a new product, what kind of metrics should they consider to identify areas for improvement?

David:
If they’re using Scrum basically correctly, they could probably depend on the canonical metrics that go with that, like velocity and your burn chart. You might look for hangover, incomplete work at the end of the sprint. You might look for a lot of variation in story size, when you finish a story in one day and the next story takes eight days. When it comes to metrics as such they could use, as I said, velocity and so on, and you can always use lean-based metrics because they’re not really dependent on the process model. What they might consider is looking at cycle times. They could look at the mean cycle times as well as the variation in cycle times and get some hints about where to go for root—cause analysis. Metrics don’t tell you what’s wrong, but they can raise a flag.

Metrics for Hybrid Waterfall-Agile Teams

Derrick:
What about a hybrid waterfall agile team working on a long term project, wanting to know what it’s possible to deliver by a certain date?

David:
To know what’s possible to deliver you can use the usual things like a burn chart, burn up or burn down as you prefer, to see according to their demonstrated delivery, their velocity, I’ll call it that, how much scope they can deliver by a given date. Conversely you could see by what date approximately they could deliver a given amount of scope. It depends on what’s flexible. In this kind of a project, usually neither is flexible, but at least you can get an early warning of delivery risk. If it looks like the trend line is way out of bounds with the plan, well now you’ve got a problem.

One thing that might surprise some people is the idea that agile methods can be used with traditional development. We need to decouple the word “agile” from “adaptive,” because quite often it is used in a traditional context.

Optimizing Kanban Processes

Derrick:
What are some metrics relevant to those working in a bug queue? Say they’re wanting to optimize their working practices to stay on top of incoming bugs.

David:
For that I usually like to use little metrics, mainly cycle time, because you want to be somewhat predictable in your service time so when a bug report comes in people have an idea of when they can expect to see it fixed. How do you do that? Well, you can use empirical information from past performance with fixing bugs, and your mean cycle time will give you approximately how long it takes to fix one.

I like to use the Kanban method for these kind of teams because it defines classes of service. You’ll find that every type of bug doesn’t take the same amount of time to fix. Based on your history, pick out different categories. You can identify the characteristics of particular kinds of bug reports that tend to fall together, and you can track cycle times differently for each of those classes of service. If someone calls in and says, “Well we got this problem.” “Well that looks like that’s in category two. Whatever that means, that typically takes us between four hours and eight hours.” That can give them a little bit of warm fuzzy feeling about when they’re going to see it fixed. I think that using empirical data and tracking cycle time, is the simplest, most practical way toward that workflow.

Rolling-up Metrics for Development Management

Derrick:
What about a CTO who wants to monitor how teams are performing, and ensure code is of high quality? How can metrics be rolled-up for those in more senior positions?

David:
How the teams are performing, you need measurements that are comparable across teams and comparable across projects. The lean-based metrics needed to compare across teams and projects and across development and support, those kinds of things. If you’re tracking throughput, cycle time, there’s another one I haven’t mentioned that I wanted to: process cycle efficiency. If you track those, those roll up nice. Some other metrics don’t roll up so well. Some of the agile metrics, particularly velocity is really different for each team. Percentage of scope complete, that may not roll up very well either.

The other question about code quality, I think that if we can let that be a team responsibility then they can use metrics, but they don’t need to report outward from the team. Usually things that they can get out of static code analysis tools will help them spot potential quality issues, but I wouldn’t share very detailed things like static code analysis kind of stuff and code coverage outside the team, because then team members will feel like they’re going to get judged on that and they’ll start gaming the numbers thinking they’re going to be needed those. Those kind of metrics are really for the team’s own use.

Common Mistakes with Software Development Metrics

Derrick:
What are some mistakes you often see people make when applying software development metrics?

David:
People seem to make a couple of mistakes over and over. The one I think I mentioned earlier, people apply metrics that won’t fit the context. Maybe a company wants to ‘go agile’, and so they start tracking agile metrics, so whatever is recommended. The metrics that are recommended are safe or something like that, but they haven’t really fully adopted these new methods. They’re still in the transition, and so the numbers don’t mean what they’re supposed to mean. For instance, they may track velocity, but they might not have feature teams. They might have component teams, and the work items that those teams complete are not vertical slices of functionality. Whatever they’re tracking as velocity isn’t really velocity. You start getting surprised by delivery issues.

You can also have the opposite situation where teams are working in an agile way, but management is still clinging to traditional metrics. They will demand that the teams report percentage of still complete to date, but you’re doing adaptive development so you don’t have 100 percentage scope defined. You have a direction. You may be 50% complete this week based on what you know of the scope. Next week you might be 40% complete because you’ve learned something, and then the management says “Well what’s going on? You’re going backwards,” and they don’t understand what the numbers mean. What I see happen in that case is it drives the teams away from adaptive development, and causes them to try to get more requirements defined upfront.

The second mistake I see a lot is that people either overlook or underestimate the effects of measurement on behavior. We can be measuring something for some objective reason, but it causes people to behave differently because they’re afraid they’re going to get their performance review will be bad because they didn’t meet their numbers. I think we have to be very conscious of that. We don’t want to drive undesired behaviors because we’re measuring things a certain way. That not only breaks morale, but it also makes the measurements kind of useless too when they’re not real.

Those are really the two main mistakes: that they don’t match up the metrics with their actual process, or they neglect the behavioral effect of the metric.

Recommended Resources

Derrick:
What are some resources you can recommend for those interested in learning more about managing projects and improving processes?

David:
I like this, an older book called ‘Software By Numbers’. That’s a classic, but one of David Anderson’s earlier books is called ‘Agile Management From Software Engineering’. That has a lot of really good information to apply economic thinking to different kinds of process models. He covers things like feature driven development, extreme programming. Another guy whose work I like is Don Reinertsen. He combines really deep expertise in statistics with deep expertise in economics and applies that to software, and can demonstrate mathematically why we don’t want to over-allocate teams. How it slows down the work if you load everybody up to 100% actually slows things down. What’s counter intuitive to a lot of managers is if you load your teams to 70% capacity they’ll actually deliver better throughput, but it’s very hard for a lot of managers to see somebody not busy. It’s really hard for them to get there.

Derrick:
Really appreciate your time today Dave, some great stuff here. Thank you.

David:
Well I enjoyed the conversation, thanks.

10X Programmer and other Myths in Software Engineering


We’ve interviewed Laurent Bossavit, a consultant and Director at Institut Agile in Paris, France. We discuss his book ’The Leprechauns of Software Engineering’, which debunks myths common in Software Engineering. He explains how folklore turns into fact and what to do about it. More specifically we hear about findings of his research into the primary sources of theories like the 10X Programmer, the Exponential Defect Cost Curve and the Software Crisis.

Content and Timings

  • Introduction (0:00)
  • About Laurent (0:22)
  • The 10X Programmer (1:52)
  • Exponential Defect Cost Curve (5:57)
  • Software Crisis (8:15)
  • Reaction to His Findings (11:05)
  • Why Myths Matter (14:44)

Transcript

Introduction

Derrick:
Laurent Bossavit is a consultant and Director at Institut Agile in Paris, France. An active member of the agile community he co-authored the first French book on Extreme Programming. He is also the author of “The Leprechauns of Software Engineering”. Laurent, thank you so much for joining us today, can you share a little bit about yourself.

About Laurent

Laurent:
I am a freelance consultant working in Paris, I have this great privilege. My background is as a developer. I try to learn a little from anything that I do, so after a bit over 20 years of that I’ve amassed a fair amount of insight, I think, I hope.

Derrick:
Your book, “The Leprechauns of Software Engineering”, questions many claims that are entrenched as facts and widely accepted in the software engineering profession, what made you want to write this book?

Laurent:
I didn’t wake up one morning and think to myself, ‘I’m going to write a debunkers book on software engineering’ but it actually was the other way around. I was looking for empirical evidence anything that could serve as proof for agile practices. And while I looked at this I was also looking at evidence for other things which are in some cases, were, related to agile practices for instance the economics of defects and just stuff that I was curious about like the 10X programmers thing. So, basically, because I was really immersed in the literature and I’ve always been kind of curious about things in general, I went looking for old articles, for primary sources, and basically all of a sudden I found myself writing a book.

The 10X Programmer

Derrick:
So, let’s dig into a few of the the examples of engineering folklore that you’ve examined, and tell us what you found. The first one you’ve already mentioned is the 10X programmer. So this is the notion that there is a 10 fold difference between productivity and quality of work between different programmers with the same amount of experience. Is this fact or fiction?

Laurent:
It’s actually one that I would love to be true if I could somehow become or if I should find myself as a 10X programmer. Maybe I would have an argument for selling myself for ten times the price of cheaper programmers. When I looked into it, what was advanced as evidence for those claims, what I found was not really what I had expected, what you think would be the case for something people say, and what you think is supported by tens of scientific studies and research into software engineering. In fact what I found when I actually investigated, all the citations that people give in support for that claim, was that in many cases the research was done on very small groups and not extremely representative, the research was old so this whole set of evidence was done in the seventies, on programs like Fortran or COBOL and in some cases on non-interactive programming, so systems where the program was input, you get results of the compiling the next day. The original study, the one cited as the first was actually one of those, it was designed initially not to investigate productivity differences but to investigate the difference between online and offline programming conditions.

So how much of that is still relevant today is debatable. How much we understand about the concept of productivity itself is also debatable. And also many of the papers and books that were pointed to were not properly scientific papers. They were opinion pieces or books like Peopleware, which I have a lot of respect for but it’s not exactly academic. The other thing was that some of these papers did not actually bring any original evidence in support of the notion that some programmers are 10X better than others, they were actually saying, “it is well known and supported by ‘this and that’ paper” and when I looked at that the original paper they were referencing, they were in turn saying rather than referencing their own evidence, saying things like “everybody knows since the seventies that some programmers are ten times more than others” and very often after chasing after all the references of old papers, you ended up back at the original paper. So a lot of the evidence was also double counted. So my conclusion was, and this was the original leprechaun, my conclusion was that the claim was not actually supported. I’m not actually coming out and saying that its false, because what would that mean? Some people have taken me to task for saying that all programmers are the same, and that’s obviously stupid, so I can not have been saying that. What I’ve been saying is that the data is not actually there, so we do not have any strong proof of the actual claim.

Exponential Defect Cost Curve

Derrick:
There is another folklore item called the “exponential defect cost curve”. This is the claim that it can cost one dollar to fix a bug during the requirements stage, then it will take ten times as much to fix in code, one hundred times in testing, one thousand times in production. Right or wrong?

Laurent:
That one is even more clear cut. When you look at the data and you try to find what exactly was measured, because those are actual dollars and cents, right? So it should be the case, at some point a ledger or some kind of accounting document originates the claim. So I went looking for the books that people pointed me to and typically found that rather than saying we did the measurements from this or that project, books said or the articles said, ‘this is something everybody knows’, and references were ‘this or that article or book’. So I kept digging, basically always following the pointers back to what I thought was the primary source. And in many cases I was really astonished to find that at some point along the chain basically someone just made evidence up. I could not find any solid proof that someone had measured something and came up with those fantastic costs, sometimes come across like, fourteen hundred and three dollars on average per bug, but what does that even mean? Is that nineteen-nineties dollars? These claims have been repeated exactly using the same numbers for I think at least three decades now. You can find some empirical data in Barry Boehm’s books and he’s often cited as the originator of the claim. But it’s much less convincing when you look at the original data than when you look at the derived citations.

The Software Crisis

Derrick:
There is a third folklore, called “The Software Crisis”. Common in mainstream media reporting of large IT projects. These are studies that highlight high failure rates in software projects, suggesting that all such projects are doomed to fail. Are they wrong?

Laurent:
This is a softer claim right, so there’s no hard figures, although some people try. So, one of the ways one sees software crises exemplified is by someone claiming that software bugs cost the U.S. economy so many billions, hundreds of billions of dollars per year. A more subjective aspect of the notion of the software crisis, historically what’s interesting, is the very notion of the software crisis was introduced to justify the creation of a group for researching software engineering. So the initial act was the convening of the conference on software engineering, that’s when the term was actually coined and that was back in 1968, and one of the tropes if you will, to justify the interest in the discipline was the existence of the software crisis. But today we’ve been basically living with this for over forty years and things are not going so bad, right? When you show people a dancing bear one wonders not if the bear dances well. That it dances at all. And to me technology is like that. It makes amazing things possible, it doesn’t always do them very well but its amazing that it does them at all. So anyway I think the crisis is very much over exploited, very overblown, but where I really start getting into my own, getting on firmer ground is when people try to attach numbers to that. And typically those are things like a study that supposedly found that bugs were costing the U.S. sixty billion dollars per year and when you actually take a scalpel to the study, when you read it very closely and try to understand what methodology they were following and exactly how they went about their calculations, what you found out is that they basically picked up the phone and interviewed over the phone a very small sample of developers and asked them for their opinion, which is not credible at all.

Reaction to His Findings

Derrick:
What is a typical reaction to your findings debunking these long held claims?

Laurent:
Well, somewhat cynically it varies between, “Why does that matter?” and a kind of violent denial. Oddly enough I haven’t quite figured out why, what makes people so into one view point or the other and there’s a small but substantial faction of people who tell me ‘oh that’s an eye opener’ and would like to know more, but some people respond with protectiveness when they see for instance the 10X claim being attacked. I’m not quite sure I understand that.

Derrick:
So how do myths like these come about?

Laurent:
In some cases you can actually witness the birth of a leprechaun. Its kind of exciting. So, some of them come about from misunderstandings. I found out in one case for instance that an industry speaker gave a talk at a conference and apparently he was misunderstood. So people repeated what they thought they had heard and one thing led to another. After some iterations of this telephone game, a few people, including some people that I know personally, were claiming in the nineties that the waterfall methodology was causing 75% failure rates in defence projects, and it was all a misunderstanding when I went back and looked at the original sources the speaker was actually referring to a paper from the seventies which was about a sample of nine projects. So not an industry wide study. So I think that was an honest mistake, it just snow-balled. In some cases people are just making things up, so that’s one way to convince people, just make something up. And one problem is that it takes a lot more energy to debunk a claim than it takes to just make things up. So if enough people play that little game some of that stuff is going to just sneak past. I think the software profession kind of amplify the problem by offering fertile ground, we tend to be very fashion driven, so we enthusiastically jump onto bandwagons. That makes it easy for some people to invite others to jump, to propagate. So I think we should be more critical, there has been a movement towards evidence-based software engineering which I think is in some ways misguided, but is good news to my way of thinking that anyone is able to think, maybe we shouldn’t be so gullible.

Why Myths Matter

Derrick:
Even if the claims are in fact wrong, why does it matter?

Laurent:
To me, one of the key activities in what we do, its not typing code, its not design, its not even listening to customers, although, that comes close. The key activity that we perform is learning. We are very much a learning-centered profession so to speak. Because the very act of programming when looked at from a certain perspective is just about capturing knowledge of the world about businesses, about… other stuff that exists out there in the real world or is written about that is already virtual, but in any way capturing knowledge and encoding that knowledge, capturing it in code, in executable forms, so learning is one of the primary things that we do anyway already. I don’t think we can be good at that if we are bad at learning in general in a more usual sense. But what happens to those claims which are easy to remember, easy to trot out, they act as curiosity stoppers, basically. So they prevent us from learning further and trying to get at the reality, at what actually goes on in the software development project, what determines whether a software project is a success or failure, and I think that we should actually find answers to these questions. So it is possible to know more than we do right now, I am excited every time that I learn something that makes more sense for me that helps me have a better grip on developing software.

Derrick:
Are there any tell-tale signs that we can look out for to help stop ourselves from accepting such myths?

Laurent:
Numbers, statistics, citations, strangely. I know that citations are a staple inside of academic and scientific writing but when you find someone inside the software engineering profession that is whipping out citations at the drop of a hat, you should take that as a warning sign. And there is more but you know we would have to devote an hour or so to that.

Derrick:
Thank you so much for joining us today, it has been great.

Laurent:
Thanks for having me, bye

A Guide to Developer Mentoring

In this interview with Rachel Ober, Senior Developer at Paperless Post, we discuss developer mentoring. Rachel teaches us the lessons learned from mentoring developers at Paperless Post, General Assembly, Turing School and beyond. These cover how to get started, tips on building successful mentor-mentee relationships, the benefits of mentoring as well as common mistakes. She writes more about teaching and mentoring on her blog.

Transcript

Introduction

Derrick:
Rachel Ober is a Senior Developer at Paperless Post, and an experienced mentor. She’s a Ruby On Rails instructor for General Assembly, Co-organizer of the Write/Speak/Code conferences, and Founder of the New York chapter of RailsBridge. Rachel, thank you so much for joining us today. Do you have a bit to share about yourself?

Rachel:
Sure. I work as a a Senior front-end developer at Paperless Post. I work with a team of five other front-end developers building the different pretty things that you see on our website. I feel that it’s really important to kind of build your team around what you want to see in the work that you do.


“Be fearless and say, ‘Hey, I need help on this!’”


The Benefits of Mentoring

Derrick:
I kind of wanted to focus our conversation a bit around mentoring. You seem to have a lot of experience with that, so what are some of the benefits you’ve seen of mentorship?

Rachel:
Some of the benefits that I’ve seen about mentorship are a two-way street between the mentee and the mentor. For the mentee obviously the biggest benefits that you’ll see is that they will have confidence in their work, they have somebody that they can talk to or just sling questions back and forth. They trust this mentor relationship to be realistic, not just telling you what the mentor thinks the mentee wants to hear. You don’t feel as a mentee, that your trying to impress this other person. It’s a much more open relationship. For the mentor, for me it’s a gigantic reminder of where I’ve come from as well as … I’ve been working with Ruby on Rails now since I think 2005. Wow, that’s like a decade. Every time I meet somebody who’s just learning or is a few years behind me in terms of their career, I learn new ways of thinking about problems. I think mentoring is really a great relationship for both people.

Derrick:
How should people go about getting started with mentoring within a company?

Rachel:
As an individual person, it’s great to talk to people and to first gauge other people who would be interested in participating in something like that. Now at Paperless Post we are pretty equal between our Engineering Department as well as the other people in our company. We have people who design cards, people who are in charge of marketing, so it would be good as an individual to kind of weigh whether or not you’re looking for mentoring for the entire company or just to the Engineering department to figure out what your expectations are and what your goals are for your mentoring program. At that point you might get a group of people together to start some type of pilot program.

I would say even asking HR what their opinions are and whether or not doing some type of mentoring program, an official one, at your company, would be something that they would also support. Obviously having upper management is going to make sure that this is actually going to get integrated into your culture, you company culture. They will support you making sure that you’re following up with them. Maybe it can even be integrated into your review process that might happen once or twice a year.


“I would advise against having your mentor be your manager”


Mentoring Myths

Derrick:
What are some myths that stop people from getting into mentoring?

Rachel:
The biggest one is that people think that they don’t have anything to offer. I think it’s probably tied into Impostor Syndrome. That they think that they would instead hurt the other person maybe, or that they don’t have any accomplishments to really share with a person to be a role model. I think that is definitely not true. My most successful relationship with a mentor, was whenever I did admit to her my weaknesses and saying, “Hey, I had this issue right out of college. I worked at a place for a year and it was a really bad relationship for both ends. It wasn’t the right hire, it wasn’t the right fit.” She was able to admit to me that one of her biggest challenges, and after that mark in our relationship, I saw a great change in her where she believed then that because I had admit to this great failure in my life or this personal failure that I felt about, that she could kind of open up, admit that she was having difficulties in certain areas, and be able to regain her confidence, move forward, and do an excellent job.

I think especially with people who are just learning the program, they have this idea … It’s kind of strange because when I was learning how to be a developer, or learning about development in general, there was very stereotypically, you spend all day in front of the computer. Now it seems like it’s turned into something glamorous, because you have this ability to change your life and to earn a lot of money by becoming a computer software developer. My first class that I taught at General Assembly somebody asked me, “What does your daily life look like at Paperless Post?” I said, “Seriously I spend most of my day trying to fix bugs and most of the time it doesn’t work.” I think I blew his mind.

It was a very interesting experience, but you as a mentor, just by explaining what your daily life is and how you interact with other people on your team, is some really fantastic advice for somebody who is either thinking about becoming a developer, or about to take that step after either graduating college or taking one of these code boot camps. Just giving your experience is very valuable. They want to know how you go about interviews, and obviously if you currently have a job you’ve been through a couple of interviews. Nobody just gives you a job usually. That type of advice is just invaluable for somebody who really has no point in which to reference.

Tips for new Mentors

Derrick:
Any additional tips that you can give for new mentors about the types of things that they should be doing with their mentees to help them learn?

Rachel:
The tips that I give to people that are thinking about getting into mentoring, or people who are looking for mentors, is to really start off the relationship figuring out what the goals of the relationship are. Figuring out how often you’re going to meet, how long these meetings are, whether or not you’re going to have some type of assignment that that this mentee is going to give, and how involved your mentor relationship will be. I’ve had different relationships … Some of it is based on just talking. Talking through issues, and more of the social aspect of becoming a developer or navigating either their job or their learning environment. Other relationships have been very deep into code, working through problems, and learning how to break down problems.

I think figuring that out earlier, it really puts everything out on the table and saying like, “This is what I’m having issues with. This is what I’d like to improve on,” and just kind of like talking about each other, talking about yourself. I feel like it’s very hard at least for me to go in and start getting advice, if I don’t know at least the motivations for the mentee. What they want to learn, how they want to see themselves in a couple months after we’ve been working together.


“Having a whole network of people will really help you share your success in achieving your goals”


Successful Mentor-Mentee Relationships

Derrick:
What do you think are the essential elements of a successful mentor-mentee relationship?

Rachel:
I think that successful elements are definitely meeting regularly. I don’t think kind of having on the fly meetings is really helpful for either person. As a mentee you want to make sure that you’re checking in and goal setting. Making sure these goals are met and that you have this accountability person. Your accountability partner, that you are actually fulfilling the things that you set out to do. I think if you are a mentor and you see that your mentee is not fulfilling these requirements, then you have to have like a really good heart-to-heart and say, “Hey, I’m putting in this time. I’m volunteering, I’m not getting paid for this, and I chose to help you because I really believe that you can do some amazing things.”

Sometimes guilting them helps. If they’re part of a Code school you can contact their teacher or the administration and say, “Hey you know, we set up this relationship and I wanted to check in with them. Have you noticed anything on their side?” You don’t want to be meddling, but you also want to set yourself up for success both as a mentee and as a mentor. Keep assessing the relationship, making sure that it’s working, and also if you are a mentor, being knowlegeable of what they are learning or what their work entails, and anticipating questions that they may have. I think just being yourself and admitting whenever you don’t know what the answer is, or giving them advice on other people to talk to. Just being realistic about yourself, where you’ve been and giving that advice to somebody else, is really the crux of a relationship like that.

Finding Mentors and Mentees

Derrick:
Where can people find mentors, or people to mentor?

Rachel:
That is a very interesting problem. I don’t think it’s easy to find a mentor, because there’s a certain level of trust and understanding. For me, I have always mentored other people. I’ve found mentees with volunteering for Code schools. I’ve mentored students from the Turing School out in Colorado, which may sound odd since I’m in New York city, but I’ve been fairly successful working with people over Google Hangouts and Screen Hero, and Slack, and just leaving myself available over text message and phone. I’ve also mentored people either through organizations that I work through … I would say that’s mentorship even if maybe you haven’t said exactly what that relationship is, or have a steady schedule. Also through the classes that I’ve taught at General Assembly. People asked to meet regularly, they want really advice on where to go next. They become voracious and they want to learn as much as they can.

For me, I’ve been searching for my own mentor for awhile. I’m particularly looking for somebody who is five to ten years ahead of me in their career. It took me a little while to kind of formulate the idea of what I was looking for in my relationship. As a woman it was important for me to find a woman who was doing the type of thing I wanted to be doing. I had to really extrapolate and think about what I wanted to be doing in five to ten years. Really be honest with myself. Where am I going to be in my family life, where am I going to be maybe in the country. Assess these things that are important to you, and then see if there is somebody out there that you really admire and just ask them, and you don’t have to limit this to one person either.

Having a whole network of people will really help you share your success in achieving your goals, because you have more accountability partners and more opinions. There’s a couple online services as well. For women there is a site called Glass-breakers, which will link up people, via the LinkedIn Network. There I’ve had a couple introductions, and I actually met somebody in person who happened to be on a business trip from London. We had dinner and it was amazing how well we connected. I would be fearless and say, “Hey, I need help on this,” and admit that you need help, and these natural relationships will form.

Derrick:
Is there something in that relationship that says your mentor should or should not be your manager?

Rachel:
I would advise against having your mentor be your manager. Here we do have Technical Managers who are in charge of making sure our project is on time, making sure that everybody’s being productive, and basically are leading the projects, leading the teams. They’re also in charge of writing reviews for the team members. We then have other people who are available such as myself, who are really focused on making sure that the other employees are happy, they’re doing the type of work they want, and then also if there’s something they need to get off their chest, it’s a safe environment because you’re talking to somebody who isn’t reporting on you or needs to also … They’re kind of the neutral Switzerland or something. They’re to necessarily involved in the review process of you and your team, but they’re there to help focus on your happiness and the growth of your career.

I think whenever you’re the manager you have to constantly have that balance of the best interest of the company and the person, whereas I think an independent party really is concentrating on the person. It doesn’t mean that this relationship with your manager is bad, it just means that maybe you also need to take in a mentor who’s not in a manager role.

Derrick:
Rachel I think this was really great conversation. I hope a lot of people take the next steps to either find a mentor or become a mentor.

Rachel:
Great, thank you for having me.

The Problems with Open Source (and How to Fix Them) – Interview with Justin Searls

In this interview with Justin Searls, Co-Founder at development agency Test Double, we discuss issues in open source software and what we can do about them. Justin raises problems for both consumers and contributors – like managing hundreds of dependencies to security issues and burnt-out maintainers. We dive into the efforts being made to address these and what you can do too.

Justin discusses issues like this and software development more generally on his blog.

Content and Timings

  • Introduction (0:00)
  • Open Source convenience, long-term fragility (1:33)
  • Who is Auditing Open Source Code? (5:17)
  • The Rollercoaster Ride of Maintaining Open Source Projects (7:38)
  • Recruiting More and Better Contributors (11:18)
  • Improving Communication in Open Source (12:42)
  • How Can We Help? (14:35)

Transcript

Introduction

Derrick:
Justin Searls is co-founder of Test Double. A software development agency based in Columbus, Ohio. Through his work Justin uses and contributes to a number of open source tools, and also speaks at conferences about a range of software development topics, including the talk The Social Coding Contract, which highlights some issues he sees in open source projects.

Justin, Thank you so much for taking time to join us today, really appreciate. Do you have any more to add about yourself?

Justin:
There’s a lot of brokenness around us, and I think the talk that you referenced, The Social Coding Contract, is really just putting a lens on a lot of brokenness in open source. But I’m not just here to shout at the clouds and complain about stuff. I think that by building awareness we can make it better.


“In an open source team you have like a dozen people who all want to be point guard”


Derrick:
So your talk raises a number of issues in open source development, both for consumers dealing with dependency issues, and for maintainers resulting in burnout. But what lead you to want to raise these issues?

Justin:
I think that when you really really distill things down, boil it down to just a kinda core essence. The answer to that question, is that our industry has organized itself around, essentially a lie, and that lie is that faster equals better. Anything we can do to like, faster to build an app, faster to ship into production, devs that slingcode faster than slower devs, or 10X better than … All of our orientation of how, you know technology is sold and described, and glorified in our culture, is all about how like how fast it is how fast people get stuff done. But the overall attention span is so brief that we just tend not to focus on problems.

Derrick:
Lets dive into a couple of these issues, you say that those building with opensource optimize for convenience, but often at the cost of long-term fragility. What do you mean by that, and has this impacted your projects.

Justin:
By fragility there … What I don’t mean is that there is some sort of cabal of open source developers trying to make a mess for you. What I really mean is that when you ship something into production these days, odds are you’re running like a little tiny layer, 10 percent or less of the code that’s getting executed in production is stuff that you wrote. Most of it is a mountain high of application dependencies that you stand on top of. Some directly, some transitively that got pulled in via those other dependencies. It’s a lot of stuff that frankly we don’t understand really well. And that’s fine for getting started, because obviously you gotta be competitive, and if somebody else can get a prototype out the door quickly, you can get really fast feedback. You know?

There’s a lot to be gained from it, we just need to move cautiously to understand, okay so it’s been 3 months, it’s been 6 months. Lets actively look at how these dependencies have been serving us. Have they been a pain to use, have any died or gone out of maintenance. Like what’s the influence that they’ve had on our, the design of our code. Are we really just like writing cookie cutter code to satisfy all the APIs that we are depending on or are we really growing a domain model that very nicely fits like a glove the problem that our application’s trying to solve.

Those are the sorts of things that, you know I would kinda call fragility that’s baked in to this process of looking for something to help get us a quick start. Another facet of this that’s really interesting to me is that coming from the ruby community, where ruby on rails was really huge, or became huge almost a decade ago now. That was one big monolithic framework and that was very difficult for me to start my own framework, because I would push it up, but then I had n plus one queries everywhere. All sorts of like pain of not knowing how to it, a magical little thing. And I could remediate that on one project, and it would take a long time, and everyone would pay the price of learning rails. But on subsequent ones we could at least use that knowledge.

But nowadays I think the trend is so anti-framework, and so pro-modularization, tiny libraries that all do their own thing. We’ve all become effectively framework maintainers. Not to say everyone’s inventing their own, but we’re like curators now of like this manifest of here’s my thirty dependencies, and no one else in the world will have all thirty dependencies at exactly the same versions that you do. Which means that now the onus is on you to make sure that they all work together correctly, and if there is any interplay between the two of them that doesn’t, it is up to each project team who’s tasked with building an application to also be responsible for troubleshooting two potentially divergent dependencies that are stepping on one another.

Derrick:
It’s definitely an interesting perspective, I haven’t thought of that myself, as a curator of frameworks.

Justin:
Yep, and it’s not like small is bad and that big monolithic stuff is good. It’s just these are the source of costs and the responsibilities and the roles that we should be thinking about. That go with maintaining something built that way.

After five to six years of working with a lot of, you know open source technology projects you start to see a lot of patterns, just do basic pattern recognition. Be like oh we’re finding like writing adapters for this third party thing is really really hard, or its API is seeping all over our project. Like how can we guard our self, and push that out, build some scar tissue between us and that dependency, because it has been found to be kinda problematic.

All those sorta good habits I think tend to grow organically, but then can only happen if people have an awareness that all these dependencies leaking all over there stuff lead design problems long term.

Derrick:
We often approach using popular open source libraries with the assumption of security, because anyone can read them, and with trust because they have a number of respected contributors. But you don’t buy that, why is this?

Justin:
The concept that you’re alluding to, is introduced in the book called The Cathedral and the Bazaar, like fifteen years ago, maybe longer. But that was written in a time when, when most people were thinking about open source they’re thinking about really big operating systems, like Windows versus Unix. Like a total black box versus this thing that is like one or two or three major big projects that have tens of thousands of developers looking at really closely. And over the last fifteen years, everything’s inverted. Now GitHub hosts hundreds of thousands open source projects.

I think at some point we probably eclipsed the point where there were literally more libraries being used, than eyeballs looking at the actual source code of other people’s open source. So in theory that would work but the problem is that there is more projects then there are eyeballs.

The other thing that I think effects this is the stack, the open source stack that we stand on gets a little taller every year. Every year we find some new common solution to a well known problem, and we stand on it. And fifteen years ago, twenty years ago, when people were starting to grow thoughts about what open source would mean for security, well you know like openness itself for example, like you know like that’s a pretty cool library, like I’m gonna use that for how I do my network securely. But now it’s such a given that everything depends on OpenSSL, that we … You know it’s mature, its mostly settled down, there’s not a lot of maintainers, it’s not a sexy thing to invest in from a marketing perspective if you’re Facebook or Google. So it mostly just gets no attention.

Even if you are a really big company, and you understand that your openness is all working is critical to your business. You know that it’s like a Mexican standoff where every company in the world, they all depend on it. So it’s certainly like, it’s very important that someone be auditing the security of a fundamental thing like that. But it’s nobodies responsibility to do that, and no one feels that accountability. So I think that’s where we’ve really fallen on our face, where something like Shellshock happens to Bash, and it’s a program almost every developer uses, and lots of different internet of things stuff use. But zero people feel like it’s my job to go on the weekend and read up on, you know Bash to make sure that it is secure.

Derrick:
You describe the life of an open source project maintainer to be something of a roller coaster ride, that all too often ends with the maintainer getting burnt out. Why do you think that this is commonly the case?

Justin:
On the one hand, we kinda glamorize prolific open source authors. There’s maybe hundreds of thousands of open source projects out there but there are a handful of people that are like, Hey when I’m doing some node stuff I’m gonna look for a library that TJ Hallowaychuck posted, because he’s got a lot of them, and his other ones were pretty good, so I’ll use this one too. Of course he up and left node one day, which was problematic for anyone that lived in his stack. We don’t do this intentionally, we don’t seek to live in this celebrity culture, its just when there are so many options out there, we need all the tribal markers for quality that we can find, And that person wrote a good thing, and I like their approach to something else, so I’m gonna use all there stuff. And that’s how that sorta power accrues.

Now, the second phenomena which I think is really interesting, is as personality drives so much open source adoption the asymmetric relationships just fail to scale. Like an illustration that I was thinking of is like imagine that you found a golden lemon, and it could squeeze and infinite amount of juice, so you started an open source lemonade stand. And so you squeeze all the lemonade from that lemon all day long, for anyone who wants it. And the early adopters of your lemonade stand are gonna be like whoa you’re brilliant, this is amazing. Of course they’re gonna get sick of the lemonade and move on eventually. But once the word gets out, even though you’ll have gotten that initial fame and excitement of getting a lot of positive attention; pretty soon the line’s gonna be so long that the upper bound on how much lemonade comes out is your own physical laborious painful labor. And at that point you’re at a crossroads, you have to decide like, and this is analogous to an opensource maintainer. It’s like I don’t get any distinct joy out of this anymore, I’m mostly just doing work for people for free, and they don’t really appreciate it because they’re used to it now.

So do I continue out some sort of misplace sense of duty, or do I just quit and leave people in a lurch. I think that’s how a lot of open source projects slowly atrophy and die.

Derrick:
How do you think we should go about fixing this?

Justin:
I think that the successful long-term sustainable open source projects are ones where a community formed around them in proportion to their growth and success. And I think all parties involved need to be A, recognize that, acknowledge that, and B accept some amount of responsibility other than treating open source like a corporate welfare program. Right, so if I’m a maintainer I acknowledge that now, after writing this talk and thinking about this a lot is like, okay so this project is a hundred stars, like I really shouldn’t be the only owner on this repo or this NPM library, or this ruby gem. Lets pull in a couple other owners because other people are joining and like oh this project is a thousand stars, like lets look at a code of conduct, a governance model, you know some kinda mission statement for what this project’s about, the core tenants.

You know oh it’s at ten thousand stars, like we should probably start to have like a technical committee so I get out of the role of dictator for life. Because it’s now way bigger than what I meant for it to be. So it’s a bit of humility right, like seeding control gradually, to avoid that burn out. But then from the perspective of users, they need to be actively seeking, and accepting like, hey you know I should be contributing back to this, and I should work with my employer to make sure that I have the time to contribute back to the projects that we use, that are interesting to me. So that they have the support that they need, so that they don’t just wither on the vine and die even though hundreds or thousands of applications might continue to depend on them.


“There is never going to be a day where you feel like your thing is ready enough for public consumption… so please just publish”


Derrick:
What else can project maintainers do to recruit more, and better contributors, and to reduce friction for those interested and contributing?

Justin:
I really like the idea of an imagined application that was like a match.com for putting together people that have open source projects that need additional contributors, maybe they have a lot of issues, maybe like I said they’re getting more popular and they recognize this need. With people who are looking to work on open source. Maybe they want the additional visibility or the prominence that comes from having a lot of open source stuff. Maybe they just want something to program on. Maybe they just want to improve their skills.

Whatever the reason, it would be really cool if an application, say used Oauth, or a storage of your various credentials to be like hey you use this library and that library author is looking for people. Or hey you write a lot of node JS and this person’s got a node JS project that’s got this many stars, and he’s desperate for a maintainer. Something to connect these people, because I think that if you look at the network graph of people who are publishing open source, it’s too small for them to try to solve it by you know, simply tweeting out or going to a user group and being like, hey will someone please maintain this. Because usually by the time that they know to ask for help, it’s late enough that that project is no longer appealing, like its probably mostly settled down and the only work left to do is to put up with all of the random people who want basically free tech support.

Those are some ideas, but you know short of that it’s hard, it’s better to try and solve that early than solve that late.

Derrick:
So in problematic projects at work, we use techniques like improving documentation, retrospectives, stand ups, pair programming to overcome some of these problems. Yet you rarely see them applied in open source projects. Do you think there is a role for these, and other ways of improving communication between maintainers, contributors, and users?

Justin:
So like comparing like a product team and a business versus an open source team. It’s almost like in a product team you’ll have role players who specialize in one area or the other. In an open source team you have like a dozen people who all want to be point guard. They don’t necessarily have any incentive or organization to be a role player that slots into just like one particular aspect of the project. Unless there is somebody or some group of people who are smart enough and experienced enough to be able to … Like you just did, carve out all of these individual responsibilities. Like if we don’t do these things problems are gonna happen. Like we have to individually justify because we don’t have some boss, we don’t have a paycheck coming typically. We need to justify for ourselves why these activities are important.

Then some of them are just shit work compared to other stuff. Like everyone wants to be the genius that gets to code a thing, and gets to be famous for building it. But if you look at somebody, maybe they are a great writer, and they are a great teacher and instructor. And if they wrote the documentation pages, then that would be fantastic for the community. But that doesn’t necessarily get the same accolades or glory. And so it’s contingent on everyone else on the team to lift that person up, to point out how great they are and how they’re helping.

That’s what I see on really successful open source teams. Is giving people a reason to fill those roles that isn’t just about money. Because once you put money into it, it might as well be a starupified or productified open source, in kinda you know name but not spirit only project.

Derrick:
What do you think of initiatives like Todo group and Ruby together, and what more do you think companies who use open source components can be doing?

Justin:
I’m friends with Brandon Keepers at Github, who’s done a lot of there work through Todo group, and it sounds like in many ways, its goal is to solve a lot of the systemic problems, or at least address as a bunch of companies realizing like hey none of us individually have an individual interest in being the one to own these shared concerns, like security audits. But we understand whenever there is one we all have a huge business risk to that.

So I love the idea of them coming together, putting a little bit of funding behind it, and seeing where it goes. I don’t think that it’s gonna turn into this thing where it’s developing a lot of standards and process that other people are gonna feel beholden to. You know they might publish some examples like an example code of conduct that you could use, or something like that to help people get started.

But I think its biggest point of value is gonna be how do we solve some of these interweaving problems, that no individual person feels responsible for. When you think organizations like Ruby Together … I don’t very much about Ruby Together per se or properly, but in conversations I’ve had with other people about things like GitTip or efforts to try to put money into projects that are commonly used as support. On one hand like sponsorships like hey we’re Ruby Gems, we need hosting, you know. That is a real clear cost. Where it gets a little muddy is … I think a lot of people have great intrinsic motivators for working on open source, and when you just kind of like put a bounty on like, hey if you get this feature done, or individual sponsorships for like, hey I want this feature, as a company in this open source thing so we’re just gonna fund two months of the developer’s time.

A lot of those are really interesting, and some of them might scale, and some of them might work. But it’s important to know that there’s still this impedance mismatch. The reason anyone starts writing an open source thing isn’t to get paid, its because it scratches an itch or it’s something that they found interesting. There’s a real risk that once you start throwing money at it, whether it’s at the individual or at the specific features and issues. You’re probably going to change the incentive structure, and alter the course of where that project goes.

So I’m a little bit more cautious around the argument that open source work is labor too, and we need to be paying directly for people’s open source labor, through some kind of like patron model. Another example of that that stands out to me, is I’ve got a friend who is a very prolific open source developer, and he could totally start a Kickstarter tomorrow and raise two hundred thousand dollars a year, if he wanted. To just do open source, and he won’t do it because he’s a smart guy, and he know that as soon as he gets however many five dollar donations it takes to get up to that number. Those are gonna act like customers, like if they don’t get the feature that they want or they don’t get their bug fixed, they’re all gonna come knocking down. Then it’s that lemonade stand problem again. Of not only do they expect the free lemonade but it was five dollar lemonade, and they paid for it and they want a refund.

So all of these things sound like solutions, but I don’t think that any of them are ever going to solve the problem of code that’s written for an egalitarian reason or a non-financially motivated reason can’t just be solved by throwing dollars at it.


“Our industry has organized itself around, essentially a lie… that faster equals better”


Derrick:
Can you recommend any resources to those wanting to learn more about running sustainable open source projects, or those wanting to contribute to open source more effectively?

Justin:
So first, when I meet people at conferences or user groups, a lot of them will come up to me, and its like a hallway confession almost. Like hey man I like all your open source stuff and I would totally open source stuff too, I’ve got this thing that I’ve been working on for like five years, but its not ready yet, its not ready to be open sourced yet. I just want to say you know … Okay bye. So like clearly they felt the urge to come and talk to me about there not yet open source thing. But they have this kind of like intention towards contributing.

What I always tell them is that there is never going to be a day where you feel like your thing is ready enough for public consumption, that you’re just gonna flick from private to public and ready the flood gates. First of all, I probably have like 200 or 300 repositories between test double and my personal account. That calculation is expensive, I just default to open sourcing absolutely everything.

You’ve never heard of 99% of my repositories, I guarantee it, because most of them weren’t that useful, maybe didn’t go great, maybe it just didn’t get viral, and a couple others did. The only reason that those ones did get successful, is because I was defaulting to open sourcing everything that I do that doesn’t have like a strict business model in front of it.

So that’s what I tell people about getting started and open sourcing. The first place is like if you just publish everything openly, that’s going to look real good when you have all this stuff to show, even if it’s not perfect.

Oh and don’t worry about people looking at your code and judging you, because a lot of people say they read open source, but like we discussed earlier about security, no one actually reads open source. Like if there’s a bug or something they look at it, but no one’s going to read your code. They’re going to see that you have thirty repositories, and that presumably they work if they have a little green badge at the top.

So please just publish. The other bit of advice about like resources for successful, sustainable, long-term, like once something’s been successful where do you look at. I don’t think that book’s been written, I don’t think that there’s a starter pack for how to build a community. At least not in the open source world, but what I would look at, is look at the non vendor-backed very successful projects. The two that are most relevant to my personal technical experience is Ruby on Rails, and even more so EmberJS. When you look at Ember, like yes there are a handful of agencies, that kinda back it and firmly sponsor it. Like obviously Tilde is the Ember company. But they’re not Oracle, there not Facebook or Google. There a smaller group of developers that just care a lot about making a ten year framework, and putting in place all of the structure that they need to build the community around it. Now look at it, there are Ember meetups, in I think like thirty different cities, we have one here in Columbus.

There’s a very strong affinity and a strong sense of ownership by even people who just use Ember. So they’ve done a great job of leveling people up, encouraging contribution, and encouraging engagement really at every level. So look at everything that they’re doing.

Derrick:
Justin, thank you so much for joining us today.

Justin:
Thanks so much for having me, I really appreciate it.

Anyone watching this, and wants to get in touch with me directly, please don’t be shy, just justin at testdouble dot com, I’d be happy to talk to you.

Going Beyond Code to Become a Better Programmer – Interview with Pete Goodliffe

In this interview with Pete Goodliffe, author of ‘Becoming a Better Programmer‘, we dive into issues that go beyond code and separate the good from the great developers. We cover things like attitude, communication skills, managing complexity and what you can do to learn more and keep your skills up to date.

Transcript

Introduction

Derrick:
Pete Goodliffe is a programmer and software development writer, perhaps best known for his software development book, ‘Code Craft’. He speaks regularly at conferences on software development topics and recently published ‘Becoming a Better Programmer: A Handbook for People Who Care About Code’. Pete, thank you so much for joining us. Would you like to share a bit about yourself?

About Pete

Pete:
Sure. I’m a geek. I’m a developer. I like to say I’m a conscientious coder, making the world better one line of code at a time. I am a regular magazine columnist. I’ve been writing a column for more than 15 years now, which is probably some kind of record, I suspect. My day job, I’m a coder. I work in a really awesome team making fun stuff. I’m a musician and I get the opportunity to write code for musical instruments.

A Great Developer’s Attitude

Derrick:
In the book, Becoming a Better Programmer, you say the real difference between adequate programmers and great programmers is attitude. What are the characteristics of a great programming attitude?

Pete:
The standout difference between the really good coders that I’ve worked with and the guys that aren’t so great, is attitude. It’s not a hand-wavey thing. I’ve worked with guys who know technology and who know the idioms, and how to do all this stuff. If they don’t have the right attitude they’re just not effective programmers and they’re not great guys to work with. The kind of stuff I’m talking about here, is humility. You don’t want to work with guys who think they know it all but don’t. Being humble is the key thing.

It doesn’t mean that’s an excuse to not know stuff, but just not believing that you’re better than you are. Specifically, being in a state of constant learning, which I guess ties in with humility, so constantly looking for new stuff, absorbing new knowledge, wanting to learn off other people, the desire to do the best thing you can. It doesn’t necessarily mean be a perfectionist and wanting to make everything perfect before you ship. It’s doing the best you can in the time you have, with the resources you have. That kind of attitude really drives through to great code, great products rather than sloppy work.


“I’ll mis-quote Sartre and say ‘Hell is other people’s code’”


Write Less Code

Derrick:
You’re an advocate for writing less code. Why is this and how can programmers actively work on coding more concisely?

Pete:
Yeah, less code. It seems kind of counter-intuitive for a coder but I think most old-hands know what talking about. It’s kind of the case of working smarter not harder. It’s entirely possible to write thousands of lines of code and achieve nothing in it. Think about it, no unnecessary logic, don’t write stuff that doesn’t need to be said, don’t write verbose code. Sometimes you can stretch out boolean expressions into massive if statements which just hide what is being said.

It’s really a pointless comment, we’ve all seen that haven’t we, code with an essay, and then the function body is really small. Somewhere in the essay there’s some important information but you didn’t need the fifteen paragraphs of comments. No pointless comments, the code just does what it does. The writing of simple but not simplistic code. If you don’t write enough code, it doesn’t do what it’s supposed to do, but just avoiding all those points of needless generality. Don’t make abstracts interfaces, don’t make deep hierarchies that don’t need to be extended, if you don’t need to extend them.

Communicating Effectively

Derrick:
For most professional programmers, development is a social activity in which communication is key, how can programmers begin to communicate more effectively?

Pete:
Code is communication. You’re communicating not just to the computer, you are communicating to other people writing the code. Even if you are working by yourself, you are communicating to yourself in two years time when you pick up the same section of code. It’s a skill, and it’s something you learn, and it’s something you consciously practice. I don’t know of courses, Uni courses or practical courses that really focus on something that’s really quite an important skill for programmers. It’s something that you need to consciously practice, you need to consciously work on, you need to be consciously aware of.

It’s true some people come out the gate better placed than others. Some people can talk well, some people are shy and retiring, but that doesn’t necessarily mean you are stuck like that. That doesn’t necessarily make you a bad communicator. Some people communicate better in different media and it’s worth bearing that in mind. Some guys are really great on email, they can write concise, clear descriptions, they can follow a line of argument writing and explaining something really well. Other people struggle to put it together in words. Learning how you communicate well, playing to your own strengths, then picking the right medium.

Handling Complexity

Derrick:
What are some ways developers can approach managing complexity in software development?

Pete:
A key thing to understand is the difference between necessary complexity and unnecessary complexity. The reason people pay us to write software, unless we’re doing it for fun, is because there’s a complicated problem that needs to be solved. There is some level of necessary level of complexity in software engineering and we have to embrace and understand. The problem is the unnecessary complexity. You can take something, a problem you need to solve, add a little bit of complexity, but then if you wrap it up in class design and higher up it reveals itself in architecture.

One thing I know about well crafted software that basically has the necessary complexity but none of the unnecessary complexity is when I look at it, it looks obvious. That is the key hallmark of some excellent code. You look at it and you just think, “You’ve been working on that for a while” and I look at it and I go, “That’s clearly right.” And you know it wasn’t simple to write. When you look at it the solution’s simple, the shape is simple. All I can say is, that’s what we should strive for.


“I have learned the most in my career when I have been around excellent people”


Derrick:
What are some things developers can do to tackle messy or bad code bases?

Pete:
The most important thing is when you get into your codebase, is to ask people. I see so many developers who just won’t sort of swallow their pride and say, “I don’t quite know what this is doing, but I know Fred over there does. I’ll just go talk to Fred about it.” Often those little bit of insights give you a super fast route through something intractable, a little explanation sort of takes you in there. I’ll mis-quote Sartre and say “Hell is other people’s code.” We all kind of go into that, I do this, and I really struggle with this. I pick up some code, I look at it, that’s a bit dodgy isn’t it. “I really wouldn’t do it like that, what were they thinking, they must be idiots!”

Then my code, I think it’s great. I understand it perfectly, but then somebody else picks it up, they’ll make that same judgement call on my code. What I think is their terrible hack, is actually some pragmatic thing they did for very good reason. When you’re reading messy code, looking at messy code, enter with humility. Nobody really goes out of their way to write badly. Nobody goes out of their way to write messy code, in general. I can’t say that I’ve found anyone that really tried to ruin a project. Approach code with that attitude.

The retrospective prime directive, I can’t remember how it goes, we truly believe everyone did the best they could to the best of their abilities at the time given what they knew, etc., etc. This stops you from making judgement calls, stops you from saying oh, I’m going to rip this whole thing out and start again. This teaches you humility to look a little deeper first.


“The standout difference between the really good coders… and the guys that aren’t so great, is attitude”


Keeping Your Skills Up to Date

Derrick:
The stuff that we work with is constantly changing and it’s all too easy to find yourself becoming something of a coding dinosaur. What can programmers do to ensure that they keep learning and developing their skills?

Pete:
If we value our careers and our skill sets, then this is something you should really be caring about. This is something I find really challenging for myself right now as well. The things I’m focusing on and learning right now are not necessarily coding-related stuff. I’m challenged with learning management, some high-level decision and tactical thinking on a project, rather than dipping into low-level coding stuff. Which is also really fun, but it does mean that I’m pulled away from thinking about the lower-level technical stuff. I want to challenge myself, not get stale and not become that coding dinosaur.

It’s interesting because I know the tools that I know well, and I do use them regularly, but it is really easy to become stale if I’m not pushing the envelope on my technical skills. The biggest take away I guess though is passion. If you don’t want to become a coding dinosaur you probably won’t, because you care about it enough that you will learn, you will read, you will spend time, you will look at web casts, whatever. If you don’t have the desire to learn. If you don’t have the motivation to do it, that’s when you stagnate. That’s when you become a dinosaur.

Recommended Resources

Derrick:
What are some resources you can recommend for those seeking to become better programmers?

Pete:
The biggest thing for me, and that I have done personally and continue to do, is to sit at the feet of great coders. I have learned the most in my career when I have been around excellent people who I can learn off of. Whose skills can rub off on me and I have moved jobs. I have moved myself physically to be able to work with those guys. If you have the liberty to do that, do that. It’s joining in those conversations replying on Twitter, blogging yourself, joining the local user-groups, and all that good stuff.

You want to become a better programmer? Again it’s the want to be a better programmer, and just stoking that passion. I’m enthusiastic. I love this stuff. If you have an enthusiasm, that passion for programming, it tells out in the code that you write.

Derrick:
Really appreciate your passion today. We can definitely see it, and we hope the viewers enjoy it.

Pete:
Excellent. Thank you. Cool.

Working Effectively with Unit Tests: Unit test best practices (Interview with Jay Fields)

In this interview with Jay Fields, Senior Software Engineer at DRW Trading, we discuss his approach to writing maintainable Unit Tests, described in his book ’Working Effectively with Unit Tests’. We cover unit test best practices; how to write tests that are maintainable and can be used by all team members, when to use TDD, the limits of DRY within tests and how to approach adding tests to untested codebases.

For further reading on unit test best practices, check out Jay’s blog where he writes about software development.

Introduction

Derrick:
Jay Fields is the author of Working Effectively with Unit Tests, the author of Refactoring: Ruby Edition, a software engineer at DRW Trading. He has a passion for discovering and maturing innovative solutions. He has worked as both a full-time employee and consultant for many years. The two environments are very different; however, a constant in Jay’s career has been how to deliver more with less. Jay, thank you so much for joining us today. We really appreciate it. Can you share a bit about yourself?

About Jay

Jay:
Thanks for having me. My career has not really been focused in a specific area. Every job I’ve ever taken has been in a domain I don’t know at all, and a programming language, which I don’t really know very well. Starting with joining ThoughtWorks and being a consultant was a new thing for me. I was supposed to join and work on C# and ended up in the Ruby world very quickly. I did that for about five years, and then went over to DRW Trading to do finance, again something I’ve never done, and to do Java, something I had no experience with. That worked okay, and then I quickly found myself working with Clojure. It’s been interesting always learning new things.

Derrick:
Picking up on the book Working Effectively with Unit Tests, most developers now see unit testing as a necessity in software projects. What made you want to write a book about it?

Jay:
I think it is a necessity in pretty much every project these days, but the problem is, I think, really a lack of literature beyond the intro books. You have the intro books that are great, and I guess we have the xUnit Patterns book, which is nice enough as a reference. It’s not very opinionated, and that’s great, we need books like that also. But, if I were to say, I prefer this style of testing, and it’s very similar to Michael Feather’s approach to testing. There’s no literature out there that really shows that. Or, I prefer Martin Fowler’s style of testing, there’s no literature out there for that. I really don’t know of any books that say, “Let’s build upon the simple idea of unit testing, and let’s show how we can tie things together.” You can see some of that in conference discussions, but you really don’t see extensive writing about it. You see it in blog books, and that’s actually how my book started. It was a bunch of blog books that I had over 10 years that didn’t really come together. I thought, if I were to say to someone, “Oh, yeah just troll my blog for 10-year old posts, they’re not really going to learn a lot.” If I could put something together that reads nicely, that’s concise, people can see what it looks like to kind of pull all the ideas together.

Writing Unit Tests in a Team

Derrick:
In the book you say, “Any fool can write a test that helps them today. Good programmers write tests that help the entire team in the future.” How do you go about writing such tests?

Jay:
It’s really tough. I don’t see a lot written about this either, and I think it’s a shame. I think you first have to start out asking yourself, “Why?” You go read an article on unit testing, and you go, “Wow! That’s amazing! This will give me confidence to write software.” You go about doing it, and it’s great because it does give you confidence about the software you’re writing. But the first step I think a lot of developers don’t take is thinking about, “Okay, this is great for writing. It’s great for knowing that what I’ve just written work, but if I come back to this test in a month, am I going to even understand what I’m looking at?” If you ask yourself that, I think you start to write different tests. Then once you evolve past that, you’re really going to need to ask yourself, “If someone comes to this test for the first time, someone goes to this line of code in this test, and they read this line of code, how long is it going to take for them to be productive with that test?” Are they going to look at that test and say, “I have no idea what’s going on here.” Or, is it going to be obvious that you’re calling this piece of the domain that hopefully everyone on the team knows, and if they don’t, it follows a pattern that you can get to pretty easily. I think it’s a lot about establishing patterns within your tests that are focused on team value and on maintenance of existing tests, instead of focused on getting you to your immediate goal.

DRY as an Anti-Pattern

Derrick:
You also mentioned that applying DRY, or don’t repeat yourself, for a subset of tests is an anti-pattern. Why is this?

Jay:
It’s not necessarily an anti-pattern, I think it’s a tradeoff. I think people don’t recognize that merely enough. You’re the programmer on the team. You didn’t write the test. The test is now failing. You go to that test, and you look at it, and you go, “I don’t know what’s going on here. This is not helpful at all. I see some field that’s magically being initialized. I don’t know why it’s being initialized.” At least if you’re an experienced programmer, then hopefully you know to go look for a setup method. But imagine you have some junior guy, just graduated, fantastic programmer. He’s just not really familiar with xUnit frameworks that much. Maybe he doesn’t know that he needs to look for a setup method, so he’s basically stuck. He can’t even help you at that point without asking for help from someone else. DRY’s great. If you can apply DRY on a local scale within a test. It’s fantastic if you can apply it on a global scale, or across the whole suite, so that everybody’s familiar with whatever you’re doing then that’s great too. That helps the team, but if you’re saying that this group of tests within this trial behave differently than these tests up here, you’re starting to confuse people. You’re taking away some maintainability, and that’s fine, maybe you work by yourself so no one’s confused because you did it, but recognize that tradeoff.

When to use TDD

Derrick:
You’re a proponent of the selective use of Test-Driven Development. What type of scenarios are helped by TDD, and when should a developer not apply such techniques?

Jay:
It’s a personal thing. For me, I think I can definitely give you the answer, but I would say everybody needs to try TDD, just try it all the time. Try to do it 100% of the time, and I think you’ll very likely find that it’s extremely helpful for some scenarios, and not for others. Maybe that’ll differ by person, so everyone should give it a try. For me personally, I’ve found when I know what I want to do, if I pretty have a pretty mature idea of what needs to happen, than TDD is fantastic because I can write out what I expect, and then I can make the code work, and I have confidence that it worked okay. The opposite scenario where I find it less helpful is when I’m not quite sure what I want, so writing out what I want is going to be hard and probably wrong. I find myself in what I think is kind of a wasted cycle of writing the wrong thing, making the code do the wrong thing, realizing it’s wrong, writing the new thing that I expect which is probably also wrong, making code do that, and then repeating that over and over, and asking myself, “Why do I keep writing these tests that are not helpful? I should just brainstorm, or play around with the code a little bit, and then see what the test could look like to once I have a good idea of the direction it is going.

Don’t Strive for 100% Test Coverage

Derrick:
You say you’re suspicious of software projects approaching 100% test coverage. Why is this, and why is 100% coverage not necessarily a goal we should all strive for?

Jay:
Earlier on I thought it was a good idea, 100%, because I think a lot of people did. I remember when Relevance used to write in their contracts that they would do 100%, and I thought, “Man, that’s really great for them, their clients.” Then you start to realize you need a test things like, say you’re writing in C#, and you have to automatically generate a set. Do you really want to test that? I think all of us trust that C# is not going to break that functionality in the core language, but if you put that field in there, then you have to test it if you want to give 100% coverage. I think there will be cases where you would actually want to do that. Let’s say you’re using some library, and you don’t really upgrade that library very often, and even though you trust it, maybe you’re writing for NASA or maybe you’re writing for a hospital system – something where if it goes wrong, it’s catastrophic. Then you probably want to write those tests. But if you’re building some web 2.0 start up, not even sure if the company is going to be around in a month, you’re writing a Rails app, do you really want to test the way Rails internals work? Because you have to, if you want 100% coverage. If you start testing the way Rails internals work, you may never get the product out there. You’ll have a great test suite for when your company runs out of money.

Adding Tests to an Untested Codebase

Derrick:
So that’s what’s wrong with too many tests. What about a code base with no tests? How can you approach getting test coverage on an untested code base?

Jay:
I think the focus for me is really return on investment. Do you really need to test everything equally when the business value is not the same? Let’s say for instance, you’re an insurance company. You want to sell insurance policies. So you need a customer’s address, and you need a social security number, probably, to look them up, some type of unique key, and after that, you just want to charge them. Maybe you need their billing details, but you don’t really care if you got their name wrong, you don’t really care if you got their age wrong. There are so many things that aren’t really important to you. As long as you can keep sending them bills and keep getting paid and find the customer when you need to, the rest of the stuff is not as important. It’s nice. Whenever you send them the bill, you want to make sure the name is correct, but it’s not necessary for your software to continue working.

When I’m writing tests, I focus first on the things that are mission critical. If the software can’t succeed without a function working correctly, or a method working correctly, then you probably need some tests around that. After that you start to do tradeoff, basically looking at it and figuring out, “Well, I want to get the name right. If I get the name wrong, what’s the cost?” Well, getting the name wrong I’m not sure if there’s much of a cost other than maybe an annoyed customer that calls up and says, “Can you fix my name?” So, you have maybe have some call center support. I’m guessing that the call center is going to be cheaper than the developer time. So do we want to write a test with a Regex for someone’s name and now we need UTF support, and now we need to support integers because someone put an integer in their name? And you get in to this scenario where you are maintaining the code and the tests. Then maintaining the tests is stopping you from getting a call center call. It’s probably not a good tradeoff. I just look at the return in investment of the tests, and if I have way too many tests, every amount of code needs to be maintained. If I have too many tests, than I need to delete some because I’m spending too much time maintaining tests that aren’t helping me. If I have not enough tests, then it’s very simple, just start to write some more. I guess I tend to do that with whenever a bug comes in for something that’s critical. Hopefully, you caught it before then, but occasionally they get into production, and you write tests around that. I always think to myself, whenever the bug comes in, what was the real impact here?

Common Mistakes with Unit Tests

Derrick:
What are some of the common mistakes you see people making when writing unit tests?

Jay:
I think the biggest one is just not considering the rest of the team, to be honest. It’s really easy to do TDD. You write a test, and then you write the associated code, and you just stop there. Or, you do that, and you then you apply your standard software development patterns, so you say, “How can I DRY this up? How can I apply all the other rules that have been drilled into me that I need to do with production code? What’s really important?” Now if understanding the maintenance, understanding that what will help you write the code doesn’t necessarily mean it’s going to help you maintain the code. What I find myself often doing, actually, is writing the test until I can develop the code, so I know that everything works as I expect, then deleting that test and writing a different test that I know will help me maintain it. I think the largest mistake people make is they don’t think about the most junior member of the team. Think about this very talented junior member on your team that joined not that long ago. Are they going to be able to look at this test and figure out where to go from there? And if the answer’s no, then that might not be the best test you could write for the code.

Derrick:
Beyond your book, can you recommend some resources for developers interested in learning more about writing effective tests?

Jay:
I think that there are great books that help you get started. I think the Art of Unit Testing is a great book to help you get started. There’s the xUnit Patterns book that’s really good. The problem is, I really don’t think there’s much after that. At least I haven’t found much. Kevlin Henney has done some great presentations about test-driven development. I personally really like Martin Fowler’s writing and Michael Feathers’ writing and Brian Marick, but I don’t know of any books. I really think that there’s room for some new books. I think that, hopefully, people will write some more because unit testing’s only going to be more important. It’s not going away, it’s not like people think this is a bad idea. Everybody thinks this is a great idea. They just want to know how to do it better.

Derrick:
Jay, thank you so much for joining us today. It was a pleasure.

Jay:
Yeah, it was great! Thank you very much for your time.

 

For further reading on unit test best practices and software development, check out Jay Field’s blog.

How to build a code review culture: An Interview with Derek Prior

We’ve interviewed Derek Prior, a Developer at Thoughtbot and host of The Bikeshed Podcast. We discuss how to build a code review culture, diving into the benefits of code reviews, the essential elements to make them effective and how to handle conflict if it arises.

Introduction

Derrick:

Derek Prior is a developer for Thoughtbot in Boston. He co-hosts a web development podcast called, “The Bike Shed”. He speaks about development practices at conferences, including the talk, “Cultivating a Code Review Culture”. Derek, thank you so much for taking time to join us today. Do you have a bit to share about yourself?

Derek:

I’ve been speaking at conferences and meet-ups lately about code reviews, like you said. It’s something that I’ve had a lot of experience with over the last 10 years or so.

Code Reviews Vs. Pair Programming

Derrick:

Some people mean different things when they use the term “code review”, like “pair programming”. What do you mean by code reviews?

Derek:

There’s this term, and it’s not my term, called, “modern code review”, and that’s what I’m talking about. What that means… The definition of that is basically, asynchronous or tool-driven, generally like, lightweight reviews. Most of the people I work with that means a GitHub pull request. That’s typically what I’m talking about. Pair programming is great, and has a lot of the same benefits. But when I’m doing pair programming, I still like to have another person to review the code… I still like to have that other person, also review the code.

The reason for that is in pair programming, you’re kind of building up a solution with your pair, as you go. I really like to see at the end, if that holds up for another developer who might have to work on this next week kind of thing. Like, does this make sense to somebody who wasn’t there all along the way.

“It’s much more helpful to focus on the cultural benefits of code reviews”

Benefits of Code Reviews

Derrick:

Why are code reviews important?

Derek:

Everybody here, when they hear about this, everybody’s natural reaction is to say that they catch bugs. That is true. If I have code reviewed that’s going to have less bugs, than code that isn’t reviewed. But I think that that puts too much importance on the “finding bugs” part. Microsoft actually did a study of this, Microsoft Research, in 2013. Where they found that people consistently said the number one benefit from code reviews is finding defects in code.

But then when they looked at actual data collected in their code review tools, and talked to people after they did code reviews, what they actually found was that people got way more benefit, way more cultural benefits, out of sharing knowledge with each other, or keeping up with what everybody’s doing, and knowing what’s going on in that other part of the code base. Finding a really interesting alternative solution to a problem that they hadn’t thought of, just by getting somebody else’s viewpoint.

Those are the much more important things. I feel like they’re much more interesting than finding defects. Finding defects is actually, frankly, really hard. A lot of times I’ll talk to people about code reviews, and they’ll say, “Well we did code reviews, but we still had all these bugs. They weren’t helping us catch the bugs”, and yeah, they’re not going to catch all the bugs. By saying, ‘The chief thing we get out of code reviews is “defects finding”’, what we’re really doing is setting ourselves up for those situations where people say, “But you did a code review on this, and there’s still a bug.” Right?

It’s hard to find all the bugs, because when you’re doing one of these lightweight code reviews, you’re really just looking at a slice of a change. You’re looking at the diff, and to know exactly how that’s really going to impact your system, you have to know the entire system. You can catch edge cases where, “Oh, you didn’t check to see if this was nil”, or whatever the case may be. Knowing exactly where this is going to screw up your data, is hard to know, without knowing the whole system. Code review is great for some defect tracking, or defect finding, but it’s not a panacea. Instead, I think it’s much more helpful to focus on the cultural benefits of code reviews.

Derrick:

How should you work code reviews into your workflow?

Derek:

What we typically do at Thoughtbot is, when I finish up a PR, I will paste it into Slack, or whatever. We’ll paste that in and say, “Can I get a review for this please?” If it’s not urgent, that’s basically all that happens, right? If it’s urgent, I will ping somebody directly. Ping a couple people directly and say, “Hey, this is a production fix, can you take a look at this?” Generally, that’s enough, with the way we work, to get your code reviewed within the next few hours. Which is, generally sufficient.

You can move on to something else. You can review somebody else’s code in that time. You can kind of trade, if somebody else is coming up and finishing something up. You can be, “Oh yeah, I’ll take a look at that. You take a look at this one for me.” That kind of thing. We take a really, lightweight approach to it.

When you think about it, there’s a lot of natural breaks throughout your day that you can work these in to right? For me, I come in in the morning, “are there any PRs outstanding I can look at?” Then I get started. Then, right before lunch, or in the afternoon, if I’m going to take a coffee walk around Boston Common or something, I’ll look then. Either before or after. There’s plenty of times I find, that I can just work these in naturally. There’s no need to schedule them.

I’ve worked with teams that try to schedule them, or try to say, “This person is going to be the one, that’s going to be chiefly doing code reviews this week”, and I’ve never seen that work particularly well. I just try to say, “Keep it lightweight, friendly”, things like that. People are really surprised, or skeptical, to hear that this works. The biggest thing that makes this work is, keeping small, discrete, pull requests, that are going to be much easier reviewed, and providing some excellent context. That’s the “why” you’re making this change, not necessarily the “what”. Ultimately, the big secret to this is that, most of these code reviews only take me five minutes. It’s not a big commitment.

What to cover in a Code Review

Derrick:

What type of things should a code review cover?

Derek:

It’s really important that each person involved in the code review, doesn’t operate from a central checklist, but rather just looks at things that interest them about a change right? Or interests you overall. Maybe you just got finished reading this great book on design patterns. You’re looking for a way to apply this one design pattern. That’s what you’re going to look for. Great. Fine. That’s valuable. You’re going to teach somebody something.

Or maybe you’re really interested in web security, and you’re going to take a look at things from that perspective. Or accessibility, you’re going to take a look at things from that perspective. That’s where having a good blend, on your team, of people who are interested in different things, really pays off, in code review. We’re all going to look at code reviews, and try and see if there’s an obvious bug. I said, not stressing the bug finding, is a big key to this.

But yes, you’re going to look for them, and you’re going to comment on them when you find them. What I really like when people do, is just sort of take their slant on how they think software development should be. Just kind of look at it with their lenses and see, “Oh, did you consider doing this alternative thing over here?” I typically look at … I’m always harping on naming. Right? Part of what I think makes code review so great, is that it’s a great place to have a technical discussion about your actual software. Rather than in the abstract, like when we go to conferences and things like that.

I harp on naming a lot to say, “Do these names give us a good basis for a conversation, about this? Are they descriptive?” That’s really important to me. I look for test coverage, to see … Like I said, I’m not hunting for bugs throughout the entire system … But I am looking to see, “Is this adequately covered, by test?” I’m not going to look at every single test, but I’m just going to to kind of get an idea of “What kind of tests would I expect to see in this change?” I’d expect to see a feature spec. Or I’d expect to see a unit test. That kind of thing. Just make sure that those are there. Stuff like that.

Like I said earlier, the biggest thing is just everybody kind of, bringing whatever it is that they’re interested in … Whatever they’re an expert at … To the table, so that we can all learn from each other.

“Having conflict in your code reviews is actually really beneficial”

Code Review Author and Reviewer Best Practice

Derrick:

As a code author, what should you do to get the most from a code review?

Derek:

The number one thing is context. When you’re submitting a pull request, you’ve been working on this thing for four hours, or eight hours, or two days, or sometimes even a week, or whatever the case may be. You have a lot of context built up in your head. Some things seem really obvious to you at this point. But, to the person reviewing it, they weren’t there for you. They don’t have that context. What you really need to do is think about… It really helps if you’ve been making several small commits along the way, where you’re describing what was in your head.

But as you prepare the change… If you’re going to use a GitHub pull request, and you’re going to submit your pull request to GitHub, make sure you give a nice description of, a summary, of everything … What I really like to see is, not what … A lot of people will summarize what they did? That’s kind of important, the bigger the change gets, so I know what to expect when I’m looking through the change? But it’s not the only thing I want to see, right?

I can figure out what a change does by looking at the code. Unless it’s totally obfuscated. What’s really interesting is why. Why are we changing it? Why is this the best solution? What other solutions did you consider? What problems did you run into? Is there an area of the code you’re really unsure about, that you’d really like some extra eyes on? That kind of thing. Those are going to help you get a much better review, by setting up everybody else to be on the same page as you.

Derrick:

When you’re reviewing code, how do you do it without being overly critical or needlessly pedantic?

Derek:

This happens a lot. Where code reviews that aren’t done particularly well, or overly critical, can kind of, lead to resentment among the team. The big thing there is, negativity bias, that you need to be aware of. If I’m talking to you … Like I am in this conversation, and I give you some technical feedback, and I say, “Oh, why didn’t you use this pattern here”, whatever. You’re going to perceive that in one way. But if I say that same exact thing written down, it comes off more harsh. You’re going to perceive that more negatively.

It’s much more subject to your particular mood. I can’t influence it with the way I say something. It’s just a fact that, the same feedback written, is going to be perceived more negatively. One excellent way to get around that, is to give feedback in a manner that’s more of a conversation. What I like to do is ask questions. At Thoughtbot, we call this “asking questions, rather than making demands”, right?

Instead of saying, “Extract the service object here, to reduce some of this duplication”, I would say, “Hey, what do you think about extracting a service object here, to eliminate this duplication?” Right? They’re very similar comments, but now I’m opening it up to a conversation, by asking you a question. Like, “Oh, what do you think …?” Sometimes I’ll even say something like, “What do you think about doing it this way?”, and I’ll provide a code sample. Then afterwards, if I don’t really feel particularly strongly about it, I’ll be like, “No, I’m not really sure though, what do you think?” Just to kind of say, “I can see how this would go either way.” “What do you think?”

Clarifying how strongly you feel about a piece of feedback, is also something that can help. Making sure that you’re addressing people’s feedback. That’s basically, how to avoid being overly critical. Is just asking nice, friendly questions, that open up a conversation? The pedantic part, I guess would be like, when you start arguing in circles. When you’re not coming to a good conclusion here, right? What I’d say there is just, make sure you’re focusing on high value things. Don’t go edge case hunting everywhere.

There’s a lot of times you can find edge cases that, just aren’t going to come up. Or, maybe they are and they’re important. But really, what you want to be focusing on is, the more higher value type stuff that we talked about earlier. Try and focus on those, to stay away from these like, pedantic arguments that you can into in code review.

“The big secret is that most of these code reviews only take me five minutes”

Handling Conflict in Code Reviews

Derrick:

Inevitably, with discussions between developers in code, disagreements are going to occur, from time to time. How should you handle this type of conflict?

Derek:

I think the first thing to recognize, is that, having conflict in your code reviews like this, is actually really beneficial. Right? As long as they’re the right types of conflict and nobody feels bad about them afterwards, that kind of thing … Because if you’re agreeing with your teammates all the time, and nothing interesting is happening in your code reviews, you basically have a monoculture, and those are dangerous on their own right.

Like I eluded to before, you want everybody bringing their own experiences, and their own expertise, and sometimes those are going to clash with each other. My experience was, doing things this way, led to these problems, and your experience was, doing things this way, was actually really beneficial. We’re going to have conflict about that. The key is that, handling those conflicts properly, is how everybody on the team is going to learn.

What I suggest, when you find yourself going back and forth, a couple comments about something, and you don’t seem to be getting anywhere … Nobody’s yielding, right? What I try to do is take a step back and be like, “How strongly do I feel about this?” Do I feel like, “This is a quality issue, that if we go out with this code, like it is today, it’s going to be a serious problem immediately?” Or is this kind of like, “It’s not really the approach I would have taken, but, maybe it will work. Or it’s a reasonable solution.” That kind of thing right?

Often times, that’s actually what it is. Once we get beyond a certain level of quality, what we’re talking about is trade-offs, all the time. That’s basically all we’re talking about when we’re doing computer programming. Like I said, once we get past the initial quality is, the trade-offs involved. If you’re arguing about what is essentially a trade-off, maybe it’s time to be like, “Okay. You know what? I would prefer to do it this way, but I can see why you’re doing it that way, and why don’t we just revisit this, if we need to, when we have more information?”, right?

Sometimes, you’ll argue about code, for a while, and then it goes into production. Then you don’t touch it for six months. Which means, it was fine. Even if you didn’t think it was the greatest looking code in the world, like it was written, and it doesn’t have bugs that are obvious. It doesn’t need to be continually improved so, whatever, it’s fine.

If you can table the arguments you’re having until you have more information, and then kind of revisit them. Be like, “Oh, okay. I see now we have this bug, or we have this additional feature we need. I can see how that thing you said earlier, would be a better approach here.” That kind of thing. Waiting until you have more information to, really dig in again, is probably a good thing to do.

Recommended Resources

Derrick:

Can you recommend any resources for those wanting to improve the effectiveness of their code reviews?

Derek:

We have guides. If you go to github.com/thoughtbot/guides, there’s a bunch of guides in there. Some of them are like style, and protocol and things like that. But there’s also a code review guide in there. Whether or not you agree with everything in there, I think it’s a good example of something that I’d like to see more teams do. Where they provide, “This is what we write down, and agree to.” Then have pull requests against, when we want to change. What we think code reviews should be. How we think they should operate.

Every piece of feedback, should it be addressed? Yes, I think so. At least to explain, “Okay, I see what you said there, but I’m going to go in this direction instead.” That kind of thing. I’d like see more people take a look at that, and then adopt it for their team. Other things that can really help with code reviews, are style guides. Which we also have in that repo but, there’s also many community style guides. Just avoiding those arguments about, “Oh, I’d really like see you call …” Like in Ruby, we have map and collect under the same method, right?

Let’s be consistent. Let’s pick map, or let’s pick collect. Or something like, just writing that down somewhere, making that decision once. Rather than arguing about it on all pull requests, those types of things. That’s basically it. I’d like to see more people, get on board with and kind of documenting their experiences and, their expectations out of these things.

Derrick:

Derek, thanks so much for taking time and joining us today.

Derek:

Yeah, thanks for having me. It was good fun.

——–

Its extremely important to learn how to build a code review culture. Code reviews are great for code quality as they reduce errors, help deliver better code and encourage teams be more compassionate towards one another.

Refactoring to a happier development team

 

In this interview with Coraline Ada Ehmke, Lead Software Engineer at Instructure, we discuss data-driven refactoring and developer happiness teams. Coraline gives some great advice on the kinds of tests we should write for refactoring, tools to use and metrics to monitor, to make sure our refactoring is effective. We also learn about the role of refactoring in the Developer Happiness team at Instructure. You can read more from Coraline on her site.

Content and Timings

  • Introduction (0:00)
  • Useful Tests for Refactoring (0:57)
  • Data-driven Refactoring Metrics (4:13)
  • Winning Management Over to Refactoring (6:34)
  • The Developer Happiness Team (8:36)
  • Recommended Resources (13:58)

Transcript

Introduction

Derrick:
Coraline Ada Ehmke is Lead Engineer at Instructure. She is the creator of the Contributor Covenant, a code of conduct for open source projects, and Founder of OS4W, an organization encouraging greater diversity in open source projects. She speaks regularly at conferences about software development, including the talk ‘Data-Driven Refactoring’. Coraline, thank you so much for joining us today. Do you have any more to share about yourself?

Coraline:
I want to say I’ve been doing web development for about 20 years now, which is forever. I actually wrote my first website before there was a graphical browser to view it in. I started out in Perl, went into ASP for a little while, did some Java, and then discovered Ruby in 2007, and I haven’t looked back. All that experience has led me to be very opinionated about what makes good software. I have some very strong opinions that I’m always flexible about changing, but I have a sense of what good software looks like, and that sort of drives me in my daily work.

“A lot of problems come with code that’s just good enough”

Useful Tests for Refactoring

Derrick:
I want to dig into your experience with refactoring a bit. You say that without tests, you shouldn’t refactor. Why are tests so important to refactoring?

Coraline:
I think it was Michael Feathers who once said that ‘if you’re refactoring without testing, you’re just changing shit’. Basically, the importance of testing is you want to challenge and document what your assumptions are about the way that a piece of code works currently, which can be really difficult, because you might look at a method or a class name and assume its role based on its name, but that name could have drifted over time. You need to ensure that you’re documenting what you think it does and then testing to see that it does what you think it does. Testing can also help you stay within the guardrails through refactoring efforts. It’s hard to know where to stop sometimes, so tests can help you identify when you’re doing too much. You want to make sure you’re doing the minimum rework with the maximum results.

Derrick:
Yeah, that sounds great. It’s so easy with refactoring to just continue down that rabbit hole. What kind of tests are useful when refactoring?

Coraline:
It’s good to have good test coverage to start with, if it’s possible. If you don’t have good test coverage at the start, it’s good to build up some unit tests and integration tests just to make sure you’re staying on the path. There are a couple kinds of tests that I’m particularly interested in when it comes to refactoring. The first is boundary testing, which is basically using test cases to test the extremes of input. The maximum values, the minimum values, things that are just inside or outside of boundaries, typical values, and error values as well.

I like to do generative testing for this. If I have a method that takes an argument, I will create an enumerable with lots of different kinds of arguments in it, and see what passes and what fails. Those failures are really important because when you’re refactoring, you can’t know for sure that somewhere in the code base something is waiting for that method to fail or raise an exception. Your refactoring should not change the signature of that method such that a given piece of input stops causing a failure. You want to be really careful with that. Generative testing can help a lot with that, so there’s that with boundary testing.

Also, I like attribute testing, which is an outside-in way of testing. You check that after your code runs the object that’s being tested possesses or does not possess certain qualities. This can be really useful for understanding when you changed how an object is serialized or the attributes of an object without getting into the implementation details.

The most important tests, though, for refactoring are tests that you throw away. We’re hesitant. We love deleting code, but we hate deleting tests, because it gives us this feeling of insecurity. The sort of test that you do during refactoring doesn’t lend itself well to maintaining code over time. You’re going to be a lot more granular in your tests. You’re going to be testing things that are outside of the purview of normal integration or unit tests, and it’s important that you feel comfortable throwing those away when you’re done with a refactoring effort.

Derrick:
Yeah. The importance of being able to throw away stuff really gives you that perspective on what you’re trying to do.

Coraline:
And it lets you work incrementally, too. You can start by challenging your assumption about how a piece of code works. Once you feel confident, you can set up some boundary tests to see how it responds to strange conditions. Once you feel confident with that, then you’re probably confident enough to do some refactoring and make sure that you’re not introducing regressions.

Data-driven Refactoring Metrics

Derrick:
An important aspect of your approach to refactoring is data and measuring the impact of refactoring efforts over time. What are some useful metrics to track?

Coraline:
I want to emphasize that over time is really, really important, because if you’re not collecting data on your starting point and checkpoints along the way, you don’t know if you’re actually making your code base better or worse. It is possible through refactoring to make your code base worse. Some of the things I like to look at at the start are test execution times. If your test suite takes 45 minutes to run in CI, you’re not doing TDD. It’s impossible, because the feedback loop for your test has to be as short as possible. Refactoring tests is actually a great way to get started in trying to adjust that test execution time.

Another one that I’ll probably talk about later is the feature-to-bug ratio. If you look at your sprint planning, figure out how much time you’re spending on new features versus how much time you’re spending on bugs. You can actually use some tools to track down what areas of your code base are generating the most bugs, and combine that with churn to see what’s changing the most. Those are interesting.

I like to look at code quality metrics. A lot of people use Code Climate for tracking code quality, which I don’t really like that much, because it assigns letter grades to code. If a class has an ‘F’, you might say, “Oh, it needs to be rewritten,” or if a class has an ‘A’, you might think, “Oh, it’s perfect. No changes are necessary.” Code Climate doesn’t really give you the change over time. It’ll give you notification if a given class changes it’s grade. That’s great, but you can’t track that over time. I like some other code quality tools. Looking at complexity, there are tools that do assignment branch conditional algorithms to see how complex a piece of code is. I also like to look at coupling, because coupling is often a symptom of a need for refactoring, especially in a legacy application.

Derrick:
What mistakes do you see people making with refactoring?

Coraline:
Basically trying to do too much at once. It’s really easy to look at a piece of code and immediately make a value judgment, saying, “This is bad code,” which is really unfair to the programmers who came before, and saying, “I’m just going to rewrite the whole thing.” We have this instinct to burn it all down. Trying to change too much at one time is really a recipe for disaster. You want to keep the surface area of your changes as small as possible and work as incrementally as possible. Those things take discipline, and I think it’s hard for a lot of people to achieve that without actively policing themselves.

“The most important tests, though, for refactoring are tests that you throw away”

Winning Management Over to Refactoring

Derrick:
Often just getting the time to actually spend on refactoring can be a challenge. What are some benefits of refactoring that can win management over?

Coraline:
That’s a big, important question. I think in a healthy development organization, the development team, the engineers themselves, are stakeholders. They have some input into your backlog. If there’s technical debt that needs to be addressed, that can be prioritized. Not all of us live in a world where we have that kind of influence over the backlog. Some of the ways you can convince management that refactoring is a good idea… I talked before about the bug-to-feature ratio. Stakeholders want new features. New features keep products healthy and appealing to end users and to investors. Anything that stands in the way of a new feature getting out is necessarily a business problem.

If you look at your feature-to-bug-fix ratio and see that it’s poor, one way you can sell it is to say, “through this refactoring, we’re going to improve this ratio and be able to do more feature work.” You can also look at how long it takes to implement a new feature and talk about how refactoring a piece of code, especially if it’s code that changes very often, can make it easier to add new features to a particular area of the code base faster and easier.

In the end, if none of that works, just cheat. Just lie and pad your estimates, so you automatically build in some time for reducing technical debt with all of your estimates, which unfortunately I’ve had to use more often than I would have liked to. It is a valuable tactic to keep in your toolbox.

Derrick:
Yeah, definitely. I still really like… You mentioned this earlier, but the feature-to-bug-fix ratio is really important.

Coraline:
It really is a great health check for how the team overall is doing, what the code looks like and the health of the team, as well. If people are doing a lot of bug fixes, you’re probably going to have some unhappy engineers, because they want to work on new and shiny things. That can have an impact on how happy your developers are, which is really critical to how much work they’re getting done and how good your product is.

The Developer Happiness Team

Derrick:
Definitely. Let’s talk about happiness. You previously led what was known as the Developer Happiness or Refactoring team at Instructure. Tell us a little bit about the role and mission of that team.

Coraline:
It was one of my favorite jobs, honestly. We were charged with increasing the happiness and productivity of the engineering team. Within our charter was identifying processes or parts of the code base or even people who were standing in the way of developers being as productive as they could possibly be. Being very data driven in my approach, the first thing I wanted to do was get a sense of just how gnarly the code base was.

We ended up writing a bunch of code analysis tools. Again, we were using Code Climate, but I wasn’t really satisfied with its tracking ability over time. So we wrote a few tools. I wrote one called Fukuzatsu, which is an ABC complexity measurement tool. There was already a tool out there that was very similar called Flog that was written by Ryan Davis, but it’s an opinionated tool. It punishes you for things like metaprogramming, and it also favors frameworks like active record and punishes you for using alternative ORMs, so just in the nature of the bias that was built in. I wanted something that… I don’t want opinions with my data. I’m an engineer. I’m a professional. I’m capable of forming my own opinions. I just wanted raw data to work with, and that’s what Fukuzatsu gave me.

Another tool that we ended up building as part of this process at the very beginning was something called Society, which basically makes a social graph of the relationships between your classes. You get this nice circular diagram with afferent and efferent coupling displayed in different colors, and you see the links between different classes. That can help you identify service boundaries, for example, because if you have one class that’s a trunk of a lot of inbound or outbound connections, you might say, “That’s a good place for a service.”

A lot of the work we did early on was building tools. We built a mega-tool that collected data from all these other tools and presented them in a dashboard format with change shown over time. You could drill into complexity. You could drill into coupling. You could drill into code quality or code smells, or various other metrics, and see how they were changing at the commit level. That being done, we identified which areas were causing a lot of bugs, which areas of the code are really complex or really changing a lot.

The next step was creating a team of refactoring ambassadors. Rather than taking on the refactoring work ourselves and handing it over to the team that owned the code, our goal was to send in someone who’s really good at refactoring to work with that team to refactor the problematic code, which I think was pretty valuable in terms of ownership and continued success, and also training and teaching and helping those teams level up.

I think a lot of problems come with code that’s just good enough. I think we’ve all seen pull request bombs where someone had a bug to fix or had a feature to implement, and they went down a path that maybe was not optimal. By the time you actually saw it to do a code review, it was too late to suggest an alternative approach, because so much work had been put into it. You knew that you would just crush the spirit of whoever sent the PR. We end up with this lowest-common-denominator code. People also tend to copy and paste the approach that they’ve seen taken elsewhere in a class or elsewhere in a code base. Demonstrating to them that there’s a better way and there’s a better pattern you can follow is really important to maintaining code quality. That was the overall mission of the team.

Derrick:
Is that kind of focussed team something you’d recommend others try? What kind of impact did it have on factors like code quality and the developer happiness?

Coraline:
I think people appreciated that we were paying attention to engineering’s needs. We’d send out surveys to gauge how satisfied people were with the code that they were writing, with CI, with the test suite, with the overall process. We tracked it over time. I think engineers want to be listened to outside of the scope of just, “What are you building right now?”, or “What are you fixing right now?” They feel empowered when their needs are being addressed and when people are asking them questions about how they’re feeling and the work that they’re doing.

In terms of it being a full-time team, I think that really depends on the size of your engineering organization. Instructure has about 100 engineers, so it made sense to have a dedicated team doing that for a while. I think you could have an individual who’s charged with doing that as their primary job, and that would probably help, or have a team that’s a virtual team where you’re rotating people through on a regular basis.

There are different ways to approach it. I think the important thing is just gauging the health of your development organization and gauging the happiness of your engineers. That is going to have a significant impact on the quality of your code in the end.

Derrick:
You talked about all those tools for code quality, and they sound really great. How did you measure developer happiness? Was it talking to people, or…

Coraline:
Talking to people. We had a survey that we created that asked a bunch of questions about, “What do you think are the impediments to getting your job done? How good do you feel about the feature work that you’ve been doing? Are we spending too much time fixing bugs?” We published the anonymized results of those surveys to our internal wiki so people could go back and reference them, and we could use that to track how well we were doing as a team, as well. Again, just listening to people makes them feel better about things and gives them some hope that maybe change is coming.

“I think engineers want to be listened to outside of the scope of just, ‘What are you building right now?’”

Recommended Resources

Derrick:
Can you recommend any other resources for those wanting to learn more about effective refactoring?

Coraline:
I would suggest really getting to know what your testing tools are as a first step, and looking at some alternative ways of testing. Generative testing, for example, gets a bad rap, but I think it’s really a helpful technique to use when you’re doing refactoring. Looking at what people in other languages are doing in terms of their philosophical approach to testing is really important, and making sure that you’re really up to speed on what those different tools are so that you can be more effective in your work.

In terms of tooling, I would recommend a tool for Ruby called Reek, which is a code smell identifier that can help you isolate what areas you want to focus on. You can create an encyclopedia of code smells for your code base and refer back to that, and organize your work such that, “Oh, we’re going to address all the code smells that are of this kind,” and do that across the board. Really look at the tools that are available for your language in terms of code quality.

There are a few books that I recommend. The first is ‘Working Effectively with Legacy Code’ by Michael Feathers. The examples are written in Java, but the lessons that it teaches are applicable to any language. That’s a really good place to start. I work in Ruby, so I would also recommend ‘Refactoring: Ruby Edition’ by Jay Fields and Shane Harvie, and another book called ‘Rails Anti-Patterns’ by Chad Pytel and Tammer Saleh.

Derrick:
Those are fantastic resources. Coraline, thank you so much for joining us today.

Coraline:
Great, it was a lot of fun. Thank you for inviting me.

 

Originally published on www.fogcreek.com on October 2015