The ultimate software development tool

Best practices on project management, issue tracking and support

Tag: code review

Code Review Best Practices

As developers, we all know that code reviews are a good thing in theory. They should help us:

  • Find bugs and security issues early
  • Improve the readability of our code
  • Provide a safety net to ensure all tasks are fully completed

The reality is that code reviews can frequently be an uncomfortable experience for everyone involved, leading to reviews that are combative, ineffective, or even worse, simply not happening.

Here is a quick guide to help you to create an effective code review process.

WHY do we do code reviews?

The first question to answer when reviewing your code review process is: what’s the purpose of our code reviews? When you ask this question, you soon realise that there are many reasons to perform code reviews.  You might even find that everyone in the team has a different idea of why they’re reviewing code. They might think they’re reviewing

  1. To find bugs
  2. To check for potential performance or security problems
  3. To ensure readable code
  4. To verify the functionality does what was required
  5. To make sure the design is sound
  6. To share knowledge of the features that have been implemented and the updated design
  7. To check the code meets standards
  8. …or one of hundreds of other reasons

If everyone in the team has a different “why”, they’re looking for different things in the code. This can lead to a number of anti-patterns:

  • Code reviews take a long time as every reviewer will find a different set of problems that need to be addressed
  • Reviewers become demotivated as every review throws up different types of problems depending upon who reviews it
  • Reviews can ping pong between the reviewer the code author as each new iteration exposes a different set of problems

Having a single purpose for your code reviews ensures that everyone taking part in the review, whether they’re a code author or a reviewer, knows the reason for the review and can focus their efforts in making sure their suggestions fit that reason.

WHAT are we looking for?

Only when we understand why we’re doing the review can we figure out the things we want to look for during the review. As we’ve already started to see, there are a huge number of different things we could be looking for during our review,  we need to narrow down the specific things we really care about.

For example, if we’ve decided the main purpose of our reviews is to ensure the code is readable and understandable, we’ll spend less time worrying about a design that has already been implemented, and more time focusing on whether we understand the methods and whether the functionality is in a place that makes sense. The nice side effect of this particular choice is that with more readable code it’s easier to spot bugs or incorrect logic. Simpler code is often better performance too.

We should always automate as much as possible, so human code reviewers should never be worrying about the following:

  • Formatting & style checks
  • Test coverage
  • If performance meets specific requirements
  • Common security problems

In fact, what a human reviewer should be focusing on might be fairly simple after all – is the code “usable”? Is it:

  • Readable
  • Maintainable
  • Extensible

These are checks that cannot be automated. And these are the features of code that matter most to developers in the long run.

Developers are not the only people who matter though, ultimately the code has a job to do. Our business cares about: does the code do what it’s supposed to do? And is there an automated test or set of tests to prove it?

Finally, does it meet so-called non-functional requirements? It is important to consider things like regulatory requirements (e.g. auditing) or user needs (like documentation) if these checks.

WHO is involved in code reviews?

With a clear purpose and a set of things to be looking for in a review, it’s much simpler to decide who should be involved in the review.  We need to decide:

Who reviews the code? It’s tempting to assume that it should be one or more senior or experienced developers.  But if the focus is something like making sure the code is easy to understand, juniors might be the correct people to review – if an inexperienced developer can understand what’s happening in the code, it’s probably easy for everyone to understand.  If the focus of the review is sharing knowledge, then you probably want everyone to review the code. For reviews with other purposes, you may have a pool of reviewers and a couple of them are randomly picked for each review.

Who signs off the review? If we have more than one reviewer, it’s important to understand who ultimately is responsible for saying the review is complete. This could be a single person, a specific set of people, a quorum of reviewers, specified experts for particular areas of the code, or the review could even be stopped by a single veto. In teams with high levels of trust, the code author might be the one to decide when enough feedback is enough and the code has been updated to adequately reflect concerns that were raised.

Who resolves differences of opinion? Reviews may have more than one reviewer.  If different reviewers have conflicting advice, how does the author resolve this? Is it down to the author to decide? Or is there a lead or expert who can arbitrate and decide the best course? It’s important to understand how conflicts are resolved during a code review.

WHEN?

When has two important components:

When do we review? Traditional code reviews happen when all the code is complete and ready to go to production. Often code will not be merged to trunk/master until a review is complete, for example the pull request model. This is not the only approach. If a code review is for knowledge sharing, the review could happen after the code has been merged (or the code could be committed straight to master). If the code review is an incremental review that is supposed to help evolve the design of the code, reviews will be happening during implementation.  Once we know: why we do reviews; what we’re looking for; and who takes part, we can more easily decide when is the best time to perform the review.

When is the review complete? Not understanding when a review is complete is major factor that can lead to reviews dragging on indefinitely. There’s nothing more de-motivating than a review that never ends, a developer feels like they’ve been working on the same thing forever and it’s still not in production. Guidelines for deciding when the review is complete will depend upon who is taking part in the review, and when the review is taking place:

  • With knowledge sharing reviews, it could be signed off once everyone has had a chance to look at the code
  • With gateway reviews, usually a single nominated senior (the gatekeeper) says it’s complete when all their points are addressed
  • Other types of reviews may have a set of criteria that need to be passed before the review is complete.  For example:
    • All comments have been addressed by fixes in the code
    • All comments have either led to code changes, or to tickets in the issue tracker (e.g. creating tickets for new features or design changes; adding additional information to upcoming feature tickets; or creating tech-debt tickets)
    • Comments flagged as showstoppers have all been addressed in some way, comments that were observations or lessons to learn from in the future do not need to have been “fixed”

WHERE do we review?

Code reviews don’t have to happen inside a code review tool. Pair programming is a form of code review. A review could be simply pulling aside a colleague and walking through your code with them. Reviews might be done by checking out a branch and making comments in a document, and email or a chat channel. Or code reviews might happen via GitHub pull request or a piece of code review software.

Summary

There are lots of things to consider when doing a code review, and if we worried about all of them for every code review, it would be nearly impossible for any code to pass the review process. The best way to implement a code review process that works for us is to consider:

  • Why are we doing reviews? Reviewers have an easier job with a clearly defined purpose, and code authors will have fewer nasty surprises from the review process
  • What are we looking for? When we have a purpose, we can create a more focused set of things to check when reviewing code
  • Who is involved? Who does the reviews, who is responsible for resolving conflicts of opinion, and who ultimately decides if the code is good to go?
  • When do we review, and when is the review complete? Reviews could happen iteratively while working on the code, or at the end of the process. A review could go on forever if we don’t have clear guidance on when the code is finally good to go.
  • Where do we review? Code reviews don’t need a specific tool, a review could be as simple as walking a colleague through our code at  our desk.

Once these questions are answered we should be able to create a code review process which works well for the team. Remember, the goal of a review should be to get the code into production, not to prove how clever we are.

Refactoring to a happier development team

 

In this interview with Coraline Ada Ehmke, Lead Software Engineer at Instructure, we discuss data-driven refactoring and developer happiness teams. Coraline gives some great advice on the kinds of tests we should write for refactoring, tools to use and metrics to monitor, to make sure our refactoring is effective. We also learn about the role of refactoring in the Developer Happiness team at Instructure. You can read more from Coraline on her site.

Content and Timings

  • Introduction (0:00)
  • Useful Tests for Refactoring (0:57)
  • Data-driven Refactoring Metrics (4:13)
  • Winning Management Over to Refactoring (6:34)
  • The Developer Happiness Team (8:36)
  • Recommended Resources (13:58)

Transcript

Introduction

Derrick:
Coraline Ada Ehmke is Lead Engineer at Instructure. She is the creator of the Contributor Covenant, a code of conduct for open source projects, and Founder of OS4W, an organization encouraging greater diversity in open source projects. She speaks regularly at conferences about software development, including the talk ‘Data-Driven Refactoring’. Coraline, thank you so much for joining us today. Do you have any more to share about yourself?

Coraline:
I want to say I’ve been doing web development for about 20 years now, which is forever. I actually wrote my first website before there was a graphical browser to view it in. I started out in Perl, went into ASP for a little while, did some Java, and then discovered Ruby in 2007, and I haven’t looked back. All that experience has led me to be very opinionated about what makes good software. I have some very strong opinions that I’m always flexible about changing, but I have a sense of what good software looks like, and that sort of drives me in my daily work.

“A lot of problems come with code that’s just good enough”

Useful Tests for Refactoring

Derrick:
I want to dig into your experience with refactoring a bit. You say that without tests, you shouldn’t refactor. Why are tests so important to refactoring?

Coraline:
I think it was Michael Feathers who once said that ‘if you’re refactoring without testing, you’re just changing shit’. Basically, the importance of testing is you want to challenge and document what your assumptions are about the way that a piece of code works currently, which can be really difficult, because you might look at a method or a class name and assume its role based on its name, but that name could have drifted over time. You need to ensure that you’re documenting what you think it does and then testing to see that it does what you think it does. Testing can also help you stay within the guardrails through refactoring efforts. It’s hard to know where to stop sometimes, so tests can help you identify when you’re doing too much. You want to make sure you’re doing the minimum rework with the maximum results.

Derrick:
Yeah, that sounds great. It’s so easy with refactoring to just continue down that rabbit hole. What kind of tests are useful when refactoring?

Coraline:
It’s good to have good test coverage to start with, if it’s possible. If you don’t have good test coverage at the start, it’s good to build up some unit tests and integration tests just to make sure you’re staying on the path. There are a couple kinds of tests that I’m particularly interested in when it comes to refactoring. The first is boundary testing, which is basically using test cases to test the extremes of input. The maximum values, the minimum values, things that are just inside or outside of boundaries, typical values, and error values as well.

I like to do generative testing for this. If I have a method that takes an argument, I will create an enumerable with lots of different kinds of arguments in it, and see what passes and what fails. Those failures are really important because when you’re refactoring, you can’t know for sure that somewhere in the code base something is waiting for that method to fail or raise an exception. Your refactoring should not change the signature of that method such that a given piece of input stops causing a failure. You want to be really careful with that. Generative testing can help a lot with that, so there’s that with boundary testing.

Also, I like attribute testing, which is an outside-in way of testing. You check that after your code runs the object that’s being tested possesses or does not possess certain qualities. This can be really useful for understanding when you changed how an object is serialized or the attributes of an object without getting into the implementation details.

The most important tests, though, for refactoring are tests that you throw away. We’re hesitant. We love deleting code, but we hate deleting tests, because it gives us this feeling of insecurity. The sort of test that you do during refactoring doesn’t lend itself well to maintaining code over time. You’re going to be a lot more granular in your tests. You’re going to be testing things that are outside of the purview of normal integration or unit tests, and it’s important that you feel comfortable throwing those away when you’re done with a refactoring effort.

Derrick:
Yeah. The importance of being able to throw away stuff really gives you that perspective on what you’re trying to do.

Coraline:
And it lets you work incrementally, too. You can start by challenging your assumption about how a piece of code works. Once you feel confident, you can set up some boundary tests to see how it responds to strange conditions. Once you feel confident with that, then you’re probably confident enough to do some refactoring and make sure that you’re not introducing regressions.

Data-driven Refactoring Metrics

Derrick:
An important aspect of your approach to refactoring is data and measuring the impact of refactoring efforts over time. What are some useful metrics to track?

Coraline:
I want to emphasize that over time is really, really important, because if you’re not collecting data on your starting point and checkpoints along the way, you don’t know if you’re actually making your code base better or worse. It is possible through refactoring to make your code base worse. Some of the things I like to look at at the start are test execution times. If your test suite takes 45 minutes to run in CI, you’re not doing TDD. It’s impossible, because the feedback loop for your test has to be as short as possible. Refactoring tests is actually a great way to get started in trying to adjust that test execution time.

Another one that I’ll probably talk about later is the feature-to-bug ratio. If you look at your sprint planning, figure out how much time you’re spending on new features versus how much time you’re spending on bugs. You can actually use some tools to track down what areas of your code base are generating the most bugs, and combine that with churn to see what’s changing the most. Those are interesting.

I like to look at code quality metrics. A lot of people use Code Climate for tracking code quality, which I don’t really like that much, because it assigns letter grades to code. If a class has an ‘F’, you might say, “Oh, it needs to be rewritten,” or if a class has an ‘A’, you might think, “Oh, it’s perfect. No changes are necessary.” Code Climate doesn’t really give you the change over time. It’ll give you notification if a given class changes it’s grade. That’s great, but you can’t track that over time. I like some other code quality tools. Looking at complexity, there are tools that do assignment branch conditional algorithms to see how complex a piece of code is. I also like to look at coupling, because coupling is often a symptom of a need for refactoring, especially in a legacy application.

Derrick:
What mistakes do you see people making with refactoring?

Coraline:
Basically trying to do too much at once. It’s really easy to look at a piece of code and immediately make a value judgment, saying, “This is bad code,” which is really unfair to the programmers who came before, and saying, “I’m just going to rewrite the whole thing.” We have this instinct to burn it all down. Trying to change too much at one time is really a recipe for disaster. You want to keep the surface area of your changes as small as possible and work as incrementally as possible. Those things take discipline, and I think it’s hard for a lot of people to achieve that without actively policing themselves.

“The most important tests, though, for refactoring are tests that you throw away”

Winning Management Over to Refactoring

Derrick:
Often just getting the time to actually spend on refactoring can be a challenge. What are some benefits of refactoring that can win management over?

Coraline:
That’s a big, important question. I think in a healthy development organization, the development team, the engineers themselves, are stakeholders. They have some input into your backlog. If there’s technical debt that needs to be addressed, that can be prioritized. Not all of us live in a world where we have that kind of influence over the backlog. Some of the ways you can convince management that refactoring is a good idea… I talked before about the bug-to-feature ratio. Stakeholders want new features. New features keep products healthy and appealing to end users and to investors. Anything that stands in the way of a new feature getting out is necessarily a business problem.

If you look at your feature-to-bug-fix ratio and see that it’s poor, one way you can sell it is to say, “through this refactoring, we’re going to improve this ratio and be able to do more feature work.” You can also look at how long it takes to implement a new feature and talk about how refactoring a piece of code, especially if it’s code that changes very often, can make it easier to add new features to a particular area of the code base faster and easier.

In the end, if none of that works, just cheat. Just lie and pad your estimates, so you automatically build in some time for reducing technical debt with all of your estimates, which unfortunately I’ve had to use more often than I would have liked to. It is a valuable tactic to keep in your toolbox.

Derrick:
Yeah, definitely. I still really like… You mentioned this earlier, but the feature-to-bug-fix ratio is really important.

Coraline:
It really is a great health check for how the team overall is doing, what the code looks like and the health of the team, as well. If people are doing a lot of bug fixes, you’re probably going to have some unhappy engineers, because they want to work on new and shiny things. That can have an impact on how happy your developers are, which is really critical to how much work they’re getting done and how good your product is.

The Developer Happiness Team

Derrick:
Definitely. Let’s talk about happiness. You previously led what was known as the Developer Happiness or Refactoring team at Instructure. Tell us a little bit about the role and mission of that team.

Coraline:
It was one of my favorite jobs, honestly. We were charged with increasing the happiness and productivity of the engineering team. Within our charter was identifying processes or parts of the code base or even people who were standing in the way of developers being as productive as they could possibly be. Being very data driven in my approach, the first thing I wanted to do was get a sense of just how gnarly the code base was.

We ended up writing a bunch of code analysis tools. Again, we were using Code Climate, but I wasn’t really satisfied with its tracking ability over time. So we wrote a few tools. I wrote one called Fukuzatsu, which is an ABC complexity measurement tool. There was already a tool out there that was very similar called Flog that was written by Ryan Davis, but it’s an opinionated tool. It punishes you for things like metaprogramming, and it also favors frameworks like active record and punishes you for using alternative ORMs, so just in the nature of the bias that was built in. I wanted something that… I don’t want opinions with my data. I’m an engineer. I’m a professional. I’m capable of forming my own opinions. I just wanted raw data to work with, and that’s what Fukuzatsu gave me.

Another tool that we ended up building as part of this process at the very beginning was something called Society, which basically makes a social graph of the relationships between your classes. You get this nice circular diagram with afferent and efferent coupling displayed in different colors, and you see the links between different classes. That can help you identify service boundaries, for example, because if you have one class that’s a trunk of a lot of inbound or outbound connections, you might say, “That’s a good place for a service.”

A lot of the work we did early on was building tools. We built a mega-tool that collected data from all these other tools and presented them in a dashboard format with change shown over time. You could drill into complexity. You could drill into coupling. You could drill into code quality or code smells, or various other metrics, and see how they were changing at the commit level. That being done, we identified which areas were causing a lot of bugs, which areas of the code are really complex or really changing a lot.

The next step was creating a team of refactoring ambassadors. Rather than taking on the refactoring work ourselves and handing it over to the team that owned the code, our goal was to send in someone who’s really good at refactoring to work with that team to refactor the problematic code, which I think was pretty valuable in terms of ownership and continued success, and also training and teaching and helping those teams level up.

I think a lot of problems come with code that’s just good enough. I think we’ve all seen pull request bombs where someone had a bug to fix or had a feature to implement, and they went down a path that maybe was not optimal. By the time you actually saw it to do a code review, it was too late to suggest an alternative approach, because so much work had been put into it. You knew that you would just crush the spirit of whoever sent the PR. We end up with this lowest-common-denominator code. People also tend to copy and paste the approach that they’ve seen taken elsewhere in a class or elsewhere in a code base. Demonstrating to them that there’s a better way and there’s a better pattern you can follow is really important to maintaining code quality. That was the overall mission of the team.

Derrick:
Is that kind of focussed team something you’d recommend others try? What kind of impact did it have on factors like code quality and the developer happiness?

Coraline:
I think people appreciated that we were paying attention to engineering’s needs. We’d send out surveys to gauge how satisfied people were with the code that they were writing, with CI, with the test suite, with the overall process. We tracked it over time. I think engineers want to be listened to outside of the scope of just, “What are you building right now?”, or “What are you fixing right now?” They feel empowered when their needs are being addressed and when people are asking them questions about how they’re feeling and the work that they’re doing.

In terms of it being a full-time team, I think that really depends on the size of your engineering organization. Instructure has about 100 engineers, so it made sense to have a dedicated team doing that for a while. I think you could have an individual who’s charged with doing that as their primary job, and that would probably help, or have a team that’s a virtual team where you’re rotating people through on a regular basis.

There are different ways to approach it. I think the important thing is just gauging the health of your development organization and gauging the happiness of your engineers. That is going to have a significant impact on the quality of your code in the end.

Derrick:
You talked about all those tools for code quality, and they sound really great. How did you measure developer happiness? Was it talking to people, or…

Coraline:
Talking to people. We had a survey that we created that asked a bunch of questions about, “What do you think are the impediments to getting your job done? How good do you feel about the feature work that you’ve been doing? Are we spending too much time fixing bugs?” We published the anonymized results of those surveys to our internal wiki so people could go back and reference them, and we could use that to track how well we were doing as a team, as well. Again, just listening to people makes them feel better about things and gives them some hope that maybe change is coming.

“I think engineers want to be listened to outside of the scope of just, ‘What are you building right now?’”

Recommended Resources

Derrick:
Can you recommend any other resources for those wanting to learn more about effective refactoring?

Coraline:
I would suggest really getting to know what your testing tools are as a first step, and looking at some alternative ways of testing. Generative testing, for example, gets a bad rap, but I think it’s really a helpful technique to use when you’re doing refactoring. Looking at what people in other languages are doing in terms of their philosophical approach to testing is really important, and making sure that you’re really up to speed on what those different tools are so that you can be more effective in your work.

In terms of tooling, I would recommend a tool for Ruby called Reek, which is a code smell identifier that can help you isolate what areas you want to focus on. You can create an encyclopedia of code smells for your code base and refer back to that, and organize your work such that, “Oh, we’re going to address all the code smells that are of this kind,” and do that across the board. Really look at the tools that are available for your language in terms of code quality.

There are a few books that I recommend. The first is ‘Working Effectively with Legacy Code’ by Michael Feathers. The examples are written in Java, but the lessons that it teaches are applicable to any language. That’s a really good place to start. I work in Ruby, so I would also recommend ‘Refactoring: Ruby Edition’ by Jay Fields and Shane Harvie, and another book called ‘Rails Anti-Patterns’ by Chad Pytel and Tammer Saleh.

Derrick:
Those are fantastic resources. Coraline, thank you so much for joining us today.

Coraline:
Great, it was a lot of fun. Thank you for inviting me.

 

Originally published on www.fogcreek.com on October 2015

9 Effective Code Review Tips

9 Code Review Tips

For everyone:

  • Review the right things, let tools do the rest

You don’t need to argue over code style and formatting issues. There are plenty of tools which can consistently highlight those matters. Ensuring that the code is correct, understandable and maintainable is what’s important. Sure, style and formatting form part of that but you should let the tool be the one to point out those things.

  • Everyone should code review

Some people are better at it than others. The more experienced may well spot more bugs, and that’s important. But what’s more crucial is maintaining a positive attitude to code review in general and that means avoiding any ‘Us vs. Them’ attitude or making code review burdensome for someone.

  • Review all code

No code is too short or too simple. If you review everything, then, nothing gets missed. What’s more, that makes it a part of the process, a habit and not an afterthought.

  • Adopt a positive attitude

This is just as important for reviewers as well as submitters. Code reviews are not the time to get all alpha and exert your coding prowess. Nor do you need to get defensive. Go into it with a positive attitude of constructive criticism and you can build trust around the process.

For reviewers:

  • Code review often and for short sessions

The effectiveness of your reviews decreases after about an hour. So putting off reviews and doing them in one almighty session doesn’t help anybody. Set aside time throughout the day including breaks not to disrupt your own flow and help form a habit. Your colleagues will thank you for it. Waiting can be frustrating and they can resolve issues quicker whilst the code is still fresh in their heads.

  • It’s OK to say “It’s all good”

Don’t get picky, you don’t have to find an issue in every review.

  • Use a checklist

Code review checklists ensure consistency – they make sure everyone is covering what’s important and avoid common mistakes.

For submitters:

  • Keep the code short

Beyond 200 lines, the effectiveness of a review drops significantly. By the time you’re at more than 400, they become almost pointless.

  • Provide context

Link to any related tickets or the spec. There are code review tools that can help with that. Provide short but useful commit messages and plenty of comments throughout your code. It’ll help the reviewer and you’ll get fewer issues coming back.

A Guide To Open Sourcing Your Project at Work

Congratulations! You’ve written something at work that is amazing and you want to share it with the world! This guide covers three key areas that you should consider before making the leap: Why, when and how to do it.

Why Should I Open-Source My Work Project?

Open-sourcing your project at work can be a great idea. It can:

Help you build a developer-friendly brand

  • From those with a developer-focused product, like Stripe and Twilio, to those with APIs, like Facebook, Google and Square. Open-sourcing your code can be a good way to build your company’s relationship with developers.

Allow you to give back to the community

  • Just think of all the libraries and software you use on a daily basis that make use of open-source code. Adding your own is a good way of paying it forward so that others can benefit from your contribution. We’ve open sourced a number of libraries and even whole products.

Help you to recruit

  • Take Yahoo and LinkedIn for example. They’ve found that through their commitment to Open-Source projects (like Hadoop and Kafka), that they’ve been able to encourage developers to join them who otherwise might not have.

Gain more contributors than your project ever would have in-house

  • Like for example Square’s Dagger, a dependency injector for Android and Java. Having released it, many developers are contributing to it, including those at Google. In fact, Google developers have been contributing more than Square’s developers do themselves.

When Should I Open-Source My Work Project?

There are two conditions that you would want to meet before open-sourcing your project. You want to make sure that:

It won’t hurt your business

  • It may be an impressive, complicated bit of code that would be useful for other products beyond your own. Yet if that development is your secret sauce, then giving it away would be bad for the business. Likewise, if your library is an integral part of what makes your product unique or even what makes it possible, then you might want to keep it in-house.

Your code is helpful to others

  • Consider whether anyone else would actually want what you’ve created. Is it so uniquely tied to your workflow or infrastructure that it wouldn’t be useful for others? As a rule of thumb: if making it suitable for general consumption would make it less useful for yourself, then it’s probably not worth the effort.

Ok, so you’ve met those two requirements. Then let’s move to the mechanics of open-sourcing some code.

How Do I Open-Source My Work Project?

Step 1: Audit your code for security leaks

  • Chances are higher than you might like to admit that you or a colleague have left some passwords, usernames, IP addresses, machine names, personal contact information or other security hazards somewhere in your code. Keep in mind that this applies not only to your final master code but also to all the changesets you’ve had in the past.

For that reason, we recommend you do two things:

1. Make a brand-new repository
  • Chop off all the history of the code up to that point. There will be a new history and it saves you having to audit all the historical versions of your code. Plus, no one needs to know that it took you two weeks to wrap your head around C++11 lambda syntax.
2. Audit the code for security problems
  • This will take a lot less time than you think. Look especially at test suites and any places that are near connection points to other systems.

Step 2: Strip your code of profanity and immature pot-shots

  • While you’re in there, also rip out anything inappropriate that makes you sound more like a teenager than a professional. This doesn’t mean you can’t have any humor in your source code. But it does mean that jokes made at the expense of your competitor, a customer or the decrepit browser you’re forced to support might not be appropriate.
  • If in doubt, think about whether you’d feel comfortable reading your code loud to those beyond your team.

Step 3: Make sure your code adheres to best-practice naming and formatting

  • You’ll want your open-source code to be examples of your best work. Make sure you are using good, standardized naming conventions and formatting. Use tools like pyflakes/pep8, jslint, gofmt, ReSharper and others to help.
  • Also, keep in mind that if you’ve been wanting to do the One True Naming Standardization for your project, now’s a good time. Once you open-source your code, there will be a lot of inertia to avoid breaking changes. Get those done before you release. It’ll also make it easier for other contributors to get started with your code.

Step 4: Document it

  • You don’t have to write ninety pages of info docs but you should at least have a nice Markdown-formatted README.md in your root directory that explains what your software is, how to use it, and (if applicable) how to build it.
  • If you’re releasing a library, you should also make sure your code has docstrings/JavaDoc/whatever so that you can generate API documentation.

Step 5: License your code

  • You may want to get some proper legal advice on this. But before releasing your code, you should pick a license. Unless you have a compelling reason to do otherwise, the MIT license will probably suffice. It’s short, sweet, well-understood, liberal and makes integrating third-party changes back into your own products headache-free. But if you’re contributing to the code that you want to include in a project that already has its own license, you might want to use that license instead. Here’s a useful overview of license types for more info.
  • You’ll want to put a LICENSE file in your repository and have a copyright notice somewhere prominent — either in that file or in the README. Such as ‘(C) 20XX Your Name. All rights reserved.’

Step 6: Name your library or tool

  • Pick a name. Make sure it’s not offensive and avoids having the same name with other existing libraries and trademarked products.

Step 7: Push your code

  • Put it on GitHub, create your own organization, repository and push your code.
  • Keep in mind that some communities have secondary systems that you should consider utilizing as well. If you’re writing .NET, then another one might be Codeplex. If it’s Ubuntu-specific then a Bazaar mirror on Launchpad etc.

Step 8: Publish your package in the appropriate package archive

  • If you’re publishing a library, submit it to the appropriate package manager. For .NET, that would be NuGet; for Python, it’s PyPI; for Perl, it’s CPAN; for Ruby, it’s RubyGems; for Node, it’s NPM; and so on. Also, make sure that someone else at your company, such as a sysadmin, has the ability to continue maintaining the library under the unfortunate circumstance that you get hit by a bus.

Step 9: Announce your code

  • You’re all good, time to announce it! You’ll want to blog and tweet it out. You should also consider publishing on /programming on Reddit and Hacker News etc.

And that’s it! You’re all done!

…well, nearly.

Step 10: Don’t forget about your code

  • Just because you’ve published it doesn’t mean you’re done. You’ve unleashed a new-born into the world; you need to take care of it. Monitor pull requests and bug reports on your new project. If you realize that keeping your project going is overwhelming, then a hearty congratulations! You should remember that it is your responsibility to at least find an extra or substitute maintainer. It’s okay if your project ultimately forks but it’s best not to do so just because you dropped the ball incorporating freely and submitted improvements to your code.

That’s it. For real this time. So go out, contribute, and have fun!

Stop More Bugs With This Code Review Checklist!

Checklists are a great tool in code reviews — they ensure that reviews are consistently performed throughout your team. They’re also a handy way to ensure that common issues are identified and resolved.

Research by the Software Engineering Institute suggests that programmers make 15–20 common mistakes. So by adding such mistakes to a checklist, you can make sure that you spot them whenever they occur and help drive them out over time.

To get you started with a checklist, here’s a list of typical items:

Code Review Checklist

General

  • Does the code work? Does it perform its intended function, the logic is correct etc.
  • Is all the code easily understood?
  • Does it conform to your agreed coding conventions? These will usually cover the location of braces, variable and function names, line length, indentations, formatting, and comments.
  • Is there any redundant or duplicate code?
  • Is the code as modular as possible?
  • Can any global variables be replaced?
  • Is there any commented out code?
  • Do loops have a set length and correct termination conditions?
  • Can any of the code be replaced with library functions?
  • Can any logging or debugging code be removed?

Security

  • Are all data inputs checked (for the correct type, length, format, and range) and encoded?
  • Where third-party utilities are used, are returning errors being caught?
  • Are output values checked and encoded?
  • Are invalid parameter values handled?

Documentation

  • Do comments exist and describe the intent of the code?
  • Are all functions commented?
  • Is any unusual behavior or edge-case handling described?
  • Is the use and function of third-party libraries documented?
  • Are data structures and units of measurement explained?
  • Is there any incomplete code? If so, should it be removed or flagged with a suitable marker like ‘TODO’?

Testing

  • Is the code testable? i.e. don’t add too many or hide dependencies, unable to initialize objects, test frameworks can use methods etc.
  • Do tests exist and are they comprehensive? i.e. has at least your agreed on code coverage.
  • Do unit tests actually test that the code is performing the intended functionality?
  • Are arrays checked for ‘out-of-bound’ errors?
  • Could any test code be replaced with the use of an existing API?

You’ll also want to add to this checklist any language-specific issues that can cause problems.

The checklist is deliberately not exhaustive of all issues that can arise. You don’t want a checklist which is so long no-one ever uses it. It’s better to just cover the common issues.

Optimize Your Checklist

Using the checklist as a starting point, you should optimize it for your specific use-case. A great way to do this is to get your team to note the issues that arise during code reviews for a short period of time. With this data, you’ll be able to identify your team’s common mistakes, which you can then build into a custom checklist. Make sure to remove any items that don’t come up (you may wish to keep rarely occurring, yet critical items such as security-related issues).

Get Buy-in and Keep It Up To Date

As a general rule, any items on the checklist should be specific and if possible, something you can make a binary decision about. This helps to avoid inconsistency in judgments. It is also a good idea to share the list with your team and get their approval on the content. Make sure to review the checklist periodically too, to check that each item is still relevant.

Armed with a great checklist, you can raise the number of defects you detect during code reviews. This will help you to drive up coding standards and avoid inconsistent code review quality.