SE Radio 554: Adam Tornhill on Behavioral Code Analysis

Adam Tornhill, founder and CTO of CodeScene, joins host Giovanni Asproni to speak about behavioral code analysis. Behavioral code analysis is a set of practical techniques aimed at identifying patterns in how a development organization interacts with the codebase they’re building. It can be used to prioritize technical debt to maximize return on investment; to identify communication and team-coordination bottlenecks in code; to drive refactorings guided by data from how the system evolves; and to detect code quality problems before they become maintenance issues. The episode starts with a broad description of the techniques, providing some examples from real projects, and ends with suggestions on how to get started with applying them. During the conversation, Adam and Giovanni touch on a set of related topics, including the applicability of the techniques to legacy, green-, and brown-field projects; ethical and privacy implications; and the importance of context when judging code quality.

Show Notes

Related Episodes

Transcript

Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.

Giovanni Asproni 00:00:16 Welcome to Software Engineering Radio. I’m your host Giovani Asproni, and today we’ll be discussing behavioral code analysis with Adam Tornhill. Adam is a programmer who combines the degrees in engineering and psychology. He’s the founder of Code Scene where he designs code analysis tools to that empower teams to build great software. He’s the author of several books, Software Design X-Rays: Fix Technical Debt with Behavioral Code Analysis, the bestselling, Your Code as a Crime Scene, Lisp for the Web, and Patterns CC. He’s also a public speaker. So welcome Adam and is there anything I missed that you’d like to add?

Adam Tornhill 00:00:51 So thank you very much for that introduction. No, I think that pretty much sums it up. I’ve been working with soft development for 25 years, and I still enjoy it very much.

Giovanni Asproni 00:01:01 Good. So well, let’s start then with the first question, which will be basically the definition. So, can you tell us, what is behavioral code analysis?

Adam Tornhill 00:01:10 Sure. So behavioral code analysis is the idea that you approach code analysis from the people side. So, in the behavioral code analysis, the code itself is an important piece, but it’s even more important to understand how the organization and developers behind the code have worked to create it. That’s where the true information is.

Giovanni Asproni 00:01:27 Okay. Can you tell us what kind of problems does it help to solve? Maybe giving also some examples from real projects so as people can understand a bit what they can make of it.

Adam Tornhill 00:01:37 Yeah, definitely. So, there are a number of additional information points that the behavioral code analysis adds. So, if you look at just the code itself or static snapshot of the code, we will never be able to understand the dynamics of what actually happens when you build a system. And that simply means that we won’t be able to differentiate technical debt with a low interest rate, technical debt that’s in stable code, versus technical debt that’s in code that’s much more volatile and actually affects the organization. Another big advantage of behavioral code analysis is that since we take this people perspective on code, we’re also able to do a lot of interesting organizational and social analysis.

Giovanni Asproni 00:02:16 Okay. Can you give us maybe a small example that you some problem you helped an organizational address with this? Just a short one if you’ve got one.

Adam Tornhill 00:02:25 Yeah, there are a couple of interesting problems that behavioral code analysis can help with. I like to think that one of the most important use cases for behavioral code analysis is to not only identify but also prioritize technical debt. The reason you need a behavioral code analysis is because technical debt, I mean the tragedy behind it is that there’s so much of it out there and you pick up a code base at random and it will have tons of application code debt. Most organizations simply cannot fix all of that debt at once. You need to use your time wisely. And by running a behavioral code analysis called a hotspot analysis, you can basically identify which technical debt represents the largest risk, where is most of our waste occurring, and what parts of the code become bottlenecks for the development teams. And the way you do this is basically by looking at the behavioral pattern of which parts of the code do we work with as developers, and how often do we work with that parts of the code. So, by identifying those patterns we can identify hotspots, and hotspots are simply complicated code that we have to work with often. And these are great candidates when starting to pay down technical debt.

Giovanni Asproni 00:03:31 Okay. So, if I’m understanding correctly here you are saying, well as we know, lots of organizations have lots of technical debt in their systems, but behavioral code analysis will help us prioritize where to work on and avoid to address technical debts in parts of the code that actually are not going to change much. And so, addressing that particular technical debt would be expensive but not necessarily a good idea. Am I understanding correctly?

Adam Tornhill 00:03:57 Exactly. That’s a pretty accurate summary. The reason why this is important to prioritize is because if you pick up your average enterprise code base, you could probably spend two, three years just refactoring and paying down technical debt. And if you would do that, I mean, you wouldn’t be able to add a single feature. You would basically be out of business, you would be Netscape, right? And you really don’t want that. If you have technical debt or low-quality code and it’s in stable parts of the code — code that hasn’t been changed in months, even years — then that’s technical debt that you need to be aware of because it could very well be a long-term risk, but it’s probably not urgent right now. So, what a behavioral code analysis lets you do is to identify the parts of the code where you spend most of your time. By focusing your refactoring to those parts of the code, you’re more or less guaranteed that your work has a real impact, a positive influence and moving forward.

Giovanni Asproni 00:04:53 Okay, that’s great. Now all this is good, but can you tell us in practice the core idea? You know, where does the data come from? Where do you get the information to do this analysis? What kind of tools do you get information from?

Adam Tornhill 00:05:05 Yeah, I should have covered that because that’s of course core to get this going. So, with the behavioral code analysis, we obviously need behavioral data of how we as developers have interacted with the code we’re building. And this sounds like some kind of survival, some kind of scary thing, but it’s not because all organizations already have all the data that they need. What I’m talking about is version control — systems like Git — because Git history. I’m so fascinated by version control because I think that historically we have used version control like Git more or less as an overly complicated backup system and then occasionally as a collaboration tool. But when doing so, we have also built up this absolutely wonderful data source over how the system evolved, which developers that worked in which part, how often we worked there, and what happened to the code when we did so. So, behavioral code analysis is about analyzing virtual control history. That’s the most important piece of information. The code itself, of course, because we need to differentiate good from bad. And a third data source that I’ve been using more and more over the past years is product management tools like Jira and Azure DevOps.

Giovanni Asproni 00:06:11 Oh okay. So, you basically do some kind of crosscheck analysis from the source code itself and the repository as well as the tools, the project management or product management tools. I guess you can do with this — tell me if I’m guessing correctly — you can also do tracking like of bugs where bugs are in the code and this kind of things in the analysis. Is that correct, by using these tools like Jira?

Adam Tornhill 00:06:36 Exactly, that’s absolutely right. When I started up with behavioral code analysis, I was mostly relying on virtual control and code. But what very often happens is that you more or less need to make the business case for larger improvements to the technical leaders, to the management. They might be people that are not that familiar to look at source code, right? So that’s why I start to use some more and more product management data because it can help you connect the dots. So not only can you show that okay, we have this complicated piece of code where we spend most of our time working, you can also make it relevant to the business by showing that, look we actually had 20 defects in this code just over the past six months. And the nice thing with that is that then you can also visualize the return on investment if you pay down the technical debt. So, the expectation would be that if that part of the code is made more clear and easier to understand, they would also significantly reduce the number of defects. So, this would be a very real win that you could show and celebrate.

Giovanni Asproni 00:07:33 Okay, that’s good. Basically, it’s also a good tool — well, say a good way, doing behavioral code analysis, to actually connect the technical needs with the business needs as well, I guess. So, in terms of you want to address technical debt in a part of the code base, then you can actually justify those actions from a business perspective as well.

Adam Tornhill 00:07:57 Exactly. And I think that’s super important, not only now but also moving forward, for many organizations because if you look at the research that has been done over let’s say the past five years, we have actually learned a lot about technical debt and how it impacts organizations. And if you look at that research you will see it’s a pretty depressing read because what it shows is basically that the average organization wastes somewhere between 23 to up to 42% of developers’ time dealing with technical debt and bad code. The reason this happens is simply because code itself lacks visibility; it’s a very abstract thing. So even if you as a potential business manager knows that okay, this is the waste that happens in the industry on average, it’s very hard to relate that to your situation, to your code base. So, I like to think that adding this behavioral data on top of it and this more business-oriented data can really, really help you understand, okay, what’s our situation? What’s our level of technical debt? What are consequences of it? And by doing so and by being able to visualize code at the business level, we can finally make sure that everyone involved — not only we as developers but also product people and managers — have the shared situational awareness. And I think that’s critical to tackle a large problem like technical debt.

Giovanni Asproni 00:09:11 Okay. Now a question that might sound a bit simplistic to you, but I think will help our listeners a bit. So, most developers have probably used static code analysis tools, and these tools that tell you you’ve got 52 years of technical debt, the code quality is so and so, climatic complexity in some files, but I guess not everybody has heard about behavioral code analysis just yet. So, what are the differences between the two? How is behavioral code analysis different from static code analysis?

Adam Tornhill 00:09:41 That’s a great question. So let me try to clarify that. The reason I started to work with behavioral code analysis was very much a reaction to static analysis. And don’t get me wrong on this, I do think that static analysis definitely has its place. Static analysis is a really, really good feedback loop when writing code. However, what static analysis is not good at is to prioritize technical debt or prioritize issues. And the reason I say this is because with static analysis, what you can do very easily with these tools is that you can identify bad code. Static analysis is really good at that. But what static analysis can never tell you is the impact of that bad code. Is it technical debt? Is it waste? There’s no way to tell, right? And that’s why you end up with this long, long list of like 5,000 critical issues.

Adam Tornhill 00:10:28 With too much information, it’s simply no longer actionable. So, the main advantage of a behavioral code analysis is that yeah, it can also identify bad code ,but as opposed to static analysis, you do get the priorities based on how you actually work with the code. The second big, big difference is that static analysis is limited to the code itself. And one of the great tragedies of software design is that the organization that builds the code is invisible in the code itself. And with the behavioral code analysis you can shine a light on that whole dimension, right? So, behavioral code analysis also lets you look into the people side of things, you can see how the different teams are organized, how they work with the code, and that makes it possible to evaluate things like architectural fit, which is really, really interesting.

Giovanni Asproni 00:11:10 Okay, that’s quite interesting. So it sounds like they are actually somehow related, but actually a more complementary because behavioral code analysis goes more in depth into some properties of the code base and the system itself, as I understand it. Is that correct?

Adam Tornhill 00:11:25 Yeah, I mean I’m obviously biased, but I do like to think that behavioral code analysis gives a more holistic picture of software development. I think that’s really important because getting the people side of software wrong has probably killed more projects than even visual basic, right? So it’s really, really important to understand that dimension too.

Giovanni Asproni 00:11:44 So now let’s go a bit more in detail to understand better the behavioral code analysis aspect. So we talk about organization information in various dimensions of around people and the code, but what kind of information, maybe about a bit more detail on the information that can be inferred from the behavioral code analysis. What kind of properties of the code base, I don’t know, can we infer architectural properties for example, or just the code? Can we infer, I don’t know, the fitness for purpose of the team structure or not? Can it help also with these aspects? I mean, can you give us a bit more detail on the kind of information that we concretely can find?

Adam Tornhill 00:12:23 Sure. I’d be happy to walk you through a couple of stories that I’ve seen happen in the wild.

Giovanni Asproni 00:12:28 That would be great.

Adam Tornhill 00:12:29 Yeah. You can do a lot of different things. So behavioral code analysis is not one thing; it’s a lot of different techniques. You won’t use all of them at once, right? It depends a little bit on your context and the problem you’re trying to solve. Couple of years ago, to give you one example on what you can do, I worked with an organization that had a pretty large system, couple of million lines of code, had a significant development organization, I think 50-60 developers. They had used component-based architecture for a long, long time and that architecture had served them well, and consequently, they also had component-based teams so that each team had ownership of a particular component. And the advantage with that organizational setup is obviously that you have a very clear ownership and a clear area of responsibility. The drawback is that you get ridiculously long lead times because if you have a feature that kind of crosses over architectural boundaries, then you need to coordinate between multiple teams.

Adam Tornhill 00:13:21 So what they did was that they decided to, let’s shift around and let’s do feature teams. So, they simply took their existing organization and split them up into I think six or seven different feature teams and let them loose on the code base. And of course, what happened now was that everything simply slowed down. There were tons of merge conflicts, the lead times got even worse, and no one was really happy with this situation. And that’s so typical of these type of organizational issues that you kind of see the symptoms, but the root causes are very very hard to communicate. So, what we did was that we used behavioral code analysis to simply analyze and visualize what does the code looks like, what are the different areas, and where in the code are the different teams working? And what you could very clearly show was that basically every single component became a gigantic coordination bottleneck because it had continuous contributions from five or even six different teams. And that helps you kind of understand that okay, we do have a mismatch between the way we’re organized and what our architecture supports. Does that make sense to you?

Giovanni Asproni 00:14:24 Yes. And how did you solve that one? Because you said you had component-based kind of system and organization, then you went to feature teams and then did you find a third way to — actually different way to organize the teams?

Adam Tornhill 00:14:35 The thing is that very often what you need to do is you do need to reorganize the teams, or reorganizing the teams alone without also casting an eye at the architecture will never ever work, right? So, the proper solution in this case was to kind of just roll back, get back to the component teams because yes, there were a ton of drawbacks, but at least the drawbacks were well understood. Right? From there, you simply had to evolve the architecture to support feature teams. So instead of having these technical responsibilities where every single component represented a technical section, they moved towards a more feature-oriented architecture so that you could have more isolated pieces where each team could work with larger autonomy without having to step on the toes of another team and force painful coordination meetings. So, it’s always teams and architecture always go hand in hand.

Giovanni Asproni 00:15:27 It’s actually quite a great and important point because I’ve seen organizations that do this kind of reorganization, let’s go because feature teams are the way to go, let’s do that and have those sorts of problems. But they are never able to kind of roll back or even justify what they are trying to achieve. Here it seems that using behavioral code analysis say, well we tried this. Actually, looking at what is happening in the behavior in the code is a bad idea. So, with the current architecture, the best we can do is still component teams. And so, if I understand correctly what you say, well this is the best we can do at the moment, if you want to go to feature teams, you have to refactor the architecture of the system as well, and basically the behavioral code analysis tools were are telling you this, am I correct?

Adam Tornhill 00:16:12 Yeah, that’s pretty accurate. The nice thing is that once you decide to rearchitect your architecture, then you can actually get a bit of support from behavioral code analysis as well because there is this other analysis called the change coupling, which is again something that you cannot see in the code itself because, change coupling is about things that tend to change at the same time. And that sounds really, really vague. But what I mean by that is simply that could be a CCS. When you make a commit, you always modify two different files as part of the same commit — simply one could be the client and the other one could be the server for example. And so what happens is that if you detect these implicit dependencies in the system and you visualize them, then it’s very easy to see that okay, if I decide to take this particular feature from a component and I pull out that feature, I try to encapsulate it, then I can see using change coupling what else follows along when I start to pull in this component. And that is a technique I’ve been using a lot over the years to kind of make an architecture a little bit more granular.

Giovanni Asproni 00:17:16 That is actually quite good as well. So can really help you drive the architecture — well, your efforts to change the architecture — in a way that is convenient and not arbitrary, if you like, or not guided by simply gut feels.

Adam Tornhill 00:17:29 Yeah, I mean to me, the ultimate test of an architecture is always: does the architecture support the way we work with the system, the way we evolve the system? And what you see, quite often you actually find that teams are fighting against their own architecture.

Giovanni Asproni 00:17:44 Okay. That is quite good. And now another question, so I went through also your company page to get some information about behavioral code analysis, and one thing that caught my attention was also the fact that says behavioral code analysis can be used to measure knowledge gaps among other things. So, what kind of knowledge and what kind of gaps are we talking about here? Kind of gaps in skills, gaps in what?

Adam Tornhill 00:18:11 Oh, this is one of my favorite topics. A knowledge gap is an example of one of those another social analysis. It’s about measuring the code familiarity of the current team. To give you an example of why this is important, a number of years ago I worked with a large organization. The team I worked with, they were responsible for two different code bases, and we did an analysis of the code bases. We looked for, more specifically we were looking for technical debt. So, we did a hotspot analysis, investigated a hotspot, did some actions, planned some ADs. And towards the end of the day the team told me that in addition to these two code bases, we actually had one more code base. So, I said, okay, perfect, let’s have a look at that one too. And everyone kind of just looked at me in a funny way, and then suddenly someone said that we don’t really have to look at it because we know that this code base is a true mess.

Adam Tornhill 00:18:59 So I said, okay, that’s super interesting. Now we really have to look at it. So, we did. And it turned out that objectively this third code base was no worse than the other two. And we actually had spent some time comparing code, looking at code samples to convince the team that there’s no objective difference in quality between these code bases. So, why was one code base perceived to be so much harder to understand? Well, it turned out that the reason was that this code base, the third code base, was of course developed by a different team in a different part of the organization that has since been disbanded. And this team simply inherited the code. So, the reason that they perceived it as complicated was because they lacked familiarity with both the domain and the code. And this is what I mean with knowledge and the code familiarity.

Adam Tornhill 00:19:41 Because in a behavioral code analysis you can actually measure the familiarity and visualize this. You can show, okay, where are the knowledge gaps as seen in the code? And this is important because if we address the problem with lack of familiarity as a technical problem and start to refactor the code, we’re going to waste a lot of time solving a problem that possibly doesn’t have to be solved. What we need to do instead is to give the team a chance to plan their onboarding, get the proper onboarding, and become familiar with the code in their own pace.

Giovanni Asproni 00:20:11 That’s interesting. And avoids the conversations around that bad code is code that I didn’t write. It’s more kind of, it’s more objective — more than I simply don’t like that. I would have written that differently.

Adam Tornhill 00:20:23 Exactly, exactly. That’s quite common.

Giovanni Asproni 00:20:26 Okay, so we mentioned legacy code a few times here now. And you often mention behavioral code analysis of course in the context of legacy code. And this is, I mean it’s a kind of natural thing, isn’t it? Most companies have lots of legacy code to deal with that still delivers value and they have to manage it. But the first thing I would like to ask about legacy code is there are many definitions about legacy code; what is yours?

Adam Tornhill 00:20:49 So, my definition of legacy code is actually very close to what you just mentioned. So, the definition I tend to use is that legacy code is code that somehow lacks in quality, and it’s code that you didn’t write yourself — because I do find that these people I mentioned is so much more important. That’s more or less what determines if something is legacy code or the next cool thing.

Giovanni Asproni 00:21:10 Okay. That is quite interesting. I know that there was also another definition — I think from Mike Feathers and others — that is basically legacy code is code with no tests. But now in your experience, have you come across code that actually has tests, is well tested, but the developers still have problems with it and think that it’s actually badly written anyway?

Adam Tornhill 00:21:30 Yes, I have. That’s the main reason why I don’t use that definition myself because — and I have to share this with you; I’ve been analyzing so many code bases over the years. I’ve probably analyzed 3 – 400 code bases over the past decade, and some of the worst technical debt I’ve ever found tends to be in the automated tests. So, to me, having a large test suit is absolutely no guarantee that this system is maintainable. It could actually mean the opposite.

Giovanni Asproni 00:21:55 So, basically if the tests are not written with high-quality code, the test themselves are not really going to be very helpful to the team, or at least not helpful in terms of going faster or getting better understanding of the code base?

Adam Tornhill 00:22:11 No, I actually think that if the tests themselves lack in quality, they are going to hold back your overall efforts. They’re going to make the situation worse for you.

Giovanni Asproni 00:22:19 So it’s actually worse than that. It’s simply if you have bad tests, you probably are in the same position as if there were, actually, there were no tests whatsoever, kind of. Am I understanding correctly?

Adam Tornhill 00:22:30 Yeah, that’s correct. And I mean, I’ve seen it all. I mean, in test code we seem to, I think there are multiple reasons this happens. I think that first of all, test automation is, I mean it’s an old idea but it’s only been mainstream for what? like 10, 15 years, or something like that. So I think that we as a community, we’re still learning what good tests look like, and consequently we tend to do things in that test automation code that we would never ever accept in application code. And then you have the other driver that some organizations tend to enforce different code coverage numbers. What you very often do is you simply try to meet that number, right? If it makes sense or not, that doesn’t really matter. You try to meet it. So, you see a lot of tests that don’t really test anything except that application doesn’t crash. And I do think that’s a pretty low bar.

Giovanni Asproni 00:23:17 Yeah, yeah. Now a bit of a small change here. Now, we talk about legacy code quite a bit, but can we use behavioral code analysis also for development of green-field and brown-field systems?

Adam Tornhill 00:23:32 Yes, we can. I mean brown-field systems is a perfect use case. It’s really also one of the sweet spots. For green-field projects, you can use behavioral code analysis. You need to allow a little bit of time for building up the behavioral data, but in my experience, this goes surprisingly quick. So, if you have a team with say 10, 15 people, then you will have enough data in just three to four weeks to start to do a behavioral code analysis because the patterns you detect in the behavioral code analysis, they’re so strong. So, they’re there from the very beginning.

Giovanni Asproni 00:24:02 Ah, that’s interesting. So basically, it is a kind of the way teams work somehow the patterns, they interact with the code and they do things start to appear from the very beginning of the project and they tend to stay stable. Is this, am I understanding correctly?

Adam Tornhill 00:24:20 Yes, they do. So what you can see on many, if you start to look at the green-field projects, what you tend to see are the parts that start to look problematic after, let’s say the first months of development, very often if you return a year later you see that they have just degraded further in quality, right? And the earlier we can detect these problems, the more important it is. So that’s one reason why I recommend green-field projects to use behavioral code analysis on in their day-to-day work. But there’s another reason too because one thing that tends to happen with the green-field projects is that they tend to get the deadline. And I’ve seen so many projects that get off to a good start — verye often, these are legacy-replacement projects, and this time around the organization wants to do everything right. Right? So, you know how messy it was to work with the legacy code. Now we are going to do the perfect replacement system. And it usually starts out well and then this deadline happens. And deadlines do tend to bring out the worst in us. And very, very often, the whole code quality takes a massive hit. And if you don’t have a behavioral code analysis in place, then you really lack that safety net, right? Then you lack the way of communicating inside a team that, okay, this is what’s going on, this is what’s happening, this is what we need to focus on.

Giovanni Asproni 00:25:29 Okay, this is quite good because now natural follow up question is how can you include behavioral code analysis within the software development lifecycle? You mentioned your suggestion for teams to use this in the day-to-day work. Okay. How can they do that? I can imagine, you know, integration with continuous integration or continuous delivery pipelines, maybe IDE integration. Can you give us an idea on how this should happen?

Adam Tornhill 00:25:56 Yeah, I like to think that there are a couple of different use cases. One of the most important is, of course, to integrate it into your CICD or your pull request if you’re working with that so that each time you, you decide to merge something to your main branch, you know that this is a clean merge, that you’re not introducing any new problems, you’re not degrading the quality of any hotspots. I think that’s super important. The best way to manage technical debt is, I mean, step zero is always put the bar on what’s already there — don’t take on more debt. These integrations help me with that. The other thing that I recommend is to use behavioral code analysis as a communication tool. And what that means is that you can use these techniques, the visualizations of code, you can use them as part of your planning meeting and as part of your retrospectives. And in particular, I’m a big fan of using this data in retrospectives because once you get to a retrospective, you have everything fresh in your head, you know which features you worked on, you know which parts that were painful to work on. And seeing that visualized in the context of hotspots, it’s a really, really good way of driving decisions on what to fix and what technical debt pay down in the next iteration. So that would be my recommendation.

Giovanni Asproni 00:27:01 Are there any IDE integrations, any tools to integrate behavioral code analysis tools with the delivery pipelines? Basically, tools that simplify the life of developers, as far as you know, or they need to do this integration by themselves with the existing tooling?

Adam Tornhill 00:27:19 There are tools. Again, a bit biased because it’s my startup that is doing those tools. But that’s one thing we build that codes in. We build the integrations with code review tools and with IDEs. nd the IDE integration is something I’m looking forward to, in particular. We are going to launch our IDE integration during February, so pretty soon.

Giovanni Asproni 00:27:39 Okay. The next question is one that I’m particularly interested in. So, tools like Git, version-control systems, or even tools like you know, the product project management systems like Jira, usually they assign something to a single person. But some teams — maybe not most yet — they use pair programming, or ensemble also called mob programming. These techniques where several people, sometimes the entire team work together on a particular feature and so, the commits or the tickets should be assigned to all of them. How does behavioral code analysis work in that context? Have you come across these kinds of environments, and what have you done to account for this?

Adam Tornhill 00:28:25 Yeah, that’s right. We actually came across it pretty early on because we have a couple of customers that are using those practices, of course. Yes, you can handle that. And the way you do it is, I mean the tooling obviously needs to know about you using pair programming, or ensemble programming. And what you do then is that as a developer you simply need to add those extra data fields, and there are different ways you can do that. You can have like a co-authored by tag in your commit message, or you can simply have a free text commit message on some structured format. And if you do that, then the tooling can pick that up and it can kind of distribute the credit and blame and price across all members of the mob or the pair. So, it works pretty well. And I think it’s practices like pir programming doesn’t impact hotspots, but it impacts the knowledge metrics a lot. So, you do want to take that into account.

Giovanni Asproni 00:29:15 And does it impact also any other information? Because, for example — well, reading your books and listening also to some of your talks, I think you found out that also there are some kind of power law distribution for who contributes to a project more in terms of developers, yeah? So, when you have say ensemble programming or pair programming, does this kind of distribution change a bit? Does the shape change?

Adam Tornhill 00:29:45 I mean, it would be super interesting to know, actually. I haven’t seen a data set on that. It would be really interesting to see a project that had been doing ensemble programming from the very start and look at what that contribution graph looks like. I mean, if you are diligent about it and that’s the only thing you do, yes, then I would expect the curve to change. The reason that you tend to get these power law shapes is also because on a project there are simply some people that stay on that project for years, right? So they naturally become the main contributors. I do think that within the IT industry we historically we had a very high staff turnover, so I think it’s kind of natural that we gravitate towards those power laws structures, but I’d be really interested to see that .I haven’t seen any good data sets on the ensemble programming yet.

Giovanni Asproni 00:30:27 Thank you. And another question — this is of a different nature, so it’s not a technical nature, but more ethical nature. So, as you mentioned, I think, briefly at the beginning also there is lots of data that comes out of these various repositories, various tools, useful development. This data also contains a lot of information or at least can help infer a lot of information about individual developers. For example, changes they make, you know, working how hours possibly the defects they introduce. Are there any kind of abuses we need to be careful of? Have you come across any situations where you had to somehow say, no, I’m not going to do this because you want to know information for that you shouldn’t be using in this way?

Adam Tornhill 00:31:16 Oh yeah, definitely. Couple of years ago, back in 2015 when I started CodeScene, the way I bootstrapped the company was by doing services, so I can kind of finance building up the tool. Back then I had so many people asking me, can you go into our organization and can you rank all of my programmers so I know who to reward and who to fire? I always said no to that. And the reason I said no was part ethical. I simply think it’s the wrong thing to do. If you need to have that outside information as a manager, you are probably not as close to your team as you should be. And second, I also think it’s genuinely very, very unhelpful. It’s actually a dangerous thing to do because the moment you start to misuse these metrics to evaluate individuals, what you’re going to do is that everyone on that team is going to adopt what they’re being measured for.

Adam Tornhill 00:32:05 They’re going to start to game the metrics and you’re going to lose a lot of valuable information. And in the process you’re going to destroy all the team dynamics. So, let’s assume that we take a ridiculous metric. Let’s say that we measuring productivity based on the number of commits you do as an individual. The moment you start to measure that and evaluate that on me, I’m going to do more commits. My commits will carry more meaning, but I’m going to do more of them. What’s also going to happen is that if you ask me for help, I’m going to decline to help you because I’m busy making commits. So, I think it’s really, really dangerous. And what I did personally was that I included a whole chapter in my latest book, Software Design X-rays, where I warn about these potential pitfalls and advise against them and explain why you don’t want to do it. And what we did as a company with CodeScene was that we did a formal statement that we are never going to evaluate individuals. What we do is, we could evaluate the team. I think that’s fair to do because as a team you share goals but never individuals. It’s simply the wrong thing to do.

Giovanni Asproni 00:33:03 Okay, thank you. Now another question instead about behavioral code analysis and how do we know that it works? So maybe a bit of a naughty question if you like, but many people that come up with the techniques or tools and stuff say, use this we’ll help you a lot, we’ll make your team teams more efficient but deliver faster, better quality. How do we know that behavioral code analysis actually works according to some objective criteria?

Adam Tornhill 00:33:31 Yeah, so this question is very close to my heart because when I wrote the first book about behavioral code analysis, Your Code as a Crime Scene, I stated already in the foreword that one thing I wanted to do was — well, let me put it another way; rather, what I think is an issue and that that’s what I covered in Your code as a Crime Scene is that as an industry, we have so many different opinions and then there’s the whole academic research where we actually know what works and what doesn’t. There simply isn’t any good bridge between the academic research and the practicing programmer. So that’s one thing I wanted to add with Your Code as a Crime scene. So, a lot of these techniques have actually been evaluated in the academic field. So, you know, for example, with the hotspots looking at the hotspot metric, we know that that has more predictive value than any properties of the code itself.

Adam Tornhill 00:34:15 So there is a growing body of evidence that these techniques actually work. Of course, I’m trying to do my contributions, as well. Six months ago, me and a colleague published a paper called “Code Red: The Business Impact of code Quality,” where we actually looked at these metrics and see do, do they actually correlate with something that means something to the business. I think it’s important because if you advise a company and the company makes potentially a million-dollar decisions based on your information, it’s simply your responsibility as a vendor to make sure that what you do actually works. So, I hope I managed to answer the question to some degree.

Giovanni Asproni 00:34:51 Yeah, I think you did; you answered the question very well. I think it’s always a question to ask to people that propose techniques or tools because, of course, nobody proposes anything saying use this because it will not work. Everybody will promote the tools, but to me it’s important to understand if there is any evidence behind that. And I think you answer very well. And then, different kind of question: now, we talk about legacy code, code quality, how to judge quality, the impact it has on an organization, and all these good things, but then there is an interesting entry in your blog from 2019: why I write dirty code, code quality in context, which I read and I found quite interesting. So now, can you tell us — well a bit maybe very briefly — about the entry, and then how do you use behavioral code analysis to decide that the context?

Adam Tornhill 00:35:46 Yeah, I’m going to step into some dangerous ground now because I know this is an unpopular opinion. The thing is, that blog post was pretty much written as a little bit of a response to the boy scout rule that always leave the code cleaner than you found it, because I simply don’t think that holds up as a general advice. I’m very much against these categorical imperatives because I think context is so important always. So, the whole idea is that if you look at code and see what’s actually happening to code as a system evolves, you would see that some parts of the code are being changed much more frequently than other parts. This is information that you can use to your advantage because code quality matters in context. So, what that means is that if I’m modifying a piece of code where I know — and again, ‘know’ is based on measuring — that this is a part of the code where we do a lot of work, it’s highly relevant to our roadmap, then yes, I do want to leave that code cleaner than I found it; it’s super important.

Adam Tornhill 00:36:42 But on the other hand, let’s say I fixed an occasional bug in the long tail of change in the code base, right? So, I’m in a very stable part of the code I haven’t a bit touched for years. When I make a modification to that part of the code, I probably don’t have a good mental model of how that code works. And if I start to refactor it, there’s always a risk I’m going to destroy something, and I might also waste time because chances are once that bug is fixed, I’m never going to return to that code again. So that simply means that when I work in that long tail code, what I do is I make the minimum effort I can to fix whatever I have to fix. And should I be wrong with that bet and actually have to return to that code again, then that’s the time to refactor it. So, I do use this temporal information to decide which code do I refactor as I go along and which code is less important. That’s the idea.

Giovanni Asproni 00:37:31 So basically the code that does not change a lot, that changes only occasionally, you don’t really try to make it much better. You do the minimum amount of work to add the functionality or to fix a bug, whatever you need to do, but you keep it to a minimum.

Adam Tornhill 00:37:46 I do.

Giovanni Asproni 00:37:46 Yeah. Okay. So and then if it is code that changes very often, instead you are very careful with that bit.

Adam Tornhill 00:37:51 Yeah, that’s right. And I might actually even — that’s why I use this unnecessarily provocative title, Why I Write Dirty Code, because I might actually make the code worse if I know that this is a stable part of the code, right? The minimum change I can do involves adding a couple of extra If statements, then I do that, right? But I would never do that in code that I know is undergoing lots of heavy change.

Giovanni Asproni 00:38:14 Okay. Yeah, and this fits also in the context you talked about at the beginning where, basically, using behavioral code analysis to actually get a, if you like, a good return on investment on your time where you want to spend time to make the code better, where you don’t want instead to spend that much time. So, I guess this fits all this line of reasoning pretty well.

Adam Tornhill 00:38:36 Yeah, and I like to think that’s important because there is so little time and so much code.

Giovanni Asproni 00:38:42 Also, how a developer can make this context decision quickly? Because usually when they have to make changes and do something there is some kind of time pressure, and often people, as we know, developers can go for a quick and dirty solution, but they might actually go for that in the wrong place. So how can they decide, hold on here, I really need to be careful, I mean in a kind of time-constrained manner.

Adam Tornhill 00:39:09 Yeah, so there are a couple of things you can do. What I personally do is I always have a hotspot analysis run on a daily basis automatically, right? So, I always have a hotspot map. If you look at that map a couple of times then you start to form that mental model, you start to think of the system in terms of hotspots that okay, these are volatile parts, these are stable parts. So that helps a lot. What you can do if you don’t have that information is you can do something very simple. You can simply do a Git blame in your code editor and you can check when was the last time that this particular function or part of the code was modified. And if it’s more than a year back, then you know that you’re looking at pretty stable code.

Giovanni Asproni 00:39:46 Okay. That is quite a useful rule of thumb, I guess, and very quick thing to do for developers. Okay. Now something bit, maybe a bit more advanced, less simple tools. So, I’ve seen references to machine learning and artificial intelligence in the context of behavioral code analysis. So, what kind of analysis and decisions do they enable?

Adam Tornhill 00:40:07 That’s an interesting area. It’s something I’ve been working with a lot over the past years. Let me try to give you one example where I think machine learning is generally useful, and that is something that’s targeting, it’s a more advanced use case and it’s more contextual because it’s a typical use case for large organizations. But let’s say that you are the tech lead or maybe you’re the software architect in a large organization, one of your responsibilities is typically to review code, review pull requests and whatnot. And you might get tens of pull requests a day to review. Most of us simply cannot pay attention to all that level of detail throughout the whole day. So, after a while you’re going to slip, and you’re going to miss something really, really important. And of course it’s also the risk of burnout because you’re going to have this constant feeling of having to catch up with the new request that comes in for review.

Adam Tornhill 00:40:55 And at the same time, I do think that behavioral code analysis knows to some degree where the potential risks are. So, one thing I’ve been working on over the past years is to build machine learning algorithms that can look at behavioral data. Like, what are typical patterns of modifications? How do they compare to the typical low risk modifications that we do? You can use machine learning pretty well to predict the risk that a particular exchange will introduce a defect, for example. And that makes it possible to simply flag certain pull requests as more risky so that you can focus your manual expertise towards likely to be needed the most.

Giovanni Asproni 00:41:31 Okay. So will it be to some kind of tool to help reviewers, or we are envisioning also some kind of automated reviews? Like, somebody issues a pull request and the system goes, I run the behavioral code analysis with the machine learning, okay everything is good, I’ll accept it, or there will always be a human there?

Adam Tornhill 00:41:51 I think at the moment we do want to have a human in that loop, but that human is going to be able to prioritize their time in a much, much better way thanks to the machine. So, the way you can think about it, the way it actually looks, is that you open a pull request and the machine says that hey this is a low-risk change, you just need to do a quick sign check; or the system can say that this is an increased risk because, and then give a couple of reasons. And, what I found so fascinating that when we trained these machine learning algorithms, what we figured out was that again it was not only properties of the code that this determined if pull request was likely to have a defect or not. It was actually very much up to the social dimension as well. And to give you one example, let’s say that I go in and make a large sweeping change to the Linux kernel; that would be a gigantic risk because I have no idea how that actually works. Let’s say that Linus Torvalds makes exactly the same change as I did; his risk would be much, much lower because Linus actually understands the code, right? So the risk is human element, it’s obviously sensitive, right? But if we really, really want to fix these hard issues and prioritize, then we need to take that social information into account.

Giovanni Asproni 00:43:01 Okay. So, the idea is basically to take this information into account to somehow qualify the risk in the change.

Adam Tornhill 00:43:09 Exactly.

Giovanni Asproni 00:43:10 So, if the system knows that it is a person that is actually knowledgeable about that part of the system, it flags as a lower risk than if it is somebody else that is less knowledgeable, something like this?

Adam Tornhill 00:43:21 Yeah, that’s what we saw when we looked at what algorithm actually did because I mean a couple of risk predictions are based on some of the vectors we used are properties of the code, properties of the change sets, but then there’s the social dimension because the more experienced you are in a particular area, the larger changes you can do with less risk.

Giovanni Asproni 00:43:40 Ok. And this is still an area under development, I guess?

Adam Tornhill 00:43:43 Yeah, so we actually launched that feature. It’s available, but of course there’s so much you can do with machine learning. We do try to find a good combination because you want a lot of things to be deterministic because you want to be able to reason backwards and understand, okay, why did we come up with this particular decision, right? So, I think it’s not something I would use wisely, but when you have a lot of different data vectors with really complicated patterns, that’s where I think machine learning really shines. And this is one example and how you can use it with behavioral code analysis data.

Giovanni Asproni 00:44:13 That’s quite interesting. And now another one, so about the limit of the techniques, I mean what are the limits of behavioral code analysis? Are there any things that you would like to achieve but cannot be achieved, or cannot be achieved yet?

Adam Tornhill 00:44:27 There is actually one thing that’s been painful to me for a couple of years now because it’s something I was hoping that I would’ve been able to address many years ago, but I still haven’t gotten around to do it. That is change coupling that we talked about earlier. Change coupling is super powerful in that it can reveal these implicit dependencies that you just cannot see in the code itself. However, change coupling is also bidirectional, meaning that you cannot tell in which direction the dependency goes. To me, this makes it a little bit less useful than it actually is. What I would like to do is I would like to collect that information, and I think the only way to collect it is by integrating into IDEs and code editors because there we can actually figure out that, okay, I modified this file before modifying this file — it gives you a direction on the change coupling. That’s one obvious limitation that I do hope to address in the near future.

Giovanni Asproni 00:45:19 Let’s see if I’m understanding correctly. So, you’re saying that change coupling, at the moment, is giving us correlation data but not causation — kind of, these things change together, but I don’t know which one I’m changing and then as a consequence the other will change as well.

Adam Tornhill 00:45:35 Exactly. And in order to figure that out you need to look at the code today, but I do think that by integrating more tightly with code editors, we should be able to figure out the direction of the dependencies.

Giovanni Asproni 00:45:46 Okay. And do you think also that maybe improvements in machine learning, artificial intelligence can actually help in solving this kind of issues, or they are unnecessary for that?

Adam Tornhill 00:45:55 I’m actually not sure. I’m watching that area with a lot of excitement because there is a lot happening. I actually think that the advances in machine learning and with AI — I mean, we have techniques like, ChatGPT now and Git co-pilot, right? — and I think what these techniques are doing, I like the direction. I think it’s still the early days, but I think what these techniques are doing is that they’re making it even more important to stay on top of your code because for a long, long time we’re going to have a hybrid where parts of the code is written by an AI, and yet has to be understand by human. And if you’re in that situation, you do want to make sure that you know exactly what goes into your code and what doesn’t. So that’s where I think that behavioral code analysis over the decade to come will be even more important than it is today.

Giovanni Asproni 00:46:41 Now, questions about getting started. Let’s say we have a team interested in doing behavioral code analysis. How should they get started? So, say we want to analyze our code base. Are there any aspects they need to look at first? You know, some kind of typical things that you would recommend to look at first, or it depends on some context? If it depends on context, what is the context? So how do they get, how does a team get started with this?

Adam Tornhill 00:47:09 So if you talk about analysis perspective, I always recommend that you start with the simplest analysis, and that’s the hotspot analysis. So, visualize your system, understand where your development hotspots are, and make sure that these development hotspots are healthy. That would be my starting point because getting the hotspots under control, it’s like it kind of creates the foundation for all other things you would like to do with that system. So that would be my recommended starting point. Once you have mastered that, I think that the next good step is to start to look at change coupling and figure out that, yeah, okay, maybe even if our individual models we have — even if they are simple enough to understand in isolation, is the system as a whole still easy to reason about? And that’s something change coupling can help you with. And then, in addition to that you have all these ad hoc use cases, like what if a long-term contributor leaves the company or leaves the team? Where are the knowledge gaps? Where do we need to focus our onboarding efforts? This one of analysis are very good fit for behavioral code analysis too, but please start with hotspots and then change coupling. That would be my recommendation.

Giovanni Asproni 00:48:16 Okay, thank you. And also just to get started, I know that you created a free, an open-source tool called Code Maat. Am I pronouncing it correctly?

Adam Tornhill 00:48:24 Yes, that’s right. Code Maat.

Giovanni Asproni 00:48:27 Okay. So with this tool, what kind of analysis can somebody perform to a code base?

Adam Tornhill 00:48:32 So with Code Maat you can perform the analysis I just talked about. You can run our hotspot analysis and you can run our change coupling analysis, and you can also do a couple of social analysis like figuring out what does the knowledge distribution looks like in the code base, where are the knowledge gaps? You can do those kind of things with Code Maat.

Giovanni Asproni 00:48:49 Okay. And does it depend on the version control system, particular version control system used by the team? I don’t know, is Git better than Subversion or Perforce or any other version control systems? I mean are there any limitations depending on the version control system you use?

Adam Tornhill 00:49:08 Yeah, there are some limitations. What I tried to do with Code Maat is that it works with all these version controlled systems, but certain features are only available for Git. So, I tried to handle that gracefully. If you request a particular — the more advanced analysis, then you need to have a Git repository, but you can always do a hotspot analysis no matter what version controlled system you’re using.

Giovanni Asproni 00:49:30 Okay. And these are limitations due to actually the data actually available in the repositories themselves. Am I correct?

Adam Tornhill 00:49:38 Yeah, exactly. I mean the main difference is that with systems like TFS or Subversion, you need to do a roundtrip to the server to figure out some pretty basic stuff like how large was this particular change, for example, in terms of lines of code? You could, of course, implement that too. I simply thought that that’s going to be ridiculously expensive from a runtime perspective, right? It’s going to take like five hours to run an analysis where you can actually do it in two seconds if the data was local like it is Git. So that’s the main reason.

Giovanni Asproni 00:50:07 Okay. As far as you know, has this tool been integrated somehow by some teams in their own day-to-day activities? Like, I don’t know, integrated with a continuous integration or continuous deployment tools, maybe running scripts from command line? Do you know of any teams that actually use Code Maat in these ways?

Adam Tornhill 00:50:28 So, I know that Code Maat has been extremely popular over the years. I know that lots of teams are using it. The tool itself, Code Maat, is an open-source tool, so it’s not a good fit to do things like CICD integration because it doesn’t have that concept of analyzing a small change set. But you could still benefit from it. I mean, what some teams do is simply that they make sure to run the tool once a day so that they always have an up-to-date view of what their code base looks like.

Giovanni Asproni 00:50:54 Okay. Are there any other tools that you would suggest for people that want to try these techniques? They want just to give it a go maybe for a while and don’t want to spend an arm leg? Are there also any other tools they can use for that?

Adam Tornhill 00:51:08 I mean, there are couple of options. What I did in Software Design X-Rays was that I wanted to illustrate that you can actually do a couple of these analyses from the command line using your Git client. If that’s your kind of thing, that’s definitely possible. It’s obviously pretty raw information and it’s a little bit limited, but I still think it’s useful for these really, really quick explorations. That’s one option. Then of course there’s a free community edition for CodeScene if you want to benefit from all the latest bells and whistles and analysis. So that’s also an option that you could check out if you’re interested.

Giovanni Asproni 00:51:40 The community edition can be used for what kind of projects?

Adam Tornhill 00:51:43 It can be used for any open-source for free forever.

Giovanni Asproni 00:51:47 Okay. So, people that are working on open-source software can use that, can use the community edition. Is that correct?

Adam Tornhill 00:51:54 Yeah, that’s right.

Giovanni Asproni 00:51:54 Okay. Well, I think we’ve done quite a good job introducing behavioral code analysis, but if there was one thing you’d like a software engineering to remember from this interview, what would it be? If they think about behavioral code analysis, is there one thing that is the most important thing they should remember about it?

Adam Tornhill 00:52:15 So, the most important thing to me is to always make sure that we optimize for understanding. That’s where the big, big win is. And optimizing for understanding goes beyond just the source code. It’s also about making sure that the way we work fits with what the architecture actually supports. So, behavioral code analysis to me is code and people within a business context.

Giovanni Asproni 00:52:38 Okay, and is there anything else that we may be missed and you’d like to mention?

Adam Tornhill 00:52:43 So, I don’t think we missed anything. I think we covered a lot, but I do have some news that I could share and that is that I’m currently working on a second edition of Your Codes as Crime Scene. I want to bring it up to the 2023 state of the art. I hope that later this year you will be able to check out the new edition and read much more about behavioral code analysis.

Giovanni Asproni 00:53:03 Okay. And also, people can follow you on Twitter. Is there any other way they can get in touch if they’d like to talk to you, they’re interested in behavioral code analysis?

Adam Tornhill 00:53:13 Yeah, definitely. So, you can also connect on LinkedIn. I’m Adam Tornhill there. You can, Twitter is probably the best channel, but feel free to drop me an email as well on [email protected] and I’d been more than happy to discuss these topics.

Giovanni Asproni 00:53:28 Okay Adam, so thank you for coming on the show. It’s been a real pleasure. And this is Giovanni Asproni for Software Engineering Radio. Thank you for listening.

[End of Audio]

SE Radio 554: Adam Tornhill on Behavioral Code Analysis

Show Notes

Related Episodes

Related Links

Transcript

Join the discussion

More from this show

SE Radio 727: Jeroen Janssens and Thijs Nieuwdorp on Using Polars

SE Radio 726: Scott Kingsley on the Swagger Ecosystem

SE Radio 725: Danny Yang and Sam Goldman on the Pyrefly Type Checker

Menu

Recent posts

Search

Search

SE Radio 554: Adam Tornhill on Behavioral Code Analysis

Show Notes

Related Episodes

Related Links

Transcript

Join the discussion

More from this show

SE Radio 727: Jeroen Janssens and Thijs Nieuwdorp on Using Polars

SE Radio 726: Scott Kingsley on the Swagger Ecosystem

SE Radio 725: Danny Yang and Sam Goldman on the Pyrefly Type Checker

Menu

Recent posts