Episode 481: Ipek Ozkaya on Managing Technical Debt

Filed in Episodes by on October 13, 2021 0 Comments

Ipek Ozkaya joined host Jeff Doolittle to discuss a book she co-authored entitled, Managing Technical Debt. In the book, Ozkaya et al. describe nine principles of technical debt management to aid software companies in identifying, measuring, tracking and paying down technical debt. During the episode, she provides some unique perspectives on the definition of technical debt and provides helpful insights from both her research and personal experience

This episode sponsored by NetApp.

Related Links

 

 View Transcript
Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact content@computer.org and include the episode number and URL.

Jeff Doolittle 00:00:46 Welcome to Software Engineering Radio. I’m your host, Jeff Doolittle. I’m very excited to invite Ipek Ozkaya as our guest on the show today to discuss her book, Managing Technical Debt: Reducing Friction in Software Development. Ipek is the author of the book along with their co-authors Philippe Kruchten and Robert Nord. Ipek is a Technical Director at the Carnegie Mellon Software Engineering Institute. She develops techniques for improving software development, efficiency, and system evolution. She regularly works with government and industry organizations to improve their software architecture practices. She received a PhD in computational design from Carnegie Mellon University and is editor-in- chief of IEEE Software magazine, the publisher of Software Engineering Radio. Ipek, welcome to the show.

Ipek Ozkaya 00:01:31 Well, thanks for having me, Jeff.

Jeff Doolittle 00:01:33 Your book is titled Managing Technical Debt: Reducing Friction in Software Development. Why should listeners care about technical debt?

Ipek Ozkaya 00:01:43 Oh wow, why shouldn’t they? So, I think we revisit the problem of software maintenance and evolution and software quality over and over again. Every once in a couple of years, there’s a major failure that happens that attracts all of our attention. Then we’re like, okay, we need to change the way we develop software. And then we forget about it. We — by “we” I mean the software engineering community and software organizations — really care about new functionality that’s developed. They do care about their defects. They do care about their security issues, vulnerabilities. Somehow, they don’t care as much about the fundamental design problems that actually bite them in the long run. And I think technical debt, as a concept, really puts that up at the front for the developers, software architects, as well as decision-makers and project managers. That’s reason one for them to care about it. The other reason is if they don’t care about technical debt in the long run, they will have to deal with it one way and others, and that might be too late. So, that also is another reason to care about technical debt.

Jeff Doolittle 00:02:50 So, ignore it at your own peril.

Ipek Ozkaya 00:02:52 Exactly. Ignore it at your own peril. And I think, I mean, it’s not a new concept. It was put out in 1992, and we’ve kind of gone around and we keep referencing and whatnot. However, what actually has been new, especially in the past couple of years, is that developers recognize it. Now they’re talking about, ‘okay, this is going to create more rework if we don’t address some of these fundamental design issues that have been introduced over time into the software.’ And I think that’s important for us to recognize as well.

Jeff Doolittle 00:03:24 You made an interesting connection now twice between technical debt and design, and I think that’s an important connection to make. So, maybe you can expand on that a little bit and as you do, help us understand what you mean when you say “technical debt,” because I sometimes wonder if we all mean the same thing when we use that term.

Ipek Ozkaya 00:03:41 So, when we talk about technical debts, what we’re referring to are the shortcuts that development teams take — sometimes for good reason; maybe they haven’t implemented the software as decoupled as they wanted, maybe they did not incorporate an external system that they wanted to include with the knowledge that as they keep developing the software, they will have to change it. It would be working. So that’s the notion of technical debt, how it relates to architecture. All of these kinds of shortcuts are actually architectural in nature. They are about the structure and behavior of the system. Why do you take a shortcut? You do not want to decouple the different functionality and different modules, or you don’t want to upgrade the software that you have to upgrade. Those are the kinds of shortcuts that actually have impact in terms of the design and architectural decisions and why technical debt is powerful.

Ipek Ozkaya 00:04:31 We already have concepts that will refer to other aspects of the software development life cycle. We know how to talk about use cases and stories and new functionality and ethics. We know how to talk about defects and bugs, as we develop the software. We know how to talk about the vulnerabilities while we do talk about architecture and design. We don’t have these bite-sized concepts that help us to manage them as we go through and sprint boundaries as we go through prioritizing a design rework versus adding new functionality. So that’s, I think why it’s important for addressing both from a definition perspective, as well as relating it to design and architecture.

Jeff Doolittle 00:05:12 Okay. So, the book starts with the word managing, why managed technical debt. Wouldn’t it be better to just eliminate it altogether?

Ipek Ozkaya 00:05:20 First of all, I know it will refer to these, but one of the principles we have as those read the book realizes that any system and every system has technical debt and you’re not going to be able to eliminate technical debt that’s for sure. So, managing is that’s the right way to approach it because if you don’t manage it, it’s going to accumulate. And also managing provides the whole benefit. The reason technical debt is important and why it’s powerful to relate to the financial concept is, there’s a benefit to it. When you take a shortcut or when you decide not to implement it in the way, the best way that might sustain you for the long-term, you’re actually creating a benefit, delivering it faster, or being able to address some value proposition in the organization. So, that benefits requires to be managed similar to how we manage our mortgage right? We don’t just forget it and pay it in lump sum in 30 years. We pay it in regular intervals and what not. So it’s really that management aspect of the debt that might create value, its management aspect of the technical debt in the software will, that will create value as well. So, that’s why it’s management.

Jeff Doolittle 00:06:25 Yeah. And interesting you mentioned, you know, financial as an analogy in it has the word debt in it. So, that is interesting. Although, you know, a 100 years ago, they didn’t have mortgages like we do now. If you wanted to buy a house, you put 50% down and a balloon payment in five years. So, that’s a way to manage your technical debt, but that’s not what you’re describing. You’re talking about more like paying it down as you go, right?

Ipek Ozkaya 00:06:45 Paying it down as you go, deciding not to pay down as you go, because it might be, that might be in parts of the system that doesn’t change. Often there might be in parts of the system that you’re going to retire and also take it off, knowing that it’s, you’re actually creating some value. I also very strongly believe some aspects of technical debt can actually help you. Like, for example, you’re going to learn something and then you’re going to decide on the best design choice that actually has a lot of value. And until you rework that part of the system, that’s actually value added because you’re collecting information.

Jeff Doolittle 00:07:20 So, when would you say that it becomes technical debt because what you just described there, I, you know, a process that I use, we start with a detailed design phase. And then the idea is you have to throw that away. You don’t get to keep that in the concept. There is you have a set of requirements, we know requirements change, but we’re going to take a whack at it. And the result of that actually should be refined interfaces. You know, speaking of architecture and refined contracts between, you know, how this new thing we’re building is going to interact or what it’s going to offer, the other things that you’re going to build. So, I think I might’ve just answered the question myself. So, maybe, yeah. So, when would that actually become technical debt? If maybe you’re in a design phase and you’re not actually going to put it into production, for example.

Ipek Ozkaya 00:08:01 So, I think you bring up a very important and subtle point, because for example, if you’re going through a prototyping or design exploration phase in your software development, you might know that I’m not going to use this. I’m going to throw it away, but there’s always that 1% chance that it might actually stay on the system. So, one approach could be, well, this is a throw away prototype that we’re doing, but if it stays up, these are the ways it might become technical debt. When it becomes technical debt is when you start observing the negative consequences. What might be a negative consequences, maybe you are trying it out, but you didn’t design for some of the available to scalability or other resources or requirements, or you’re designing it with underlying subsystem that you’re aligned on, but you don’t have rights to it. It needs to be purchased.

Ipek Ozkaya 00:08:51 Then if it’s puts into production, there are those boundaries. So, there are all aspects of consequences. And when those consequences will actually start to be observed, then it really becomes technical debt that has higher interests as opposed to zero interests when you were prototyping through it. So, I think it truly is important. And that’s actually what we emphasize with my co-authors Robert Nord and Philippe Kruchten in the book. It’s the understanding the consequences and managing the consequences. If you’re able to manage the consequences, it’s 0% interest rate, then it’s okay to maybe keep the technical debt on for a while.

Jeff Doolittle 00:09:25 Yeah. And as we go, we’ll dig, I think a little bit more into how you compare technical debt to the finance world. And I think it’s a really helpful analogy, but before we dive into the principles which you referred to before, the nine principles of technical debt management early in the introduction to the book you reference what Edsger Dijkstra in 1972 called the ìsoftware crisisî. So, I would imagine that from you and your co-authors perspective, you would say the industry may be still is in crisis? And if so, how is that relevant to this subject in particular about technical debt?

Ipek Ozkaya 00:09:57 Oh, the software industry is in crisis. There’s one report after another, that comes, I don’t always pay too much attention to those because there’s this added fear happens and there’s all sorts of business contexts. But I think the software industry is in crisis and has been in crisis. The reason being, first of all, we relying on it, a lot. Increasingly the complexity of the software is increased and the complexity is also from the interactions. A lot of software interact with other software. There’s this ecosystem of software we’re creating. We have a crisis because we can’t create developers in the right that we need that in terms of the skill sets that are required. It’s a, there’s a crisis. The technology changes incredibly quickly and we don’t always have to catch up. So, all of these things, what we call in the book is like a death by a thousand cuts.

Ipek Ozkaya 00:10:47 There’s a cut that happens everyday in the software and some organizations manage it better than others. And I think while we appreciate technical debt, we call it technical debt today. But we also appreciate software maintenance, evolution and maintainability and sustainability perspectives. That crisis will not go away. We don’t have tools. We don’t have a vocabulary to talk about what might be accumulating through the time in the software. And I think that’s important to relate technical debt to that complexity and the software crisis. We think things will actually fall in place in the right way, but it doesn’t. We still have some of the crisis and I’m not going to name any organizations or any part. But I think every week we have a new status with significant software that actually creates life-threatening lists, not only financial loss, so that’s not going to go away. And I think the power of technical debt and managing technical debt, it’s provides that vocabulary as well as the tool sets for organizations, to may be honest with themselves.

Jeff Doolittle 00:11:50 Right, and maybe they can reduce the amount of crisis in their context. And if enough people start doing that, we may not fix the whole industry, but maybe the beginnings of that can start to come to fruition. Well, let’s start diving into these nine principles of technical debt management that you refer to in the book. So, the first one states that technical debt, reifies an abstract concept. So first off, I don’t know if a lot of our listeners use the word reify often in their day-to-day conversation. So, speaking of abstract concepts for me, maybe you can start by defining reify for us and then explain how that describes the relationship between technical debt and abstract concepts.

Ipek Ozkaya 00:12:32 So, English as a second language speaker, I will admit that when that word was actually part of our principles, I personally had to go back and check. Okay, is that really the best word we use? But it is the best word to use because what it means is that technical debt is an abstract concept, right? We’re taught, especially in software, that’s why we keep talking about what is the definition? How do I put my head around? It is everything that goes bad in my software, technical debt? So, it takes that abstract concept and in the book, what we’re aiming to do is make it concrete. And the way we make it concrete is we say, first of all, separate your causes from your technical debt. Yes, your manager who may not have made the right decision at the right time may have created a cost for technical debt, but that’s the cause. Yes, the documentation not existing might be a cause of technical debt.

Ipek Ozkaya 00:13:22 Yes. The requirements that elicited that might be a cause of technical debt, but those are the causes. The debt itself that you want to manage is actually something that you can define and put your hand around in the software system. You should be able to say, here’s the rework that needs to happen in the software. Here are the consequences that happen. If we don’t do that rework and those consequences are in terms of harder things to add to the software or additional defects that you might be introducing and what not. So, that’s concrete, this is what we’re really going after with that principle. So, if you are not able to get it at that level of concreteness, then maybe there are other concepts in the software engineering life cycle and the practices we have that might be better suited for the problem you’re dealing with.

Jeff Doolittle 00:14:05 Well, and it’s interesting too, you mentioned avoiding focus so much on the cause because that often doesn’t help you move forward unless you’re doing root cause analysis to avoid the same problem in the future, which makes sense. But if it’s turning into the blame game, it’s not really helping you, it’s better to just say here’s what it is. And so to reify sounds like it means to make concrete. So, technical debt makes these abstract concepts about what’s wrong and how did we get here and why is this difficult and why is it getting harder? And it makes it concrete gives us concrete concepts that we can discuss. We can analyze, we can evaluate and make decisions based on those actions. Is that a good summary of that?

Ipek Ozkaya 00:14:48 I could not agree more. And also furthermore, and that’s also what we emphasize in the book is you can concretely point it to the actual place in the code or some of those software development artifacts that you need to change for technical debt to disapprear.

Jeff Doolittle 00:15:05 Okay. So, it helps us connect these abstract nebulous things something’s wrong. And we have various ways of thinking about something being wrong. You mentioned before, you know, you have technical debt when you start seeing the cause right, something’s wrong. But now to be able to say, okay, I have a framework, I have a set of concepts that I can use to sort of make these things concrete and distinguish things like, you know, the root cause isn’t related to the actual debt itself. And I can start distinguishing those concepts. It really sounds like in some ways, this first principle is telling us the book is offering us a way to think about technical debt by defining it and giving us ways to sort of categorize it, analyze it and begin making strategies for addressing it.

Ipek Ozkaya 00:15:47 Correct, and especially if we’re addressing it the way you might address causes might be completely different, will be completely different, not mine more than how you actually pay down the technical debt that already exists. Like for example, if developer turnover is creating all these confusion and each new developer that comes on introduces a new piece of complexity or mistake into the code, that’s a different problem. And the strategies to fix that are completely different than going into the code base and fixing the problems that they have introduced. So, the strategies will be different. So, I think decoupling them is important. So, don’t refer to developer turnover as a technical debt, refer to the mistakes that they have made on document them. And some of them may not be technical debt some of them will be just defects or what not. So, I think it’s important from that aspect as well.

Jeff Doolittle 00:16:36 Okay. And we’ll dig a little bit more into that later because you just made a distinction between technical debt and defects, and it will be good to dive a little bit more into that in a bit. But I think part of the distinction there you’re making is between the symptoms and the diagnosis. You know, the symptoms might be, we have technical debt and we have defects. We have these other problems, but what’s the diagnosis. Well, the diagnosis may be, oh, too much turnover. And maybe turnover is not so bad if your developers have similar training. And if you have a shared set of standards that you follow, that you can, you know, adhere to, you know, and that they can adhere to. But then in this case, as you’re describing, you know, okay, we have technical debt. Why, well, the why is the diagnosis? And the diagnosis in this case you were describing would be, there’s so much turnover. But knowing that now this is that root cause analysis, it would be useful. Okay. Why we have technical debt? You know, of using the financial analogy, using the book to me, seems like a family that keeps going deeper and deeper and deeper into debt. So, just figuring out how to satisfy the debt is at that point, not necessarily the only approach you want to take. You also want to say, why do we keep incurring debt? What behaviors are causing us to incur debt.

Ipek Ozkaya 00:17:43 Exactly. And that’s also very important, as I said, like the way you manage, why is completely different the way, okay, how do I really tackle this problem that is happening in the code base now, right? You still have to manage. What’s like, why do I keep getting into debt? That’s a different thing rather than, okay. I have the debt to myself. Now I have to pay that down as well as eliminate my ability to get more into debt, I guess, iteration after iteration or one developer at a time or whichever way it manifests itself in your organization.

Jeff Doolittle 00:18:17 Now, perhaps this is pushing the analogy too far, but you tell me not all debt is bad. I mean, if you make an investment in something, you know, not even just a mortgage, but let’s say you buy a piece of investment property and you take on debt in order to do so. But on that investment property, you have a building and you have a tenant and you’re getting more rent from the tenant, then you’re paying on the debt for the property. So, in that case, the debt is in a sense, good, because it’s allowing you to have this investment. Does that apply at all when we’re talking about technical debt?

Ipek Ozkaya 00:18:49 Of course, and actually that’s the power of it. Not all debt is bad. And of course, I do not want to be empowering everybody’s technical debt because like, if I go stay in the SE radio show set, not all good is bad, but I think it’s the, really the management aspects of it because it’s really software development is about trade-offs. And that’s where I actually really related to the architecture and design. These are all about trade-offs. And we make trade-offs based on all sorts of different priorities and business concerns, as well as other technical concerns. So, technical debt is about having the vocabulary of understanding those trade-offs and what might be the consequences when those trade-offs might change, if we don’t manage it. So, if we’re comfortable with the notion that software development and architecting is about trade-offs, then if I’m making a trade-off, that might mean I might be taking on that because I’m optimizing one over the other, the other attribute might still turn out to be important.

Ipek Ozkaya 00:19:48 But now I have a vocabulary to say, you know what? I made this trade-off. I need to watch it carefully because the priorities might change and should the priorities change, I need to revisit the kinds of decisions I made. So, I think from that perspective, definitely technical, that is not all debt is bad, unintentional debt is mostly bad, unmanage debt is bad, but being able to understand the trade-offs and how you express them and create value out of them is actually the power of the, I think the metaphor, as well as any design activity and technical debt puts a concept around it. Now, if you’re making this trade-off, you have a vocabulary to express it. Did we have not other vocabulary? Of course we did. We could still have put it in the issue tracker and say, okay, here’s an architectural decision I made, but somehow I guess the negative consequence didn’t hit people. They always happen in some teams, but I never have seen it in about 20 years of my software architecture research and process and development, as well as organization work. I haven’t seen that as a common practice. I think technical debts concept gives us an opportunity to maybe bring that to life.

Jeff Doolittle 00:20:58 Well, and you mentioned putting an architectural decision in an issue tracker, and I would encourage listeners to read about architectural decision records, which I’m sure you’re familiar with. And we can put a link to that in the show notes, because in my experience, yes, in, in my experience, if it’s ends up in the issue tracker, it’s going to go there to die and no one’s ever going to see it again when they need it. Whereas if you put your architectural decision records, writing your source code, then they tend to stick around a little longer and have a chance to be useful. It doesn’t guarantee they will be. But anyway, so

Ipek Ozkaya 00:21:27 Correct. And actually the same goes for technical debt as well. There’s work that has been done in terms of looking into how developers actually disclose their technical debt in a part of their comments. What is the vocabulary they use? There’s all sorts of typical fix me, this is a hack to do. And a significant portion of those are actually also hints for technical debt.

Jeff Doolittle 00:21:47 Yeah, absolutely. Yeah. When you say slash-slash hack something, I reject that PR out of hand, but that’s a whole other side. Yeah. So, the next principle says, if you do not incur any form of interest, then you probably do not have actual technical debt. And that seems to make sense. If we’re talking about debt, then if you have a debt, you owe someone and you know, you aren’t from a country or, or a belief system that doesn’t allow interest. Then you’re going to be incurring interest if you have a debt. So, the book uses this analogy to try to help us. I want to say quantify, but I don’t know how explicitly we can quantify technical debt, but maybe you can go a little bit further for us and describing that analogy of principal and interest and how that pertains to managing technical debt.

Ipek Ozkaya 00:22:30 So the interest is about the consequences and the principle, if you do not incur any form of interest, you probably do not have actual debts. The keyword is there, the actual, and you just asked me about technical debt, not all being bad, it could actually be a potential value creating activity. That’s actually related to that. So, if I made a trade-off and I recorded as a potential technical debt at the moment, I am not observing the consequences. And by consequences again, maybe I am going through the actual development life cycle. Maybe I just want to ship the code so that I can actually observe how the users are reacting to it? At that phase of the, I guess some system, there is no debt it’s done when you start seeing the consequences. So, it’s really that principal tries to position the positive aspects of technical debt. And that was actually provides an opportunity, it’s not actual debt. Now it’s potential debt in the future, I might need to revisit it. So that’s really the differentiation. And also when you’re reducing it, that may not be the top priority that you start from in your rework, because at the moment that you are not observing the consequences.

Jeff Doolittle 00:23:40 Okay. So, is it actually possible to have a system that’s not incurring any interest? I don’t think so because otherwise I think that would contradict the next principle we’re going to talk about, which is, that all systems have technical debt. So, you’re always incurring some interests?

Ipek Ozkaya 00:23:54 You’re always incurring some interest on the technical debt items. Itís just one particular design decision, I will incur as a design decision. You might, okay you know what? I’m taking this on, we will need to revisit it by the next iteration. Right at that moment, that is actually not incurring interest. So, it’s potential debt. And in the next situation you might say, you know what? This is just fine. So, you don’t manage it as technical debt anymore. Whether other items might be observing consequences.

Jeff Doolittle 00:24:22 It’s really helpful too. I think, you know, you mentioned the term has been around for a long time. I myself have not really thought about it in depth in relation to finance, but the more I think about it now, it seems like if you’re especially say in a startup, which I spent a lot of my career doing startups, so the idea is you’re going to incur debt or you’re not going to be able to get anything done. You know, you’re going to have to make some trade-offs. That’s just life, there’s always trade-offs. But you can also get into technical debt bankruptcy, and that’s called you failed, right? Too much technical debt, and eventually it’s just going to literally blow up.

Ipek Ozkaya 00:24:57 Well, there’s that. But Jeff, for a startup world, I don’t have as much experience that it could be, you might have a beautiful system. It’s just the business need may not be there.

Jeff Doolittle 00:25:08 Well that’s right. Product market fit and those kinds of things. The only reason you fail, right?

Ipek Ozkaya 00:25:14 There are failures. Like, I mean, we went through this, the whole globally working from home and a lot of these systems that supported us had to scale to probably limits that they didn’t even imagine. And some of them were able to others weren’t able to. So, I think we will observe that like business changes am I, do I have a product that did not incur a technical debt, that I could actually meet these new needs at ease? Or I’m sure some of these organizations had to go through very painful conversations internally and I guess, kibitzing or reworking or solving the problems in a different place.

Jeff Doolittle 00:25:50 Well and it’s tough too, because if your debt load is high, you know, I, before I used that analogy of a property owner with a building and they rent it out to someone, but if they lose their tenant and they can’t find another one before, they didn’t have too much debt, suddenly something happened and they have way too much. So, I imagine that as you’re thinking about managing technical debt here, you could have it at a place that seems healthy for the meantime, but something around you could change and suddenly your debt load is too high. So that probably is, I would imagine part of managing technical debt too, is wanting to keep it at a reasonable, again, I know we can’t explicitly quantify it, but some sense of we are in control. We are managing this. And if things change around us, we have a sense that we won’t be overly leveraged.

Ipek Ozkaya 00:26:32 And the way we refer to is there’s an expected baseline and ideal state that the software development goes through, right? It might be based on new functionality that you deliberate, it might be based on the amount of, I don’t know, it’s code that you were able to ship in. Different organizations have different ways, or it’s based on the new products that are actually what the deadline, whatever that ideal state of delivery is that kind of defines your probably risk exposure and then technical debt is like, does the technical debt that you’re accumulating actually change that ideal baseline that you have in terms of how you’re able to respond to the needs of the organization with your software development tempo, and then how does it actually change that? So, I think that helps from a risk exposure perspective, different organizations have different exposures, different organizations might be willing to take that on. Sometimes they have just complete sprints or iterations dedicated to reworking the system to reduce that there are different strategies, but it’s surely based on how you’re able to take the risk and manage it. And the reason why technical debt actually resonated with the developers within the last decades, more so than when it was first forced by Cunningham in 1992, because it’s allowed them to talk about it explicitly because everybody knows that they’re software might have issues, but having the vocabulary to talk explicitly help provided the opportunity to manage it as well.

Jeff Doolittle 00:27:59 So the next principle, I think we’ve addressed in some ways, principle three, all systems have technical debt. So maybe the obvious question is why did you need to explicitly call this out?

Ipek Ozkaya 00:28:10 Because we’ve observed mismanaged software systems over and over again, where the goal is to really focuses the numbers rather than the outcome of the software meeting it’s business and mission and quality goals. So, you’re not really trying to, first of all, we don’t know how to actually even assess zero defects, zero technical debt, even with the tools that we have at our exposure. Second, the systems we develop are complex and they evolve in ways we do not, or do not anticipate. And that also is part of the overall item management construct. But most importantly, as I said, it’s about trade-offs and there’s always a trade-off that happens in every single system. So every single system has some amount of technical debt it could be well-managed variable to zero in chess, that’s the ideal, or it could be not well managed where you’re actually really struggling with some very difficult quality issues as well as development issues. And that’s really what the differentiated we use the same similar principle. When we talk about software architecture and systems, every software system has an architecture. The difference is whether you were aware of it and whether you’ve designed it, versus it happens to you. This is the same with technical debt as well.

Jeff Doolittle 00:29:26 And you mentioned systems multiple times and that’s perfect because the next principle says technical debt must trace to the system. So maybe describe for us the importance of that principle and what you mean by, by tracing it to the system.

Ipek Ozkaya 00:29:39 So this goes back to the owner conversation. We already had the confusion between what causes technical debt and what’s actually the rework that I need to do to be able to reduce it, and also getting confused in terms of the kitchen sink syndrome. Everything that goes bad in my system is technical debt. What we mean by that is like, if we’re talking about the technical debt item, I should be able to completely point in the system, what rework that I will need to do if I don’t manage that technical debt.

Jeff Doolittle 00:30:10 Okay. So rather than just saying, there’s that monolith over there, that’s a pile of technical debt, that’s pointing at the system. But when you say tracing to the system, you mean more specifically, these are the portions of the system that will need to be changed in order to address, the monolith.

Ipek Ozkaya 00:30:27 Yeah. I think the example you give is a really good one. Turn my monolith into microservices. Great. Where do I start? First of all, is it even helpful to managing this as a technical debt, youíre actually talking about those serious re-architecting efforts, which probably there’s aspects of technical debt within it, but there’s a serious re-architecting going on. So that’s number one and what are the specific things that need to be done? And I’m sure within the monolith, there are partials of it that are hairier to resolve than others. So, differentiating those actually would be some, would be technical debt others would be just routine development and re-architecting.

Jeff Doolittle 00:31:06 Sure. And of course, it’d be good probably now to dispel the rumor as well, that microservices have less technical debt than monoliths because regardless of one or the other, they both can have their own amounts of complexity and technical debt. So, didn’t want to apply one or the other was better from a technical debt standpoint, just using it as an example. Yeah, because with microservices, you could just say there, they all are. There’s the technical debt, and that would be just as bad as just pointing to the monolith and

Ipek Ozkaya 00:31:31 Dependency explosion happens everywhere. I think that’s probably it.

Jeff Doolittle 00:31:35 And honestly, you talk about trade-offs and that you focus on architecture and so much of architecture, I would distill down to encapsulation and dependency management. And if you get those things right, you’re going to probably do better than most.

Ipek Ozkaya 00:31:48 Exactly but I think I’m very glad you picked up on that. Uh, Jeff, because there is no such thing as a system developed with a microservice architecture versus a monolith versus service-oriented architecture versus whatever your most favorite architectural paradigm is, has more or less technical debt is a very irrelevant question. That’s not the right question to ask. I’m following agile software development versus my historic waterfall or whatever your most favorite software development approach. I have less or more technical debt, irrelevant question. So, all of these, and that’s why it’s important, that every single system has technical debt. It’s that development, the ecosystem and the activities as well as design trade-offs that introduce it to the system?

Jeff Doolittle 00:32:31 Well, I’m back to the first principle and why I think this book is helpful in, in other ways is it is making these abstract things concrete, but I think it’s not always abstract things being made concrete, it’s confusing things being made clarified. And that’s, I think the issue with monolith versus microsystems or agile versus waterfall is often those words are there a lot of times strawmen they don’t, they don’t even relate to the thing as it really was. And I think that’s true about both agile and waterfall in many cases. So, in a similar fashion, we’re saying, let’s talk, let’s actually define what technical debt is. And then we can actually deal with it once we actually know what it is and what it isn’t.

Ipek Ozkaya 00:33:08 Thatís exactly why some of these principles are reinforced, get specific, get concrete, trace it to the system, don’t say at the abstract level, because then you’ll have the tools to deal with whatever the consequences or whatever the technical debt that you’re incurring. And more importantly, you’ll also have the tools to manage what some of the other issues that you might actually not have the vocabulary to manage before. So that’s really the point of the principles as well as the book.

Jeff Doolittle 00:34:35 And that leads right into principle five. Technical debt is not synonymous with bad quality. So, we talked before you can’t just point at the system, say, there’s all the debt you have to trace it to, these are the parts of the system that are impacted by or need to be, you know, changed in order to satisfy this technical debt,these kinds of things. But now you’re also saying it’s not synonymous with bad quality. So why, why is that? Why wouldn’t we just say bad quality is technical debt? Why aren’t those the same thing?

Ipek Ozkaya 00:35:01 So this goes back to reinforcing. If we’re talking about technical debt more often than not, you’re talking about the structure and the behavior of the architecture of the system. So bad qualities, bad. Bad quality, obviously there are, but we also have tools to deal with bad quality. We have quality assurance practices, defect management, quality conformance, all those statical checkers or whatever is your most favorite approach to manage that it was often tend to be symptoms. Why there are symptoms because you have bad quality. You probably have also not structured your software properly. It really hints at an underlying structure and behavior problem in the way the software is organized as well. So that’s number one. Number two, early on when we started working and trying to understand the concepts around it, there’s always a tendency. Okay, fine. I’m going to go to my defects. All my defects are now called technical debt.

Ipek Ozkaya 00:35:55 Technical debt is not all bad. I’m done. So, there’s that tendency to kind of confuse concepts with each other as well. So, we wanted to decouple it. The other tendency was there are quite a number of quite powerful static code analyzers that will actually hint at some of these issues. Some of them again might be symptoms, but again, all right, I run my most favorite tools on my code base. It came up with 11,000 number of conformance issues and it says the cost of fixing them is twice the cost of the development efforts. I’m already bankrupt. So those wrong ways of measuring and mapping them actually does not help anybody. So, the reason of that principle is trying to decouple it. Of course, it’s one thing to make these very easy examples. There are some cases where things get hairy, but I think experienced developers and software engineers have already the appreciation to be able to decouple them and deal with them in terms of whether a technical debt or whatever else, issue that they might actually be having and had.

Jeff Doolittle 00:37:01 And with that, it reminds me of the distinction between data information and knowledge. You know, static analysis tool might give you data. It might give you information, but it doesn’t give you knowledge. And it seems to me that technical debt is, and the way you’re describing it and the way the book is teaching us to think about it and manage it. It’s helping us move from that data to information, to actual knowledge, which is good because now it’s actionable and we can identify it. And we can have conversations where we actually mean the same things by the words that we’re using, which is helpful if you want to actually communicate

Ipek Ozkaya 00:37:35 And you can even, God forbid I use the words, quantify it because now I know what I need to change. And I can actually get concrete, not be overwhelmed by the thousands of issues that are put on my in front of me and my software has bad quality, which I might already know actually that my software has bad quality.

Jeff Doolittle 00:37:53 Right. And the quantification of course, and I know you, you know, shudder maybe to say it, but this is more of a generalized quantification that, you know, this isn’t precision down to, you know, nanometers or something like that. But it’s a general sense of, you know, we can, we can sort of size it. We have a sense of the size. Right?

Ipek Ozkaya 00:38:09 I can prioritize it. I can decide whether I’m going to do it, this sprint or that sprint. Then I know which developer will be able to handle it with how many number of days and whatever. So that’s the level of quantification ideas.

Jeff Doolittle 00:38:25 So I know principle six is near and dear to both of our hearts as software architects. So, the principal says architecture, technical debt has the highest cost of ownership. Can you tell us a little bit more about what you mean with that principle?

Ipek Ozkaya 00:38:38 Definitely. First of all, for the listeners, I should probably clarify when I refer to technical debt I am actually referring to the architecture and design construct. However, what has happened in the community is when people identify technical debt, depending on the artifact, like for example, I do a code analysis and that’s how I’m talking about technical debt. I call it my code debt. If I’m talking about my infrastructure, I call it my deployment debt or my security debt. I think that actually creates an adds to the confusion. There are two reasons. One we’re talking about, we’re talking about technical ed, we’re talking about rework and the trade-offs that it enforces. And also when we talked about, okay, quality relates to some of the structure and behavior of the software, or when we say, when you have all sorts of complexity added to the software, you incur technical debt. We’re always talking about these accumulated design aspects of it. So that’s important. And also why it’s highest because it becomes layers and layers of implementation on top of wrong structure to start with reworking, that actually has a lot of accumulation. So, for most reasons, architecture has the highest cost of ownership. Technical debt is about architecture, more than other things. You might be discovering technical debt through other artifacts, but at the end resolution is often architectural. So that principle is trying to highlight that strong relationship between technical than and architectural design choices.

Jeff Doolittle 00:40:08 So if we had cost on the vertical axis and we had poor design on the left of the X and good design going towards the right of the X axis, then we would basically see that the highest point on the curve would be poor design. And it would slope down as you got closer to improved and better design. Okay. So, if you want a high cost of ownership on your system, design it poorly.

Ipek Ozkaya 00:40:33 And we are not the first to actually point that out there all sorts of even the flow and lean software methodology, is that actually out, it’s generally the backlog and how you’re pulling things. And what is the rate of addressing some of these issues that also related to how you’re bringing the costs on with good design.

Jeff Doolittle 00:40:53 Principle seven, all code matters. Really? I mean, my code matters more than others. I can hear people thinking, no, but seriously, like what kind of code is there that isn’t treated like, it doesn’t matter?

Ipek Ozkaya 00:41:06 I mean, all those principles, we went through multiple rotations. There’s actually been conversations that Billy prod and I had through them all, all code matters because resolving technical debt or discovering is not just on the code. And first of all, it’s certainly you need to bring it back to the implementation yet. For example, the tests code matters there as well, because we’ve actually have examples where people of course, tweak the test score to make the test pass, but it actually was on piling technical debt. So, there’s that traceability between the implementation, the test and the infrastructure. There’s a lot of now with the CIC DIA and the DevOps age we’re going through, a lot of fields and infrastructure called that actually can incur technical debts, which might hinder your ability to develop as the tempo to actually develop. So, it’s surely a technical debt maps to the rework that you need to do in the code. And it’s thought through in the front of the design concept. So that’s really what we’re trying to emphasize, but mostly it was our observations that we allow lower bar on tests and infrastructure code, which I think in the long run creates more negative outcomes rather than positive ones.

Jeff Doolittle 00:42:18 And to your previous points that involves architecture, meaning you have an architecture for your infrastructure code and you have an architecture for your test code. It just may not be a good one and that will show in pain. And when you see that pain, you can start to say, ah, there’s some debt here. There’s some technical debt here because the design is showing us pain.

Ipek Ozkaya 00:42:41 And the alignment also. So when I fixed my code, I actually aligned my tests and build infrastructure with the fixes I made. And sometimes there’s all sorts of mismatches that happen with that alignment that we’ve observed actually created significant issues. And sometimes the time spent is not necessarily on the fixing that, but the time spent is finding where the issues are because of the misalignment that might be introduced in the process.

Jeff Doolittle 00:43:05 Right. And I think this principle helps too because, you know, we’ve talked a lot about systems and I think that’s part of the point here is a system is going to have a lot of code of different kinds and qualities. And sometimes some people might have a tendency to say, well, my code matters or the code I understand matters, or the code in front of me matters, but we’re building a system. And so even though I might be developing, you know, a specific piece of functionality for the system in code, that doesn’t mean I can ignore deployment pipelines because there might be an impact say, you know, some helm chart out there. I mean, maybe I don’t even know everything about how Kubernetes runs, but I also can’t say it doesn’t matter because if someone else’s role or task or whatever, right, that code still matters.

Ipek Ozkaya 00:43:50 Correct. And actually it also relates itself to when you actually have to interface with external systems, a typical example we use to demonstrate this is in a example system that we worked with, the system, crushes developers, identify it, patch it, the system crashes again, for the same reason, developers identify patch it after quite a number of patches. I think the number was 28 or something. One of the developers say, you know what something else is going on here while the crash was injected to the system because of an API with an external supplier system, they never patched it in the source. Hence, they kept actually having have to find what was going on. So, this is the fix is not difficult. Finding the root cause of the fix within the system is difficult and all code matters because it’s not even code you own. It’s called you actually come from an external system.

Jeff Doolittle 00:44:42 Well, and I think too, it relates to quality in an interesting way. You know, I think of the, the O-ring problem with the space shuttle and my understanding, although I might have to, to verify this is that the O-rings used on the space shuttle were also used all the way back in the Apollo program. And the only reason they didn’t discover it back then is because they didn’t have the rocket go through a period of freezing temperatures. And it was only when that occurred, that we finally had the Columbia disaster. Was it the Challenger disaster? There was a Challenger disaster in the 80ís. So, in a similar way, in that case, all components matter, I mean, a simple O-ring, a simple quote, unquote O-ring in a complex system led to catastrophe. And I think in a similar way here, if we say all code doesn’t matter, we might just be setting ourselves up for catastrophe in ways we cannot yet envision or imagine.

Ipek Ozkaya 00:45:30 And also we see this a lot with legacy software, because way back when, for example, to optimize for performance, not all of the security principles weren’t implemented the way they were like they were disabled for one reason or another, because we didn’t have as powerful processes and whatnot. But now we have still the same software, but we have more opportunities for attackers and those areas because actually issues in the software system, again, at the time, if we had the vocabulary, those implementers could have said, you know what? We’re disabling these things to optimize performance, should our hardware and software change. Please enable them again so that we don’t have to deal with some of the consequences. I think there was subtleties become important as the software evolves as well.

Jeff Doolittle 00:46:18 And that’s a wonderful point because that expands the meaning of this to another level it’s, it’s meta all code matters, meaning all the code everywhere ever written matters because you never know which of that code could create a problem for you, even if you’re not involved in the development of the system. When you think of disasters that are software related, literally you’re saying all code matters.

Ipek Ozkaya 00:46:43 And it goes back to the earlier conversation we had with the software crisis. Why do we have software in crisis? Because we go through these evolutions at the time, we’re doing the right thing. In this example, disabling the checks was the right thing because we’re trying to optimize for performance latency, which is important for the software’s behavior. Yet. Now we have a different situation. If we have the vocabulary to express it, the new owners of the software will actually know what to check for, what to look for and know where they are located because now there’s a vocabulary to be able to actually trace it. So that it’s really about all code matters under the collective ownership, not within the team, but I guess within the overall life cycle of the software system.

Jeff Doolittle 00:47:27 Well, let’s move on to principle eight, which says technical debt has no absolute measure, neither for principal nor interest. And of course, this is why previously, when we mentioned quantification, I know we both kind of backed away from that a little bit, but if it has no absolute measure, we spoke a little bit about principal and interest. But talk to us a little bit more now about how we can measure it, even if we don’t have absolute measures for it.

Ipek Ozkaya 00:47:52 So one of the things that early on a lot of the organizations got excited about is like, okay, great. Now I have the vocabulary technical debt. I’m going to run a tool, press a button. It’s going to tell me the dollar amount. And I’m going to compare this with the other software I’m going to decide, or I’m going to compare this with the, whatever I’m being delivered, I will pay or pay not. So, I am very sorry to disappoint everyone. Who’s looking for this. That is never going to happen. What measurement comes down to is first of all, recording it. If we start recording it explicitly, then we’ll know how much of our systems development effort is going into technical debt. But the other is being able to understand what rework is. Right now I am going to rework each technical debt item and what does it mean and how cumulatively and as the number of items that appeared mean in terms of the measurement. The principle, which is at the time when you’re taking technical debt, that’s actually even harder because systems will evolve and resolving the debt might actually be a different architectural or design strategy

Ipek Ozkaya 00:48:57 Than one you’ve actually started with so that’s not an apples-to-apples comparison. And if we bring back the financial metaphor. If I’m taking $10,000 off a debt, I’m paying back 10,000 plus its interest. In the software sense, if I’m making a design trade-off picking one over the other. When I do a different design choice, I may not actually be redoing in the same way. I might’ve out-call I have a completely different approach to it. So that’s really trying to put them as apples-to-apples comparison as difficult. I think finally, we’re moving away from it. I see more and more development teams recognizing let’s at least enumerates what we mean, what we are talking about when we’re talking about technical debt. So, we can measure the consequences. We can measure the rework. We can definitely measure some of the downstream impacts of technical debt, but comparing them between different systems or trying to think that this particular system has an amount of technical debt is not the right way to think about.

Jeff Doolittle 00:49:56 And the book we can’t obviously go into all the details. That’s why listeners will need to get a copy of the book, but there are helpful diagrams and also helpful conversations about different sizes of companies or different sizes of systems and the kinds of technical debt they might incur. And my question there is, I think I recall, but, but tell me if I’m misremembering here that there’s even possibly different measurements for the different sizes of companies or different things you’re going to focus on. Is that correct?

Ipek Ozkaya 00:50:20 Of course. I mean, it’s really the fundamentally how you’re looking into the consequences and your ideal development, I guess, baseline that you’re talking about in one organization, the consequence of technical debt might actually incur as you are not able to add new functionality fast enough? In another organization, the consequence of technical debt, may incur as, you keep is observing more defects because there’s relationship with some of these. In another organization, it might actually be so complex that you’re not able to onboard developers. So, what is actually the consequence maps might be different. So, there are different metrics. The good news is these metrics or these measurement techniques are not foreign to the software engineers. We already have these at our disposable. We are already doing it. We’re already using some of the DevOps pipelines and introducing tools to do some of these checks. It’s really stepping back. And as you said, what is the knowledge that those information is providing us that would actually be worthwhile to communicate to the rest of the system developers as well as other stakeholders as, okay, here’s a technical debt item, and we need to talk about this at our sprint, as well as, as we’re doing our planning for our resource allocation.

Jeff Doolittle 00:51:38 And the final principle is technical debt depends on the future evolution of the system. How can we depend on the future when we don’t know the future?

Ipek Ozkaya 00:51:49 That’s a good question. And I think it’s about accumulation of the consequences and that rework. So, when we decide whether I would like to actually reduce or rework this particular aspect of the system or not, we’re making a decision even today without the technical debt concept, we’re making the decision based on our anticipated use of the system in the future. So that anticipated use puts a stress or not on the particular system with, or without technical debt. And I assess it based on that anticipated use and rework, because if I don’t anticipate that part of the system that has technical debt to change off on, or the consequences not observed, then I would not have any reason to spend resources on paying it down. So that’s really what it’s trying to emphasize. And that’s how we make our decisions when we’re making our other system prioritization decisions as well. So, it’s not just unique to technical debt it’s actually the way our planning works.

Jeff Doolittle 00:52:51 Well. We’ve also talked a lot about design and architecture, so I imagine, and good design. So, you know, future evolution, wouldn’t that be constrained somewhat by a better design, which would then affect the impact of technical debt on you going forward. Meaning the evolution of the system will be constrained somewhat by the architecture and the design of the system.

Ipek Ozkaya 00:53:09 Of course, yes. Yes. And that’s actually the moment you put a design or version one system out there you’re already providing constraints to how you’re actually going to evolve it. And then the question is, well, whether you’ll be able to live with those constraints, that the current architecture poses versus whether you need to relax these concerns, which is rearchitect and we work the system.

Jeff Doolittle 00:53:31 Yeah. And I point listeners back to a previous episode with you’ve all Lowy on his book, writing software where he talks about ways that you can define these components of your system or these, these interfaces and contracts between the systems in ways that actually can help you constrain the system. So that as it evolves, you’re not constantly evolving your architecture, because if you’re constantly evolving your architecture, your, your system is going to be unwieldy and complex. And you’re probably going to have a lot of technical debt.

Ipek Ozkaya 00:53:58 And that’s also, you mentioned earlier a good design actually reduces your risk of tech on technical debt because you’ve made some of these trade-offs early on this doesn’t mean design upfront. This doesn’t mean you don’t evolve your system, but it really is really understanding what are some of the constraints, understanding architectural drivers, understanding how they will actually, which ones might change in which may not change.

Jeff Doolittle 00:54:25 Sure. Well, and you mentioned design upfront. I’d clarify that with big design up front, which has also, I think become a straw man and a boogeyman, and it gets bandied about just like Yagni- you arenít gonna need it. And what ends up happening is no design up front. And I think what we need is just enough or, you know, some design upfront that’s enough, but definitely not, no design as a response to too much design

Ipek Ozkaya 00:54:45 Listeners may or may not agree with me on this. These are my personal, I guess, beliefs that I’m very strongly become, I guess, standing behind them. When you’re talking about technical debt, you’re really talking about and design and architecture trade-offs and they rework consequences on it. That’s number one. And number two is assuming that you don’t know anything and things will emerge is really not looking into a problem. Uh, carefully enough, there are always things that will not change as quickly as you anticipate. And there will always be some of the drivers that will actually be there that will help you make the design choices early on, and then you can iterate on it and continue to see of the system. But assuming that you will not, you know, nothing about to be able to make architectural decisions is probably a naive perspective.

Jeff Doolittle 00:55:36 So if I’m a listener and I want to begin taking action based on what I’ve learned, and I want to get a copy of the book and have some next steps. My first question might be, this sounds like a lot of work, identifying technical debt and tracking technical debt and quantifying technical, even though it’s not absolute, but some quantification. So, what’s your response to listeners for say different sizes of system or different sizes of team, you know, who obviously would have different constraints in that regard. But if they say this just seems too daunting, I’m not going to take the time. How would you respond to them?

Ipek Ozkaya 00:56:10 So I would appreciate if it’s daunting from the perspective of, okay, I have never heard of the concept. So how do I relate it with things I know that I have some appreciation for, because in a way technical debt puts the concepts that we’ve already known architectural trade-offs design trade-offs software, maintainability, and rework with a different eyes, so that there’s vocabulary so that I might appreciate until they’ve done some homework. However, the rest I think we already do anyways, developers already to talk about technical debt. Developers already to talk about aspects of their system that is hard to rework. They already know. I think all these little concepts that are introduced into the comments, how can we to do this is bad with all sorts of vocabulary. So, there’s actually already existing practices that we have just, I think if they want to do one thing, that’s one thing is start putting technical debt to whatever system that you’re using to communicate tasks, user stories, defects, whatever you call them into whatever system you’re using and identify it with a clear labor. That’s why you have please have, okay, are we overdoing it? Are we on the same page? I would really identifying things that are going to cause us pain in the long run that will create information for the team. And that’s really not a very daunting task. And after that, you can use whatever you’re using as part of your software, project management and software release planning, iteration planning practices. It’s really that easy from that perspective.

Jeff Doolittle 00:57:50 Right, so as I was saying before, when we were discussing design, you want just enough design. I imagine it’s similar here with management. If you’re a small team versus a large team, you’re tactical, that management principles might differ.

Ipek Ozkaya 00:58:01 And also, if you are a small team, you, it might be good enough just to talk to each other, versus if you are large team and running a large organization, there’s more coordination and organization needed. And obviously then the consequences also snowball a lot quicker as well.

Jeff Doolittle 00:58:17 Right. But the book would still be helpful for the small team. I imagine from the standpoint of clarifying what they’re talking about, even if talking about is all they need to do.

Ipek Ozkaya 00:58:26 And also to make sure that, I would assume a small team might be a startup, right? And if you really want to be a large team, what do you really need to be able to be aware of in that role? So, I think that small teams might take all different flavors as well, depending on what kind of a domain you’re in. But yes, I think regardless of the team size and the project size, I think this is a fundamental concept that we need to teach our software engineers. And that’s also important to recognize,

Jeff Doolittle 00:58:56 Well, if people want to find out more about what you’re up to, where can they go?

Ipek Ozkaya 00:59:00 I work at Carnegie Mellon University, Software Engineering Institute. All of the practices that we develop are actually available. We share. That’s why it’s exciting to be in this show so that we can actually share some of this knowledge with our listeners as well. And our website is www.sci.cmu.edu and our posts podcasts, as well as some of the papers that we’ve written, including case studies that are available there as well as the book, if anybody’s interested.

Jeff Doolittle 00:59:30 Well, thanks so much for joining me today on Software Engineering Radio, Ipek.

Ipek Ozkaya 00:59:35 Thanks for having me Jeff.

Jeff Doolittle 00:59:37 This is Jeff Doolittle for Software Engineering Radio. Thanks for listening.

[End of Audio]

 

 

SE Radio theme: “Broken Reality” by Kevin MacLeod (incompetech.com — Licensed under Creative Commons: By Attribution 3.0)

Facebooktwitterlinkedin

Tags: , , , , , , , , , , , ,