Felienne talks with Peter Hilton on how to name things. The discussion covers: why naming is much harder than we think, why naming matters in programming and program comprehension, how to create good names, and recognize bad names, and how to improve your naming skills.
Venue: Felienne’s residence, Rotterdam
- To camelcase or under_score by Binkley et al, http://ieeexplore.ieee.org/document/5090039/
- Domain Driven Design by Eric Evans, https://www.amazon.com/Domain-Driven-Design-Tackling-Complexity-Software-ebook/dp/B00794TAUG
- Code Complete by Steve McConnell, https://www.amazon.com/Code-Complete-Developer-Best-Practices-ebook/dp/B00JDMPOSY/
- Clean Code by Robert C Martin, https://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882
- How do Range Names Hinder Novice Debugging Performance? By McKeever and McDaid https://arxiv.org/ftp/arxiv/papers/1009/1009.2765.pdf
- How to Name Things: the solution to the hardest problem in programming, https://skillsmatter.com/skillscasts/5747-how-to-name-things-the-solution-to-the-hardest-problem-in-programming
- Peter’s blog, http://Hilton.org.uk
- Peter Hilton’s twitter: https://twitter.com/peterhilton
►▼ View Transcript
Transcript brought to you by innoQ
This is Software Engineering Radio, the podcast for professional developers, on the web at SE-Radio.net. SE-Radio brings you relevant and detailed discussions of software engineering topics at least once a month. SE-Radio is brought to you by IEEE Software Magazine, online at computer.org/software.
* * *
Felienne Hermans: [00:01:11.22] Hello, I’m Felienne Hermans for Software Engineering Radio, and I’m here today with Peter Hilton.
Peter is an independent software consultant and he’s currently working for Signavio in Berlin. If you’re a Scala dev, you might know him from his book “Play for Scala”, but I know him because two years ago I saw a fantastic talk by him about the topic of naming. Peter, you’re here to talk about naming. Welcome to the show!
Peter Hilton: [00:01:36.13] Thank you.
Felienne Hermans: [00:01:36.13] So they say there are only two hard things in computer science – cache invalidation and naming.
Peter Hilton: [00:01:42.14] That’s exactly right. It’s a famous quote by Phil Karlton, who said that there are only two hard things in computer science, indeed – cache invalidation and naming. This is funny, which is part of the reason why we all know this, and it is funny for a good reason – because it’s true. Cache invalidation is famously hard, but that’s okay because it sounds like it should be hard, so we expect that.
Naming we don’t expect to be hard, so that’s why the joke is even more funny, because it violates our expectation that hard things should sound like technical things, and yet naming turns out to be hard.
Felienne Hermans: [00:02:18.24] Why is naming hard?
Peter Hilton: [00:02:21.27] That’s not immediately obvious. It’s hard because it’s creative, we have to think of things, but that can’t be the whole reason because there’s a lot of creative thinking and design thinking in programming. A better explanation is that naming is hard because it’s communication, and it’s communicating with people, which is harder than communicating with computers.
Felienne Hermans: [00:02:44.10] That seems to be quite right. Let’s take a step back here. Why do names matter? Why is it so important to communicate with other humans? Programming is ultimately about communicating with a computer, isn’t it?
Peter Hilton: [00:02:58.25] Right, but we like our code to be beautiful, and names are a part of that. We want people to look at our code and say, “What a nice name”, in the same way that when you introduce yourself you want people to at least not be horrified by your own name. We want our code to be beautiful and clean, and we want that to include the names.
I don’t think that’s the most important reason why names matter. Names matter because, as we said, code is human communication as well as communicating with the computer, and the crucial role that names have is explaining what we mean, not just what the code does. If you’ve ever got into an argument with somebody, you’ll know that trying to explain what you mean can be very difficult.
Felienne Hermans: [00:03:40.19] Naming is also something creative. You have to really think about “What is the thing I’m programming here?” Is that part of why it’s so hard?
Peter Hilton: [00:03:50.20] Yes, absolutely. You have to choose between lots of options that seem maybe right, but are maybe not just quite right. Anybody who’s ever named a child knows that this is not an easy thing to do, especially if you have to commit to it… Children don’t have a refactor/renaming.
Felienne Hermans: [00:04:06.21] Yes, so take extra care in that situation. Talking about rename refactoring, I know this is one of the most used refactorings of all the refactoring tools. Do you think this is something that people often do change their mind about names in a domain?
Peter Hilton: [00:04:24.01] They do change their mind, and they do use the rename refactoring in their tools, but I don’t think they use them often enough, which is ironic because it’s perhaps the safest refactoring there is. Maybe this is because renaming is harder than naming. Renaming is the harder thing because it includes naming; you have to come up with a new name, and that’s not necessarily easier than coming up with the first name.
Also, if you rename something that people have got used to, you’re going to have to go and talk to them about the new name and get them used to the change. So it’s naming plus hard things – including talking to people – and change.
Felienne Hermans: [00:04:59.13] Maybe the bigger a codebase is, the more people work on it, the more people are used to something, the harder it gets to change a name.
Peter Hilton: [00:05:06.06] Exactly. And people get used to names even if they’re bad names. Even if you don’t need the name anymore for its initial communication, suggesting what something means, it can be just an arbitrary word, but once you got used to it, you know what it means. You can’t completely make up names. It could be anything, and we get used to them. If that wasn’t true, then we’d have real trouble with products names.
Felienne Hermans: [00:05:30.06] Yes, even if they change, sometimes people keep referring to them by their old name because it’s so hard to cope with the change.
Peter Hilton: [00:05:36.24] Yes, absolutely.
Felienne Hermans: [00:05:37.25] Talking about bad names, can you give a few examples of names that are really bad, that all our listeners should definitely avoid in the future?
Peter Hilton: [00:05:46.25] Right, there are some really obvious bad names, which are the obvious placeholder names. When you’re reading some code that somebody else has written and you come across names like “Data” or “Object” or “Thing”, you know that they weren’t even trying, you know that they just gave up. You’ve got a variable called “the Thing”, and it could clearly be anything. You have these obvious bad names – like “Fu”, as well, the placeholder. When you see this in the code, you know it’s a bad name. The person either gave up, was having a bad day, or it was former you, it was your own self, six months earlier and you just couldn’t be bothered. Now you’re paying for it, because you have no idea what you meant.
Felienne Hermans: [00:06:28.05] There must be the opposite, as well. What are some good names? Can you give us some advice on how to name things?
Peter Hilton: [00:06:35.07] Good names are more slippery. When a name is good, it’s good in a particular context. You can’t see the endless toil it took to come up with this name. Let’s say you’re working on a logistic system and you’ve finally figured out that this thing is a shipment and it’s not a consignment. “Shipment” could be a really good name because that’s what people call it. If the people in the warehouse call it a shipment, then it’s a shipment, and that is the correct name. The things sometimes have a right name, but you can’t really see from looking in the code, “Oh, this is a really good name”, or I can’t tell you “Shipment” is a good name because it’s only a good name if that’s the name it should be in that context.
[00:07:16.13] Good names are harder, and they also have that element… If you manage to have someone on your team who’s good at naming and writes this amazing code with these great names, it might just look too easy and you might not realize that they’re great names.
Felienne Hermans: [00:07:30.19] So it’s a skill that sometimes goes unnoticed.
Peter Hilton: [00:07:33.28] Yes, absolutely. It’s something you notice by its absence. You notice the bad names when you’re maintaining code. You tend not to notice the good names so much because you just understand the code.
Felienne Hermans: [00:07:43.23] Yes, it just goes very easy and you think “It’s good code” and it might be due to the names, but that isn’t as visible maybe as a lack of [unintelligible 00:07:48.21]
Peter Hilton: [00:07:51.01] Right, exactly.
Felienne Hermans: [00:07:52.00] Something is interesting in that story about shipments – the domain plays a big role, so is the advice here that you need to go out and talk more to the domain experts?
Peter Hilton: [00:08:05.16] Yes, exactly. Naming has a whole chapter in Domain-Driven Design – Eric Evans’ famous book – and it is a very important part of domain-driven design and an important part of a particular subject matter domain, because that’s where names often come from. Things have a name, and if you can write your code in terms of those names, you’re probably going to have better, more understandable code. These are the names that should mean something to everybody already.
Felienne Hermans: [00:08:33.04] To everyone in the domain. That, of course, doesn’t hold for all the developers in your team, that they all know the domain through and through.
Peter Hilton: [00:08:41.22] Right, that’s true. But at least if you have the right domain name, then you don’t necessarily have to explain it yourself just because you wrote the code. Potentially, anybody in the domain can explain to you what is the difference between a shipment and a consignment, or the fact that there isn’t one, perhaps, because “They’re synonyms, but in this company we use this one.”
[00:09:00.04] A domain name potentially appears in the dictionary somewhere, and crucially, it potentially appears in a dictionary that you didn’t have to write yourself.
Felienne Hermans: [00:09:09.02] Coming up with good names could not only improve your source code, but it could also improve communication between the developers team and the domain, because when you start talking about names, what you’re actually doing is trying to understand what the domain is doing. Am I getting that right?
Peter Hilton: [00:09:26.00] Yes, absolutely. As developers, we have to talk to other people about our code or about our software. Even if we don’t show them the code and expect them to say, “Hey, you’ve got the right name there”, we’re having a conversation with non-coders about the software we’re writing. This goes a lot better if we use the right language and if we use consistent language in all of our conversations: with the compiler, but also with the humans.
[00:09:49.19] This is another central idea in domain-driven design – this is the ubiquitous language. It’s not just using the right names, it’s using them all the time.
Felienne Hermans: [00:09:57.22] Yes, consistency is important. We’ve talked already a little bit about bad names, but you have a thing that you call “naming smells”, a few code smells that relate specifically to naming. Can you give a few cool examples?
Peter Hilton: [00:10:18.21] Right, exactly. Having bad naming is just one particular kind of code smell. I’ve mentioned a couple already. Meaningless names like “Fu” and “Data” – the first name smell. “Fu” si clearly meaningless. “Data” is not meaningless, it’s true, but it’s too abstract to be remotely helpful.
Felienne Hermans: [00:10:39.27] Anything is data.
Peter Hilton: [00:10:40.24] Right, exactly. So that’s not really helping at all. As you get progressively more useful, you still have bad names, just perhaps less bad. You have vague language, vague verbs, most famously “get”. If you have a method name or function name who’s first word is “get”, you’re just not being very imaginative or specific about what exactly this thing is doing.
Felienne Hermans: [00:11:06.04] But it’s getting stuff, what’s wrong with that?
Peter Hilton: [00:11:08.22] Well, it could only be worse if you said it was doing stuff. Doing stuff and getting stuff is what all functions do. That’s like calling a variable data – sure, it’s true, but it’s not very specific. There are much more interesting words that you could probably use instead, like “fetch” or “calculate” or “derive”, that are more specific and could be wrong. So yes, “get” might work for every function that returns something, but it doesn’t tell you what it’s doing. Is it just giving you a field value, is it doing a search in a database, is it calling out to an external system? Those kinds of differences typically matter.
[00:11:43.10] It’s worth mentioning that “get” is popularized in Java, particularly by the JavaBean standard, which sort of compensates for the lack of properties in the language, but has trained a whole generation of programmers to call all of their methods “get”, unless they’re called “set”. This is not very helpful, but it’s no different from using vague nouns as well in class names. If you’ve got a class that is some kind of manager, that’s just not very helpful or specific. Classes manage data, that’s what they do, so calling a class a manager is not as helpful as calling it a builder. What kind of management is it doing? Is it a builder, is it a calculator, is it a data access class? There are a lot of more specific ways to describe responsibilities.
[00:12:34.03] It should be obvious that manager is not very specific because if you’re introduced to a colleague and told that they’re a manager, do you have any idea what they do? Probably not. I’ll keep going, I guess…
Felienne Hermans: [00:12:46.14] Yeah, keep going, all of them. I’m like, “Oh, I did that, I did that, I did that too…” Hit me up with a few more, I’m sure I’m guilty of all of them.
Peter Hilton: [00:12:54.05] Okay, let’s go for a few more. Single-letter names. If you’re not sure whether a name is long enough and it’s one letter, it’s not long enough. One letter is not long enough to communicate anything. Most words are more than one letter, so it’s a special kind of abbreviation.
If it’s one letter, if it’s “A”, you don’t know whether that’s meaning it’s a placeholder or it’s the first letter, whether it’s an abbreviation for a word that begins with A, whether it’s some standard accepted meaning… This is very ambiguous, and the intent of the code is not clear.
[00:13:31.19] This is true of abbreviations in general. Abbreviations work when we know exactly what they mean, but they catch us out. We’re quite used to “char” meaning “character”. I remember once on a project years ago coding quite an elaborate bug based on my assumption that “char” meant “character”, and I think it was a database attribute or something, and it didn’t. It meant “characteristic”, which meant something quite specific in that domain and cause me to just make some wildly wrong assumption and code a quite significant bug. Abbreviations will always get you; they will always get you in the end.
[00:14:10.06] That sort of names were not too short, but names can be too long as well. If you use overly simplistic language in names, you can end up with convoluted long names like “appointmentlist”. That might make sense technically because it might be a list object that contains appointment objects…
Felienne Hermans: [00:14:28.13] That seems like a very good name.
Peter Hilton: [00:14:30.18] It might seem quite reasonable, and it’s not as bad as the vague names; it’s true enough, but it’s not a great name. We’re now into the ground of average names. This is an average name, but a much better name for “appointmentlist” is “calendar”, because we normally call that a calendar. Or if you have a “CompanyPerson”, we normally call those “employees”, unless it’s the owner of the company or a shareholder.
These generic average names, as well as being unwieldy, they’re the wrong name because they’re not what you would use in the domain name, and they’re often not specific enough… Not as specific as they could be. It’s very useful for names to be as specific as possible, because that gives them more meaning.
[00:15:16.20] Finally, to go back to the whole discussion about domain language names, you can have the wrong name. I worked on a logistics project where “shipment” and “consignment” meant the same thing. “Shipment” was the right name because that’s the one people used, but an order is not the same thing as a shipment. If you say “order” when you mean “shipment”, you just have the wrong name. That’s really gonna confuse people. That’s going to sound maybe reasonable in a sentence or look reasonable in code, but it’s just going to be wrong.
Felienne Hermans: [00:15:46.18] So again, understanding the domain is very important for naming.
Peter Hilton: [00:15:50.03] Yes.
Felienne Hermans: [00:15:50.24] I would like to go back to one of the things you said in a bad naming smell, which was the one-character variables, because there is a domain where one-character variables are totally fine, and that’s mathematics. They’ve just said, “V means Velocity.” We all agree that’s what it means, we don’t need to explain it everywhere. We want to have short and beautiful proofs with short variables, because that’s concise. Why are they different? Are they smarter than we are?
Peter Hilton: [00:16:18.05] I’m not sure I want to comment on whether mathematicians are smarter because I studied mathematics as an undergrad at university. I certainly thought I was smarter, but anyway… I also remember that mathematics textbooks did read very differently to code. As a programmer it is strange to look at mathematics and to see these formulas and proofs that are full of single-letter names, because that’s what you do in mathematics, you just use one letter for everything. In the end you’ll run out of letters, but that’s okay because you just use more alphabets; you’re not stuck to ASCII, like programmers, largely still are.
[00:16:52.26] So you go for the Greek alphabet as well, and when you’ve run out of that, then you switch to the Hebrew alphabet. You just use more characters, but single-letter names. It’s all based on single-letter names, but it’s not entirely true that you rely on everybody already knowing what these mean. In some fields, for example mechanics, certain letters will always be used for the same thing. You’ll always use the same letter for velocity, or angular velocity. However, in general that’s not true for everything.
[00:17:25.02] What you have to do when you look at the book and you see this formula of proof with the single-letter names is to look at the page before that, and you have a whole page of explanation of what’s coming. Each name has a sentence or more explaining what it means. In programmer terms…
Felienne Hermans: [00:17:44.23] …they use commands.
Peter Hilton: [00:17:46.23] Right, exactly. In programmer terms, mathematics is written like the block comments are longer than the code itself. That’s not a popular approach in code.
Felienne Hermans: [00:17:55.16] Why do you think that is? It has worked for mathematics for centuries, we could do that – big, single-letter identifiers, but then have stands of documentation on their meaning.
Peter Hilton: [00:18:05.15] Yes, we could do that. I’m not entirely sure why mathematicians do and they never worked at how to get rid of this, but as a programmer my experience has been that we much prefer to write code than we do write English. If we can possibly get away without writing comments, that’s exactly what we’ll do.
Felienne Hermans: [00:18:24.19] That’s a bit of trickery, because if you really think about good names, you are writing English; they are just in your code, rather than somewhere else, but that is a very literary type of activity, the way you describe it.
Peter Hilton: [00:18:38.09] Yes, absolutely, and there is a definite link here. A lot of the time if you’ve got a not very good name, then the only way to make the code understandable is to add a comment to explain what it means. Often, that’s how you can tell you’ve come up with a good name – people cannot understand your code, but they don’t need the comments. It is sort of the same activity, but in the naming you’re abbreviating things.
It’s like coming up with a newspaper headline, or a tweet. If the headline is good enough, people know what the article is gonna be about. If you get it wrong, they’re mislead. That happens too, sometimes deliberately… Hopefully not in code. If your name is not good enough, then you will have to continue the explanation.
[00:19:21.22] A name is like a summary. Writing a summary is often harder than writing the longer text, and that’s another example of that. The link with writing English or any other language is very strong.
Felienne Hermans: [00:19:33.05] Are you saying we don’t need documentation anymore if you’ve got fantastic names?
Peter Hilton: [00:19:40.25] Nice try. No, there are very few times when you don’t need any documentation. You often need a lot less than you think you need or used to need, and you can often have much less than that if your code is better. But we fundamentally still need documentation for things that we can’t explain in the code.
Talking about comments, we’ve touched on the difficulty of naming things and having the code be clear without the comments. The reality is that not all of our names are gonna be that good, and we’ll need some comments to explain what we mean. And by all means, improve the code and get rid of those comments, but realistically it turns out that naming is hard and therefore you’re probably going to still need some comments.
Felienne Hermans: [00:20:22.16] Is this the only reason we still need documentation, to battle our lack of good names? Can you give specific examples where documentation will always be adding to the value of the code, even with perfect, Peter Hilton qualified/certified names?
Peter Hilton: [00:20:38.27] Yes. Naming is just one of the reasons we need documentation. Many of the others, like bad names, are about bad code in general. Let’s suppose that the code, as you suggest, is absolutely perfect. You’ve removed most of the documentation and comments; what’s left is explaining why. Code itself can’t really explain why you wrote it, why it exists. Our programming languages are not that self-referential. You might see exactly what it means and what it does, but you won’t know why we have it, why did somebody write this code, why did we not skip writing this code. Explaining what it’s for or why is typically the last comment that you can’t remove in a class, and on a higher level it’s typically the kind of documentation you need for a computer system in general – you need to explain why it’s here and why these choices were made.
Felienne Hermans: [00:21:36.06] Yes, design decisions.
Peter Hilton: [00:21:37.17] Yes, exactly. You don’t need very much, but you typically want some kind of architectural overview that gives you a summary of what you have and explains why we have this. A summary is important as well, so that’s the other reason we need documentation. Everything may be explained in complete detail in the code, but it may be explained in complete detail in 100,000 lines of code, and that might take some time to read, so you might want some kind of one-page management summary.
[00:22:07.05] To use an example I’ve used earlier, if you haven’t heard of the book, and I encourage you to read War And Peace… You might ask me, “Well, what’s it about? Is it something I’m gonna like?” and if I said, “I mean, this is classic literature.” You can see it’s very long – it’s a big, thick, heavy book.
Felienne Hermans: [00:22:23.13] It seems legit.
Peter Hilton: [00:22:24.20] Right, but it’s so well-written, it’s entirely self-documenting. That’s not really a good answer, but we use the argument with our code. So we want that one-page introduction so the architectural overview, the functional overview, the business context that explains why we’re doing this thing in the first place.
Felienne Hermans: [00:22:40.16] Again, a little bit like a summary…
Peter Hilton: [00:22:43.12] Yes, exactly that. Explaining the why and giving the short introduction is the documentation that is left over. If you skip that, then you’re just making things hard for people.
Felienne Hermans: [00:22:54.04] That seems fair.
Felienne Hermans: [00:23:26.25] Can you give us some guidelines on how to make good names? We’ve already talked about the smells, but how can I get better at this?
Peter Hilton: [00:23:35.17] Well, if you’re listening to this podcast, then you’re probably halfway there. The most strange thing about the difficulty of naming is that as programmers we tend to give up. Maybe we’re too intimidated by this claim that it’s one of the two hard things and we just don’t try to come up with good names; we just accept that naming is hard and we accept we’re going to have bad names and we leave it there. But if you try at all, you can improve every name. If you pick a name, work with the code a bit, and when you finish working with the code – for example, before you submit a pull request, and think “Okay, with these names I’ve created… Do I understand the code better than when I created this name? Can I come up with a better name?”
[00:24:18.11] Simply trying to come up with a better name is a big step in actually doing so. Provided that you actually do the rename refactoring, then that’s the most important step. The most important step is to try. The only reason that I’ve improved in this is that I’ve been trying to come up with better names for a long time.
Felienne Hermans: [00:24:37.14] I don’t think it’s actually true that people don’t pick good names because they think it’s very hard. I think it’s not a concern for many people. We’ve been educated quite a bit about code smells – “Don’t make your methods too long, don’t have 25 parameters in your function.” These are things that are sort of being normal behavior for developers; we’re not there yet in terms of naming. People don’t really see that bad naming code smell the way you see it. Obviously, “thing” and “data” are bad, but “AppointmentList” seems fine.
Peter Hilton: [00:25:14.13] I do agree. It’s not a priority, it’s maybe not the thing we’re most focused on, and it’s harder to see because it’s a softer code issue because it’s about the human communication. It compiles, so how bad can it be? Depending on your programming language, but that’s a topic for another podcast.
I do agree that a lot of the time we’re just happy if we get the code to work and we’re not thinking so much about maintenance or the human communication aspect of the code. It’s maybe a shame that not every software developer experiences the horror of a maintenance project where they have to figure out how to maintain and fix and improve code written by somebody else who’s not around. But even without that learning experience, hopefully it’s possible for programmers to feel inspired to try and come up with better names and to realize that this will make their code much better if they can achieve it.
Felienne Hermans: [00:26:15.24] Is this something you can typically do yourself, or would be pairing be very good here? Because as you describe this, “Well, you look at your own code and you see if it makes sense…”, but it’s sometimes very hard because something has obtained meaning for me. If that variable has always been called “data”, I now know that thing as “data” and it’s totally fine for me. It has more meaning because that’s in my memory and I’ve worked with the code. Would pairing help? Maybe pairing with people that don’t know the code that well.
Peter Hilton: [00:26:49.18] Yes, pairing is actually much better than what I suggested earlier, that you would be disciplined enough to review your code for naming before you commit it. Realistically, that’s only going to happen if you decided as a team that this is a priority and you have some kind of checklist. Otherwise, let’s face it, you’re going to skip this.
Pairing works especially well because of the way that when you pair, you typically think about the code that you’re both looking at on two different levels.
[00:27:16.10] The person driving with the keyboard is looking at the line of code level, and the person sitting next to them, as they get gradually more bored or…
Felienne Hermans: [00:27:23.21] Confused…
Peter Hilton: [00:27:25.12] …confused, trying to engage with the code, often takes that step back and looks at the whole screen and tries to figure out what it all means. A very common question in pairing is “What is that? What is one of those? What does it mean?” Well, these are, in part, questions about naming. A better name would be the right answer to those questions.
I do fear it sometimes, as my experience with pair programming is that it’s not easy to make renaming happen during pairing. You’re often focused on something else. But pairing is probably the best way of dealing with this while coding.
Felienne Hermans: [00:27:58.07] Yes, and also what you’re saying is probably right – people won’t take that to bad name, specifically. They might say, “This method is too long”, but they won’t say, “That’s a bad name.” But they might say, “I’m confused. What does this do?”, which is secretly the bad naming smell that rears its head, but it’s not very clear, it’s not vocalized as a bad name.
Peter Hilton: [00:28:20.09] Exactly. That’s a habit that’s worth acquiring, which is to identify difficulties that are probably caused by a bad name, or that may be fixed by improving a name.
Felienne Hermans: [00:28:31.23] We touched upon programming language a little bit and you said that’s a topic for another podcast, but I’m still curious if there are programming languages that encourage programmers to name better? Is there any language, or maybe IDE, that gives me some support?
Peter Hilton: [00:28:52.25] This is not something I’ve thought about that much, so I’m not sure. There are two things that are relevant here. One is the difference between functional programming and, for example, object-oriented programming. The other is the culture that comes along with a programming language. By functional programming I’m particularly thinking about strongly typed functional programming. The thing about stringly typed languages is that you have to name the types, as well as the things.
[00:29:22.29] In a way, this makes the problem harder because you have to name more stuff, but I think in practice it makes the problem easier overall. The problem is split in two. Sometimes you don’t really need to name the thing; just knowing its type is enough. Sometimes naming the thing is the second part of the name – you start with “What is it?” and then you name it. Strong typing I think helps overall.
[00:29:51.24] The link for functional programming is in functional programming you don’t often have to name the thing; sometimes you’re really just coding with the types, and the types are more abstract and I’m guessing that’s a little bit easier to name. I’m unsure about that, but that’s a possibility. It’s certainly a different experience to coding in a language where everything is effectively a string or an object and you have to name every single thing and you have to include also whatever information is necessary to know what kind of thing it is.
Felienne Hermans: [00:30:24.04] And then you would have to code that in the name, whereas in a typed language you would code that in the type.
Peter Hilton: [00:30:30.21] Right, except you don’t have to code that in the name, so maybe you don’t bother.
Felienne Hermans: [00:30:33.19] Yes, you probably don’t do it. In a sense you’re saying that a typed language encourages better naming because you also have to name the type, and if everything is object, then that is something that doesn’t need to be named.
Felienne Hermans: [00:31:13.02] Why?
Peter Hilton: [00:31:14.09] Because I find the types easier to name, and that gets me halfway there. The other issue I’d mention is culture.
Felienne Hermans: [00:31:22.18] Yes, because some languages have conventions for naming.
Peter Hilton: [00:31:25.02] Right, and the conventions can be very strong in a programming language about what kind of code you write. If the convention is that everything is a single letter, then you can sort of ignore the naming problem, largely. I’m sure you can pretty much ignore the naming problem in high-school because you’re really doing mathematics, and as long as you write those paragraphs of text that would be in the mathematics, I’m sure everything is just fine.
[00:31:45.27] In some languages, typically in certain Java communities, you’re expected to come up with long, descriptive names for classes, even if they don’t fit on one line in your IDE – that doesn’t matter.
Felienne Hermans: [00:31:56.09] They’re called “Super Container Factory”…
Peter Hilton: [00:31:59.04] Right, but then that’s another kind of difficulty – if you’re supposed to drop as many design pattern names as you can, despite the fact that nobody really knows what an abstract builder factory does, then that’s a different kind of a challenge. In Go, for example, they’re famously keen on short names. Idiomatic Go code tends to include a lot of single letter and abbreviated names. This probably makes naming quite easy and moves the difficulty somewhere else, like maintaining something that has Go code. Maybe that depends on your conventions for writing comments, for example.
[00:32:38.02] This cultural aspect of what kind of dialect or accent you write the code in is probably the most significant part of the difference.
Felienne Hermans: [00:32:47.24] Yes, it’s really about “Where do we store the information?”, which is a deeper problem related to naming. Do we need to put the design patterns in the code, so that it all goes together, or do they go into separate documents? Do you have comments that explain the abbreviations, or does everything go in the code?
Peter Hilton: [00:33:07.00] Exactly. This is the discussion about thinking about what exactly you’re optimizing. Are you reducing key strokes in everything you write in the documentation and in the code, or are you just reducing documentation? Are you optimizing for programmer speed or maintainer speed? There are lots of different approaches.
Felienne Hermans: [00:33:30.25] Do we know anything about what’s better? Has there been any research about what’s more readable, what is a better strategy, where do you store the information, how does it not get outdated?
Peter Hilton: [00:33:44.11] I hope there is. Personally, I’ve had very little context with academic computer science, so I just wouldn’t know. I’m not aware of anybody talking about this stuff.
Felienne Hermans: [00:33:54.13] Yes, I know there’s a paper, we will put it in the show notes. I know there’s a paper that compares readability of identifier names with CamelCases and underscores and other various types of the same name, but how to do capitalization and splitting of the words. CamelCase was better than underscores for readability.
Peter Hilton: [00:34:15.18] Really? I always thought it was the other way around.
Felienne Hermans: [00:34:17.00] That’s what they found. I don’t know who it’s by, but I’ll make sure we put it in the show notes.
Peter Hilton: [00:34:22.16] I look forward to reading that.
Felienne Hermans: [00:34:25.01] What else can we do to get better at naming? Are there books we can read that are specifically about naming? We already talked about the DDD book. Are there others?
Peter Hilton: [00:34:35.03] I’m glad you asked this, because when you first asked about how can we get better at naming, I simply said “Try.” But you can do more than that, you can take this further. Books are often the best way to get up to speed on what other people have learned. Sadly, I don’t know of any books that are only about naming. I wish there were books that are only about naming, because they’d be great.
Domain-Driven Design is not the only book that has a chapter on naming. You should definitely read that book – however tricky you find it – because it’s really worth it. The first programming book I read that has a chapter on names is Code Complete by Steve McConnell, and I was lucky to read that in my first years as a programmer, and it helped me a lot.
[00:35:21.08] It has a very serious chapter early in the book about the power of data names. This has a lot of sensible advice about explaining why this matters and why this is important. Most of the other classic coding books have some stuff about naming, but not quite enough. You might forgive them for thinking that the author didn’t think it was very important. I hope that’s not the case; I hope they just preferred to write about other things. I’m thinking about books like Clean Code, for example, by Robert Martin. It mentions names, but not in great detail.
Felienne Hermans: [00:35:56.20] So there’s opportunity here for someone to write a book maybe about naming.
Peter Hilton: [00:36:01.06] Yes, please do. Meanwhile, in the absence of a lot of books about naming, you can just improve yourself by just improving your naming skills.
Felienne Hermans: [00:36:10.18] How?
Peter Hilton: [00:36:11.21] Earlier on we talked about the link between coding, naming documentation and literary writing. Naming is one of the areas in which coding is writing, in the sense of writing English. You’re not writing a novel, but you’re kind of writing a poem. The hard thing about writing poetry, I’ve heard – I promise you, I’ve never done this – is it’s very important to choose exactly the right word at each stage in the poem. In fact, famous writers of novels have said, “You write, and then you re-write, and then you re-write, and you have to agonize over every single word.” Who knew that writing was hard? I thought only coding was hard, but presumably other people’s disciplines are hard, too.
[00:36:54.20] If you make that link well… If you have to agonize over the lines of code and choosing the right words the way Ernest Hemingway did, you realize that maybe getting better at writing shares something with getting better at naming. How do you get better at writing? How do you get better at vocabulary? Because this is the specific issue. Do you know more words?
Felienne Hermans: [00:37:16.22] We should all start to write novels?
Peter Hilton: [00:37:18.10] Writing novels – no. It takes too long, and who’s got the time? But you should definitely read novels. And it doesn’t matter if they’re novels, you should just read a lot. But read stuff that’s well written, read stuff that uses language well, so you could certainly do worse than read a lot of classic literature. Classic literature tends to be full of quite exciting vocabulary and use of language.
Felienne Hermans: [00:37:39.21] Like Bob Dylan.
Peter Hilton: [00:37:41.28] Right. This doesn’t have to be novels, this could be song lyrics. Song lyrics is poetry, and the words matter. They don’t just matter because they should rhyme; depending on your songwriter, they should also mean something. This is where there’s an interesting difference in how you approach this, or your experience if you’re a native English speaker or not. In a way, as a native English speaker, I can cheat because I know more words, but that’s partly because I’ve always played word games and I’m pretty good at scrabble if I do say so myself. And that’s partly because I’ve always read a lot of novels. I finished one off this morning, by Alistair MacLean, a classic thriller writer. This is great writing.
[00:38:21.29] By reading a lot of good writing, you learn more. But if you’re a non-native speaker of English, on the other hand, you’re probably more used to the mechanics of learning a foreign language. You’re more used to the idea of actually learning words as a deliberate activity. This is something a lot of native English speakers have never had to think about. So learn more vocabulary is – as well as the pair programming – probably the most important thing you can do. How do you learn vocabulary? You read books, you play word games, you look at the lyrics of your favorite songs… And writing does help. Don’t write a novel, but maybe write blog posts.
Felienne Hermans: [00:39:00.05] You touched here upon learning a second language – I’m Dutch; probably you hear from my English that it’s not my first language… What if you work for a Dutch company which has a domain with domain words in Dutch. Do you put Dutch identifiers into the English-sounding code with ifs and for? Or do we have to translate all the domain words so that it doesn’t sound funky?
Peter Hilton: [00:39:27.29] I’ve done a lot of software development projects here in the Netherlands, and it’s kind of a nice idea, but Dutch words are far too long and they just definitely wouldn’t fit on the screen. In practice it’s never really come up, I’ve never got to try that as a consistent thing, to use Dutch naming and code written by Dutch people. In practice we’re doing coding in English.
Felienne Hermans: [00:39:49.20] But that means you have to translate domain words, and it wouldn’t be “shipment”, because it would be “bestselling” in Dutch. So you’re not then speaking the language of the domain anymore.
Peter Hilton: [00:40:01.21] Right, this is tricky, and I’ve seen this cause problems a couple of times. Mostly there are standard translations within a domain, so you don’t have to figure out these translations yourself; there just is a correct translation. If it’s about shipments, you go to the Wikipedia page about this and then you just select Dutch and see what words they use. This Wikipedia is an excellent resource for translating domain terms. It’s also a good resource for learning them in the first place.
[00:40:28.03] If you’ve been doing some other domain and then suddenly switch to global accounting, or HR, go and read the Wikipedia pages and see the language they use. That will help you learn them.
Felienne Hermans: [00:40:41.00] That’s a good trick.
Peter Hilton: [00:40:41.24] But there is a danger that teams that are mostly non-native speakers of the same language – I’ve worked in a team with mostly Dutch programmers – will pick a word that sounds reasonable but isn’t really a very good translation. It’s either not the term a native English speaker would use, it’s a bit of a Dutch-ism, or it’s a completely reasonable word, but it’s just not the equivalent domain term. Sometimes in another language the equivalent domain term sounds quite different, and there’s no literal translation. That’s what you’d need Wikipedia for.
[00:41:11.27] This came up in the Netherlands on a project where one of my colleagues asked me how to in English make a distinction between two different words for edits to a text – this is software for a publisher – and there seemed to be a difference… They’re using two different Dutch words in the domain with the customer for a change that the author had made to a text. The change had been made in the context of a particular publication by an editor. So how do we translate these two words differently?
[00:41:43.19] We had lots of long, complicated names, but at least one of them we could pick a nice name for in English. A change to text made by the author, we could just call that and edit. Sometimes you come up with quite an easy name, but it was quite hard work to avoid picking Dutch-isms that would have looked like completely reasonable English, but the code would have been quite hard to read if you didn’t speak Dutch as well. That does happen a lot, as well.
Felienne Hermans: [00:42:08.29] So coding in a second language is an extra hard problem.
Peter Hilton: [00:42:12.07] Yes, but just to point out – the situation that I was describing is also fairly typical increasingly; especially in this country, we’re coding with international teams who are not all the same language. This often means that when it comes to language writing – and English, in particular – people have a different background. We shouldn’t all expect to be brilliant at naming, in the same way we don’t all expect to be brilliant at user interface design. You wouldn’t have everybody on the team spend the same amount of time doing user interface design. You might work out that one person had a particular flair for it, and you would get them to do that. Maybe it’s the same with naming. Maybe you should work our who in the team is the best at it and get them to review the code, pair with them, or just go and ask them, “What would you call this?”
Felienne Hermans: [00:43:02.03] Yes, and in some cases that could be a product owner, or someone very related to the domain, to talk about the name. That would be an interesting idea.
Peter Hilton: [00:43:10.05] Yes, absolutely. In fact, ideally it would be.
Felienne Hermans: [00:43:12.18] Yes. We talked about using Wikipedia as a tool for naming… Are there other tools we can use to make this easier?
Peter Hilton: [00:43:21.11] Yes, it turns out that I’ve got this amazing program on my laptop which is the best tool when it comes to naming. Unsurprisingly, it’s preinstalled… It’s called Thesaurus. I’ve got quite a good one on paper, but that’s a bit clunkier and it’s too far away on the shelf. On my Mac, the built-in dictionary and thesaurus is great for getting inspiration about what are the options I have, what are the possible names I could use instead…
Felienne Hermans: [00:43:48.00] How do you go about that practically? You’re not going to read it from cover to cover…
Peter Hilton: [00:43:52.17] No, I look up a word. If I’m trying to name something, I’ve got a name and I think “Okay, is it shipment, or should it be something else?” If I look at the thesaurus, I look up “shipment” and I see alternatives. Often, a different word jumps out, “Oh, I’ve seen that one before. People talk about that.” Or I can look at each word and think, “Is that better? Does that sound more right in this context?” The thesaurus is much more useful from the dictionary than this. You have options.
Felienne Hermans: [00:44:21.03] Yes, so it’s looking up synonyms, that’s the activity.
Peter Hilton: [00:44:23.07] Yes, look up synonyms in your thesaurus. I do this a lot, it feels like cheating.
Felienne Hermans: [00:44:32.29] Yes, that’s a really good trick. I would have never come up with that.
Peter Hilton: [00:44:35.18] Yes, and if I had one vote for the next feature of my IDE, it would be “Okay, my IDE has a nice dictionary in it, and it does spell checking.” But there’s not thesaurus…
Felienne Hermans: [00:44:44.25] Oh… You do right-click, “Refactor To Synonym”.
Peter Hilton: [00:44:47.12] Yes, that’s what I’d love to have.
Felienne Hermans: [00:44:49.15] Well, that’s not hard… You’re a programmer. You’ve been in the business for 20 years and you’re still getting paid to write software, so why don’t you build that, Peter?
Peter Hilton: [00:44:58.02] Well, that only occurred to me about ten seconds ago.
Felienne Hermans: [00:44:59.19] It’s a good idea. If you do build it, we will put a link to GitHub in the show notes. Anything else we need to talk about…? Oh yes, of course! Shall we talk a little bit about my pet peeve? People that know me know what my favorite hobby is, and that’s of course spreadsheets. Spreadsheets are coding; I always argue that they are the best way of programming, because they’re so easy and so many people can do it. How about naming in spreadsheets?
Peter Hilton: [00:45:28.08] I guess it’s not a coincidence that you mentioned that. Actually, it turns out that I first saw your spreadsheet presentation at the same conference where you saw my naming presentation. But it’s only recently I’ve figured out… You persuaded me then that spreadsheets are coding, and perhaps even functional programming and therefore cool, but somehow spreadsheets aren’t quite the same as normal coding, because they’re massively more popular. There are a lot more people who use spreadsheets than people who identify as programmers.
[00:45:58.23] Spreadsheets are genius, clearly; anything that popular… But one of the genius ideas in spreadsheets is to avoid the naming problem. In a kind of judo move, stepping out of the way, neatly avoiding the hardest problem in programming, spreadsheets mostly don’t have names and they have great references. Instead of naming something… You don’t have to call it anything, it just is “C4”.
Felienne Hermans: [00:46:23.05] But that violates all your advice, because it’s very short, it’s an abbreviation, it doesn’t have domain meaning…
Peter Hilton: [00:46:29.28] Right, exactly. It’s like certain kinds of functional programming where everything is anonymous, so I would speculate that using good references in spreadsheets make them hard to understand. If I think back to trying to understand a spreadsheet that somebody else made, that is pretty tricky. And like code, you can put comments on the cells in spreadsheets, and like code, people don’t really do that very often.
Felienne Hermans: [00:46:58.11] They don’t do that, and there’s been extensive scientific research on cell names versus names, because in Excel you can give cells a name. There’s a whole Ph.D. dissertation about that topic, and the conclusion is if you start naming things, you mess with the grid system, and people get entirely confused, so it’s not better to use those names. Even though lots of people advocate it, if you try it on people, they really like the grid names. It’s sort of thought-provoking that that works for lots of people well. It’s clearly bad naming, by all your standards.
Peter Hilton: [00:47:31.01] Yes, it’s bad naming, but also I do see that naming is hard and removing that part of the problem from coding – since the spreadsheets are coding – makes them much more accessible, so I also suspect that the absence of naming is what makes it easy for a lot of people to get started with spreadsheets.
Felienne Hermans: [00:47:47.27] It’s a very interesting question how we could get that low threshold in code where we would also the benefits of good naming.
Peter Hilton: [00:47:55.22] Yes, I consider that an unsolved problem in spreadsheets – how to introduce naming right up there with introducing strong typing, or something.
Felienne Hermans: [00:48:03.10] That’s it… Is there anything you want to share? Where can we read more? Do you have a blog or a website where we can follow all your naming adventures?
Peter Hilton: [00:48:12.09] I have a blog at Hilton.org.uk, that’s my website. I also have some presentations there, including the presentation you mentioned, “How To Name Things, The Hardest Problem In Programming”.
Felienne Hermans: [00:48:23.13] I really recommend watching it.
Peter Hilton: [00:48:24.28] There’s at least one video of that presentation, and the slides are online; the slides are on Slideshare. We should add those to the show notes, too.
Felienne Hermans: [00:48:31.29] We’ll link to all of those.
Peter Hilton: Just try it out. Just try and come up with good names, and maybe it will turn out that the person on your software development team who comes up with the good names maybe is you.
Felienne Hermans: [00:48:41.27] Great! Thanks a lot for this.
Peter Hilton: [00:48:44.03] Thank you, too.
* * *
Thanks for listening to SE Radio, an educational program brought to you by IEEE Software Magazine. For more information about the podcast, including other episodes, visit our website at se-radio.net.
To provide feedback, you can write comments on each episode on the website, or write a review on iTunes. Mention or message us on Twitter @seradio, or search for the Software Engineering Radio Group on LinkedIn, Google+ or Facebook. You can also e-mail us at firstname.lastname@example.org. This and all other episodes of SE Radio is licensed under the Creative Commons 2.5 license. Thanks again for your support!