SE Radio 439: JP Aumasson on Cryptography

JP Aumasson, author of Serious Cryptography, discusses cryptography, specifically how encryption and hashing work and underpin many security functions. Justin Beyer spoke with Aumasson about randomness and how it is the basis of cryptography. They also discussed how block and stream ciphers work, at a high level. They also discussed how hash functions are ubiquitous in the world of cryptography. To close out the episode, they discussed how cryptographic algorithms are proven secure and what attack models and security notions are in the world of cryptography.

Show Notes

Transcript

Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.

SE Radio 00:00:00 This is software engineering radio, the podcast for professional developers on the [email protected] se radio is brought to you by the IEEE Computer Society. I is your belief software magazine online at computer.org/software.

Justin Beyer 00:00:21 Hello, this is Justin Beyer for Software Engineering Radio. And today I’m speaking with JP Aumasson. JP is co-founder and CSO of tars. JP is known for his work in cryptography, including the reference book, serious cryptography, the widely used algorithms Blake two, and sit Pash and talks at leading industry conferences, such as black hat and Def con. Welcome to the show, JP, thanks

JP Aumasson 00:00:41 For the invitation. Of course.

Justin Beyer 00:00:42 So just to start off the episode, I want to lay kind of a foundational what we’re going to be talking about today. So in cryptography, there’s generally the two large areas that I hear discussed encryption and hashing. Can you kind of do a quick discussion on what they are, how they differ and a little bit on where you might use one or the other?

JP Aumasson 00:01:04 We just started discussion. So I guess, to refer to a and hushing or signature, I mean today, crypto is a very rich field with many different protocols, many different schemes, but as a software engineer, most of the time you’d be concerned with some form of symmetry encryption. And most of the time what you want to do is encryption data or would be hushed and data. So maybe fundamentally the biggest difference is that the encryption we address confidentiality of the data or secrecy, or as flushing is most often used to protect the integrity of the data. So to ensure that that data has not been modified it accidentally, or my obviously, and there’s a few differences depending on the use case on the other hand.

Justin Beyer 00:01:44 Alright, awesome. So essentially we’re using encryption when we’re striving for things like confidentiality and hashing is generally used in the field of data integrity, and then there’s some overlap and crossover between the two with certain functions.

JP Aumasson 00:01:58 Yeah. So initially the point of flush functions was to protect integrity, but they ended up being like a Swiss army knife of crypto. You can find them pretty much everywhere, even if someone random to generators and you can even be an encryption schemes based on hash functions.

Justin Beyer 00:02:12 Okay. So we are going to dive a little bit into hashing and how hash functions are created and where that’s used with encryption, but just kind of moving to a more fundamental concept in crypto. I usually hear the discussion around randomness being, you know, the big, you know, base level foundation of all things. Crypto, can you kind of talk about how we define and measure randomness of data?

JP Aumasson 00:02:39 Maybe that’s, uh, the most important thing in crypto randomness. Like when I get my crypto training, you know, the first session is not about encryption. It’s not about symmetric public key. It’s about randomness, uh, cause randomness, if you don’t have a good for them to know on crypto, then everything become predictable, therefore insecure. And the reason why is, is that all of cryptography tends to rely on some type of secret. Usually the secret is a key or secret key private key. However you go ahead, but the secret key must be a non-video the attacker. So unarmed means unpredictable means from dub, uh, admitted. At some point you have generic data key using some sort of random generic, and you know, all your cryptographic scheme and protocol, we would strap from this core randomness that you had initially. So it’s extremely important that you have a CGRP RG in crypto. And what’s also important when I tell my students, is that what you took off, uh, something random. If you look at a string and on those, your 1, 1 0 1, you cannot tell if something is random yourself, what is random is the process that you use to generate.

Justin Beyer 00:03:44 Okay. And I usually hear when people start to discuss randomness, the term entropy thrown around, can you kind of discuss what that is, how that fits into the overall picture?

JP Aumasson 00:03:54 Yeah. Entropy’s maybe one of the most confusing terms of crypto. You sometimes hear people strange substitute Bates and say, oh, that’s my intro key. You know, cause why they want to say is that that’s the random number that I would use, um, to get some predictability, but entropy, it comes from physics. It comes from the field of thermodynamics and it’s just a way to quantify, to measure the level of uncertainty or randomness. So most of the time you want to have an entropy level that is as high as for security to the whole. So just to simplify, let’s say you want to generate keys are phenomenal in 28 minutes. So you want a generic thought a class at this level of entropy because in principle, in theory, if you have a generic entropy, I don’t know, elevates, then you can attack it by doing approximately two to power and operations.

Justin Beyer 00:04:45 Okay. So essentially you need the level of security as kind of a downstream effect with that.

JP Aumasson 00:04:52 Usually you don’t really have to worry about it. Um, you’re not going to measure the entropy and actually shoot like a tower. Lenox is doing it. They have a natural bee counter of summaries in there somewhere. I don’t remember exactly, but it’s going to be quite misleading because the way to confidential is completely wrong automatically. So you should not rely too much on what the next, as you, in terms of internal.

Justin Beyer 00:05:15 Yeah. And you had mentioned a kind of earlier, like with the concept of like a pseudo random number generator or PRNG, can you kind of discuss how that works for something like in Lennox where I usually hear the terms used, like dev random dev you random and that controversy.

JP Aumasson 00:05:30 Yeah. So you have different ways to generate randomness on your system. I think most reporting systems will get you run on generic or as part of the U S has bottled the Kronos. So you should look at the nurse states accessible using the dev slash random or dev slash random device fires. And the wage proxies that the Colonel is running a system whereby you have a state, a pool of like four kilobytes and it would co it will collect data from a number of sources that behave fundamentally in an analytic way. Uh, for example, you know, system interrupts activity of her keyboard of your mouse, uh, or the network and a bunch of logs. And we’ve combined all this data and Hershey then do a bunch of cryptographic operations and put all these, the dine in a big table. And when you want to extract some randomness, then it will take some of these data in these big tables, put it in some deterministic, random the generator, what needs calls the RBG. And from this, you will derive a potentially infinite stream of bits. And then again, you can access it using this device fires by using open generator, but the basic way to do it and Linux and open BLT and free BSD is to use the guest random system called, which is maybe just say for Switzerland.

Justin Beyer 00:06:44 So why would I want to use the CIS call for example, instead of going to dev random or w random directly

JP Aumasson 00:06:50 The Cisco fusion Caesar, she is not on . You don’t have to worry about it, but you, it in see that’s the basic way because of the flirting reason, a different, um, we’ll block, which means it will stop returning data. If the Linux kernel things that you don’t have enough entropy, but most of the time it would be wrong. It could be good enough. Does your end dumb is that we’ll never block and specific. It would not block even if you just would the system and you have reached a state of enough entropy. And in that case, you can be troubled because if you don’t have enough entropy, like one second after booting the system, then you’re serving them. Data might be insecure. So what gets done we’ll do it with it. We’ll wait until DRS has reached a good enough entropy. And from that point, it’s when they will block, which is the right.

Justin Beyer 00:07:40 And you kind of hinted towards this earlier. What are some ways that PRN GS are attacked, for example, or oranges, for example, our attack, you kind of mentioned predictability, but how would someone go about determining that one of these things is giving predictable output? Yes.

JP Aumasson 00:07:54 Sometimes it’s very difficult to know if you have good randomness, because what happens, you run an application and you have to generate some random stuff. So you don’t necessarily just, you know, return the random bytes that you get from a PRG. Sometime. I know you want to generate a number of his specific format you want to generate, I don’t know, some random inmates or random data, so sure. And you want to make sure that all this stuff in regenerate is distributed uniformly, which means that if you generate a random number between zero and four, so it’s your 1, 2, 3 or four, you want each number of to have the same property alpha, you know, being returned by, by your program. And it’s not always trivial. Um, cause for example, if you’re generating a random, I think it’s an example between two and four, you should not just take a random bite and you know, to direction because not all numbers without the same chance of occurring.

JP Aumasson 00:08:50 So what you want to do is use this example, you know, rejection, something just specific needs that you want to use to generate union from the random numbers. And you can generalize this to, you know, trending and random stuff, random objects, uh, and maybe a second. That’s a problem that I often found the new years ago when doing security at this, I think out there much, that’s a thing today because developers know how, how this works. And, um, you also have libraries using, uh, uh, get, you know, story oranges. But this problem is that a few things a week PRNG, so you have two types of guarantee. So random number generators, you have cryptographic ones which are secure, meaning that if you have a long stream of bytes and you are out of it, you cannot determine the rest. However, if you have a non cryptographic PRNG, uh, like the one in mass, like the one in the C uh, to the library or random, what will happen is that these, they would be very fast, but they would be completely predictable if you have part of the output. So let’s say a generic one, kilobytes, you know, the first, uh, the first six buyers, then you would be able to determine the rest, uh, with something like the merchant twister. I could pick a few questions. You want to make sure that’s used as throwing a guarantee. So for example, in the roast, you want to use RSR and G as you run down, of course, he’s a cryptographic, and that was a very government mistake.

Justin Beyer 00:10:11 So just kind of changing directions a little bit here now that we’ve kind of laid the foundation of randomness and how that impacts crypto overall, I want to kind of dive into encryption a little bit now, back in episode 3 78, uh, there was a discussion about public key infrastructure and some discussion on TLS and how the protocol works overall. So we’re going to kind of table that and not really dive into that portion of it. I want to talk more about the underlying functions of encryption and some of the different pieces of it. So generally the term that I hear brought up the most in the world of encryption, at least at the base level is your ciphers and your cipher suites. So can you kind of define and explain what a cipher is and what its goals are?

JP Aumasson 00:10:54 Yeah. So cipher is a very gentle down to denote any algorithm or extortion that gets you, you know, this notion of encryption whereby you take a context and you obtain a separate text, it also applies more generally to systems that include some depth of authentication, a signature. Why don’t we talk a cipher suites? So that’s usually in the context of protocols, such as TLS versus stage, where you can configure your system by attending the openness of the combination of ciphers, or most specifically the combination of cryptographers and that you want to use. So, for example, in the older version sent you this window to one window to, you could say, for example, oh, I want to use RSA for the handshake. And then I want to super up, um, you know, IAS, CBC and RC four, and you would list your different combinations of various terms of safer streets in specific order, you know, uh, whereby the ones at the top priority tasks compared to the others. But that’s usually, I mean, that’s done for historical reasons, you know, where you had assistant what in different types of algorithms, but today we tend to avoid, you know, asking the user to choose what first they want, because there’s a chance that they take the wrong combination of ciphers. So you should look at Telus windows three, you had very few Cypress rates, I think four or five, I’ll say, and all of these are secure and enabled, and you, you know, you can shoot yourself in the feet most of the time.

Justin Beyer 00:12:28 So essentially as technology has advanced and the world of crypto has started to really be more forefront on these common web technologies. A lot of these things have been abstracted out because again, and we’re going to kind of discuss some of these insecure ciphers weeds later. You don’t want the developer selecting, you know, triple Dez or RC four is their encryption algorithm. Since we know it already has issues,

JP Aumasson 00:12:53 Even though sometimes you might, if you deal with some legacy system, you know, you might’ve found a choice. Uh, if you went to communicate, which is all device thing around your company, so there are some cases where you want to use them, but just be careful,

Justin Beyer 00:13:06 But just kind of changing subjects a little bit. So I usually hear ciphers discussed first, and then I usually hear that there’s the two big, giant groups of ciphers stream ciphers and block ciphers. Can you kind of discuss what the difference between the two of those things is how they kind of work and then we’ll dive into some examples after we kind of discussed that base?

JP Aumasson 00:13:27 Yeah, well, typically today, the good news is that as a developer, if you walk in a way, I don’t think it shouldn’t normal mobile app, you don’t have to worry too much about, you know, looks like for students at first, but he started G uh, they were from two very different facets of, um, applications, uh, stream ciphers, the way they work, uh, turn per day message. So you can see them as some kind of silhouette generator. Uh, what it means is that it’s with the gate secret key and not about you, that trickled the nonce, uh, number of use anyone’s and using these two values to stream cipher, which generate a bunch of infinite stream of silver or an unbias, or I’m looking by, and to add, you will just take these sort of random key stream and exonerate for the fintechs. And decryption is the same, because then you can cancel out then the like Ustream XR portion and the way it was designed, it was just a way to simulate or emulate the one-time pod, which was, you know, this system used to during the cold war whereby you had, you know, a suitcase and a very long string of base, and then you use would use it only once to Excel the message with this long sequence.

JP Aumasson 00:14:37 Some block ciphers are much closer to the encrypted notion of encryption, where you think a message and you transform the message using some quick deliveries. And it’s called BlockCypher because it’s actually defined as another person takes a, a block. So typically have 16 or 32 bytes, and it transforms it into another block of 16 and 32 bucks. But I mean, we’ll go back to the state, or you can combine the two and credit block industry ministry, middle blocker. So,

Justin Beyer 00:15:04 I mean, yeah, it’s kind of less common today that you’re going to see this hard delineation between the two, because in the protocols they eventually get mixed together. And as you kind of mentioned, we’ll probably discuss this a little bit later on in the episode, but I heard you mentioned the stream ciphers being like a PRNG S and you said that you use a key and a nonce. So is that similar to how with a traditional PRNG you would initialize the state

JP Aumasson 00:15:29 When you took a PRG, for example, the, um, Sonoma generator in the Linux system, you say combination of different component, the one that will, you know, look for. So none of data from the environment, the system that would maintain a state, you know, uh, quite a big state appropriate advice, for example. And the last component is the one that is to close it to a stream cipher. It’s what we usually call a deterministic current of the generator. And this system behaves very similarly to a student first. So it takes some, uh, input value that is called a seed in the context of PRNG, uh, the context of two insights where you would call it a heat, um, and then regenerative downstream of it. But you can really imagine having some alums or adjust system where one of these components, the last one is very similar to a student’s effort, and they should look for example, in the open BSD, um, uh, Colonel PRNG. So it relies on judges judge, the 21 or the two to generate randomness and discharge at 26, actually what you find from cipher.

Justin Beyer 00:16:35 Okay. So they’re actually using a stream cipher as part of their PRNG and open BSD.

JP Aumasson 00:16:42 Yeah. Interestingly, if you look at the core of church at 20th is something that is very similar to a, to a BlockCypher it’s kind of proposition for the same kind of thing you find in that, in the books I for,

Justin Beyer 00:16:54 And yeah, now that you’ve mentioned blocks, safers, let’s kind of dive into some examples of what are block ciphers, or at least common, like block ciphers that developers would see being used. I usually hear a S being thrown around with the most common one that everyone’s using, and then the legacy being, you know, triple Dez or Dez, which data encryption standard. Okay. So

JP Aumasson 00:17:14 Most of the time you’re going to use AEs, and sometimes you don’t even have to choose ASMR because it’s going to be a default cipher in the library that you’re going to use, but used to be a time where you had to choose AEs or triple desk or something else. I think now most libraries have made it easier for you to use a S by default. And the reason why you want to use AEs is not only because the best cipher of all time, it may not be there just because it’s a standard is supported pretty much everywhere. And in particular is supported, uh, in your CPU native consortiums, which means that you have actual a silicone dedicated to running AEs and ship, which minutes, which would be much faster than another version. So for example, it would typically run at a speed of like, if you could get vice versa and, um, a laptop, for example, and comparison, if you were to use areas without these instructions, it would be approximately 10 times slower. So that’s the reason why you use AEs most of the time, because it’s faster and it’s safe.

Justin Beyer 00:18:15 Okay. So due to it being built into the CPU makes it a lot faster. And, you know, I mean, it is the advanced encryption standard, so it has been standardized and I know NIST has done documentation on it and other things like that, but just diving a little more into AAS. I’ve heard a lot of the underlying functionality without going super into the math here. It has like mixed columns and some of these other functions. Would you be able to explain like, kind of how that all fits together to create the encryption standard? Yeah.

JP Aumasson 00:18:42 Maybe instead of explaining specifically how a squirrels, cause we would talk about finance fees and put in a mouse and see that’s my get listeners a bit lost in math. You know, when you design a site for like AEs and marginally, when you design something like hash function or any cemetery create scheme, you don’t design a big storage room that those does apprenticing. You want to minimize the go sides. You want to minimize, um, the kind of stuff that has to be analyzed in terms of security. So you would design a small piece of transformation that’s we usually call around and you would iterate this in ASMR, for example, you have 10 rounds and then each shrunk can be broken down in four PCs, which each age sheet or different security property. So you have one, one piece of what we call diffusion, which when it would spread the differences all across the state of , which is a 16 byte, six by value. And you have another as that achieved, once we go confusion, uh, and corresponds to cryptographic streams by mixing based in a very strong away, but not diffusing the changes of course, the state. And what you do is we use something called X-Box or sophistic substitution boxes, which are kind of tiny blocks at first that you use to transform eight base into another ABC inside of yes.

Justin Beyer 00:20:00 Okay. So it’s kind of back to that whole concept. Encryption is all a permutation and, you know, kind of leaving the math out specifically of AAS. It’s breaking things down into these little boxes and then doing specific functions on them to derive certain security properties, which we are going to discuss a little bit later with security goals and security properties of these algorithms. But I just want to jump back a little bit before when we had started talking about block ciphers, I heard you mentioned modes of operation. Can you kind of dive into what that is and how that impacts how the cipher works?

JP Aumasson 00:20:33 Yeah, that’s a great point because if you take a cipher, a BlockCypher alone is quite useless because it would just turn from one block of 16 buyers and 200 block of 16 by, but most of the time your messages would be longer than this. So how do you do the very nave way to do is just, you know, split your message into blocks of 16 bys and then encrypt each block individually, independently of the others. So that’s what the multiple ECB electronic guidebook is doing. But as the, not the very secure mode, because if you have two blocks, which are identical, then the output blood loss will be identical as well. So not that girl might be able to pinpoint, which looks identical in the Pantex. So that’s usually you just started using the image of, uh, the ducks pink when to see that tissue and grip ducks and each blocks, then the corresponding image has a bit, math would be we’ll have a similar shape.

JP Aumasson 00:21:25 Uh, so what you want to do instead is use the mode as it’s a bit more offensive, for example, the CBC mode set for blockchaining and this has nothing to do with blockchains. It would just change the book by taking the cipher text on this first book and doing an Excel with the, uh, subsequent then next block as well. And you have another block that, uh, another most, sorry, which is called the counter mode and is quite different. It actually doesn’t walk on the message block. So what the counter mode is doing is create a stream cipher based on a black cipher, and it’s a completely different approach.

Justin Beyer 00:22:00 Okay. So essentially never use ECB anymore, if you can avoid it. And then you’re looking at either a CBC or a CTR mode when you’re using something like a S

JP Aumasson 00:22:11 Yeah, unless it’s very specific case, uh, when you encrypt small pieces of data hub, you know, in front of story, we are doing a security data with other people and my colleague and another cryptographer he is, has meet, find a critical vulnerability and I asked him again, what did they encrypt to PCB? Ah, they encrypted enough, uh, four to eight directors said, well, that’s like, you don’t need a majority. So that’s the only exception where,

Justin Beyer 00:22:41 Yeah, I’m sure there’s probably somewhere in the PCI PINpad standard. That ECB is still acceptable for small data inputs. But as you kind of mentioned there, so we’ve kind of discussed these modes of operation. And you mentioned how in a block cipher, everything’s split up into these nice little blocks essentially of, you know, a certain size 16 bits. So what do you do if your message doesn’t fit nicely into that?

JP Aumasson 00:23:04 Yeah. So what do you use it love? So you’re going to use the mother CBC or, or the counter model, and maybe you can briefly explain how a counter mode is is working, because it’s quite interesting to see how you can trade a stream cipher from any block cipher. So the way it works, you are not going to split in a station two blocks at all. You’re first going to generate a string, a friend, and base like you would do with a stream cipher and the way you do it. So your first block will be, for example, if I use Euro and you would like put the value zero with, um, what your key, and then you’re going to, and for twine, you’re going to encrypt two and so on, uh, as blocks of six advice, and you’re going to take the results as a long stream of advice.

JP Aumasson 00:23:44 And then you’re just to exhort a stream of bikes with your message. Uh, there’s just one catch here. If you encrypt multiple messages, you don’t want to have the same stream of bias as it would be in secure. So instead you’re going to use a nonsense number or that’s, you’re going to use only, only ones for each encryption instance. And instead of just encrypting zero and one and two and three, and so on, you’re going to encourage the first blog, the combination of zero with your knots, we hear sometimes called the Nivea. And this will ensure that the synonymous between you are going to generate will be different for every message to be encrypted. And that’s how the AEs gutter mode works. And that’s how the EDS GCM, what works on you might be familiar with GCM, which is the standard use pretty much if we are lacking TLS and NTLS and the SSH.

Justin Beyer 00:24:34 Yeah. So I’ve heard a S GCM thrown around, I believe it’s the gloss counter mode of it. Can you kind of talk a little bit about why you might use GCM mode?

JP Aumasson 00:24:44 Yeah, you can. So you have very good reason to your GCs. GCM is not just a, an encryption mod. It’s an authenticated encryption box, which means that not only you’re going to encrypt the message, but you’re also going to generate a Mac. So messages don’t think it should go. Somebody who would act with a low use, you verify that the message has not been modified. So this nurse might be familiar with H Mac, a hash based mark, which is a way to produce values that are, you know, symmetric is Cynosure, so to speak. Um, if I hear that you regenerate when you receive the message to make sure it’s not been, it’s not been modified. So ESG cm is just a way, you know, a 2 81 a verdict, um, whereby you both encrypt the message and tonight is stack. So you don’t have to combine ESG cm with H mark, no, of any Microsoft verbose is doing everything for you. And that’s the most simpler way of so much more efficient way than combining CBC and HVAC, for example.

Justin Beyer 00:25:38 Yeah. And that’s actually probably a good point to start talking about hashing and then kind of working into that discussion on keyed hashing and how that impacts things like these authenticated hashes with, you know, using H makin combination for ASG cm. So just to start off and set kind of that base level, how does a hashing algorithm work? I mean, you commonly hear like, oh, I’m going to hash this piece of data for the purpose of integrity, but how does that kind of work under the hood?

JP Aumasson 00:26:02 So I would say it’s maybe the simplest type of algorithm you can, you can imagine, and you can design in crypto because it just about taking a piece of data, any size. And from this piece of data, for you saying the value of fixed short size of typically a 256 bits or the case of a shattered five, six late two is kind of the same and the products or not. So it’s pretty much the same kind of approach you’re going to find in a look cipher. So these are portions where you mix base in a way that is a highly non-linear and cryptography Narnia, uh, usually means six-year-old, which means that you have a very strong, a very complex relationship between the input base and hash function behaves randomly, but meaning that not, uh, not meaning that it’s probably sticks. So if you have twice and input will have twice an outpost, but the Durham is deterministic, but it would be in such a way that you cannot predict the output if you don’t not input at all. So let’s say you have three different messages and have only one bit of difference. I may not have heard the first one, but the will not tell me anything on the hush of the second one, because it would be competent different, uh, from the first hash. So any small change in the input, we have a complete, you know, a huge difference on the output. So it’d be as in this kind of unpredictable way, that’s you need a, when you build a Mac, for example.

Justin Beyer 00:27:26 Yeah. And usually when I hear hashes discussed that there’s two different families of them, a compression hash versus a sponge. Can you kind of explain what those are and how those differ? Yeah, sure.

JP Aumasson 00:27:39 So what referred to is the way you construct the hash function internally, because if you look at highway for us, uh, when you’re going to process a big message, you’re not going to directly read the message and do something cryptographic with this gigantic message what’s you’re going to do instead is lacking. It looks at where you’re going to split the messaging two blocks, and you’re going to process each of these blocks interactively in, um, in your heart function. And did you find these to refer to, um, I mean, historically we’ve been touched functions using what we call a consumption function. So compression function is like a hash function, but we, they fixed that input. For example, if you take the case of, um, 2, 5, 6, I tasted, the one point is what we call the of value, which is essentially the state of the hash function and itself.

JP Aumasson 00:28:29 So to say device, and it would take in the second input, which should be the message block and this message book with the 64 bytes, if I’m not mistaken and the outputs would be another substitute by value. So you’re going to compress a combination of the state and the message block to get to another state value. And you’re going to, we’d already suffered domestic blocks. So internally use something that is quite similar that way it BlockCypher the other hash functions like Shastri also go to gift stock. They work a bit different. So you rely on something called a sponge construction, where what used internally is not technically a BlockCypher, but it’s a Burma edition. So let’s say for you just permutation with a key and a proposition is adjusted Lexapro without a key. So the thing that’s the kind of relies on the same types of Bitwise worldwide information, the differences that the permission is like a bloodsucker without a key, which means it’s a bit simpler to design. It has to have different security properties.

Justin Beyer 00:29:27 Okay. So essentially when we’re inputting a key into something it’s used as a way to almost randomize the output in such a way that you’re not necessarily worried about from a permutation perspective with hashing, cause you want the same input file result in the same output versus with encryption. You don’t necessarily want that, right?

JP Aumasson 00:29:46 It’s just maybe one detail. When I said that you use a BlockCypher into consumption of a hash function. The important detail is that the key of the block effort is not secret. Uh, we use the key slot, for example, for the message block for the Channing value, but it’s not necessarily a secret value because you can also construct half functions where you take a kiosk parameter, and that’s, um, a technique that you will use to generate, uh or sort of an, a function, but the key will be processed a bit differently.

Justin Beyer 00:30:16 Yeah. And now that you mentioned that message authentication code, how does that kind of differ from the traditional hashing where, you know, you’re doing a keyed hashing, which as you kind of mentioned, that is the same for a compression function, no matter what is it just that with something like a keyed hash, you know, doing H max you’re doing a secret K instead.

JP Aumasson 00:30:37 So historically we wanted to build a kid hash function to, to get max, to get to know the functions and to use that as Macs, but we didn’t have any, and cryptographers thought about the idea of just taking a hash and instead of just flashing a message, concatenating, a keyword, and a central message with a key, and then coming into today and having a VRF or a Mac. But they realized that these native consumption was not secure in theory, at least. So they set out to design something for which they could have some nice mathematical proof security proved that these design fulfills all the security goals you want to achieve. And they ended up with H mark, uh, hash Bismarck, where you combine two hash functions. And these, you have this nurse’s construction whereby you’re gone through the first harsh, and then you harsh a second piece of data that it was the first pass. So it’s, from my perspective, it’s a bit over-engineered. And historically we had because we wanted to have these security proofs, but now we have, if you look at modern hush functions, ablate , you don’t need the Haitian construction. So you don’t need to be because you already have a key mode designed, uh, specifically, uh, we have the hash function. So it’s used Victor, you can do it, but you can directly use the key version of day two, which would be as secure assets.

Justin Beyer 00:32:03 Okay. So as time has gone on with most things, the way that we approach it is kind of changing where HVAC is starting to get less important with these specialized functions. Whereas instead, you can use these more modern hashing functions to achieve the same security goals, which we are going to talk about a little bit later when we start to discuss how an algorithm is, kind of proved out in provable security and those concepts. But I just want to cover a couple more things before we go into that discussion. So we’ve kind of discussed using hash functions and we’ve discussed them for data integrity. Now, I usually hear the one common use at least is the concept of hashing a password and using assault. Why would you want to do that? What are some situations in which you wouldn’t want to do that? What are some algorithms you would want to use versus not use? Yeah.

JP Aumasson 00:32:50 Yeah. That’s a great point. And maybe that the most important question in the, in the whole discussion, I think what’s important here is that even each other, sit your hash functions just as you bless three, it might be situ in a context of general cryptographic applications, maybe corrosion resistant, pretty mature systems. You name it, but you don’t want to use this gal joke or postscript of every cash for Hashi buskers. And the reason why it is, is that cryptography cautious is designed to be fast because most of the time, that’s what she wants. You want speed. You want to, you know, predict environment and do as few competitions as possible. But in the context of passports is quite different because the stuff you’re mashing passwords, they have low entropies with who we want to refer to entropy again. And it means that if the hash function is very fast, very efficient, then not that girl that knows a hash, then they might be able to brute first your password by trying many, what is your password repeatedly?

JP Aumasson 00:33:44 So what you want is instead is something that we call password hashing function, and it’s designed specifically for, to use it as passport, where you want the function to be relatively slow, and also to use some, you know, a fair amount of memory to make password cracking much harder, or if you have a GPU, BGR and other specific, uh, device. So what you are going to use instead is something like DVD of two, which is a nice standoff okie-doke regression. So it’s been called T the revision, which is quite similar to possible lashing. And now we have modern Android phone such as I went to. So I wouldn’t do was designed specifically to be secure against a GPU FTGS. And the way it works is that it takes not only passport. It takes another value that we call it salt. So for each bus word, you’re going to use, you want to use different salts. And it also comes with two different, uh, two additional arguments. So one for the speed, how many directions you want to do, how slow you want the, the hash function to be. And the last argument is the memorization, because you want to make different known, slow, but to foster competition to use, for example, 10 megabytes, or even then you get back to from memory to make it even harder to crack, uh, by possible hashing, CRA password cracking techniques.

Justin Beyer 00:35:02 Okay. So the discussion around using these key derivation functions with salts and such the goal is to get one, an algorithm that’s hard to brute force easily. And to using the salt kind of gives you that pre computation resistance where someone can’t, precalculate an entire table of hashes and then kind of just match the two up. Yeah,

JP Aumasson 00:35:23 Exactly. That’s a aggressive Murray. You’re going to use a soul because otherwise, uh, do you have like your nose not venture, which assumption you’re going to use? So they know in advance, I teach harsh to basketball. 1, 2, 3, 4, you’re going to end up with a specific house ready. So the salt is just a simple way to simulate the use of different hash functions because instead of using , you just have to have it to use different salts and the hash function would be competent differently, which the attacker doesn’t know in advance, uh, what he’s up against do.

Justin Beyer 00:35:55 So now that we’ve kind of laid this foundation, we’ve discussed encryption, we’ve discussed hashing, and we’ve kind of hinted to it throughout the show. I want to kind of start talking about security goals and how we actually validate algorithms as cryptographically secure and valuable. So can you kind of discuss the difference between the view of like informational versus computational security and cryptography?

JP Aumasson 00:36:22 Yeah, sure. Maybe one example, um, is maybe two, the simplest one hash functions. So let’s say I have some, um, some data or some best phrase I want to cash and how’s your pass phrase. Then I give you the hash and let’s say you got plenty of fan of time on your hands. We even have, you know, infinite time and you have infinite competent borrowers. So the question is, can you, can you recover my, um, the basketball or the best phrase I hushed? Uh, and the answer is no, because by what we call the pigeon on the pitch of all principle, since there are many more possible different inputs to the harsh than possible outpost, then there, when there would be for each possible output to many different inputs. So even if you have infinite competent borrower, you’re never going to find the input that I use unless you know, what the effect is.

JP Aumasson 00:37:14 That’s what we call informational security, which is security. Even the, even India Decker has, you know, an infinite amount of, uh, competent power and infinite time and competition competition security is just a way to set out, okay, if you have INFINIT competent borrower, you might find a solution to my problem, but in practice, you will need to carry on know a huge amount of competitions that will take you millions of years. So we just accept the risk and we say secure. So maybe I can briefly explain what it means in the context of, um, of encryption. So I get an encryption key off 1 28 base, uh, so we can of course try all the possible keys, but it would take you, you know, you will need to iterate over two to the power of 1 28 possible keys, uh, which is, uh, a number of, um, trying to think of a comparison versus saying there’s been only to the 80 or to do the 90 something nanoseconds since the creation, um, since the big bang, like 40 years ago. So that’s, um, quite high amount of competition.

Justin Beyer 00:38:21 So essentially it all comes down to, you’re kind of taking two views of an attacker. They either have infinite computing or you’re kind of looking at it from the realist point of view where you’re saying realistically, no one has infinite computing power. What’s the limit that someone would need to get to, to actually be able to successfully break this algorithm.

JP Aumasson 00:38:42 Yeah. So usually be considered out to 1 28. Bit of security is enough. So generally 1 22 security just means that you need to carry out of the order of two to the 1 28 operations, whatever your person is. So you should look at AEs with , then you’re fine. You should look at equity curve cryptography with a FGF 256 base. Then you would get one 20 minutes of security. One big difference is RSA. So of course, if you have an RSA key of 1000, 24 bit, you’re not going to have 1024 of security. You’re just going to have a key size key of this size, but the secrets of the motorway would be much slower. So for example, once you get with RSA one dozen 24 is I think the last two proximity yet ATVs of security or even lower, I think it’s much lower than this. So you don’t want to use, obviously 1 10 24, you want to use 2048.

Justin Beyer 00:39:41 Okay. And that just has to do with how the underlying mathematics of the algorithm and key derivation are occurring. It’s just the fact that RSA with the nature of it, it’s not a direct correlation like it is with a symmetric algorithm.

JP Aumasson 00:39:56 Yeah. This is competent dear friends. So a key size by a key for a stream set for a BlockCypher, it’s just any, any strain, you know, often of the size, um, other first crab D size. But if you take the case of, uh, you know, RSA, for example, the keys, not any string is a specific type of member or specific properties. So specifically crime numbers, it’s not any value of the size that you is. Wouldn’t be if I need to, um, would be if need to buy and the best docs to break RSA and the best docs to break ’em to regulate crypto. So they’re not, they’re not just no, brute-force like attacks. There are attacks that really explored the mathematical structure of Verizon. So if he wants to break RSA, you don’t virtuous or serial numbers. You, you dress something in motion motto, which is called the number of feces.

Justin Beyer 00:40:50 So why do you want to use something like elliptic curve over something like RSI?

JP Aumasson 00:40:55 Yeah, that’s a very good question. And that does a lot little DBAs in the crypto community RSA. Yes. And if the curves, so that difference you, you got to do exactly the same, cannot say, uh, RSA gets to encryption and signature, which means encryption. Then again, you hook them decks and you do the RSA version and you ended up with a separate text. So you cannot do this with elliptic curves. Curves can be used to encrypt stuff, but indirectly, essentially you would do a kind of different amount. And then you would use a chef’s secret to encrypt the message you want to go and try and correct. So both RSA and elliptic curve crypto can be used to copy signatures. Uh, but we have a big difference. So with RSA, he’s no private to get public key and to send a message. You use this private school version, which is much slower than the one you would do with a decaf crypto. However, the verification time, when you verify the signature, it’s much faster with RSA than with either 25 or 19. So if you already care about signature verification speed, and if it’s really important, uh, in a product, then you might prefer to use our existing insurer than the ethic of same to show. But what we see, you know, in most applications is a for example, blockchain applications, is that

Justin Beyer 00:42:20 Okay? Each of them gives you different benefits of speed and cost, depending on what functionality you’re trying to use, what the algorithm.

JP Aumasson 00:42:28 Yeah. And in both cases, you have to be very careful to department or to use the and you must have one to pay attention to such a taxation is about exporting, you know, some, um, information, some physical information such as the extinction time and both RSA and elliptic curve. If the should dive depends on the private key, then you have, you know, quite powerful, the tag that will, you know, that you can reuse that we can use to forcing insurers or to directly recommend a product.

Justin Beyer 00:42:59 Okay. And we’re going to actually kind of discuss that when we start talking about black box and gray box attack models, but just before we get there, can you kind of discuss how an algorithm in the crypto community gets proven as secure?

JP Aumasson 00:43:14 So breaking news, we never actually you’ve already proved the security of an algorithm. We just proved that when a gray zone is at fees, as hard as breaking another one tourism or office, as far as solving a math problem, that we, we hope we believe to be hard enough. So like the factoring problem and RSA audit script that goes on for DPN van in, for the teacher of crypto, or sometimes, you know, the marketing department, uh, it goes a bit overboard in to say, oh, this has been proven secure. Whereas in reality, what’s been proven secure, just let’s say one type of a doc on one or one specific security notion. So in the case of, um, symmetric cryptography, you already have, you know, full Phillip security proof. So most of the time we just rely on the faint at dense, by many crypto breakfasts for many years.

JP Aumasson 00:44:10 But since we don’t have this mathematical structure that you have in RCRA tickers, you have no really way to, to proof stuff because it’s a complete mess. You have a very complex relation between the input and the output, but you don’t have a math structure or suggest exponents, finance field and stuff that you can leverage to, to say something about the security and that’s what you copied manipulate crypto. And then again, you might have a proven paper, virtual your system might end up being competent, broken in practice because of software bug because of bad parameters. Um, so it can be a bit disadvantaged to say, oh, how does private secure server, but then in practice, it’s a golf, it’s a different game.

Justin Beyer 00:44:50 Yeah. Traditionally, I hear the example of things like elliptic curve, where if you select a bad curve or a non-secure curve, it defeats the entire purpose of the algorithm. Even though you might have a provably secure algorithm, it comes back to implementation on the parameters that you’re using for it.

JP Aumasson 00:45:06 Totally just the type of care you want to use. So, uh, the first thing you you learn at the elliptical school is not to design your own turf and you say, stop standardized carer. So Don negotiate the curve parameters. And even then I think last week I looked, um, at a very nice article about, um, one of the most widely used a teacher based English, a scheme called ed twenty-five five 19, uh, which is the state of DRC neutral scheme based on the 2,519, uh, which is safe and no really reliable, just pretty much everywhere. But if you look at a standards, uh, if you can different specifications, they will tell you different things regarding testing regarding input validation. And this article, the, I think they looked at maybe five or six different inseminations and all determinations behave differently in terms of input validation and what type of workout, what type of singers are you to write to be verified? So you have to be also very careful, um, when funding data inputs, and cause you have specific public keys that you shouldn’t, that should not be used. So you should receive a public key. You want to verify that you say valued one, otherwise you might end up with, uh, let’s say in signatures that will be accepted when they should not be accepted. For example,

Justin Beyer 00:46:25 It’s kind of, again, as we’ve discussed already that implementation, and I’ll definitely link that article in the show notes. And I’ll also link some information on how we discussed earlier with the number fields, but just kind of changing a little bit of directions here. You’ve mentioned the concept of security notions and we’ve kind of discussed, you know, side channel attacks. And I mentioned tack models before. Can you kind of talk about what attack models are and black box versus gray box?

JP Aumasson 00:46:52 Yeah, sure. Typically when we approach security of aside from when we make a security claim, it’s like, you didn’t assess a cipher. You don’t just say, oh, it’s secure. You say it’s secure with respect to this security gold against this type of attacker. And when we say type of attacker, we have well-defined attack models so that God has a frame encryption, uh, maybe the weakest, the weakest other care we can imagine is the that only suffer techs. That’s what we go, no insight from textile, several texts on the model. But in practice, you know, I read that like Europe, typically they, um, inject cotton, they can modify content and they can drop packets on the network. So you’re dealing with active attackers and two-minute active at the girls. We have, um, different notions who have what we got, shows them separate texts. The children’s separate tax is it kind of article that we mentioned has access to a black box and this black box will give them a safer place for any plan text to send to the box so they can send any number or any reasonable number of, of, uh, of them fintechs.

JP Aumasson 00:47:54 And they would get a second text. So in practice, you might not have access to such a black box because you don’t have access to the key that may just some reds limiting other system. Maybe you don’t feel safe with the X, maybe you on do part of it. But if you can prove to yourself that this is secure and this complex, then it would be secure in any model. The way I do have that corrals less power than a full doesn’t think takes to overcome. And that’s an even stronger mother that we call a CCA chosen Sanford next attacker. So this one makes sense to people who are not familiar with cryptography, because they’re like, well, if you can decrypt any separate things. So if you have this black box that to which you can send any cipher texts again, the blend text, then what’s the point of, you know, I think he decided for a full, because you can take her up to the message you want. So that’s right. That makes enough sense. And, but you have some cases where, for example, DRM, where you can decrypt, you may be able to decrypt for what you want, but what you’re, what you’re after is the key itself. It’s not just to confirm capability. And then again, if you can prove, say for sexual or indisposed from an active attacker that can decrypt anything they want, then it gives you a pretty high level of restaurants in the context where you have less, uh, less capabilities in this.

Justin Beyer 00:49:12 But what about if an attacker let’s say is able to compromise that key, let’s say that they do defeat your cipher. How do you prevent them from being able to then decrypt everything going back in history?

JP Aumasson 00:49:27 Yeah, I mean, interestingly, if you look at, um, for example, messaging protocols, like what you have in signal watch in WhatsApp, which is pretty much the same thing, you make the assumption that at some point the attacker might have access to your device or the attacker might have access to some of the keys that used to encrypt the messages because the protocol signals each single message is encrypted using a different key. So now you’ll have this notion of backward secrecy and for secrecy, which are quite different. So for secrecy is about ensuring that the other girl has access to your system, to your current keys. Then they will not be able to decrypt the best messages and they will not be able to compute the previous keys. So that’s for example, achieve by rotating keys, uh, for example, by hashing the previous key and getting another key in.

JP Aumasson 00:50:19 And so, so back for our secrecy, backward security, which is also called, uh, sometimes post-compromised resistance or future is a bit different. It’s like, do your budget and much harder to achieve. So it’s about making sure that another girl that had access to your device had to compromise your system. At some point, maybe they got a snapshot, um, at the empty of your system, then you don’t want them to be able to predict the future keys in your system and that’s achieved quite differently in, um, for example, if you have a full compromise of your signal or what sub-state, then in theory, you gathered achieve, you can achieve this kind of, uh, secretary notion that you need some, uh, different assumptions, but then that’s quite different from why did we discuss before I disappear in DCCA? So that’s security protocol vs the security of, uh, often encryption.

Justin Beyer 00:51:15 Okay. So when you hear those things like perfect forward secrecy, or as you kind of mentioned, you know, the reverse of that, those are more about the actual protocol implementation of something versus what we’re talking about, where you have chosen ciphertext attacks or known plain text attacks, or what have you, where you’re actually trying to defeat the underlying cipher. Yeah,

JP Aumasson 00:51:36 Totally. Any, if you look at this relatively simple case of, you know, messaging encryption, it’s, it looks quite simple. Uh, you know, I sent you a message. You send me a message. I mean, how God is going to be. But if you dig under the hood is terribly complex, you have a symmetric encryption scheme, you have the fear, man. And you also want to manage the case where, you know, so you make the assumption that you have, you know, going into delivery and you know, delivery, but at some point, one party must be a flying, maybe offline. So you have to kind of, you know, queue the messages and make sure no messages lost. And you have to, you want to some extent guarantee that you would equip messages in the same order as it were brought to you. Uh, so you have a lot of things to deal with and it’s really complex script of protocols and internal. It also means proving the system to be secure, I guess

Justin Beyer 00:52:25 Yeah, because there’s so many moving parts, it’s kind of hard to just say, well, this big number math problem that we know is hard is, you know why this whole algorithm is completely safe. Whereas it’s more about how they interact with one another. Yes.

JP Aumasson 00:52:40 30. And that’s what we call a composition in the context of crypto. So you might have, you know, a piece one at a six year old or a B student, a secure tick individually. But when you combine them then, uh, yeah, because of some interruption, we have two protocols. The combination is not as secure as what, what you would expect. And I think one of the tools used to prove the security of this is, um, thermally off, um, form of rotation techniques, which are going to have mathematical proof. That’s your protocol under some reasonable assumption will behave in a secure way. And we have this kind of for, for example, for the signal protocol.

Justin Beyer 00:53:16 Okay. And I’ll actually try to get some documentation and include that in the show notes, in case there’s any listeners that are interested, but we’ve kind of discussed security goals a bit. Can you kind of dive into some examples of what those might be?

JP Aumasson 00:53:29 Yeah. Great. But I was thinking about it actually. So we talked about security in the CPH was, and several lectures on ethics, but the, what does it mean to be secure? Uh, so maybe the intuitive notion is that you should not be able to make a run in a Sage from then texts from the sacred text, but more presidency. How do you define this, uh, in mathematical terms? So the definition that cryptographer series is what they call chosen, what they call a semantic security, which is technically the same as, um, what they call indistinguishability in the chosen Pantex. Uh, but to explain this in very simple terms, what it just means that if you see a ciphertext and even if you can do, she doesn’t send text queries yet, the girl should not be able to determine any single bit of the, where it should only learn is the approximate size of the message that was encrypted because the same from text is approximately the same size as plant, but otherwise they should not be able to say anything about how the dentist looks like. Uh, if it’s a, you know, on the zeros on only ones. And that’s why it’s called indistinguishability because the way we formalize it is by saying that except for decks and cryptic messages should look the same as random values. So we proved that, uh, an encryption scheme is secure by coming up with a mathematical proof that said from techs are indistinguishable from truly surely random numbers. Should we run them by?

Justin Beyer 00:55:02 Okay. So it’s kind of that, almost that entropy discussion again, but instead you’re talking about whether there’s a relation between the actual plain text prior to encryption and the plain text after encryption.

JP Aumasson 00:55:15 Yeah, totally. And if you look at the whole point of security of similar extremities, so we talked about this know fancy notion, semantic security, a chosen Sanford text, you name it, but ultimately what you want to have is a transformation that behaves completely randomly that behaves like if you are taking a random function from the set of all possible functions. And if you talk in terms of, you know, mathematical notions and miss, you should have no correlation between the input and the output, uh, in terms of a physicist, you should have no symmetry, which means that there should be no single property in the input that should translate in another property in the output. So it should be completely random. But the flip side is that because you have such a probiotic, you have something that behaves in the gum 50 random chaotic way. Uh, so you have no structure that you can explore it, to prove it formally to perform any that it’s a hot problem. And so the listeners might be familiar with the class of, uh, hot problems, notice no empty coffee problems, but you will not find you and not find any complete problems with also screams I first, because it don’t lend themselves to this kind of analysis.

Justin Beyer 00:56:28 Okay. So essentially these symmetric parameters, there are a lot harder to draw these direct math correlations to them. Okay. So kind of moving into a discussion on some active breakage of these ciphers, can you give some examples of some of the common attacks that you see against different ciphers, like block and stream ciphers and even hashing algorithms?

JP Aumasson 00:56:53 The, when we took off, you know, security failures with, uh, I’m just going to go to them. I don’t think I’ve ever seen a symmetric algorithm being broken in practice in really a really know realistic scenario where it was broken by spotting of in decipher, uh, most of the time and what we see all the time, what we see fairly frequently is either bad intimidations or key is the topic point generated. For example, last last week I found a cryptocurrency, what I, that was using AAS key on the right size, a dimension of AEs, but the way the QS computed from a password was quite weak. And you can brute-force the passport and therefore determine the key. So that’s the kind of thing that we find. Uh, another example is when you use a GCM or where you use a stream cipher electrons for 20, you want to use a key, has a secret barometer and also a nonce that is unique for each, um, each time your encrypt a message. Uh, so unique of course means it should be, it should be different. But what we sometimes also observe is that this nonce is hard-coded, hard-coded in the source code, which means this is not unique. And then it’s difficult to propose a fusing, a stream cipher. So yeah, the upshot is that what you should be careful is a less than the name of the algorithm you use is how you use it, how you generate the key, how you generate the parameters and also how you, how you test it with the whole thing.

Justin Beyer 00:58:24 Okay. So it’s an implementation issue that’s occurring and that it’s kind of back on the developer to appropriately test their code. And we can discuss that a little bit later on, but as you kind of mentioned, the nonce, you know, or initialization, vector, Ivy being hard-coded, it reminds me of the zero log on vulnerability pretty recently for Microsoft, where they had a hard-coded Ivy of all zeros. And I mean, it kind of goes to show that even these huge companies are making these kinds of oversights and mistakes in their code, but an initialization vector, an Ivy and a nonce are actually kind of two different things. And I hear them used interchangeably a lot just for the listener. Can you kind of clarify exactly what they are and what the difference between the two is

JP Aumasson 00:59:11 Both terms that are used about, you know, they’re quite different. So I V sometimes called initialization vector or sometimes efficient value. So the main difference, um, in the context of, for encryption, is that a Nivea, if you look at CBC, for example, the CBC mode, uh, the security requirements that the Ivy should be unpredictable, unpredictable, which essentially means random. That’s not like a cannot predict the future, the future IVs and the nonce is quite different. Uh, number of use only once in the context of cryptocurrencies. So we’re just means that you should not use twice the same value, uh, as semester you use the same key. Uh, so it doesn’t have to be random. You may just use a counter, like, you know, 1, 2, 3, 4, 5, or you can start from any place that you want and you don’t have to want it to repeat. So if you use a random, a random nonce, so you might be fine, but you just have to be careful, um, that the non should be should not be too short, because let’s say you use a non-self 64 bits and take a render value each time. So an average you will, um, you have a repetition of a nonce after due to the sorted two, um, instances. So you should only use a random nonce if the non Caesar is large enough. So typically if it’s 1 28 base, uh, you should be good after I started at 64 instances, but what’s your nature about is uniqueness, not enough randomness.

Justin Beyer 01:00:40 So is there like a recommended way that you would want to generate nonsense

JP Aumasson 01:00:45 Once you can do it? If you’re in a protocol where you have different buckets and each bucket has a unique sequence number, then you can therapy you as a sequence number of the bucket as a non. So you don’t have to generate and worry about how to manage downs. And you know, that the receiver on the back end, they will, they will know this number so that you don’t have to transmit the non-state separately because you know, it’s equal there’s request number, and that’s pretty much, uh, what, uh, what’s your next one to three it’s doing now.

Justin Beyer 01:01:14 So then kind of moving along here, how would a developer pick what cipher or mode or hashing algorithm they should be using in their application? Okay.

JP Aumasson 01:01:25 So ideally if you use a, um, a diaper off API library that people, um, call modern, uh, you, you would have a function, for example, that would be called an crypt or print the box where you don’t even have to choose the underlying, uh, to them. You don’t even, you might not even know which one it is. Um, so I think in every, like gets to a, if a function called crypto box or crypto secret box, and you just give it a key and a message, and it takes you off everything for you, however, it may not always be the case. You might have to choose a wish, a gruesome to you as a all you might have to choose. You might have to use specific adverse them because you must implement a specific protocols, or you might have to, you might, your system might only support, I don’t know, a yes in a CBC mode.

JP Aumasson 01:02:17 So what you want to pay attention to is that the key size is how you generate a key, uh, is the mode don’t ECB. But if you have the choice, uh, of the library where they originally recommend is two to the library that does bother to work for you. So you have library, like, for example, I know, um, crypto disperse or bouncy castle, or you have like, you know, you show us about secure ones and less secure ones. So you have the choice better because I very like salt or sodium, um, where you have to noodle shows up like orgasms and work on your first few, you know, good modern, uh, or stuff like, like this, like RC four, uh, by the way, don’t use RC for, uh, it was quite popular 20 years ago, but today’s not recommended

Justin Beyer 01:03:10 Given the hypothetical situation here. Let’s say I can’t pick my library and I have to pick an algorithm for something. Is that where I want to look at some type of standards, like a NIST standard, or like a FIP standard to kind of decide what the appropriate amount of security is for my data.

JP Aumasson 01:03:27 Yeah. I think if you, if you stick to the NIST, uh, recommendations and, um, the series of standards, uh, usually you’re good to go because they fill up pretty well. Um, the last, uh, you know, the recent, uh, docs and the obsolete D M D old insecure ciphers, just to just show one, I think to the most of the time you’re going to use a S a as GCM. What’s important is a mistake that I see quite often is people that just Anchorage stuff, where I dealt to encrypt and authenticate the data. So if you can use a GCM, which will get you without this team integrity and those confidentiality, then you should do it be whether you encrypt, I don’t know, database content or data, or some, some network communication. Uh, you want to use ASD CMR, Bali, which is another type of authenticated cipher, but yeah, most of the time you want to use one of these standards. And in terms of hushing, you may so care about speed performance. Uh, so small, uh, a small person will click here. So I have these, I have, uh, legs free with a number of other people. So Blake’s freeze hash function. That is extremely fast. If you have a multi-core and, and SIM the, um, instructions. So you should care about speed. You might think a different algorithm than that, if you don’t, but, um, Alexa for example, is considerably faster than just Russia to fix your harsher last fives, teammate.

Justin Beyer 01:04:52 Alright, awesome. And I’ll include some links and resources to some of the libraries and the protocols that we mentioned here. So just to start to wrap up the show here, what would be one takeaway from this episode that a software engineer should have,

JP Aumasson 01:05:04 Uh, Dawn cash passport with a normal heart function and do not hardcore the nonces in your system does maybe the most, the most, uh, common bugs I find, and also use authentic as encryption, not just encryption when you can, because it saves you a lot of problems, detecting corrupted data and, um, preventing attackers from when you find your, your data quest in both things. But when you’re not a cryptographer, you, you don’t notice. Um, so, and maybe a last bit of advice, if you wanted on mobile, we took our feet. I heard us a nice book called serious cryptography, which I happened to be the author of which kind of expense all these concepts in relatively simple terms.

Justin Beyer 01:05:49 Yeah. So I’ll definitely make sure to include your book in the show notes and in the reference in the section, just to kinda, you know, completely close this out. Is there any further worker dote on the topic?

JP Aumasson 01:05:59 Well, these is a marking on a verbiage from GAF cryptography. Um, uh, that’s you find it blockchain that they just show say, I’m interested in special signatures and special encryption whereby you have, uh, multiple parties, but yours, you only need a subset of these parties to collaborate, to create a signature or to decrypt that that’s really cool. Uh, but that’s really for the, for specific applications, like what we find in that in enterprise. Um, and I’m mostly interested in possible than cryptography, which is the kind of crypto that would be safe, uh, against quantum computers. But we don’t have quantum computers these days, but nieces working on standouts just in case it happens.

Justin Beyer 01:06:38 Yeah. We didn’t really talk much about quantum crypto and quantum safe and post quantum a lot in this episode. And maybe that’ll be a topic for another show. And I hear you’re also working on a new book, is that correct? Yeah.

JP Aumasson 01:06:48 Yeah, that’s right. So that’s a completely different burden than the last one, but it will be about explaining many different concepts in cryptography from the simplest one to the most complex one. It’s just some, uh, you know, historical anecdotes. So it could be a thing later this year in December, but they will not stop for us. I’m really excited about it. Uh, it would be kind of dictionary of cryptography.

Justin Beyer 01:07:12 That sounds awesome. Well, JP, I just want to thank you for coming on the show and discussing all these cryptography, discuss topics and encryption and hashing and all these other things.

JP Aumasson 01:07:23 Thanks for saying that’s where I saw a really good discussion. Uh, we could have gone much longer, but I hope the of the podcast is most of it. And yeah. See you next time.

Justin Beyer 01:07:31 Um, see you next time. And I will definitely include some social media resources in case any of our listeners want to reach out to you. Excellent. This is Justin buyer for software engineering radio. Thank you for listening.

SE Radio 01:07:42 Thanks for listening to se radio and educational program brought to you by either police software magazine or more about the podcast, including other episodes, visit our [email protected] to provide feedback. You can comment on each episode on the website or reach us on LinkedIn, Facebook, Twitter, or through our slack [email protected]. You can also email [email protected], this and all other episodes of se radio is licensed under creative commons license 2.5. Thanks for listening.

[End of Audio]

SE Radio theme: “Broken Reality” by Kevin MacLeod (incompetech.com — Licensed under Creative Commons: By Attribution 3.0)

Show Notes

Related Links

Transcript

Join the discussion

1 comment

More from this show

SE Radio 723: Dave Airlie on Linux Kernel Maintenance

SE Radio 722: Dwayne McDaniel on the Engineering Challenges of Secrets Management

SE Radio 721: Rob Moffat on Risk-First Software Development

Menu

Recent posts

Search

Search

SE Radio 439: JP Aumasson on Cryptography

Show Notes

Related Links

Transcript

Join the discussion

1 comment

More from this show

SE Radio 723: Dave Airlie on Linux Kernel Maintenance

SE Radio 722: Dwayne McDaniel on the Engineering Challenges of Secrets Management

SE Radio 721: Rob Moffat on Risk-First Software Development

Menu

Recent posts