SE Radio 467: Kim Carter on Dynamic Application Security Testing

Kim Carter of BinaryMist discusses dynamic application security testing (DAST) and how the OWASP purpleteam project can improve early defect detection. Host Justin Beyer spoke with Carter about how DAST can provide meaningful feedback loops to developers to improve code quality and push penetration testing to the detection of higher-level vulnerabilities. They also discussed how the OWASP purpleteam project fills a gap in the open source DAST space. While discussing purpleteam, they dove into the project’s underlying architecture, such as how it leverages the Zed Attack Proxy (ZAP) project to detect the actual vulnerabilities in the application. There was also a discussion on how to integrate DAST into your software deployment pipelines.

Show Notes

Transcript

Transcript brought to you by IEEE Software
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected].

SE Radio 00:00:00 This is software engineering radio, the podcast for professional developers on the [email protected] se radio is brought to you by the computer society. I is your belief software magazine online at computer.org/software.

Justin Beyer 00:00:16 Well, this is Justin buyer for software engineering radio. And today I’m speaking with Kim Carter, Kim as a technologist information security professional entrepreneur, and the founder of binary mist. He’s a chapter leader of Owasso New Zealand as also a certified scrum master, and was also a former host of software engineering radio and has 20 years of commercial industry experience across many domains. Welcome to the show. Kim is

Kim Carter 00:00:39 Going Justin.

Justin Beyer 00:00:40 Hey Kim. So just to start off the show, I kind of want to set a foundation for the audience. So today we’re gonna be talking about dynamic application security testing, and specifically the purple team project that you’re working on. So to start off, can you kind of tell us what dynamic application security testing is?

Kim Carter 00:00:56 Yeah, so it’s, it’s a kind of black box testing, but it’s obviously security testing. So you’ve got a lots of different types of testing, right? So you’ve got your unit testing, you’ve got integration testing, which generally focuses on testing the actual code. So you’re actually testing the implementation of the code rather than actually testing. I like the interface of the software itself or dynamic application security testing is usually coming at the software from a black box perspective or as an end user would, but in an automated fashion.

Justin Beyer 00:01:30 Yeah. So when you say black box, you’re referring to the concept of white box testing versus known source code versus completely unknown source code.

Kim Carter 00:01:38 Yeah. Yeah. So with desk or dynamic application security, just, and you don’t get a view into the source code at all, you’re coming at vacation from outside of it rather than, uh, looking at the code itself. Okay.

Justin Beyer 00:01:53 So that’s how it’s kind of different from a static security analysis scan where that’s looking at the actual code and saying, you know, this is a common format for an XML entity injection, or this is a common format for SQL injection. And then kind of taking a guess, you’re actually trying to say, does this actually have a SQL injection here?

Kim Carter 00:02:13 Yeah, yeah, exactly. It actually is. It’s putting it through the test as an end user or as an attacker would.

Justin Beyer 00:02:19 Okay. And just to kind of a high level, and we’ll dive into some of the specifics with how the purple team project actually works, but from a general perspective, how does a dynamic application security test work? You know, do I just put it in my webpage and say here, go at it or do I have to give it specific end points and tests to do, uh,

Kim Carter 00:02:37 Um, hierarchs do things I guess in slightly different ways. So with pivotal team, you create a, um, a job file, a build user conflict file. And in that file, basically what we’ve got is we’ve got some, or we listed out the routes, we list some fields of those routes indication and how to get to the application, basically just details around how to actually get to the application and how to go about testing it. There’s some stuff on there, like the routes and fields in, and also a little bit of dummy data to put into those fields for the first pass over. And basically what that’s doing is giving the application Testa, basically just a starting point. And then from then on that knows that those routes exist. Those fields exist in it. I think start to get a little bit more dynamic and it sort of like takes off on its own on its own accord.

Justin Beyer 00:03:31 Okay. And we’ll dive into a little bit more of the specifics of how purple team works a little bit later on, but essentially what you’re saying is it’s almost like a Canon, you’re kind of pointing it at the website and you’re saying, you know, here’s where I kind of want you to hit and you’re giving it some data to start off with, to almost lay a foundation for it to start testing. And then the application, how that responds, changes how the actual security testing application is going to try to test things. So if it sees, oh, there’s these fields, let me try these different known injections that should work. And then if the application gives me up 500 error, it might try different things.

Kim Carter 00:04:07 Yeah. That’s pretty much it basically just pointing it in a general direction and giving it some tidbits to get started. Yeah.

Justin Beyer 00:04:14 Tobacco in episode 4 53, I had spoken with Aaron Rinehart about security, chaos engineering, and we kind of discussed this concept of continuous verification. You know, the idea that you’re always coming up with different scenarios, which are plausible and then testing your code against those, to determine if there’s a regression or a known issue that you need to address and those kinds of things. How do you think desk or dynamic application security testing would compare to something like continuous verification? Would it be an augmentation or replacement?

Kim Carter 00:04:46 A dynamic application security just in is there’s generally an augmentation to most of the other types of testing, like static code analysis and that sort of thing. You can basically just say all we’re going to do as dynamic application security testing, because it’s not really enough. It’s not, it doesn’t give you the feeds and deeps in order to get a defense and dips, you know, lots of different types of testing, cause they’re all gonna find different things.

Justin Beyer 00:05:12 Exactly. They all kind of have their own strengths and weaknesses. And on that matter, I mean, you kind of said that it’s not going to replace it, but how do you think it compares to something like doing a penetration test on your applications

Kim Carter 00:05:26 Quite similar, but, but it’s automated. And the idea is to at least with purple team is to have it running is the developers engineers that I would encode in introducing defects. So as they’re introducing defects, pupil teams running, and then it comes up with a report, some results and some details on basically gives you details on where the defects are, what to use to reproduce them. And some advice on how to go about fixing them, which is quite some of the penetration testing. But the main difference is that with the integration testing, it’s usually performed very late in the SDLC, like often, um, at go live or, you know, several months after the code’s been written and often then the developers can’t even remember the specific areas that they were working on, that I may have introduced to our specific defect. So yeah, if you can find and fix those defects as you’re being introduced, it’s just so much cheaper as well,

Justin Beyer 00:06:28 But kind of going down this penetration testing route. So you said the difference is you’re trying to catch defects early in the process versus having almost the big bang at the end where penetration tests comes and they give you a 500 page report that says your application has a thousand SQL injection vulnerabilities, 2000 XML, oddity injections, and every other Owasso per miter, common weakness enumeration in the book. But are there things that a penetration test is going to catch that dynamic application security tester isn’t

Kim Carter 00:07:00 Oh, that’s a good point. What usually happens is that developers are introducing often pretty trivial sorts of defects that penetration testers can find very easily, but the penetration test is only have the budget of say a week or two weeks to spend on the actual system on the test. So what happens often as I’ll come up and I’ll say find five criticals, teen highs, teen mediums, and teen low severity bugs. And the business decides because it’s so expensive to actually fix those defects in that point in time, while we’re only going to affect the five criticals. So it’s super expensive and you don’t get all the defects fixed with the likes of purple team I’m running over your source code daily, or the idea is that you find and fix all those simple defects as are being introduced. And even some of the more complex ones our political team will tell you about.

Kim Carter 00:07:56 And then basically when it comes time to hand it over to the penetration testers, are they going to be able to concentrate on a defects that are further up the ladder or not the lowest hanging fruit anymore? So you’re basically just raising the bar and sort of really for the attacker, because a payable team has already found a lot of the defects, the developers have fixed it because I’ve been made aware that I’ve introduced defects like the day before or whatever. And then like the penetration test has come along and they can focus on the more gnarly sort of defects. And then some of those will get fixed as well. So you’re just raising the bar considerably for the attacker. Yeah.

Justin Beyer 00:08:35 Yeah. So the idea kind of being that penetration testing is catching, as you kind of said the higher on the ladder, you know, the business logic vulnerabilities, where for some reason there was a weird condition where certain business logic conflicted with what the intended output was or some higher level certificate issue or backend infrastructure issue that, you know, it’s not necessarily application level, it’s the backend systems that caused an issue in the application, but can you kind of name some of the vulnerabilities that you would normally see something like purple team detecting most,

Kim Carter 00:09:10 I usually found in the application code itself, uh, like you mentioned, um, uh, defects can be found on infrastructure and SSL, TLS certs, and that sort of thing. Every team’s got salmon application tests and it’s got a tearless chicken and a server checker, which is not yet implemented though on the backlog. So that will be finding some of the infrastructural stuff as well.

Justin Beyer 00:09:33 Okay. And that might lead to a discussion later about how that might integrate with things like terror for, and more ephemeral infrastructure. But one other question I kind of had, how does dynamic application security testing and purple team work with distributed applications? You know, something that I’m deploying across 50 data centers across the world.

Kim Carter 00:09:53 So because it’s coming yet or your application from a black box perspective, it’s, it’s seeing what an attacker is going to see. So it’s a technical application, like a physical I’m a tech award. So there’s really not too much difference there. You can point it at a specific end point and then it’s going to find what an attacker would normally find.

Justin Beyer 00:10:13 Okay. So as long as my code deploys are working correctly and all my data centers are the exact same replica of the application, it should be catching everything that you would see with those kinds of things.

Kim Carter 00:10:25 You could point it at civil differently in points. There’s no reason why you couldn’t do that.

Justin Beyer 00:10:29 Okay. So you could kind of almost check to make sure that all of your defects that your developers are saying are fixed are actually fixed and deployed across the entire environment versus just specific environments, like a staging environment, for example.

Kim Carter 00:10:42 Yeah. But, um, in order to do that, you’d have to point it at a few different end points with a multiple.

Justin Beyer 00:10:47 And then how would something like DAS compare to using things like development, frameworks? You know, I’m thinking of things like an RM from a very high level concept where it’s starting to abstract away. A lot of these injection vulnerabilities is that kind of removing the need for dynamic application security testing, or is that actually giving more of a reason to keep doing it? Because now you’re making this assumption that the RM took care of it. I don’t need to worry about it

Kim Carter 00:11:15 Again, it comes down to a defense and depth. I mean, adding all these different frameworks and our aims and that sort of thing. Sure. The idea is that it helps to steer developers in the right direction, but then developers are still riding those, our M’s and frameworks so often is defects in those as well are coming at your application from a black box perspective, you’ll end up finding the defects and the, our M’s and, and frameworks as well, rather than just the application codes and the developers.

Justin Beyer 00:11:47 Yeah, exactly. You’re not only testing just your application code, but now you’re testing all of these dependencies that you’ve also pulled into your application code. Cause you brought in, you know, 20 different sanitizers for the output and 50 different RMS because your developers thought that if they added more, it would make it more safe

Kim Carter 00:12:05 Inadvertently.

Justin Beyer 00:12:06 Exactly. And it’s that time complexity cost. Now, if I’m trying to sell this to my organization, we already have a static analysis tool we’re already paying for penetration testing. Why would I want to bring in something like a dynamic application security tester, other than in the case of it being open source and purple team?

Kim Carter 00:12:24 Yeah. So are you static code analysis is not going to find everything that it finds known defects. Whereas the dynamic application security testing is going to find some of the known defects and a lot of the unknown defects as well. It’s just coming at it at your application in a far more dynamic sense. So if all you’re doing is covering your code base with static code analysis, you are not going to find all the defects and you’ll find some obvious ones in the libraries are because the payments and that sort of thing, but it’s definitely not going to find, uh, most of the defect,

Justin Beyer 00:12:59 Essentially. It’s just going to look for known patterns. If I think of something like, you know, the Semel project, it’s looking for specific groupings of commands in a specific language, but with desk, does it have any limits on languages that it works with? You know, can I have a development shop that does Java script and does go Lang and does Pearl all under the same dynamic application security?

Kim Carter 00:13:25 Yeah. Yeah. So dynamic application security is, um, is language agnostic. It doesn’t even know what language you’re using because it’s coming at the application from the outside.

Justin Beyer 00:13:35 Okay. And is there any special instrumentation that a developer would need to add into the code to make it so that a desk could hook in, in some way to get feedback on the actions that it’s doing?

Kim Carter 00:13:46 I’m not normally with purpose team they’re different. Isn’t

Justin Beyer 00:13:49 Okay. And what kind of overhead do you see? Something like a purple team product adding, you know, manpower hours, complexity, you know, do I need to hire someone specifically to run this? Is this something that the development team could run? Do I need a dedicated security team?

Kim Carter 00:14:06 The idea of purple team is so that you don’t need a dedicated security team. And so your development team is actually getting mentored by the tool itself. So the idea is that with pivoting running over your applications and telling you about the defects you’ve introduced the day before or whenever the test was run last, the idea is that it’s still fresh in your head or the code you wrote, and the tool actually guides you along the path to fix your own defects. So points you in the direction of where the defects are, how to reproduce them and provides some details on how to actually fix them. So it’s, it’s, it’s working as a mentor. So the idea is that you don’t need to have security knowledge upfront, but working with the tool actually helps you to learn more about security, as well as finding your defects and helping you to fix them.

Justin Beyer 00:15:02 Okay. So the concept of education that you usually hear thrown around, as you know, you’re shortening the feedback loop for the student or in this case, the developer to learn from their mistakes, with what they’re doing. You know, I wrote bad code on, you know, Monday, and then on Tuesday morning when I come in, I see, Hey, that was bad code. It has these issues. So now in my head, I’m always going to remember, or hopefully, always remember that writing, you know, direct user input into SQL calls, a bad idea. And I shouldn’t just concatenate strings. That ends bad.

Kim Carter 00:15:32 Yeah, exactly. The other thing is, is does it, if you do do it again, then the tool should find it again. And so, you know, I want to made the same mistake several times. You’re not going to make it again.

Justin Beyer 00:15:45 When you get the 50th report and your boss starts asking, you’re going to stop making that mistake hopefully, or he might not have a job very long. So changing a little bit here. Have you seen a big difference in both the desk space between commercial tools and open source tools? I mean, primarily I’ve seen a huge amount of commercial tools in the space and very little in the open source world.

Kim Carter 00:16:09 I think probably mainly because the tools are actually quite expensive to create very time consuming because it can be quite complex. Yeah. So there’s not very many tools that are actually freely available at all. Purple team being one of them. So purple teams got an offering and, and a paid for offering. The only differences is that the paid for offering is already set up for you. So, um, all the infrastructure is set up for you, but if you decide to use the iOS people team, which has the same code base, then you’ve actually got to set up all the components yourself. And that sort of takes some time on a good run that can take sort of like probably two to three days.

Justin Beyer 00:16:47 Okay. And that’s something where as an organization, you’re probably making that manpower hours, complexity decision of build versus buy, you know, where you’re saying, is it worth standing up the open-source component with what we’re going to save? And it may be more of something where you’re doing it as a test run or a dry run to see, is it worth the investment? Yeah,

Kim Carter 00:17:07 Exactly. I’m planning on having a free trial for the paid tiers anyway, so you can actually trial, how are they going through the motions of sitting the whole thing up spinning civil day, sitting at all up? The one thing that is common to both Iwas people team and , which is just, um, uh, my own company is that you’ve still got to set up the CLI and the AR job file, which just tells the purple team back in that way, your application has head authenticate against it. And which routes you want tested in some field information, which is usually fairly trivial.

Justin Beyer 00:17:44 Perfect. And we’re going to discuss that a little bit later when we start diving into the OLS purple team specifics. Cause I do to dive into some of the brains behind purple team and how these job files work in the CLI works, but a couple more questions on just desks. Have you seen it delay software releases or builds in any significant manner or is it really just almost, I want to say an augmentation to, you know, the secure code training that you send every developer to before you let them sit down and do their coding

Kim Carter 00:18:11 The best place for it, as in something like a nightly build, because it can take some time, depending on the size of your application, I’m sort of expecting, you know, to allow somewhere between 15 minutes and an hour for a full sort of system test, or that’s more of a, uh, a nightly build, then I continuous integration type build.

Justin Beyer 00:18:34 Okay. So what I’m thinking at this from the high level defense and depth perspective, I think static analysis tool integrated into the IDE catching some of the lower level known issues right off the bat. Then you have some type of continuous integration testing, which might be unit testing or some type of like security, chaos engineering, continuous verification process that’s running on each commit. And then you have your dynamic application security testing, running nightly, that’s doing some more dynamic testing, discovering some vulnerabilities that you may not have known about. And then, you know, you have your annual pen test on there for PCI and your annual secure development training that you’re required to do because, you know, you need the checkbox

Kim Carter 00:19:23 Pretty

Justin Beyer 00:19:24 Moving into, you know, the specifics of the OSTP purple team tool. You know, why we brought you on the show. Can you tell us just a little bit about the project as a whole? You know, what was the inspiration for

Kim Carter 00:19:36 The inspiration is basically just to help developers create a more secure software. I did a proof of concept about three years ago and took it to a lot of different conferences and workshops all around the world and that sort of thing, to get developer feedback and get developers actually using the proof of concept and, and working out whether it was going to be something viable to take further and actually turn into something. And it seemed like it was developers were pretty keen on it, pretty excited about it. So what I was doing in that proof of concept was simply using zap. I was writing some tests for it, and it was heading a node guy, which was just a, it’s another application. It’s a purposely vulnerable node application. So I was just using that as the system on the test, then basically her day, a single script that I was using to drive zip with the net was sort of the beginning of it all. And then I also had some discussions with Simon Bennett, my creator, and he was talking about writing a service that he could put zip behind so that it was running in the cloud, but he didn’t have enough bandwidth for, so I decided to actually just jump straight into it and I get started in just so happens that I lost zip is one of the emissaries.

Justin Beyer 00:20:57 Yeah. Let’s dive into that a little bit. Cause you use the term Emissary here. What do you mean by that?

Kim Carter 00:21:02 So the architecture, what we’ve got is we’ve got a CLI which sits on your bill box or is basically as close to your builder as possible. And then there, the build user base just creates a job file, which I’ve discussed. And then when you run the CLI that fires, that job file to the back end. If you’re running locally or you’re running in the cloud, there’s some slight differences. If you’re running in the cloud, then most of the infrastructure is AWS and it goes to an API gateway and a network load balancer. And we use a kick NATO for the authorization. If you’re running locally, vice you, that are those components up front, aren’t there and you’re heading the orchestrator directly. Now the orchestrator is the first main component of the purpling back in. And the orchestrator basically does what the names is. It orchestrates everything that talks to the different types of testers at the moment, we’ve got the application tester, which is, is the application tests with our repository name as the app scanner.

Kim Carter 00:22:07 So we’ve got testers there. So I’ve got the application test, which is fully implemented and we’ve got the service scanner until it’s chicken, which is on the, to do list. It’s all architected so that these testers can be plugged in and the community can add additional testers. Now these testers are responsible for the emissaries and the case of the application Testa EMA Syria’s I was present. And the case of the service scanner that Emissary is probably going to be nectar in the case of our Telus checker, the Emissary’s probably going to be assessed. So the number of emissaries that are spun up is determined by number of routes that are passed and to the backend from the CLI. And that’s all defined in the job file. We’ve got a maximum or current maximum of 12 that may or may not be increased in the future. So the emissaries are spot up dynamically via Lambda functions. Those lender functions when run locally are hosted via same CLI. When I run in AWS, I just run on AWS Lambda. So we’re using a Redis is the messaging system, which seems back all the tests or events to the orchestrator, the orchestrator, the orchestrator basically puts all those events together and then sends them back to the CLI. We’ve got some details in the documentation around the different types of messaging that you can use to have them sync back to the CLI.

Justin Beyer 00:23:36 So just to kind of recap what you’re talking about here, and we’ll definitely link to the documentation so people can dive more into the specifics on the code if they want to, but essentially the high level application architecture is you have the CLI, which is on your build server or somewhere in that region, that’s then taking your job files, which we’re going to dive a little bit into. Cause I have a couple more questions on that, but it’s taking your job vile. It’s using that as almost a message essentially to say to the orchestrator, Hey, I want you to do these 12 things. The orchestrator then takes that and uses the abstraction of a tester to say, okay, we’re going to do an application test or we’re going to do a service test or we’re going to do an SSL test or TLS test. And then that’s all defined in the job file specifying, what kind of testers it’s doing?

Kim Carter 00:24:26 Uh, not quite so those different testers get run no matter what the job fall simply here’s the routes and the fields. It doesn’t say, um, like which tests are going to be used or anything like that because all the tests are used.

Justin Beyer 00:24:42 Okay. So the job files almost a architecture overview or a very simple architecture review of your application. So the orchestrator seeing that and says, okay, these are the end points. We’re going to run all three of these testers against. Then the testers are defining the emissaries. Like you mentioned earlier, which is what’s doing the actual work under the hood.

Kim Carter 00:25:02 So the testers themselves are responsible for spinning up in number of emissaries. And it’s dependent on how many tests sessions are in the bill and the job file.

Justin Beyer 00:25:13 So if you have 12 jobs defined or 12 routes to find in the job, you’re going to end up with three times 12 testers or emissaries running.

Kim Carter 00:25:23 So what you’ve got as a sort of a hierarchical thing in your job fall, you’ve got tests sessions and you’ve got routes, and then you’ve got the fields period. So the test session can have any of those routes that are defined in the job file. So it may have two of those routes and another test session, which is on the same job file. Um, I have one where out or may have teen routes. So the idea is we’re keeping those, the reason we’ve got the test sessions is so that we can make sure that there’s no cross pollination happening between the testers. So it’s basically just separating everything out so that we don’t get overlapping results in between tests, decisions. Basically what defines a test session is a username and basically a set of credentials. So you can think of a test session is someone that’s going to log into your application.

Justin Beyer 00:26:18 Okay. So a test session is almost like a login session, but then this kind of, as you mentioned that cross-pollination, do you see some types of overlap between, you know, what is reporting and what, or is that attack proxy is reporting in a test or for your app versus the Nico reporting?

Kim Carter 00:26:39 Uh, so the NICTA isn’t implemented yet. It should, there could be crossover, but, um, between the test is, but it’s pretty unlikely because that doesn’t really touch on this is the LTLs defects.

Justin Beyer 00:26:51 Okay. So whereas Mico does more of that route discovery, additional endpoint discovery, and doing additional testing on those. You’re not going to see that necessarily with zap where zap is just doing, you know, known MITRE, common weakness, numerations.

Kim Carter 00:27:07 The thing is each test has got quite a different responsibility. So there shouldn’t be much overlap if any, between those distances.

Justin Beyer 00:27:14 Okay. And when you say, you know, service tests or with Nico, what are you imagining? That’s going to pick up vulnerability wise versus those that attack proxy.

Kim Carter 00:27:23 So pick up configuration issues with how you sit your server up. Last time I looked at this was like a couple of years ago, so it’s not fresh on my mind. I’ve been working in on the application test in the human series for the application tests. So the service scanner and the tearless checker store that we implemented probably start on the TLS chicken. Who’s probably going to be the next cab off the rent.

Justin Beyer 00:27:49 Okay. So we’ll kind of bring you back on the show. Once those other testers are a little bit more flushed out, maybe we can discuss, you know, your thought processes on implementation for that, but kind of going back a little bit with the job file you were mentioning, there’s a limit of 12 right now that may or may not change. And I assume you put the limit in for obvious rate-limiting reasons, but could you define multiple job files? If I had, you know, 24 routes, could I define two or I want to 24 test sessions? Could I define two job files?

Kim Carter 00:28:18 I’ll answer that first. Well, it wasn’t actually a question, but I’ll just discuss that first point or there a limit of 12 simply because of the UI, because we’re using, uh, a CLI UI watching more than 12, um, two sessions at once. It was just too much. It took up too much space. And that was simply the only reason what we want to do is get hands on the toe and actually see what people need. Do they need more than that? Is it enough? And we’ll go from there, your main question there around her multiple job files. Yeah. There’s no reason why you can’t have multiple job files, but are there may or may not be a need for it? I think what we need to do is actually get people that are using it to tell us whether they want more job files or not. Yeah. There’s nothing to stop you writing multiple job files and starting a test with a different job file.

Justin Beyer 00:29:10 Okay. And I guess if you really examine it from the perspective of test sessions are really just credentials. There’s probably not a huge amount of need to have, you know, 24 sets of credentials to test, unless you have quite a lot of, you know, our back rolls or something you want to test.

Kim Carter 00:29:24 Yeah. Although I don’t think there’s really anything that’s stopping users doing that. So it would be interesting to see how it actually went.

Justin Beyer 00:29:32 Yeah. And what their actual use case around that. Wasn’t why they went that route versus a different route. But out of curiosity, is there a specific language that the job files are written then

Kim Carter 00:29:43 It’s just Jason.

Justin Beyer 00:29:45 Okay. So it’s pretty accessible to most developers then. All right. So earlier you mentioned, you know, Retis messaging and some options around that. Can you kind of discuss how that works? We did an episode on Reddis, um, episode 4 44, where we kind of discussed some of the functionality of it, but I’m curious how you’re actually leveraging it in a real production app for messaging.

Kim Carter 00:30:07 Yeah. So we’ve got a couple of different use cases here. So when you’re running it locally, when you’re running per team locally in use servicing to Vince, which uses a pub sub, or we can use long poling, which uses, pops up and lists when they’re running in the cloud, we’re constrained to using long Poland, which basically means we’re using our, we just pumps up. Lists, ends a few pros and cons to each of those, but y’all are documented in the purple team. Read me currently on GitHub. I could go into a whole lot more details, but it starts getting quite complicated quite quickly.

Justin Beyer 00:30:48 So just at a high level, what are some of the trade-offs that you’re going to make as a developer or as a security person setting this up? If you choose, you know, long polling versus pub sub, I assume it’s response time for test completion.

Kim Carter 00:31:01 So with the I’m servicing the vents, if we’re using that, just pops up what basically what they means is if the CLI goes offline or we close the CLI, and then we started up again, while the test is still running in the backend, or we can miss a section of messages that may come from the orchestrator, if we’re using long polling, what happens is because we’re using lists and using pubsub, we’re essentially, um, persisting the messages to read us lists and then, or using pops up to push those messages back to the CLI. So if we’re using long polling, we don’t actually lose any messages when the CLI goes offline and saying that long polling is slightly more chatty, but not a lot. Because what we do is we send a request to the, out, to the orchestrator and then the orchestrator basically pops as many messages off. The latest list is currently there and then returns those via pub sub. So it’s actually not too chatty. So if you’re using people team locally, I’d, I’d just try using both and see which works best for you.

Justin Beyer 00:32:08 Okay. So essentially you’re almost getting close to guaranteed delivery with the long poling versus pubsub where it’s just firing it off blind. Why did you opt for Reddis over a more traditional, you know, message queuing system like rabbit MQ or Kafka or something like that.

Kim Carter 00:32:27 It seemed to be pretty easy to set up.

Justin Beyer 00:32:29 So it was mainly just, uh, ease of use for the end user when they actually went to build their infrastructure out locally.

Kim Carter 00:32:35 Yeah. Well, the thing is we, this is just running in a container, so it’s just another container to deploy. There’s quite a few services here and they’ll sort of prod at once. Um, other than the actual emissaries, which get fired up when the, uh, job file comes through.

Justin Beyer 00:32:52 Okay. And how does that work? I know you had mentioned a little bit earlier about some CLI that kind of works in between when you’re running it locally, but how does that work when launching a Lambda job out of a container?

Kim Carter 00:33:05 Ah, yeah, so what’s happening there, uh, locally as we use our Docker compose UI. So that allows the artistic containers access to the local Docker socket to spin up, uh, Emissary containers on the host as well. So we don’t have containers running in containers or anything silly like that happening. We’ve just got some additional containers that are spawned up, are using Docker, compose UI to host the emissaries. So what’s happening. There is the actual test is talk to Sam CLI locally, and then Sam CLI talks to Docker compose, which has an API and a doc. The also has access to the local Docker socket so they can spin up containers and the cloud it’s just Lambda. So, uh, ECS containers, which have our test is running and them today to us Lambda. And then I do a slam that I should just spins up, uh, these two containers on demand.

Justin Beyer 00:34:02 Okay. So essentially locally you have an API abstraction to just spin up more Docker containers versus in the cloud, you’re just doing an API abstraction to spin up Lambda jobs for temporary short-lived,

Kim Carter 00:34:13 Which spin up the ACS container.

Justin Beyer 00:34:16 Okay. Now we’ve kind of talked a lot about how this project is working under the hood and a lot about, you know, the underlying architecture behind it. But I want to talk a little bit about the emissaries, you know, why did you choose to go with something like that attack proxy for the application scanner?

Kim Carter 00:34:33 So I was precepts it’s free and open source basically. And that’s a great patient to be intercepting. Proxy is, I mean, we’ve got others like burp suite and some other older ones, but zips still pretty much cutting edge and Simon and the team are still working hard on that and just doing really well with it. I don’t think it’s going to be going anywhere, so. Okay.

Justin Beyer 00:34:54 And just from your perspective, if I’m starting a new project, let’s say I’m doing, you know, something like a progressive web app, or I’ve decided that I’m building a giant distributed system over the weekend. How am I going to integrate this into my new project?

Kim Carter 00:35:08 So the details for that are on purple team CLI read me and get up. But basically it’s just a, so there’s a few options there. You can spin up the purple theme CLI so that you can see the actual interface and what’s happening, or you can run it headless. We run it headless when we’re putting it in to a build pipeline. So yeah, there’s a few different ways you can sort of sit it up there on a few different ways. You can install it as well. You can install it into any type of build pipeline. I haven’t come any build pipelines that we couldn’t install it into that’s it’s language agnostic. So we’re just basically spawning process was born in the per diem CLI process and then just feeding it the job for.

Justin Beyer 00:35:50 Okay. And out of curiosity, what language did you end up writing purple team, man, pretty much

Kim Carter 00:35:54 Your feelings in a JavaScript.

Justin Beyer 00:35:57 Okay. So everything’s pretty much in JavaScript and that you’re just injecting into the build pipeline with some type of call, I’m assuming to the shell to call the CLI to upload the job files.

Kim Carter 00:36:08 Uh, yeah, so we just run purple team basically. So we’re just spawn pick up team with the test command so you can run any of it commands. Um, it’s only got about three or four of them at the moment. Um, test, test plan status, just to check whether your back ends up in responsive and T’s plan, which returns what the current test plan is going to be happening in the back end.

Justin Beyer 00:36:31 And let’s say a developer decides they don’t want to use that. They want to put in a Burke connector. How would they do that? Would they write a whole other test or would they write, I don’t know, comparable tests that they could put into zap to run that would meet the same needs as burp suite.

Kim Carter 00:36:53 Yeah. So the idea would be to, um, create a tester and then that just sits in a container and basically just follow suit to what it’s already been done with zap. So as zips, actually also driven with selenium with the application scan or this or this to Hema series cyber side there. So we’ve got selenium, which basically gets the zap or requests proxied through it. So are we need that in order to run active tests against your system and the test for the likes of our service scanner using neckties or TLS check or using your supplies? We haven’t necessarily implemented those yet, but we think for the emissaries, there’s only going to be like one. So you’ve got the server scan at Testa, which will have the Neto Emissary and the tearless chicken, which will have the SSLs probably.

Justin Beyer 00:37:46 Okay. So essentially you’re using selenium as like a headless browser for zap to go through, to hit your test end points. Okay. Totally understand that. So, and this’ll probably be a silly question since kind of answered by the fact that you’re using zap, but I assume this comes with pretty much any test case you could think of out of the box.

Kim Carter 00:38:08 Yeah. Yeah. So what’s happening there is we’re running a full zip active scan. So we’ve set up a bunch of things before we actually start the actor scan like different thresholds and different scanners that needs to use and plugins and that sort of thing. There’s a bunch of setup required before we actually tells that to actually do the act of scan. And then my physique of scan started it, it goes through a plethora of tests. Those tests are listed on, on the Zep proxy docs site.

Justin Beyer 00:38:40 Okay. So we’ll link to that proxy documentation. And what kind of testing and supports, I assume if a developer wanted to, and this may not be a great question for you, but they could write some type of custom test if they had a weird application edge case that they wanted to check for.

Kim Carter 00:38:55 Um, so at this point in time, we don’t have ability to do that. That’s a potential for the future, but we want to basically get hands on the tool, more hands on the tool to start with than actually see whether that’s something that people are actually even wanting. I don’t want to spend time on something that people don’t even need and no one’s actually requested that yet. So if there is a request then yeah, we’ll definitely look into there or have been thinking right from the start, um, around the ability of actually providing the test plan as well as the job file and then the back end code run it, which is a lot more work. So yeah. I mean, we’ll get to that bridge when we get to it. I guess we’ll cross that bridge when we get to

Justin Beyer 00:39:41 Exactly. It’s kind of one of those things where until there’s an actual proven use case or a need for it, there’s no reason to build it again, agile development. There’s no reason to spend time and money on a feature.

Kim Carter 00:39:51 Yeah. But it has been thought about from the beginning. So there’s no reason why we can’t do that.

Justin Beyer 00:39:57 Exactly. It was built into your abstractions from the initial standpoint of the architecture. Now you’ve kind of mentioned this, that there’s some amount of intelligence, but I’m assuming none of this is, you know, hard scripted testing where you’re saying, you know, take your El from job file and run X string on it.

Kim Carter 00:40:17 Yeah. So the EMA is responsible for actually how to test. We just tell the tester what we want tested and then the test spins out the Emissary and then the EMS area’s responsible for how to go about doing its own business.

Justin Beyer 00:40:33 Okay. So again, it would be on that end Emissary’s responsibility to actually write the actual vulnerability test. Whereas purple team is orchestrating all of these emissaries to run all of these and tests. Is there any thought process of leveraging something, you know, like, you know, T some buzzwords here, artificial intelligence or machine learning to kind of make test plans more efficient or more coverage on code?

Kim Carter 00:41:02 Yeah. So that was an idea from day one, as well of having a machine learning module I’m sitting out there that I can’t remember exactly whether the orchestrator consulted it or whether each of the testers consultant, probably the testers, but the idea was is that the specific tests got more focused and got better at basically what they were based on how they’ve been previously. So on previous test runs and that sort of thing. So the idea is that the orchestrator or the test is consult the email module and then get back sort of like a refined set of directors.

Justin Beyer 00:41:39 Okay. So it would almost be like if every single time I ran the application scanner, we got a SQL injection vulnerability and never found any XML issues. Cause we don’t actually use XML in the app stop testing for X amount. Yeah. Okay. That makes sense. And you’ve kind of mentioned this a little bit and it’s kind of implied by the fact that you did have an ML module concept in there, but what kind of reporting does, you know, oh, was purple team actually provide back to users?

Kim Carter 00:42:08 So what we’ve got is we’ve got the reports from the emissaries. We’ve got the test outputs from the cucumber CLI I think that’s a day actually. So at the moment it was two, two sessions. And one of those decisions, a statistic, one router, and the other decision is testing two, we get a, we’re getting back for eight files in our outcomes package. So they get delivered back to the CLI and then you can open those up and go through the reports, which are from the emissaries and the actual test outputs, which are from the cucumber seal.

Justin Beyer 00:42:48 Okay. And those reports, what format are they in? I mean, I assume they’re not PDF reports, but are they just Jason pull reports that could be fine-tuned later? Or is it something that’s coming up in the,

Kim Carter 00:42:58 Yeah. So those reports are defined in the job file. You can tell the system, whether you want an HTML report, a markdown report or Jason report.

Justin Beyer 00:43:11 Okay. So essentially you could almost get a pre-packaged report to hand to the executives, or you could get something that you could put into an in-house tool that you want to format. Okay. And you kind of mentioned this earlier that you had, you know, some statistics and facts, and I assume it’s from seeing all of these reports from, you know, actual use cases, do DAS actually reduce defects, you know, specifically something like purple team. How are you seeing an improved code quality?

Kim Carter 00:43:38 Yeah. So there’s quite a few different stats and studies around this. I wondered, I looked into quite a while ago when I was writing my book series was a study that was in a Steve McConnell’s code, complete book. He had a graph, there were three rows had requirements who are in architecture row and a construction. I’m mainly focusing on the construction cause that’s basically developers are writing code and that’s the construction phase. So there’s a specific cost. We’re introducing a defect. So we’ll call them R one X, if their defect is not found and fixed until system test time, then it’ll cost 10 times. The original cost of it’s not found in fixed until post release, which is basically a penetration testing, um, time, uh, traditionally, um, a team will get a penetration test done on their application that will cost between 10 and 25 times what it would have initially. And I’ve got some other steps here I say for a six month project and a two week penetration testing engagement, that’ll generally cost you somewhere around $40,000. And what generally happens is only the top five critical bugs are fixed. And that generally costs about 24,000. You know, you have $64,000 to find and fix five criticals as opposed to spending a hundred or $200 on a tool like pebble team per month to find and fix almost all of those vulnerabilities.

Justin Beyer 00:45:10 So when you put it like that with, you know, the fact that you’re reducing live time of these code defects, you’re increasing fixed rates. What is the point of doing an actual penetration tests?

Kim Carter 00:45:22 Yeah. So the point of the penetration test is that you’ve got experts that are dying to find gnarly bugs. So penetration testers get quite sort of down when all they’re finding is the likes of the iOS top 10 and simpler sort of defects that developers should have found. So if you give them some actual, some code where they struggle a little bit more than they actually going to dive in, and they’re going to find the more gnarly bugs, so they’re still going to find the defects. They’re just going to be more gnarly defects.

Justin Beyer 00:45:57 Okay. So, and I think we kind of discussed this earlier, when we talked about, you know, the overlap between penetration testing and DAS and where they kind of fit together. It’s that, you know, even though it costs a lot of money, you want to make that money worthwhile and actually find the things that your automated tools aren’t going to find. Yeah. So would you say that by using something like, you know, a Las purple team it’s actually going to influence the architecture of your application one way or the other, is it going to force you away from using lower code level things that are going to have a high rate of defects where you’re going to see a huge amount of these issues and push you towards something that is going to be harder to introduce defects? Or do you think it’s not going to change anything?

Kim Carter 00:46:41 Yeah, well, I would say it would because, I mean, if you’re getting bombarded with similar types of defects, then the developers are smart. They’re going to basically work out well. But if we introduce this other library or some other technique or something that can stop these similar defects happening all the time, then we should be doing it and they’re going to do it. So yeah, it will improve the architecture. It’s the same as test driven development. Right. We’re similar, but test and development is not about testing. It’s about the actual development. It’s about the architecture. It’s about creating loosely coupled code, or they can change easily, same sort of deal,

Justin Beyer 00:47:21 Exactly. Leveraging the abstractions and making it easy for you to test portions independently, to find some type of defect with functionality or features or what have you. Whereas this is kind of pushing you to not do silly things in your code, essentially that you probably know were bad and you did it because, you know, it was Friday at four o’clock and you want it to be done for the day. And on that kind of note, you kind of mentioned the reporting and how that has, you know, some output formats. Have you guys done any integrations for, you know, the chat ops concept, you know, integrating into slack or I guess discord at this point would count or something more traditional, like integrating into JIRA or another ticketing system where it’s going to create a record of everything that it found automatically and possibly apply some additional automation in that system to assign developers to it.

Kim Carter 00:48:13 Yeah. So, no, we haven’t probably because we’re basically just been focusing on the core product itself. They’re getting that as good as it can be, but we’re definitely open to, to anything that potentially the community would like to introduce.

Justin Beyer 00:48:28 Yeah. Maybe that’s a good addition with, since you already have the Jace on, I’ll put, maybe that’s another project for someone to work on is actually integrating that on output into these other alerting mechanisms to pull in developers, but just to kind of close out the show here, is there anything that I missed or didn’t bring up so far that you think would be good for developers or software engineers to know about, you know, either the entire dynamic application security testing space or, oh, us purple team specifically?

Kim Carter 00:48:57 No, I can’t really think anyway. I think we’ve done some pretty good coverage on it.

Justin Beyer 00:49:02 All right. Perfect. And what other research are you working on? Are you mainly just working on a loss purple team or is there other projects you’re working on or books you’re working on?

Kim Carter 00:49:13 Yes. At the moment it’s just purple team and it’s been pivoting for pretty much the last three years. And that was, I started that sort of like when I was getting near towards the end of the writing my book series.

Justin Beyer 00:49:27 Okay. And we’ll definitely include a link to your book series since I think it has some helpful overlaps with some of the concepts that we discuss in here in application security as a whole. All right. Well, Kim, I just wanted to thank you for coming on the show and discussing, you know, how we can leverage things like, oh, boss purple team and dynamic application security testing to improve application security and reduce the code defects. This is Justin buyer for software engineering radio. Thank you for listening.

SE Radio 00:49:57 Thanks for listening to se radio and educational program brought to you by either police software magazine or more about the podcast, including other episodes, visit our [email protected] to provide feedback. You can comment on each episode on the website or reach us on LinkedIn, Facebook, Twitter, or through our slack [email protected]. You can also email [email protected], this and all other episodes of se radio is licensed under creative commons license 2.5. Thanks for listening.

[End of Audio]

SE Radio theme: “Broken Reality” by Kevin MacLeod (incompetech.com — Licensed under Creative Commons: By Attribution 3.0)

Join the discussion

You must be logged in to post a comment.

3 comments

Kim Carter says:

July 7, 2021 at 7:15 am

Hi.
https://doc.purpleteam-labs.com/ is 404
New docs are at https://purpleteam-labs.com/doc/
Byron says:

July 12, 2021 at 3:07 pm

It looks like the link `OWASP purpleteam Project Documentation` is not working as of 7/12
SE-Radio says:

October 6, 2021 at 10:16 pm

Thank you for letting us know. The link has been updated.

SE Radio 467: Kim Carter on Dynamic Application Security Testing

Show Notes

Related Links

Transcript

Join the discussion

3 comments

More from this show

SE Radio 725: Danny Yang and Sam Goldman on the Pyrefly Type Checker

SE Radio 724: Jure Leskovec on Relational Graph and Foundational Models

SE Radio 723: Dave Airlie on Linux Kernel Maintenance

Menu

Recent posts

Search

Search

SE Radio 467: Kim Carter on Dynamic Application Security Testing

Show Notes

Related Links

Transcript

Join the discussion

3 comments

More from this show

SE Radio 725: Danny Yang and Sam Goldman on the Pyrefly Type Checker

SE Radio 724: Jure Leskovec on Relational Graph and Foundational Models

SE Radio 723: Dave Airlie on Linux Kernel Maintenance

Menu

Recent posts