Search
Thomas Graf

SE Radio 445: Thomas Graf on eBPF (extended Berkeley Packet Filter)

Thomas Graf, co-founder of Cilium, discusses eBPF and how it can be leveraged to improve kernel-level visibility. Host Justin Beyer spoke with Graf about where eBPF and XDP can be leveraged and how they function at the kernel level. They also explored how eBPF can be leveraged across multiple networking, observability, and security use cases, including in microservice environments. They also discussed how eBPF projects, such as Cilium, can compare to side-car service mesh models, such as Istio.


Show Notes

Related Links

Transcript

Transcript brought to you by IEEE Software
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected].

Justin Beyer 00:00:21 Hello, this is Justin Beyer for Software Engineering Radio. And today I’m speaking with Thomas Graf about E VPF. Thomas is a co-founder of psyllium and the CTO and co-founder of ice surveillance. The company behind silly previously, Thomas worked at red hat and Cisco as a Linux kernel developer on the Linux kernel and various other open source projects. Welcome to the show, Thomas.

Thomas Graf 00:00:40 Thanks a lot for having me, Justin.

Justin Beyer 00:00:42 Yeah, of course. So just to start off the show, I want to discuss a little bit of a baseline before we dive into the EBP. I want to talk a little bit about how the Linux kernel is structured. So from a high level, can you kind of discuss, you know, how does my user space program interact with the Linux kernel and a little bit about the internal structure of the kernel?

Thomas Graf 00:01:02 Sure. Um, the kernel is what typically sits between an application and hardware or appliances, and it is there to abstract the hardware away. So applications don’t have to care about what hardware you’re actually on the, and that could be real hot, or it could be virtualized, virtualized hardware, and the kernel is fundamentally event driven. So it will only do something if you ask it to do something right, it’s not magically doing something. There are maybe some time-related things that are happening, but generally there are system calls that are happening from the applications. For example, Hey, I want to open a file or, Hey, I want to do an, a TCP network connection order. Is hardware telling the kernel, I have something to read or I have a network packet that has just come in. You need to do something about it. So the kernel is there to act on those events and to connect hardware with, with applications, to, for example, allowing applications to open a TCP connection or to store a file or to read from a file. And so,

Justin Beyer 00:02:03 Okay. So it’s event driven, we’re leveraging Cisco calls from user space into kernel space to drive those events. So we’re going to dive a little bit into this, into where EDPs fits in that, but before we get there, I just want to talk a little bit about how our change changes actually made to the Kernel.

Thomas Graf 00:02:20 So if we specifically talk about the Lin was Kernel, which we do in the context of EBPF, it is an open source community, it’s hundreds or thousands of Kernel Kernel engineers, or Kernel developers who will make source code modifications. It’s I would I’ll call it relatively old school. So it’s still using mailing list. It’s not using Slack, it’s not using get hub yet it’s it’s using get or get, was actually came out of the kernel development movement. So you have, uh, hundreds of thousands of people that are making code changes that then run on millions and millions of devices from high-end servers to really small embedded devices, which leads this interesting scenario, that half money, many, many different interests, and they all have to agree on that particular change. You make sense for all of them. So all of this creates this very unique, very, very interesting kind of group of people that, that all have similar goal, but they’re not all caring about the exact same thing.

Justin Beyer 00:03:19 Yeah. So it’s almost like that debate you get, when you have the 500 distributions of Linux of which one do I pick, you’re still at the kernel level, trying to summarize all of those higher level specific, like Oop onto changes or Devean changes or our exchanges. And you’re trying to kind of boil that down to core essence and say, okay, but this is the actual functionality we need in the kernel.

Thomas Graf 00:03:42 Exactly, exactly. But the huge benefit is that if you have something in the kernel it’s available everywhere, right? Linux is everywhere these days. So if you manage to get functionality in, it means you have this available on any smartphone that is Linux-based on any Linux server on most home appliances that when they’re holding space. So the gain you get from something integrated into Louis Chrome is huge, obviously.

Justin Beyer 00:04:06 So then kind of diving into EBPF how did that actually get introduced into the Linux kernel?

Thomas Graf 00:04:12 Interesting, uh, EBPF kind of origins from EBPF, which is the older, more natural form of it. Berkeley packet filter, which is, is all that an I am, but it’s very, very old and it is the original package filter from the BSD days. It’s a small coding languages that call it a small coding language that allow to write filters to, for example, if you’re on tcpdump to specify with which packets should actually be displayed, or if you have a raw socket, which packets should be, or should be received on a raw socket. So it’s something low-level in the Kernel that was used to select a subset of network packets and EBP F is the extended version of that. And the motivation for it was that it became harder and harder to find consensus among all a more all of the Linux kernel engineers, because a certain change may make sense for a Google or a Facebook or one of the other very big companies utilizing Linux for high skill environments. But it may not make sense for, for example, embedded systems. And this led to a motivational to, uh, to a discussion is kind of under the way of, can we make the Linux kernel programmable, and this is where EDPs thrown it.

Justin Beyer 00:05:21 Okay. So just to kind of go a little bit back when we’re talking about packet filtering, what actually is that because I know some of our listeners may not be super familiar with that lower level networks stuff.

Thomas Graf 00:05:35 Yeah. So packet filtering is done on multiple occasions. The most obvious one is, is for example, a firewall, right? If you, if you want a security firewall, you need to be able to do packet filtering to decide, should I allow this packet, or should I not allow for this packet? Uh, the example that most people are probably probably aware of is IP tables, where you will have a role such as I want allow SSH, and I want to drop everything else. And I want to allow all of my reply packets, uh, and that’s it like a stateful that’s a typical state full packet filter, but there are other cases as well. Uh, one is, uh, one is, uh, something I’ve already mentioned, which is TCP dump. You can run TCP dump to capture network packets on a network device and display them. For example, for network troubleshooting. In this case, you also want to apply a filter. And for example, say, or I only want to see network packets on port 80, or I only want to see TCP packets with the CIN flax and so on that’s those are typical examples of a packet filter.

Justin Beyer 00:06:33 Yeah. And commonly, I hear it discussed, you know, TCP dump underlying something like Wireshark to display the actual captures and view the P cap files and all that kind of stuff. Exactly,

Thomas Graf 00:06:43 Exactly. Anticipatory is kind of the raw form and then virus shark is to go.

Justin Beyer 00:06:47 Exactly. So you mentioned how EBP F has kind of this programmability in the Linux kernel. Can we kind of talk about how EBP F is actually working under the hood, like integrated into it? Like how has that actually hooking into the kernel?

Thomas Graf 00:07:00 Absolutely. EVP AF by itself is a byte code language, which is, is restricted, but relatively general purpose. So you can write programs which contained logic. The kernel then provides a whole set of hook points to which you can attach those programs and run them when a certain event happens. And this suits nicely into the event driven architecture of the Linux kernel. So for example, you can run an EBPF program when a network packet is received or when a network packet is transmitted or when a system called is being made or when a tree’s trace point is passed, or when, um, when a user space probe function is called and so on. So the kernel provides a set of hook points, which you can extend and run an EBPF program. And that program can then return a verdict to instrument or to indicate to the Kernel what to do at that top one, how to proceed.

Justin Beyer 00:07:53 Okay. And we are going to dive a little bit into those discussions about, you know, what those hook points are and stuff. When we discuss a little bit later about writing EBPF programs, but I just kind of want to talk about what are we talking about with complexity of VPF when we’re talking, you know, time costs, memory costs. If I want to write an EBP F program, how much of a hit on my system am I going to say,

Thomas Graf 00:08:15 EBPF is specifically designed to be as efficient as possible. And for that reason, it actually comes with a just in time compiler, which means that even though you’re loading generic, bytecode portable, bytecode into your Linux kernel that just in time, compiler will compile that and translate that into whatever your CPU runs. So from an efficiency perspective, you’re very close to execution speed as if you would recompile the kernel by perf programs are typically not very complex, there’s even a complexity limit to programs, because from a cold perspective, you don’t want to allow users to load programs, which might install the kernels. Like EBPF programs have to be fundamentally secure and having a known complexity or an opera complexity limits is a key part besides .

Justin Beyer 00:09:01 Yeah. And we’ll dive into some of those restrictions a little bit more later on. I know there’s some restrictions around looping and pointers and all that kind of stuff. So when you’re saying that just-in-time compiler, though, are you talking about, is that where LLVM kind of fits into that layer?

Thomas Graf 00:09:14 Two layers of compilation that happened, first of all, EBPF is an assembly like language it’s, uh, it’s, it looks very similar to x86 assembly and yes, you can write to the EBPF bytecode at that level, but that’s not what you typically do. You typically use C or a pseudo C language and then use LLVM to compile pseudo C into PPF bytecode. And then the Kernel will translate that again from the generic pike code into, for example, x86

Justin Beyer 00:09:42 UPF. Okay. And who’s maintaining all of this stuff within EEPs cause it sounds like there’s definitely a lot of underlying functionality that’s dependent on constant maintenance.

Thomas Graf 00:09:52 Exactly. There’s multiple layers that are stacked on top. Like if we start from the very bottom, there is the EDPs subsystem in the corner, which is maintained by Daniel Borkman and Alex and Andre. Uh, I, sorry, Andre, I completely forgot. I forgot your last name, but fully maintainers on, on the EBPF coral side. That’s what’s in the kernel side, uh, this is the engine. This is the chit compile. This is the verifier, which we’ll talk later about this on a hook points, everything that runs in the kernel. And then there’s a lip PPF, which is also maintained by the criminal community that’s low level. And then on top of that is, um, various toolset to compile a suite LLVM and then tools like Solium BCC PPF trays, which are higher level tools. They’re still very much in the EBPF ecosystem because most users will not typically write the PF programs themselves. They will leverage a tool that sits on top that provides, uh, a more user-friendly way of actually using it.

Justin Beyer 00:10:50 Yeah. And we’re going to discuss EBPF trace and some of those other tools, we start talking about troubleshooting the programs you’re were writing it’d EBPF, but I just want to give a couple examples of some functions of writing an EBP F program. What’s going to motivate me to write one of these things, rather than just depending on a psyllium or something like that.

Thomas Graf 00:11:08 I think the vast majority will not write DBPR programs. I think in being beginning deaf, wasn’t very much the case. A lot of the high leveling tooling did not exist. If you look at the typical use cases to leverage CBPR, it’s tracing it’s networking, it’s all security for all of these major use cases at tooling exist. Right? If you are, if you’re looking into profiling and tracing, you will use something like PPF trace and only if there’s no specific program in the huge set of the huge library of existing UBP of programs only 10, would you actually go in and write your own NPR program? The learning curve is definitely definitely high. Similar on the, let’s say the sector, the system called security side, there’s set comp there’s a second PPF you. There’s no real reason why you would go to the EBPF where you want to benefit from the power of EBPF, but typically you can, you can make full use of it without actually fully understanding your EBPF. So I think unless you’re starting or unless you’re looking to start a new EVP of project, most likely you will start out with some of the higher level tooling first.

Justin Beyer 00:12:09 Yeah. So unless you’re trying to dive straight into, you know, I’m writing XYZ project and I didn’t need to do this kind of interaction with the Kernel and the Kernel devs won’t accept my patches. You know, that’s when you might start looking at using something like EDPs to do that. So just kind of changing direction a little bit. The other acronym that I hear thrown around when EBPF has discussed is XDP can you kind of discuss what that is and how that kind of fits into APPF?

Thomas Graf 00:12:35 Yeah. So XTP is the Xpress data plane. And in a nutshell, it’s the ability to run an EBPF program inside of the network driver of Linux. And why would you want to do that? Right? I think, um, it, that sounds a bit, it sounds a bit crazy. If you come from the networking role, it’s actually not crazy in the networking role. It’s several use cases quickly, um, evolve around how, how many packets per second can you handle? So for example, if you perform high, high performance load balancing, or if you, if you perform DDoS mitigation, the type of measurement you care about very quickly is about packets per second. Like how many packets per second, can you load balance? What type of DDoS attack can I sustain from a packets per second perspective? And in that context, it’s actually very useful to run as close to the hardware as possible. So you want to be able to do packet filtering or load balancing without actually going through millions and millions of lines of kernel code. First in that sense, XTP comes in and it allows you to program network logic incredibly close to the hardware, which means that you pay the least amount of least amount of overhead before you get to the EBPF program that does the actual work.

Justin Beyer 00:13:49 So it’s almost getting an EBPF insert before drivers or at the same level as a driver would be getting within the networking hardware.

Thomas Graf 00:13:58 It’s at the same level as a driver, but you still have, for example, instead of first of all, creating, what’s called an SKB — a socket buffer, which is the generic abstraction of a network packet inside of a driver. Every driver will have its own view of things. It’s still very hardware specific. And then the Linux kernel has this obstruction layer on top, which would makes everything generic. And that, that is already pretty costly. You do memory allocations and so on. XTP allows you to run logic performance before you pay this, this, this big first cost. It means that you don’t have access to a lot of what the kernel provides. A lot of the, for example, there is no IP or TCP or anything like that. You’re talking about raw network packets, but if you do load balancing, or if you do a DDoS mitigation, you don’t need any of that. All you want is to, to be able to look at the packet as early as possible, and, um, either do a load balancing decision or say, I need to drop this back because this is part of a attack.

Justin Beyer 00:14:52 Okay. And just kind of diving into, let’s say the DDoSs example a little bit, or the load balancing example a little bit. How am I passing that information into XDP to say, you know, yes, these packets with this signature need to go away, or these packets with this signature need to be directed to this IP address or something to that.

Thomas Graf 00:15:11 Exactly. Yes. This is where the generic nature of PPF comes in. SoEBPF consists of the program aspect, which is the actual program called the logic. And then the second big aspect isEBPF maps. Those are data structures that can be hash tables that can be LPM longest prefix matches table. Uh, and so on like a vide set of data structures. And typically the configuration of a program is in those maps. So if you look at, for example, a details mitigation program, you would typically store the, uh, the IP ranges or the deciders, which you want to drop in an LPM table. And your XTP program will basically parse the network header, look at the source IP address, and then see if there is an LPN match in the LPM table. And if so, drop the packet. So that’s the program. Logic is variable, it’s relatively simple. And the actual configuration on what you want to drop is actually encoded in the EBPF program, which means that if you change, if you need to change what you want to drop, you’re not rewriting the program. You’re only changing the content of the EBPF map. And thisEBPF map can be modified, both read and written to from user space as well. So there is a system called that allows a application, uh, running and user space to take control over that EBPF program and to end the EBPF nap.

Justin Beyer 00:16:29 Okay. So I could almost have my user space program running, you know, DDoSs mitigator, that’s getting network calls with cider information for my security infrastructure and that’s then updating the map in EBPF. And then my XDP program is referring to that EBPF map and then acting based off that.

Thomas Graf 00:16:49 Exactly. So you have the control of what we would typically call the control plane. The controller portion is in user space, typically written in something like go or not a high level language and the actual data path that needs to be really fast handling millions and millions of packets per second. That is in EBPF. XDP very, very low, very low level.

Justin Beyer 00:17:07 Okay. And you mentioned LPM tables in passing there. I’m assuming you’re talking about longest prefix match. Exactly. Can you kind of discuss what that is? Cause I’m not sure how many of our listeners would be super familiar with that.

Thomas Graf 00:17:18 Yeah, this is something quite networking specific. And it’s the notion that if I’m on the internet, you have, you have routing and routing means routing is the, the logic that decides where network packets need to go. And in order to avoid that every single device on the internet needs to know every auto device on the internet and that there is something called subnets or routing, which means that some devices will generally know in which direction the packet need to go. And then the closer you get to the destination, the more information you have In order to implement this, you have something like a longest prefix match, which means it will always match on the most accurate and most exact version of the information you have. So if you know about the exact destination IP, it will match on that, but you can also, um, add an entry into that. Uh, something like an IP 10.0 0.0 slash 24, which means the last eight bits could be anything. And it will only match on the first 24 Bates that’s longest prefix matching. It’s something very networking specific.

Justin Beyer 00:18:21 Yeah. So usually I hear like the simple example, as you kind of mentioned, you know, you have that $10, zero.zero.one/ 24, then you have that 10.0.0.0 slash you know, eight, the slash 24 is what’s going to win because it’s that longer prefix. It’s more defined, more specific.

Thomas Graf Exactly.

Justin Beyer So just kind of moving along here, you kind of talked about XDP fitting in at that, you know, hardware interface, NIC level, you know, network interface, card, and EVF kind of underlying that as the logic. Is there any other use case where you’re going to want to start using XDP in combination with EBPF?

Thomas Graf 00:19:02 I think the prominent use cases right now are DDoS mitigation and load balancing and load balancing primarily what we will call North files, load balancing. That’s typically packets coming into a data center or into a set of machines are these type of local answer’s typically exposed to the internet. So they’re subject to a lot of traffic and you typically have a small number of load balancers and then have a vast number of servers behind those load bouncers for dos XDP is ideal because what you typically do end up for those local answers, you receive a packet from the network card, you do something to the network packet, and then you send it out the same network card. So you’re not actually, you’re not actually delivering it to an application running on that same server or on that, on the, on that mint Linux machine. You, you, all you want to do is enable it, manipulate the network packet to do the load balancing. So change some change, certain fields in the network haters, and then send it out again. It’s also the prime use cases for XTP, but they’re definitely auteurs as well. Yeah, there has been talk about doing early detox in XDP there’s definitely cases where you can do something like add cabs, encapsulation. The next DP, there are cases where I’ve seen people build a net for six or the translation from right into IPv6 and back forks. For example, for IOT, a lot of IOT devices will only do IPv6, but the actual backend applications are written in a way that they don’t understand how people six and so on. All of those low-level network functions can be, can be done at very high speeds with XTP. And when we talk high speeds, that usually also translates to very low power consumption per operation. So it’s not necessarily millions of packets per second. It’s also just Watts per packet. That sense. So if you have very little compute power available XTP is very interesting as well.

Justin Beyer 00:20:48 Yes. So that’s where it almost has that perfect fit for the IOT environment where, you know, if I want to stick X IOT device in someone’s refrigerator, I don’t want them changing the batteries every single day because they’re, they’re not going to buy my product anymore if that’s the case. Let’s kind of talk about how we write these programs. You know, how I’m writing any VPF for an next DP program, just from a high level, what kinds of libraries can I use to write an EBP F program, for example, and then we’ll kind of discuss about writing XDP programs.

Thomas Graf 00:21:20 Yeah. So first of all, I think the first category is to not even write anyEBPF program, but to just leverage one of the tools. I want to emphasize that enough, because I think there’s a perception that, Oh, I need to learn about EBPF. I think that’s great, but you can leverage all of its powers without actually having to write your own program. If you want to write your own program, there is a set of libraries that you can use, uh, starting from like, there is, there’s a rust implementation, so you can write programs in Ross. There are ways of writingEBPF programs in Golang. Typically the actual program code is always in pseudo C. So I use always some form of C code. That is a bit, it’s kind of a limited set of C. So you can do arbitrary C. You can do function calls by now. You can do what’s called bounded loops, but you can, we can’t loop forever, for example. So there’s a subset of C that you could use. You write that code and then you either use LOVM or more recently GCC also hasEBPF support and you run it through the compiler and that will generate you the pipe code. At that point, you will load that pipe code into the kernel using the loader infrastructure. There is lippy PF. There is TC, uh, are BCC and a couple of other loaders does this, this is the code and will take the EBPF program and loaded into the kernel. So it will make sure that the PPF program has the proper elf headers and all of that to make sure the Kernel actually understands and that this will load it into, into the kernel. Uh, at that point, the compiler will, will verify it and make a judgment, whether the program is safe to run and then either complain or successfully load the program.

Justin Beyer 00:22:53 Okay. So that’s a lot to unpack there. So let’s just kind of take this like one bit at a time here. So first we’re writing our code using some libraries. So essentially we’re creating some type of pseudo C and I’ve also heard the library for Python, BCC, thrown around handling a little bit of that compiled automatically for you. So then you’re compiling it into byte code and you’re putting it into an elf format. Can you kind of dive into what that is and why you’re putting it into that format?

Thomas Graf 00:23:19 Yes, because you need to, like one aspect is just the code itself. And then another aspect is you need metadata around the program. So you need to understand what PPF map does this program use. You need to understand how the program should be loaded. So at what hook points should it be attached to? Um, you need to understand who is loading this program. And then more recently also like S P P EBPF signing. So the signature of the of program is also being discussed. So that is the program itself and then a bunch of metadata that needs to go along with it. And that needs to be standardized. So kernel and user space need to understand how this information is encoded. So both, both sides can actually understand it.

Justin Beyer 00:23:59 Yeah. And elf is usually a common format for, or a common standardized format in the Linux world for passing these binaries around.

Thomas Graf Yes.

Justin Beyer So then it’s getting, and we’re going to discuss those hooks and trace points and stuff like that a little bit later, but so then it’s getting loaded in. So how has that loading process actually working? So is that something a Kernel SIS call that’s pulling that in, is there a loader within the Kernel that’s doing this?

Thomas Graf 00:24:24 So it’s a, it’s a system call — there’s the EBPF system called that you can use to load to program. The same system called us also used to create maps, to attach, to maps, to, uh, to pin maps. We can talk about that as well. I’ll walk that one that is to access maps, to write into maps, to read from apps and so on. But there’s one system call that you do to load the programs. So you can say, here is my program. This is the elf, attach it to, let’s say the TC layer. That’s a traffic control layer. And I want to run this program every time. And that were packages received on for example, and you tell this to Kernel, the Kernel will take the program verified and attach it to, to that, to that hook point. And the moment it is attached, the next network packet that is arrived, will flow through that through the DVPs program.

Justin Beyer 00:25:12 Okay. So it’s still got that verification layer. So can we kind of talk about what that’s doing? What does that checking for? How does that kind of work? And then what kind of feedback is it giving to you as a developer, if something’s wrong?

Thomas Graf 00:25:24 Yeah. I think a lot of our listeners might kind of be wondering at this point, why are you going through all of this? Why aren’t you just loading a Linux kernel module? The Linux kernel was always extensible and the very big difference between a Linux kernel module and EBPF is safety. EBPF is from design secure, which means that you can’t load arbitrary function. So it may sound like making the Kernel programmable means I can call into arbitrary kernel functions. That’s not true. You can call into what’s called EBPF helpers or help recalls. It’s a, it’s a subset of the functions that the internal Kernel API offers, but in a stable manner. So even across kernel versions, it will be the same. So it’s a, the amount of complexity and the amount of operations and UPPF program is can do is limited. And the verifier ensures that this is, this is met.

Thomas Graf 00:26:16 So a couple of checks that are verified. It will, first of all, check the overall program size. So you can’t load a 50 megabyte program into, into the Kernel. The Kernel will reject that there is a agreed on upper limit for EBPF programs. And then the Kernel will do a verification of every single branch that exists in a program. So it will, it will calculate all the possibilities that the program could go into and it will make sure that all of those paths run to completion. So it will ensure that nothing can loop forever. It will also make sure that nothing is uninitialized. So every single piece of memory has to be initialized and so on. There’s a long list of things that we’ll do to ensure that UPF program, even if it’s harmful, can not crash to, and it can not stall the kernels. So even if somebody has access to a DPF system call, which is typically protected by privileges, but even if somebody loads a malicious program, the verifier will ensure that the Kernel can continue running. So that’s a huge major difference between the Linux kernel module and EBPF

Justin Beyer 00:27:22 Okay. So it has that almost implicit benefit of even though it’s event, because it’s event driven, there’s no way to completely stall the Kernel on an event. So, you know, if I hook into network traffic, I can’t write a program that just doesn’t infinite loop and basically takes your entire server down if I got on it.

Thomas Graf Yeah, exactly. You can, you can still, you can still do things like drop all your SSH packets and lock yourself out, but you cannot do something like let’s just loop forever. And then my kernel will stall.

Justin Beyer 00:27:52 Exactly. So you can still shoot yourself in the foot, like everything with Linux, but yeah, you can’t completely take the system down. So what is the benefit of this over a user space program? I mean, generally with a user space program, I mean, or even, you know, a kernel module, as you kind of mentioned the benefits, it’s more safe, but from a developer perspective, why am I going to go towards EBPF over writing a kernel module or writing some user space program to do this

Thomas Graf 00:28:20 The only occasion where we actually want to do this, if you need to run in the Linux kernel. So when do you need to run in the Linux kernel? One aspect is you are seeing everything that the application is doing. So for example, if you want to implement, what’s called container runtime security, our system called filtering and restrict restrict a number of system calls, a particular application can do you need to run in the Linux kernel because system calls are being made through the lens. So you need to run as part of the Linux kernel, or if you want to get involved into networking, you either need to do this in the kernel, or you need to do something like user space, networking, where you bring the hardware to use the user space. But typically the, the main reason why you want to write, write any EBPF program is if you’re, if the context you need on, if the, the, the use case you have needs to be solved in the Kernel that’s what’s that’s what’s, that’s why hEBPF is there. It does not make sense to port arbitrary applications into EBPF.

Justin Beyer 00:29:18 Exactly. So there’s no reason to port my hello world program into EBPF because, you know, it works just fine and user space. And it’s really starting to get into that concept of if you need to start intercepting SIS calls, that’s when you need to start going into the kernel.

Thomas Graf 00:29:33 But, well, why not? A good example is for example, if you want to do performance troubleshooting off your applications, like you can either go on, make modifications to your application. And for example, add trace points and add print case or print apps, or like linker Lincoln, some debugging library that will do profiling for you, but that requires you to recompile and restart your application with EBP F you can do a lot of this for a already running application, because the Kernel is always in a position to observe what the application is doing. And this is where it is, where it gets really powerful. The Kernel has access to observe everything that is going on on the system, and it can take control over everything that’s going on in the system in a completely transparent, transparent matter without having to change any application code. This is where a lot of the benefit is coming in.

Thomas Graf 00:30:25 If you look at a lot of the EBPF talks around, for example, tracing a profiling, it’s about performance troubleshooting on life applications in production, where you cannot restart them. You cannot run a debug version of this, but you still need to figure out why your application is slow. You still need to figure out your distributed application running on hundreds of servers. What is consuming CPU like which functions are consuming, how much CPU typically, if you’re on this, on your, on your laptop, you would, you would link in some debug library or something that will give this information. You cannot do this for the application running in production, right? This is, this is the key benefit of, of, of the kernel. It can see and control everything that’s on the system.

Justin Beyer 00:31:04 Exactly. Cause you know, you can’t push a debug version of my giant distributed mail client. Nobody’s going to like it. If all of a sudden their email has a bunch of error messages popping up on the client side, it’s bad user experience. So just kind of driving back here, you’d kind of mentioned, you know, we’ve talked a little bit aboutEBPF maps. So you mentioned pinning maps. Can we just kind of discuss what pinning a map is and why you’re going to do that?

Thomas Graf 00:31:32 So as we mentioned,EBPF maps contain the state of the application or configuration. It’s a set of data structures that just immediately brings up the question who owns these maps, right? So you have the EBPF program it’s running in the kernel, but the EBPF program can actually access multiple maps and the map can be accessed for multiplEBPF programs. So then not itself cannot be owned by the EBPF program. You can for example, have a program on the network level and another one on the system called level. And they can both interact with the same UPP EBPF map. We also set that the EBP or that the Linux kernel is event driven and a user space application may not live forever, which means that it opens up this difficult question who owns the EBPF map. At what point is it no longer used? And for this purpose, it was decided that we need to somehow pin that to the file system.

Thomas Graf 00:32:29 So it needs to have some representation where you can see thisEBPF program exists or this map exists. And there needs to be some way to release that resource. So it kind of delete the map. And for this reason there is aEBPF file system and everyEBPF map can be represented on the file system and you can access it through a file descriptor. This allows a program that loads CBPs program, um, or like a user space process, loading the program into the Kernel to actually quit and terminate and then start again and access the map again. So you’re, you don’t need to have a process running all the time and user space. And as soon as that process would terminate, all of your programs would be gone. All the resources, all the programs, all the maps are pinned to the file system. So they will exist even if the user space process, um, terminates or crashes or, or, or something. This is why painting pending exists. It’s also a way for other programs to discover the maps. So you can have multiple user space processes and they can discover a map by just opening the file system and opening the file basically.

Justin Beyer 00:33:34 Yeah. So your program’s coming up and it’s able to do that, you know, file descriptor, you know, doing an open source call on that actual file that’s underlying. And as you kind of said, you know, and we might discuss a little bit when we discuss the containers, but you know, there are femoral, they can go away and disappear and go everywhere and they still need to be able to access these things without getting rid of that file. But that brings up a question. Is there any kind of garbage collection on this? How does it know this is no longer needed anymore?

Thomas Graf 00:34:04 No, not really. It’s it’s the, the responsibility of the program loading or the process loading the program to also remove it again in typically PBF programs are a very crucial part of the overall system. If the EAP program does load balancing, or for example, containing are working, you want that to continue running under all circumstances because if it not being available could have tremendous consequences. So the Verde is always something, some failure in user space could affect. What about the overall system is doing? So there is no automatic garbage collection. It’s, it’s the, it’s the responsibility of the process, uh, which loaded the, the program to also remove it again.

Justin Beyer 00:34:46 Okay. So essentially whatever loaded the EBPF program through the EBPF, Cisco is also responsible for unloading the EBPF program and cleaning up any of the mess that it created. Yes. Okay. So that also brings up another question that I have. So we’ve mentioned throughout like trace points, you know, K probes, K trace, what are these things within the kernel that we’re actually hooking?

Thomas Graf 00:35:11 It can be almost everything actually. So it’s actually a very, very good question because the, the number of hooks that exist are continuously being extended. They already ranged from, if we start from top userspace prop. So even your regular applications, processes running and user speeds, you can define a so-called user space probe and say, if this function inside my application is called, then invoke the CPP program. This is how you can profile your own user space application. You can do Kaypro. So for any kernel function, with a known symbol, you can attach a PPF program when that function is entered. And when it exits for any trace point and the Linux kernel system, or the Linux kernel provides or maintains a, a long list of trace points, stable names to which you can attach to, for example, a network packet is being transmitted on the IP layer or block storage access from like the generic block storage does a read or does a write.

Thomas Graf 00:36:12 And so on. All of those are generic trace points. Then it goes all the way into very, very specific scenarios, such as TCP con congestion algorithms. So you can write programs that optimize how TCP congestion algorithm works, or every system called that is being done. You can attach aEBPF program to, or XDP, as we mentioned, like very networks of networking specific or socket operations. So sockets are used to do network operations of the application. You can attach your programs to EBPF is really becoming this very general purpose program, programmability layer, and is, is making its way into almost all of the kernel subsystems. There is an EBPF program to do infrared frequency reprogramming or something like very like very niche, right? Very niche. But it, it shows the vide usage of EBPF. I think in the future, it will even go as far as code that is currently written by Kernel developers natively and merged into the lens.

Thomas Graf 00:37:08 Kernel in the future might be written anyEBPF because it will be more secure. It might be hard for us to believe, but even Kernel developers will make mistakes, even Kernel level, our principle include box and having a verification layer that is another safety net will have huge benefits because any bargain Kernel code will affect millions and millions of, of, of users out there. So I think UPPF was really becoming this very general, uh, programming layer for the Linux kernel in general, we’re talking linens kernel right now, but I think it will actually make its way outside of outside of Linux as well. We are already seeing early on signs of this. Like there’s a free BSD port. Um, there has been tweets from Microsoft about this. I, I, I definitely VPF will become much bigger and will make operating systems in general, programmable.

Justin Beyer 00:37:56 Yeah, because it’s almost providing that safe abstraction layer within the kernel, not really an abstraction, but a way to hook into the kernel in a more safe manner than people were doing when they were writing modules or like in the windows world, when you’re writing, you know, really low level drivers that are hooking into it to do some of these things, you know, it gives you a little bit of verification that you aren’t getting with those other things.

Thomas Graf 00:38:19 And it also makes programming easier because if you, if you write a Linux cuddle module, for example, you are using internal kernel, API APIs, like functions are, are libraries that the kernel internally provides. There is no stable guarantees given by the Kernel Kernel community that these functions, um, have some sort of API or even API compatibility, which means every kernel version may break this. So you will have to adjust your kernel module code for every, for every kernel reason. You might even have to release it for different kernel versions. And so on EBPF has this obstruction layer in between that. Yes, you can, for example, call a function that will get you a random number back, but there is a, this is a stable function that you call. And even if the backing implementation changes that the actual help recall that API will always remain stable.

Justin Beyer 00:39:09 Yeah. So kind of going along that example, you know, if, for example, the kernel changes how you’re getting a random number and they changed the function name from get random to get you random. Now that’s a breaking change for your module. And now you’re putting a nice little if statement in there to check and see which kernel versions you’re on and whether it’s supported or not. But from an EBPF perspective, you can have just a simple, you know, getEBPF random. And whether it goes to random or Devy random or dev random, it doesn’t matter because from a developer perspective, you’re getting the same API call between

Thomas Graf 00:39:46 Exactly for those who have written column modules in the, in, in, in the past Carla module typically has thousands of pre-processor. If, if conditions, if cuddle version ax, if, if cuddle version Y and then different code, it’s a mess, right? It’s a mess.

Justin Beyer 00:40:00 Yeah. And it’s horrible to maintain because now you have to watch the kernel mailing list and, you know, almost have a little redneck filter looking for whatever function names you’re using to see. Did someone propose a change to this? Did someone propose a change to this? I think giving that feedback to say, no, don’t do this. Or, Ooh, wait, I got to fix my entire module now. Rereleased but kind of going back a little bit, you’d mentioned the concept of, um, hooking anything with a known symbol in the Kernel. I’m not sure how many of our users would be familiar. What does that mean when you’re saying a known symbol?

Thomas Graf 00:40:32 So if you run your Kernel, typically you don’t have to simple table available, which means that kernel functions are, are behind arbitrary addresses, right? Like kernel, like memory addresses. And in order to understand the very function starts. So Vera function where code for a function is located in memory, you need something like a simple table and the symbol table will map the name for example, system called, right? Like where is the implementation for the right system call and will contain the memory address? Where is that code for that system called in memory? And if you have that simple table available, you can basically say, I want to load an EBP, or I want to run anyEBPF program. When the system called function for right is invoked or vantage returns. And this is what quirky prompts is about for this to function. You need that simple table because as a human, you don’t recognize the symbol, the symbol or the memory addresses, but you will, you will recognize to the human readable function names. Yeah. And even

Justin Beyer 00:41:34 If you do know the, you know, memory address specifically, cause you sat there and debugged it, the problem is when you have stuff like address space layout, randomization turned on, which should be turned on in your carnal and you shouldn’t be turning it off. You still need that symbol layer to do the simple table, to do the translation for the address for you since hard-coded address really doesn’t work anymore.

Thomas Graf 00:41:54 Okay. So we’ve kind of talked about

Justin Beyer 00:41:58 Writing these programs a little bit. So how am I going to troubleshoot this program? If I’m writing one they’re like tools that I’m going to use to do that?

Thomas Graf 00:42:08 Yes. I think w we can speak from our own experience, uh, as a maintainer of psyllium, one, probably the biggest UPPF program that currently exists. We invested a lot into a troubleshooting and debugging layer and it’s fantastic because we were able to leverage the, the tracing layer of VPPR. So one of the prime use case of VPF is tracing. And when you do tracing you typically you hook into user space applications or into kernel functions, and then you emit a some event. For example, I, um, I have observed something or, uh, you want to admit the string or something and then show that in user space. And for this, the so-called perv ring buffer exists. It’s a, it’s a ring buffer allows any VP of programs to send events, to use the space in a very efficient manner and freezer space to retrieve those events.

Thomas Graf 00:42:58 And for example, print them. I mean, it’s, it’s an arbitrary coronary Greenbelt. Furry could be messages. It could be, it could be really anything, uh, re are specifically using that to basically bake in deep pockets. So it means that in the, in the codes, we can basically print debug messages anywhere we want them. And then on the fly, actual Naples enable this. This is very interesting because of the, the nature of VPF it’s possible to atomically replace programs. We can even in a production environment actually enable data path troubleshooting messages and like very similar to how a curl developer would do debugging with like print case statements. We can do that, but actually without recompiling the kernel, we we’re not restarting it. So that allowed us to do a lot of, uh, fantastic troubleshooting when, when, when needed, because if you run in the Cornel, it, you can’t simply attach a deBaca, it it’s, it is possible, but it’s not as easy as it is for, for, for, for a user space application.

Justin Beyer 00:43:51 Well, yeah, cause you’re trying to hook into the underlying operating system. And now when you hook a debugger to things, it messes everything off, but it gets all weird.

Thomas Graf 00:43:59 Yeah. And like, in a lot of cases, you can’t even just sleep and break, right. If you are processing that network packet, you can just, okay, let’s wait. That option does not exist. You would break your kernel. Or if you’re doing assistant all in a system called you could actually sleep, right. That’s actually feasible you, the application doing the system called Woodend sleep install, and it would look like a deadlocked or something, but that’s it, that would actually be okay. But for a lot of the arteryEBPF program usages, it’s not possible to even just break and stop as a typical debunker with two. Okay. And

Justin Beyer 00:44:32 You were kind of mentioning, you know, the debug layer and being able to atomically replace programs. How are you actually doing that? Atomic replacement and UPF.

Thomas Graf 00:44:40 Yeah. So the really fantastic property of UPF is start to program and its state is separate. So if you look at a typical application, the, when you load the program, and if you want to replace the program of your, if your user space process, you have to restart the entire program, the process, and you will also lose all of its memory. You will lose the stack. You will, you will lose like everything. Like you will start from scratch with PPF. This is different because the, you will also lose the stack, right though, because the stack only exists for one in invocation of the program. But between invocations of the program, you can replace the program, um, atomically wild and map, state prevails, which means you can, for example, collect statistics or maintain a connection tracking table andEBPF, or have firewall rules in the map and then replaced a program with the BPS system called atomically while the state still persists, which means you can, you can, you can apply to hotfixes. You can upgrade your programs without actually disrupting anything that’s going on.

Justin Beyer 00:45:47 So almost if you had, let’s say a debug switch in your control plane, you could say, flip this into debug mode, it’ll swap out and load the debug version of yourEBPF code. And because your map is pinned, as we discussed earlier, that doesn’t get thrown away because that’s the responsibility of the program to clean up. Instead. Now you have this debugging version, but how does that work with that ring buffer that you were discussing earlier? Does that get cleared for error? Is that new for each instance of a program or does that actually persist

Thomas Graf 00:46:19 That pursuits as well? A ring buffer is just not aEBPF nap as well, so that the ring buffer is always there as well.

Justin Beyer 00:46:25 So essentially because that’s still there again, as you kind of said, you’re not losing anything by swapping it out to the debug version. Now, you know, the buffer might just fill up a little more because you have more debug messages coming out.

Thomas Graf 00:46:36 Yeah. I never, because it’s a rainbow for even if the, the EBPF program is very quick and may process, let’s say only a couple of thousand packets per second. He may still admit hundreds of thousands of debug messages. The user space program doesn’t have to be quick enough to read everything in real time. It can buffer everything in the rainbow for, and then at a later stage, read it because the user space program may not actually have, may not be scheduled on the CPU at the same time as the EBPF program is running. So this allows for a kind of a single switch that kind of, Oh, I need debugging now maybe for the next 10 packets and then disable it again because all I need is a couple of traces to figure out what’s going on in my program. You can even do those things like, uh, only because you can a topic you’ll replace the program.

Thomas Graf 00:47:21 If you know the failure scenario you can, um, all enable the debugging. If a packet is arriving, matching that failure, that specific failure failure scenario, this is how we have, for example, track down a box around every six extension headers. We would have logic in there that would only admit the traces if the packet of six extension and how those will be will be received. And so it’s, I’ve actually been fantastic, at least from a perspective of a Linux Con level oper where that’s the, that’s kind of our standard, right? Like from a perspective of a, let’s say a Java application developer, it’s still very difficult to apply due to deBaca EBPF program, but compared to kernel development, it’s actually, it’s actually very handy.

Justin Beyer 00:48:03 Yeah. You know, when you’re used to having that user space, you know, APM, you know, application performance, monitor, reading, everything, your code, and you just go to a web interface and see run times for each function. It’s a lot easier than trying to write a program to handle some rainbow furs and maps and all that kind of stuff. So I kind of want to change directions a little bit here and start talking about some of the use cases that we have withEBPF. I mean, we’ve discussed a bunch of them so far, you know, anything from, you know, xDB being used in the next and even projects like psyllium and some of the use cases you guys fulfill, but I do want to talk, you know, as we’ve moved into a more containerized world, how do you get EBP F to understand that concept of a container?

Thomas Graf 00:48:46 That’s an interest that’s interesting because the lentils kernel and this may be a surprise that the Linux kernel does not actually know what a container is. The lentils Kernel understands, see groups, namespaces process boundaries are process forking and somebody doesn’t actually understand container container uses all of what I just mentioned to, to create like container isolation on a concept of a container. So you can not go on and say, Cornel, give me, uh, give me the memory usage of container of container IDX to Kernel does not know what container IDX is. You would need to translate container ID into the actual sea group ID and then ask the Kernel, okay, this container is using this C group for memory consumption or for memory isolation, show me the configuration or so many, show me the usage and so on. And this is where APPF comes in and can make the Kernel, um, indirectly aware of concepts like Kubernetes container microservices, and so on by allowing it to, for example, do a packet filtering or system called filtering, um, with the awareness of container ID because, or containers or parts and so on by encoding that as part of VPF programs that that’s making so powerful, the reason why that’s even needed in the first place is because typically the time between writing a Kernel feature and the time by the time he gets available is multiple years.

Thomas Graf 00:50:10 Like it’s often five plus years, like somebody writes a Kernel feature. And then by the time that Kernel release gets into the hand of users, it’s just so long that it’s very rare that the Kernel is actually up to date in terms of use cases, the Kernel that you’ll run right now, it’s probably not been written with your current use case in mind and EVP of programs. You can probably obviously load any program that you want, which means you can bridge that gap. This is what’s making it so powerful.

Justin Beyer 00:50:36 Yeah. So it’s giving you that flexibility that you don’t actually get with kernel development because of the fact that Kernel development’s delayed, it takes forever to get a patch in. And then even once you have that in now, it has to be rolled out to all the distributions and not everybody’s running the latest version of the kernel in their distribution and you know, every other, what if scenario, but just kind of circling back. So you mentioned the concept of namespaces C groups, all that kind of stuff. And as you kind of mentioned, probably most software engineers, aren’t going to know what that is. They might understand the concept of a container, but they might not know how that works at the lower Kernel level. Can you kind of dive into that a little bit and explain it? Sure.

Thomas Graf 00:51:13 So I think containers are two aspects. One is the container image, which I will not cover. Don’t know if the packaging shipping the entire dependencies and so on. That’s one aspect. And then the other aspect is the deployment forum, like how it actually runs on the system and the migration or the movement away from virtual machines as the deployment that we’re kind of the, the, the isolation form to a container, in which case the Kernel becomes responsible to, uh, negotiate the resources between containers, because obviously we’re running multiple containers on a single system, which means somebody needs to be aware and in charge of, for example, making sure that one container can not use all of the memory of the entire system or all of the CPO and the system, or who ensures that the, the networking of one container or like, uh, one application running in a container can not just access the other application in another container on the same system, because from a networking perspective, that would obviously be possible if you don’t do the isolation for this purpose. The concept of namespaces has been, has been created actually much longer than containers exist as a concept. It’s been many, many, many, many years. And then later on the concept of see groups have been introduced as well. It’s all about making the lowest code multi-tenant or multiple, multiple applications. Multiple tenants can operate a single system without necessarily trusting a charter. And by providing isolation on all levels, CPU, memory, networking, storage file system, mounts users, and so on.

Justin Beyer 00:52:43 Yeah. And this is where that benefit of the kernel almost being that choke point helps because now it can filter CIS calls that are accessing these items based on those C groups. And that’s where, you know, you can start leveraging EBPF to hook into that, but to kind of dive into some examples here. So how would you roll out EDPs in something like a microservice environment, you know, for things like observability networking security, I know that’s a lot of what psyllium is actually capable of doing.

Thomas Graf 00:53:14 Yeah. So solar is a networking and security layer for specifically for Cubanetis cloud-native. Um, and for networking, you typically, we can use to a example of Cubanetis, which I guess is the most, most, most dominant our container platform. Right now you, you would typically almost a so-called CNI plugin, like containing an orchid interface. Uh, at that point you are the, the component that there was responsible for allowing containers to communicate with each other, like container to container, and also how to communicate to the outside world and how, how to allow the outside world to talk to containers. Um, and in that sense, the CNI plugin that’s actually has a lot of freedom on how to do this. Specifically, mostly in our plugins will use so-called virtual ethernet pairs to create a virtual virtual, Ethan, that cable, basically between the network namespace of the container. And what’s called a host name space.

Thomas Graf 00:54:13 So there’s always like namespaces and secrets are almost always, um, like coming in the form of a tree. So you have like an in it or a host name space, or a C group, and then you can go down into multiple levels. So you need to bridge the, the container namespace with the actual host, and then also bridge that out to the outside world. This is what the CNI does, and that, that gives the opportunity to attach anyEBPF program to that virtual ethernet cable. So it sees and controls all of the network packets that come in and out of a container. And, um, for the case of, of system calls, you would typically attach the, um, the EBPF program to the C group. So all of the system calls being done in the concept of that C group, which then is equivalent to all of the system calls being known in that container would go through that EVP of program as well. So you’re using the criminal level constructs, which are used to build containers as a way to integrate into containers.

Justin Beyer 00:55:12 Okay. And that’s almost something like where Falco would almost hook into that system that’s using that EBP F and filtering based on those C groups and then looking at the specific CIS calls coming from those specific containers. Exactly. Yes. Yeah. And back on episode three 41, I know we discussed a little bit about container networking and on two, nine day we discussed some stuff around Docker and security and some of this lower level stuff and how this isolation works, but just to kind of dive into some specific examples here with observability, you know, we’ve kind of mentioned tracing, would I hook my EBPF program into some type of overall like distributed tracing framework? Or is it really just like pod specific for Kubernetes, for example,

Thomas Graf 00:55:56 Both actually. So in the case of psyllium, we offer kind of all the way from low-level network flow logs, what you would expect from a typical network layer, all the way into higher level tracing languages, such as open tracing tracing information that you would typically only expect from the application levels, like understanding of HTP calls, Kafka topics, GRPC calls, um, aunts on DNS information, like high-level like clearly beyond what you would typically get from a network ops over ability platform, but also we can do both. Um, and that’s the, that’s a really powerful, really powerful aspect of VPF because we can combine what we see on the network level from a packer perspective, but that we can also hook into, for example, the socket layer and understand the data that an application sends or receives, or even instrument Envoy, um, which is the, the layer for proxy that psyllium users for some of the HTTP parsing and get information from Darren as well. Th that’s that’s the, the, the true power of it because EBPF can be used to glue everything together and then combine all of that information to private provide a very meaningful visibility layer. That’s that goes well beyond what you would, what, what typically existed so far.

Justin Beyer 00:57:08 Yes. So you’re almost getting that full stack monitoring capability from an observability perspective. So this would almost replaced, or at least supplement something like putting like an APM library into my specific code, or, you know, having all of my engineers implement, you know, open trace. This is almost that I can get this same level of observability without having to make every engineer rewrite every piece of code.

Thomas Graf 00:57:34 Yeah, absolutely. And the real powerful Potter is that this will work even for applications that you don’t want to change anymore. Right. Um, in a lot of scenarios, you don’t even, you may not even have access to the source code. So I think the option, it’s not a decision whether you want to change or don’t want to change your application. You just don’t for some applications, you can’t, you’re running a proprietary piece of code and you still want to instrument it, or you want to understand what type of HTP calls are going on. What is the latency, how long does it take for a rest API to respond? And maybe you don’t have access to the source because maybe you can not add instrumentation code. And this is alreadyEBPF comes in and you can do this at multiple levels. You can combine the visibility you have at the network layer, and then also instrument, uh, wifi BPR, for example, the HTP layer of the applications, which is to HDP goal library.

Thomas Graf 00:58:23 And for example, at trace points to figure out how long does it take to decode and parse and HTP header? How long did it take for the application to respond and so on and combine all of that? So you can answer questions. It’s the application slow? It’s the network’s law. Is it TCP? Is it like overall CPU on the system? You can correlate what you see in terms of latency before example TCP, retransmission. So you understand whether maybe it’s a loss in network and this, this is the really interesting part. You can combine all of this, uh, which gives you a very compelling, compelling visibility layer to answer, um, which was just not available for, DVPs not because the information did not exist, but simply it was spread across multiple systems typically.

Justin Beyer 00:59:04 Yes. So how does this compare to something like an Istio service mesh?

Thomas Graf 00:59:09 Interesting question. Um, we get that a lot and there’s a lot of overlapping in kind of the values we provide, like service mesh psyllium, the, the ask or the promise is to provide visibility into applications, network service calls. And so on a transparent matter to implementation is very different. Like service mesh or Istio typically uses what’s, what’s called sidecar proxies, um, to inject, uh, proxies into the application parts and to, to extract visibility from there. The outcome is similar that you get a transparently, you get a, a piece of code that sits in there transparently and can observe and control like in the service meshes to case it’s a sidecar, if for psyllium limits the EBPF program. The big difference is that there’s obviously like an, an, an latency and an overall performance overhead between sidecar and AndyEBPF. And we could talk a long time about, uh, about that, but I think even more fundamental differences that service mesh typically relies on for a lot of the functionality to kind of have end to end visibility.

Thomas Graf 01:00:15 You need to have the sidecar proxies on both sides. For example, mutual TLS will not work if you, if you don’t have side cars on both sides, uh, with a lot of what psyllium is doing, because it’s integrating on the kernel level, it does not require to have this. It doesn’t need does not require to manage both sides you can easily do. For example, I, I, it should be latency tracing while only riding on one side of the, of the, of the connection. So only for example, the client is somewhere else and you are not controlling the cloud. You can still measure HTTP latencies, for example, but overall, I think you can get the same or very similar information from both systems.

Justin Beyer 01:00:51 Yeah. It’s and we’ve discussed, I think back in episode three, six, one more on the SEO service mess specifically, but as you kind of mentioned, it’s all dependent on the proxy and the sidecar being able to work on both ends of the connection and you get some of those benefits, like, you know, mutual TLS and some of those other security benefits. But as you mentioned, if you don’t control what client you’re missing, half the connections, so there’s not really as much value there. And you’re also adding like a latency overhead for all of these connection calls, whereas with VPF, I mean, as you mentioned earlier, it’s near execution time.

Thomas Graf 01:01:25 Exactly. And I think the other thing to mention here is that the sidecar will only see like layers layer four and up, right? Like you don’t have any, you don’t have any visibility of what’s going on in the Cornel. So you cannot correlate this information with, for example, network visibility or network layer information or TCP retransmissions. And so on. You can not answer the question to whether it’s network or application. All you see is kind of sidecar. And there’s also just a complexity overheard I think the real benefit from like the EBPF model is that from a user perspective, nothing changes, right? Like the Kernel is to most people is it’s basically already this huge big black box and you’re adding pro programs into the kernel and it will do address or kind of extract visibility. You’re not changing anything from how it looks to the end, to the end user.

Thomas Graf 01:02:14 Whereas in the service mesh model, you’re definitely changing in way. I mean, the sidecar is visible to application developers. It’s running as, as part of the Cubanetis pod, which is typically owned by the application team. So it’s definitely visible to some extent, and you’re changing, you’re changing how the network flows are happening. It’s definitely a more involved or architectural change. And that’s, and that’s a, I don’t want to bash service mesh too much. I think a lot of our services was just bringing in is actually absolutely fantastic. I’m not a hundred percent in agreement with the current site core implementation model. I think what we will see longer term is tap the control plane and the user interfaces of service mesh will continue to exist. And I think we’ll see more and more of the sidecar or the more and more of the service mesh data plane to be implemented by something like psyllium or EBP where you still get the same high level benefits, but you will have a more efficient data plane implementation.

Justin Beyer 01:03:15 Yeah. So you’ll essentially start to see, as I can imagine, like with the mutual TLS, like that might still exist within that service mesh layer. You know, some of the pod authentication might still live within that control plane service mesh. But when we start talking about like lower-level observability and application tracing and network tracing, that’s going to start to maybe migrate into the closer to the kernel world where, you know, you’re preserving kind of the black box layer between, you know, the kernel and user space.

Thomas Graf 01:03:45 Absolutely. I think there’s even a, like a, a more fundamental difference here, which is, or like another very fundamental, interesting point, which is Selema can, for example, do policy what’s called policy driven TLS termination. So you can, you can terminate the TLS connection on behalf of the application. You can also notice of service measure. There’s a difference here though, that if you wrote this as part of a sidecar, you are within the bounds of the pod within the parts within the bounds of what’s owned by the application. So if the application gets compromised in some way, whatever the sidecar is doing is all, as it gets compromised. In the case of psyllium under VPF, we can do this outside of the pod. So it’s outside of the control of the application, which is really interesting if you are running on trusted workload. So if you can not fully trust the workloads that you’re running in your clusters, but you still need, for example, TLS termination, or you want token injection and want to use certain on one to, for certain services to use certificates or keys in order to access an external service with EBPF, we can do this outside of the pod, which means you can, you can run fully on trusted for clots and don’t have to expose any secrets or certificates to them.

Justin Beyer 01:04:51 Okay. So when you get the proprietary black box from your vendor, you can still, you know, inject some of the security that you wanted there.

Thomas Graf 01:04:58 Exactly. Yeah. That’s, that’s a, that’s a common use case.

Justin Beyer 01:05:01 So I just want to wrap up the show now and start closing this out. So what are some resources you would direct people to, if they want to learn more about EBPF, they want to learn more about XDP, you know, writing this code, something like psyllium and where that fits. And yeah, also I think

Thomas Graf 01:05:18 The best starting point is a website called EBP f.io. It’s a community owned website that we have launched a couple of months ago. It will give you an overview of EBPF, basically everything we covered today, and also list all of the projects that exist inEBPF ecosystem. So if you’re interested in tracing, what are the projects that exist around tracing? If you want system call security, what projects exist, if you need container networking, what other projects that exist? And so on, we also have aEBPF Slack community, uh, EBPF.io/slack. That link is also on EBP FIO, hundreds of EBPF developers, or other, uh, people learning about DB, or if an intent to learn about VPF, uh, in our Slack community, um, helping each other out. And then we have also covered, uh, Solium you can find more information about psyllium on psyllium.io. A source of great talks is, uh, is the EBP of summit, which we hosted a couple of weeks back, um, dozens of really amazing talks, um, about open source projects inEBPF project, in the EBPF space. You can also find that on EBPFIO. That’s definitely, there’s a great intro level talks to low-level EBP F verifier internals, like, like really anything that, that your heart could there could, could ever wish for in the PPF space.

Justin Beyer 01:06:35 Yeah, I was watching some of those summit talks. There’s actually a really good keynote, um, from the recent one on just some basics of writing a basic BPA EBPF program. And that might be good for our listeners if they want to start getting exposed to what the code actually looks like. So what other stuff are you working on now? Any additional research in the area

Thomas Graf 01:06:53 There’s so much going on in EBPF right now that I think it would be unfair to mention just a couple, I think in general, we’re seeing the spaces literally exploding right now. Like, um, we’re seeing huge investments into Linux security modules, which is the, the overall around time secured land up the kernel provides, uh, Google is heavily investing in that. I think that’s super interesting. We’re seeing EBPF obviously in the overall security space, in the visibility space, in the networking space, uh, we’re seeing more and more tracing tools come up. There was, I think pixie lapse just launched a couple of weeks back as well with really interesting topics. There’s just a lot going on right now. It’s really hard to, to even keep track of what’s going on. I think it’s, the promises is huge and a lot of people are just starting to realize the full potential of it. And we’ll, we’ll see a ton of stuff getting created by, by EBPF. And again, I think it won’t be that a lot of people will, will natively use the EBPF, like EBPF has been designed for kernel level operas and it will remain like that. But a lot of the higher level projects that are written on top of it will be very, very interesting to, to a lot of people.

Justin Beyer 01:07:59 Yeah. So it’ll be like most things that open source I’m going to start taking, someone’s already implemented project and then tweaking it as I need to for my environment. All right. Well, Thomas, thanks for coming on the show and discussing EBPF and SDP and how we can leverage them for observability networking and security. I will definitely put links to those sites that you mentioned and some of the resources I used for research, uh, on this topic also, th this was great.

Thomas Graf Thanks a lot for having me, Justin.

Justin Beyer This is Justin Beyer for Software Engineering Radio. Thank you for listening.

[End of Audio]


SE Radio theme: “Broken Reality” by Kevin MacLeod (incompetech.com — Licensed under Creative Commons: By Attribution 3.0)

Join the discussion

More from this show