<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>EBPF on Monish Kumar&#39;s Blog</title>
        <link>https://itsmonish.pages.dev/tags/ebpf/</link>
        <description>Recent content in EBPF on Monish Kumar&#39;s Blog</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en-us</language>
        <lastBuildDate>Sun, 08 Mar 2026 16:42:18 +0530</lastBuildDate><atom:link href="https://itsmonish.pages.dev/tags/ebpf/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>Barbwire: an eBPF based behavioral correlator</title>
        <link>https://itsmonish.pages.dev/blog/barbwire/</link>
        <pubDate>Sun, 08 Mar 2026 16:42:18 +0530</pubDate>
        
        <guid>https://itsmonish.pages.dev/blog/barbwire/</guid>
        <description>&lt;h2 id=&#34;introduction&#34;&gt;Introduction
&lt;/h2&gt;&lt;p&gt;After the eBPF learning phase I wrote about in the &lt;a class=&#34;link&#34; href=&#34;https://itsmonish.pages.dev/blog/learning-ebpf&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;last post&lt;/a&gt;, I wanted to build something that looked like a real security tool. Not because I thought I&amp;rsquo;d ship it, but because the programs I tried before where stupid. At some point you need a problem with actual constraints.&lt;/p&gt;
&lt;p&gt;Barbwire is that project. It&amp;rsquo;s a lightweight behavioral correlator for Linux. It watches for processes that open sensitive files and then make network connections within a short time window. This is classic exfil behavior and rough fingerprint for credential harvesting and so on. It&amp;rsquo;s not a production EDR. It&amp;rsquo;s an experiment. It&amp;rsquo;s on my &lt;a class=&#34;link&#34; href=&#34;https://github.com/ItsMonish/barbwire&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Github&lt;/a&gt; if anyone&amp;rsquo;s interested.&lt;/p&gt;
&lt;h2 id=&#34;the-core-design-decision&#34;&gt;The core design decision
&lt;/h2&gt;&lt;p&gt;The first instinct when building something with eBPF is to put logic in the BPF side. It&amp;rsquo;s running in the kernel, it&amp;rsquo;s fast, why not do the correlation there?&lt;/p&gt;
&lt;p&gt;I thought about it and decided against it early. The BPF verifier gets unhappy with complex logic. It&amp;rsquo;s harder to debug, less portable across kernel versions, and for this use case, unnecessary. Userspace is fast enough. The design I landed on: BPF is a dumb data collector, userspace does the thinking.&lt;/p&gt;
&lt;p&gt;The BPF side attaches tracepoints to three syscalls. &lt;code&gt;sys_enter_openat&lt;/code&gt; for file opens, &lt;code&gt;sys_enter_connect&lt;/code&gt; for network connections, and &lt;code&gt;sys_enter_execve&lt;/code&gt; for process execution to track lineage. All three write into a single shared ring buffer with an event type field. Userspace reads and routes.&lt;/p&gt;
&lt;p&gt;I used a ring buffer over a perf event array deliberately. Perf event arrays are per-CPU, so you have to poll multiple buffers and deal with out-of-order events. A ring buffer is a single shared buffer, better throughput, simpler consumer. So obviously I wanted the path of least resistance.&lt;/p&gt;
&lt;h2 id=&#34;how-correlation-and-scoring-works&#34;&gt;How correlation and scoring works
&lt;/h2&gt;&lt;p&gt;On the userspace side, three in-memory maps hold state. Recent file opens per PID, process lineage from exec events, and a deduplication map to avoid alerting on the same PID repeatedly. A cleanup goroutine evicts stale entries every 30 seconds.&lt;/p&gt;
&lt;p&gt;The correlation window is configurable. I have it set to 5 seconds. When a connect event comes in, Barbwire looks back at that PID&amp;rsquo;s recent file opens and checks if any fall within the window. If they do, it scores the pair.&lt;/p&gt;
&lt;p&gt;Scoring is based on file and connect pairs, not individual signals. A process connecting to the network is normal. Opening &lt;code&gt;/etc/passwd&lt;/code&gt; on its own is normal. Both within 5 seconds is suspicious. The score comes from which file category was accessed:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;suspicious_files&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#f92672&#34;&gt;category&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;credential access&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;base_score&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;patterns&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;/etc/passwd&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;/etc/shadow&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;/.aws/credentials&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#f92672&#34;&gt;category&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;ssh key exfiltration&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;base_score&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;patterns&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;.ssh/id_rsa&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;.ssh/id_ed25519&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Process lineage modifies the score up or down. A shell spawning a process that reads credentials and connects out is more suspicious than the same behavior happening under systemd:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;suspicious_parents&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#f92672&#34;&gt;comm&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;bash&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;modifier&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#f92672&#34;&gt;comm&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;sshd&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;modifier&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;legit_parents&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#f92672&#34;&gt;comm&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;systemd&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;modifier&lt;/span&gt;: -&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#f92672&#34;&gt;comm&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;dockerd&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;modifier&lt;/span&gt;: -&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Alert threshold and severity thresholds are decoupled. The threshold decides if the alert is an actual alert, severity is a classification of the alert. When a connect event comes in, Barbwire picks the highest scoring file match for that PID and emits one alert. One alert per connect event per process, no duplicates (hopefully).&lt;/p&gt;
&lt;p&gt;A fired alert looks like this:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;┌─ barbwire alert — PID 41890  ─────────────
│  command  : python
│  file     : /etc/passwd
│  connect  : 2404:6800:4007:821::200e:80
│  severity : HIGH
│  reasons  : credential access, suspicious parent: fish
│  parent   : fish (pid 13387)
│  gparent  : tmux: server (pid 9483)
└─────────────────────────────────────────────
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;my-screwups-while-building-it&#34;&gt;My Screwups while building it
&lt;/h2&gt;&lt;p&gt;A few things broke in ways that were actually instructive. The first one was actually two bugs hiding behind each other, which made it genuinely annoying to debug.&lt;/p&gt;
&lt;p&gt;There was a constant mismatch between the C and Go sides. &lt;code&gt;EVENT_EXEC&lt;/code&gt;, the event type for &lt;code&gt;sys_enter_execve&lt;/code&gt; tracepoint, was 2 and &lt;code&gt;EVENT_CONNECT&lt;/code&gt;, the event type for &lt;code&gt;sys_enter_connect&lt;/code&gt;, was 3 in the BPF code, but I had them swapped in Go. At the same time, I had attached the exec tracepoint and called Close() on it immediately without a defer, so it detached right after attaching (Yeah this is embrassing).&lt;/p&gt;
&lt;p&gt;The symptom was simple, in terms of events I had none. Thing was execve events were being routed to the connect handler. Since those events carry no IP address family, they got dropped as invalid. Meanwhile connect events were being routed to the exec handler, which was already closed. Everything was failing quietly in a different place than where I was looking.&lt;/p&gt;
&lt;p&gt;I spent a while staring at the correlation logic before going back to basics and checking whether the tracepoints were even active. Once I caught the missing defer, connect events started showing up, but in the wrong handler. That&amp;rsquo;s when I found the swapped constants. Two separate bugs, one symptom, neither pointing at the other. Really annoying&lt;/p&gt;
&lt;p&gt;The most useful bug involved stale memory in the event struct. &lt;code&gt;bpf_ringbuf_reserve&lt;/code&gt; gives you a pointer to uninitialized memory. If you skip the &lt;code&gt;__builtin_memset&lt;/code&gt; before writing your fields, whatever was in memory from the previous event bleeds into the fields you didn&amp;rsquo;t set. I had filenames with garbage characters appended, leftovers from longer filenames earlier in the buffer. The fix is one line. But you have to know it&amp;rsquo;s needed.&lt;/p&gt;
&lt;h2 id=&#34;limitations-and-takeaways&#34;&gt;Limitations and takeaways
&lt;/h2&gt;&lt;p&gt;The biggest problem with Barbwire is the one that can&amp;rsquo;t be tuned away.&lt;/p&gt;
&lt;p&gt;For example, &lt;code&gt;curl&lt;/code&gt; reads &lt;code&gt;/etc/passwd&lt;/code&gt; on every invocation. Here is it making the call:
&lt;img src=&#34;https://itsmonish.pages.dev/images/barbwire/1.png&#34;
	
	
	
	loading=&#34;lazy&#34;
	
		alt=&#34;curl opening passwd&#34;
	
	
&gt;&lt;/p&gt;
&lt;p&gt;Not because it&amp;rsquo;s malicious, but because that&amp;rsquo;s how it works. I believe, basically any program using libc networking does this. Barbwire flags it every time. No matter how hard I tuned the knobs, I couldn&amp;rsquo;t get rid of such alerts. Then it hit me, single-process correlation without broader context has a ceiling.&lt;/p&gt;
&lt;p&gt;Evidently, production security tools handle this differently. Binary hash whitelisting instead of process name matching, which is spoofable. Process ancestry graphs that track behavioral chains across multiple processes. Behavioral baselines built up over time. A shell that reads &lt;code&gt;/etc/passwd&lt;/code&gt; and then forks a child that connects out would be caught by a graph-based correlator. Barbwire misses it because the file open and the connect are in different PIDs.&lt;/p&gt;
&lt;p&gt;Building this made that limitation concrete in a way I don&amp;rsquo;t think I&amp;rsquo;d have gotten from just reading about EDR design. Honestly, I never even thought of these before. You also understand why process ancestry graphs exist when your tool fails exactly where they&amp;rsquo;d succeed.&lt;/p&gt;
&lt;p&gt;The other thing I took away: the BPF to userspace boundary is unforgiving. Memory layout, pointer semantics, which helper functions apply where, what the verifier accepts. The documentation covers the what, but the why only shows up when things break.&lt;/p&gt;
&lt;p&gt;All things considered, I had fun with all this. That&amp;rsquo;s the whole point if you think about it.&lt;/p&gt;
</description>
        </item>
        <item>
        <title>I wanted to poke at the Linux kernel. eBPF was the answer</title>
        <link>https://itsmonish.pages.dev/blog/learning-ebpf/</link>
        <pubDate>Sat, 07 Mar 2026 20:32:58 +0530</pubDate>
        
        <guid>https://itsmonish.pages.dev/blog/learning-ebpf/</guid>
        <description>&lt;h2 id=&#34;introduction&#34;&gt;Introduction
&lt;/h2&gt;&lt;p&gt;A while back I came across &lt;a class=&#34;link&#34; href=&#34;https://blog.cloudflare.com/how-cloudflare-auto-mitigated-world-record-3-8-tbps-ddos-attack/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Cloudflare&amp;rsquo;s writeup&lt;/a&gt; on how they mitigated a 3.8 Tbps DDoS attack, the largest ever disclosed publicly at the time. The post goes deep into how their systems detected and dropped attack traffic autonomously (quite interesting stuff). One part that caught my eye: they used XDP and eBPF to drop packets directly at the NIC level, before the kernel network stack even saw them. I had zero clue on what eBPF was back then. I thought that was interesting, bookmarked it, and then forgot about it (Yeah I do that a lot).&lt;/p&gt;
&lt;p&gt;When I eventually got around to it, I was looking to get into lower level Linux stuff anyway. How the kernel works, how you can poke at it from the outside. eBPF kept coming up. I also saw it framed as a safer alternative to writing kernel modules for certain use cases. That part made immediate sense to me, because writing kernel modules means one bad pointer dereference and your machine is just down. So I thought maybe I could look into it seriously.&lt;/p&gt;
&lt;h2 id=&#34;what-even-is-ebpf&#34;&gt;What even is eBPF
&lt;/h2&gt;&lt;p&gt;eBPF lets you run sandboxed programs inside the Linux kernel without modifying the kernel source or loading a module. Before your program runs, the kernel verifies it. No unbounded loops, no out-of-bounds memory access, every code path has to terminate. If it doesn&amp;rsquo;t pass, it doesn&amp;rsquo;t load. However it doesn&amp;rsquo;t mean it&amp;rsquo;s bullet-proof, I still found some bugs and weird things happen now and then. But it catches the most obvious ones.&lt;/p&gt;
&lt;p&gt;Evidently, it started as Berkeley Packet Filter, which was just a way to filter network packets efficiently. The &amp;ldquo;extended&amp;rdquo; part came later and expanded it well beyond that when people wanted better tracing options when debugging inside the kernel. These days it shows up in observability tools, performance profilers, security products, and a lot of infrastructure tooling that runs quietly in the background of production systems.&lt;/p&gt;
&lt;h2 id=&#34;how-i-learned-it&#34;&gt;How I learned it
&lt;/h2&gt;&lt;p&gt;I started reading &lt;a class=&#34;link&#34; href=&#34;https://www.oreilly.com/library/view/learning-ebpf/9781098135119/&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Learning eBPF&lt;/a&gt; by Liz Rice and went through it cover to cover. It builds up a solid mental model of how everything fits together, the verifier, maps, ring buffers, CO-RE, without throwing you into kernel internals from page one. Great book, would recommend it if you&amp;rsquo;re starting out.&lt;/p&gt;
&lt;p&gt;After that I spent time staring and reading through examples at &lt;a class=&#34;link&#34; href=&#34;https://github.com/libbpf/libbpf-bootstrap&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;libbpf-bootstrap&lt;/a&gt; and &lt;a class=&#34;link&#34; href=&#34;https://github.com/cilium/ebpf/tree/main/examples&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;cilium/ebpf&lt;/a&gt;. That was just to see how things are done in real projects and scripts.&lt;/p&gt;
&lt;p&gt;Then I started writing stuff. A per-process file access counter. A pure C rough version of what would eventually become &lt;a class=&#34;link&#34; href=&#34;https://github.com/ItsMonish/barbwire&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;Barbwire&lt;/a&gt;. None of it was useful or anything. It was just to get familiar with the toolchain, get yelled at by the verifier, figure out why and repeat.&lt;/p&gt;
&lt;h2 id=&#34;what-i-liked-about-ebpf&#34;&gt;What I liked about eBPF
&lt;/h2&gt;&lt;p&gt;A couple of things genuinely impressed me.&lt;/p&gt;
&lt;p&gt;The first was CO-RE, which stands for Compile Once, Run Everywhere. Despite my reservations on Java, I always thought the idea of Code once, run everywhere was pretty neat (Thanks to JVM). CO-RE brings the same thing to Linux kernel programming. It handles struct field offset relocations across kernel versions, so your program isn&amp;rsquo;t silently tied to the exact kernel it was compiled against. Without it, anything that walks kernel structs like &lt;code&gt;task_struct&lt;/code&gt; breaks the moment you run it on a different machine. With it, the same binary works across versions. It is remarkable as far as I can tell.&lt;/p&gt;
&lt;p&gt;The second was how dynamic the whole thing is. With kernel modules you write code, compile, load, and if something goes wrong you restart. With eBPF you attach programs to running systems and detach them without rebooting or disrupting anything. For observability and containers especially, that matters a lot. You can instrument a live system, see what you need, and pull the probe out. No downtime, no risk of leaving a bad module loaded.&lt;/p&gt;
&lt;h2 id=&#34;what-came-next&#34;&gt;What came next
&lt;/h2&gt;&lt;p&gt;After a few weeks of experiments I felt ready to build something with an actual purpose. That became Barbwire, a lightweight behavioral correlator that watches for processes opening sensitive files and then making network connections. I&amp;rsquo;ll write about that in the next post.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s not a production security tool. It&amp;rsquo;s an experiment. But building it was a good reason to use eBPF for something real instead of just counting file accesses forever.&lt;/p&gt;
</description>
        </item>
        
    </channel>
</rss>
