BPF comes to firewalls

182 points by jasonma 6 years ago

majke 6 years ago

At Cloudflare we use BPF in firewall extensively. We blogged about this in past:

* https://blog.cloudflare.com/bpf-the-forgotten-bytecode/

* https://blog.cloudflare.com/introducing-the-bpf-tools/

* https://blog.cloudflare.com/introducing-the-p0f-bpf-compiler...

We even have a piece of software called "floodgate" which is pretty much a custom runtime for iptables compiled down to BPF. It's using proprietary driver magic to offload the firewall rules engaged during large L3 attacks, and runs them with high performance in userspace. More:

* https://blog.cloudflare.com/meet-gatebot-a-bot-that-allows-u...

More BPF integration in iptables is a very good idea. The interesting bit is how to deal with more complex things like haslimits, limits, ipsets and conntrack integration.

comex 6 years ago

So the Berkeley Packet Filter is now being used for packet filtering? What a wonderful development! What will they think of next? :)

metalliqaz 6 years ago

Yes the title caught me off guard as well. As a BSD user for many years, it is strange to think of it as "new".
- bodyfour 6 years ago
  
  eBPF is far evolved from "classic" BPF, though. There is a historical relationship (hence the name) but they're very different animals at this point.
  Also, the traditional use of BPF wasn't to filter network traffic, but only to sieve data flowing to userland tools like tcpdump.

nialv7 6 years ago

For anyone who don't know: nftables has its own byte code interpreter for filtering too. And at the time nftables was created, bpf already existed in Linux.

The author's argument against using bpf is that it's hard to reproduce the human readable rules from bpf byte code. Wonder how bpfilter is going to should this problem.

callesgg 6 years ago

What we are missing from linux today is a networking subsystem that allows for configurable efficient hardware offloading.

Without hacking up everything, like it is done in consumer routers these days.

Will BPF handle that better than nftables?

qeole 6 years ago

Offloading eBPF programs attached to TC (traffic control interface) or XDP (at the driver level) is possible since kernel 4.9, and keeps getting better (more complete, and easier). [Disclaimer: I work for the company working on this.]
Tools in iproute2 package are being updated too, so typically you would attach and offload programs to hardware with `tc`- or `ip`-based command lines.
- sargun 6 years ago
  
  You work at Netronome? I’m curious if we will ever see your chips in traditional data planes like switches.
ibotty 6 years ago

That's stated in the discussion as one of the major points of bpfilter against netfilter's nftables.
0xdeadbeefbabe 6 years ago

From what I can tell snabb might be able to do this, but it bypasses the kernel: https://github.com/snabbco/snabb
chrissnell 6 years ago

In this age where major security flaws are being discovered in hardware, do we really want to offload network filtering to what might be black box hardware with limited/no auditability?
- kardos 6 years ago
  
  We already have plenty of insecure non-auditable black boxes on our networks [1,2] and in our CPUs [3]. A dodgy offload chip seems like a minor concern until those are solved
  [1] https://www.networkworld.com/article/3016992/security/junipe...
  [2] https://www.theregister.co.uk/2015/09/15/compromised_cisco_r...
  [3] https://en.wikipedia.org/wiki/Intel_Management_Engine#Securi...
- viraptor 6 years ago
  
  There are various levels of offloading. There are basic operations like checksum offloading that existed for a long time and it's optional. Not sure anyone would complain about the auditability of that one. (you would see errors on the next hop if it didn't work correctly)
  
  kardos 6 years ago
  
  I interpreted OP's point as moving that kind of complexity to a black box would open it up to being compromised and then leveraged as a backdoor into your system, or as a botnet, etc.
vbernat 6 years ago

Outside firewalling, Linux already has support for offloading switching and routing to hardware if possible (through switchdev). This works with Mellanox network cards for example.

loudmax 6 years ago

> Developers should be careful, though; this could prove to be a slippery slope leading toward something that starts to look like a microkernel architecture.

That's an interesting warning. Pushing more tasks out of the kernel would seem like a good idea to me. I thought Torvalds' argument against a microkernel design was more about performance than complexity. Is that incorrect?

mavhc 6 years ago

I assumed this was a joke, based on early Linux vs microkernel debates.

danwent 6 years ago

For those interested in a really technical deep-dive on BPF, check out: http://cilium.readthedocs.io/en/latest/bpf/

If you have questions on BPF and Cilium for advanced firewalling, you can also ask them on the cilium slack: http://www.cilium.io/slack

jandrese 6 years ago

After many years of learning the many options to iptables, I can only say it's about time. The iptables command is one of the least ergonomic ones I use regularly.

That said, while BPF syntax is great for simple cases, the boolean logic gets pretty messy in a hurry if you want to do something weird.

Simple case comparision:

  iptables: iptables -t filter -A FORWARD -p tcp --dport 80 -j ACCEPT

  theoretical bpf: allow forward tcp dst port 80

Note that the capitalization in the first command is not arbitrary, it must be there for the command to work, exactly as shown. Also, I didn't switch to the double dash option on dport frivolously, there is no single dash option for this incredibly common feature. Plus the manpages are split into a bunch of different parts and the SEE ALSO section at the bottom is woefully incomplete, making it difficult to track down exactly the page you need.

That said, incorporating all of the features from iptables into a BPF syntax is going to require a considerable expansion over what you get with tcpdump. Things like marked packets, NAT, state tracking, etc... all need to be grafted onto the language somehow. And of course everything needs to be well documented because a lot of sysadmins are going to need to learn this in a hurry and bad documentation will make them hate it and insist on keeping iptables instead, warts and all.

smallbigfish 6 years ago

I somehow like the iptables syntax better. Not just in the netfilter world but anywhere there are similar ideas.
Why? Because it has a well defined structure for me.
For example '-j ACCEPT' signals my brain that I jump to a decision. I can also add after or before that a '-m comment', it will not change the jump decision I made already.
For the proposed bfp syntax it will take me 5 years to remember that the correct syntax is 'allow forward tcp' but not 'tcp forward allow' or any other combination.
(This is also part of the reason I didn't jump to nftables yet)
vbernat 6 years ago

There are two man pages: iptables(8) and iptables-extensions(8). What other man pages are you referring to?
- jandrese 6 years ago
  
  I had remembered the conntrack stuff being in a different manpage, but maybe that's faulty memory on my part.
  
  vbernat 6 years ago
  
  The iptables-extensions(8) manual page is fairly recent. AFAIK, previously, there was no manual page, but maybe it was scattered over various pages.

tinco 6 years ago

What do sysadmins generally use BPF or other more advanced firewall systems for?

I've administered only a few production systems, and the firewalls I configured were always very simple. Reject all traffic incoming except for port 22/23, 80/443 and outgoing except for to certain package management systems, that sort of thing.

I admit I've done some slightly more complex things to rewrite things to implement some virtual network thing, but I don't think I did that in production.

tytso 6 years ago

One potential advantage of the new BPF code for firewalls is that may make it easier to excise code owned by a certain copyright troll....

ciupicri 6 years ago

Are you referring to Patrick McHardy a former contributor to Netfilter?
https://www.theregister.co.uk/2017/10/18/linux_kernel_commun...

arca_vorago 6 years ago

"The Linux kernel currently supports two separate network packet-filtering mechanisms: iptables and nftables. For the last few years, it has been generally assumed that nftables would eventually replace the older iptables implementation; few people expected that the kernel developers would, instead, add a third packet filter. But that would appear to be what is happening with the newly announced bpfilter mechanism. Bpfilter may eventually replace both iptables and nftables, but there are a lot of questions that will need to be answered first."

Goshdarnit. I've been trying to get ahead of the curve and have been learning and implimenting nftables, and now you're telling me I might need to learn something else! Such is the life of a sysadmin I suppose.

"The use of BPF enables the writing of firewall rules in C"

Have you seen the rulesets people write in other firewall languages? This seems scary to me.

"One of the core design features for bpfilter is the ability to translate existing iptables rules into BPF programs."

nftables also does this, but I suggest not using it and writing fresh

"even though it would be likely to supplant nftables relatively quickly. Instead, Miller said in the discussion that nftables failed to address the performance problems in Linux's packet-filtering implementation, driving users toward user-space networking technologies instead. There is a real possibility that nftables could end up being one of those experiments that is able to shed some light on the problem space but never takes over in the real world."

Ok, well I have some questions here. First, show me the benchmarks. Also, nftables is still much faster than iptables in my benchmarks, so it has largely delivered. Of course it's difficult to compete with an asic offload, but I do see how there could be lots of potential with bpf if it offloads to the interface. That said, the real potential I see is for nftables and bpf to coexist in the future as a replacement for iptables or for iptables and bpf. nftables solves a lot of real problems and working with it has been really enjoyable for me compared to years of iptables rules (I always refused to use the layer-on-top-of-iptables abstractions, so I'm talking about pure iptables.) A quick glance at bpf seems like it would be worth it for extremely high requirement cases where the investment would be worth it (just noticed the cloudflare comment for example), but for the rest of us mortals in IT deps with limited budgets, time, and knowledge workers, it seems a bit too heavy to just start implimenting.

I could be wrong, but I really hope nft succeeds despite this.

marios 6 years ago

I'd really like it if instead of having "kernel developers add a third packet filter", said developers would sit down a bit and agree on how to manage firewalling at the userspace level. iptables feels like a tool that was developped to test out netfilter rules (that is, the kernel part) but not really for actual use. That would explain why there are so many frontends that attempt to abstract away the ugliness. I've seen too many "firewall setups" that are just a script calling iptables for each rule. While this works, it can be dangerous as well: if you edit the file and mess up a rule, there's a risk that only part of the ruleset is loaded. Hopefully, the operator won't have locked himself out in the process. Of course, there's the iptables-persistent package for atomically loading a ruleset (in Debian at least). The problem is that there's also a netfilter-persistent package. What's the difference between the two ?
How is one supposed to pick ? Then, there's also nftables. I've only glanced at it, and it looks promising but it seems there are some things that are missing. A comment in the article says that TCP MSS clamping has been added only recently. By the looks of it, it appears to be "almost there" but not quite ... which is a shame.
I'm hoping whatever implementation ends up prevailing will solve the various technical problems (performance, features w.r.t filtering capabilities) but will also provide a sane way to manage it. I feel kind of sad with the current situation. With my developer hat on, I am continuously impressed with the networking features available on Linux (Netfilter, XDP, ...). With my operator hat on, I find the general lack of usable tools as well as the inconsistency maddening.

ciupicri 6 years ago

systemd implemented eBPF-based per-unit IP access lists and accounting [1] in version 235.

[1]: https://github.com/systemd/systemd/pull/6764

vegasbrianc 6 years ago

This is what project Cilium does - https://github.com/cilium/cilium

deltaprotocol 6 years ago

From your link:
>A new Linux kernel technology called BPF is at the foundation of Cilium. It supports dynamic insertion of BPF bytecode into the Linux kernel at various integration points such as: network IO, application sockets, and tracepoints to implement security, networking and visibility logic. BPF is highly efficient and flexible.

rjsw 6 years ago

The NPF firewall in NetBSD also uses BPF as its rule engine along with a JIT for several CPU architectures.

INTPenis 6 years ago

This reminds me of a cloudflare blog post I read a few years back about the xt_bpf module for iptables.

https://blog.cloudflare.com/bpf-the-forgotten-bytecode/

I wonder if the projects are related.

borplk 6 years ago

https://news.ycombinator.com/item?id=16420328

mrmondo 6 years ago

Really please to see BPF making further progress, well done and thank you to those involved in the implementation, testing and review process across the various projects involved.

ComodoHacker 6 years ago

Well, more user-controlled code executed by the kernel means more fun! And more work for security researchers.

snvzz 6 years ago

I'd look at Dragonfly's stack first, as that's outperforming Linux at the moment.

arca_vorago 6 years ago

DFLY has shown some really amazing numbers. I really like it and have been keeping an eye on it, not just the networking stack either, HAMMER2 seems like everything btrfs and zfs got right in one package. Have you used it in prod?
beagle3 6 years ago

What are the benchmarks you are looking at? On what kind of hardware?
mdekkers 6 years ago

I'd look at Dragonfly's stack first, as that's outperforming Linux at the moment.
Apples, Oranges, etc.
- cat199 6 years ago
  
  true, one monolithic kernel posix-like OS which can run pretty much the same software stack as another monolithic kernel posix-like OS has nothing in common whatsoever. I'm sure that even if the code lineages are different, there are no useful algorithms which could be applied from one to the other. very good point. thanks for enlightening us.
- mdekkers 6 years ago
  
  Instead of senseless downvotes, how about some comments instead? Dragonfly isn't Linux, it is a BSD, therefore the OP is comparing apples to oranges. If you disagree, say why, don't just hit the downvotes. HN is rapidly becoming the Reddit for technosnobs.
  
  seeekr 6 years ago
  
  I would guess that both of you received downvotes because both comments failed to provide any explanation for the view expressed. I think it's good practice on HN to briefly introduce a technology where it's safe to assume that a majority are not very likely to be familiar with it, so that the comment becomes meaningful on its own.
  If you had stated: "You are comparing apples to oranges here because Dragonfly only exists on BSD, not on Linux, and as such may not be a viable choice for most users." I think you would have received no or substantially fewer downvotes. If you then had in addition to that provided a quick explanation (or more) of what "Dragonfly" is, you would have gotten upvotes instead, because that would have taught a number of us something we did not know yet.
  
  jnbiche 6 years ago
  
  I didn't downvote, but like the adjacent comment, I'm guessing you are being downvoted because you left a short, pithy, two-word comment without any explanation or justification as to why the two are apples and oranges.
  HN has always been like this, at least in the 7+ years I've been on here. Short, one-word or two-word comments are generally not appreciated. People on HN like substantive comments.