Member-only story
Apache Kafka — Important Designs
Filesystem, Zero-copy, and Batching

To sustain my work, I’ve enabled the Medium paywall. If you’re already a Medium member, I deeply appreciate your support! But if you prefer to read for FREE, my newsletter is open to you: vutr.substack.com. Either way, you’re helping me continue writing!
Intro
As promised in the last article, we will continue learning Apache Kafka this week. In this article, I will present my research on some of Kafka’s important designs: Filesystem, Zero-copy, and Batching.
Kafka use the Filesystem
Before going further, let’s understand the Operating System (OS) page cache concept.

Modern OS systems usually borrow unused memory (RAM) portions for page cache. The frequently used disk data is populated to this cache, avoiding touching the disk directly too often. Thus, the system is much faster, mitigating the latency of disk seeks. If some application needs the memory to run, the kernel will take back memory portions used for page cache. This ensures the page cache does not affect the system’s performance.
Kafka uses the OS filesystem for data storage, thus also leveraging the kernel page cache mechanism. Rather than trying to keep as much data in memory and flush it to the filesystem when running out of RAM, the OS transfers all data to the page cache before flushing it to the disk.
As a result, this approach helps Kafka simplify the code base because the OS system handles the page cache logic. Moreover, this approach also benefits Kafka given the fact that it was built on the Java Virtual Machine, which has some pain points:
- The high memory overhead of stored objects.
- The garbage collector process will be slow when the number of in-heap objects increases.