```html Kernel Bypass Networking with Netmap on FreeBSD โ€” Synthetic Context

Kernel Bypass Networking with Netmap on FreeBSD

๐ŸŽฏ What You'll Learn

This guide delves into kernel bypass networking with Netmap on FreeBSD. You'll understand the underlying principles, how to configure and utilize Netmap for extreme packet processing, and explore its integration with VALE software switches for advanced network topologies.

The Quest for Wire-Speed Packet Processing

In an era dominated by high-speed networks (10GbE, 40GbE, 100GbE and beyond), traditional operating system network stacks often become a bottleneck. The conventional path for a packet involves numerous context switches between user and kernel space, multiple data copies, and complex protocol processing, all of which introduce significant latency and consume valuable CPU cycles. For applications demanding wire-speed packet processing โ€“ such as high-frequency trading platforms, network intrusion detection systems (NIDS), software-defined networking (SDN) controllers, and network function virtualization (NFV) infrastructure โ€“ this overhead is unacceptable.

Kernel bypass networking emerges as a critical solution to this challenge. By allowing user-space applications to directly access network interface card (NIC) hardware, it eliminates much of the kernel's involvement in the fast path of packet I/O. This paradigm shift dramatically reduces latency, increases throughput, and frees up CPU resources for application logic. On FreeBSD, netmap stands out as a robust and highly efficient framework for achieving kernel bypass. Developed at the University of Pisa, netmap provides a unified API for direct access to NIC packet rings, enabling zero-copy packet I/O and facilitating the creation of high-performance network applications. This article will explore the architecture, implementation, and practical applications of netmap on FreeBSD, including its powerful VALE software switch component.

Understanding Kernel Bypass and Netmap Fundamentals

The traditional network stack, while incredibly flexible and feature-rich, is not optimized for raw packet throughput. When a packet arrives at a NIC, it triggers an interrupt, causing the kernel to copy the packet data from the NIC's DMA buffer into an mbuf chain. This mbuf then traverses various layers of the kernel stack (e.g., Ethernet, IP, TCP/UDP), potentially undergoing checksumming, routing, and firewall processing, before finally being copied to a user-space buffer via a system call like recvmsg or read. For outgoing packets, a similar, reverse process occurs. Each copy operation, each context switch, and each layer of processing adds overhead.

Kernel bypass technologies aim to circumvent this overhead. Instead of the kernel handling every packet, the NIC's receive and transmit rings are mapped directly into the user-space application's memory. This allows the application to read incoming packets and write outgoing packets without any kernel intervention on the data path. The kernel's role is reduced to initial setup, resource allocation, and handling exceptional conditions or control plane operations.

netmap on FreeBSD achieves this through several key mechanisms:

  • Zero-Copy I/O: The most significant performance gain comes from eliminating data copies. netmap provides a shared memory region between the kernel and the user application, where packet buffers reside. When a packet arrives, the NIC places it directly into one of these buffers. The user application then accesses this buffer directly via a memory-mapped region, avoiding any memcpy operations.
  • Direct Ring Access: Applications gain direct access to the NIC's transmit and receive rings. These rings are essentially arrays of descriptors, each pointing to a packet buffer. The application manipulates these descriptors to indicate which buffers are available for reception or ready for transmission.
  • Batch Processing: Instead of processing one packet at a time, netmap encourages batch processing. Applications can process multiple packets from a ring in a single loop iteration, amortizing the cost of system calls and other overheads.
  • Unified API: netmap provides a consistent API across different NIC drivers, abstracting away hardware-specific details. This allows applications to be written once and run on various netmap-compatible NICs.

Compared to other kernel bypass solutions like DPDK (Data Plane Development Kit) primarily used on Linux, netmap offers a more lightweight and integrated approach within the operating system. While DPDK often involves custom drivers and a complete user-space network stack, netmap leverages existing kernel drivers and integrates seamlessly with the FreeBSD kernel, making it easier to deploy and manage in a FreeBSD environment. XDP (eXpress Data Path) on Linux also provides kernel bypass, but typically operates within the kernel context using eBPF programs, offering a different trade-off between flexibility and direct hardware control. netmap provides a direct user-space view of the hardware rings, which is its distinct advantage for certain applications.

Netmap Architecture and Operation

The netmap framework introduces a pseudo-device, /dev/netmap, which serves as the primary interface for user-space applications. Through this device, applications can open a netmap port on a physical NIC or a virtual VALE port, configure its parameters, and gain access to the shared memory regions containing packet buffers and ring descriptors.

Core Structures:

  • struct netmap_if: This structure represents a netmap instance associated with a specific network interface. It contains metadata about the interface, including the number of transmit (TX) and receive (RX) rings, the number of buffers per ring, and pointers to the shared memory regions.
  • struct netmap_ring: Each netmap_if contains an array of netmap_ring structures, one for each TX and RX queue. A netmap_ring holds the current head and tail pointers for packet descriptors, along with the total number of descriptors and the index of the next buffer to be processed.
  • struct netmap_slot: Each entry in a netmap_ring is a netmap_slot. It contains an index (buf_idx) pointing to a specific packet buffer in the shared memory region, the length of the packet (len), and various flags.
  • Packet Buffers: These are raw memory regions where packet data is stored. They are allocated by the kernel and memory-mapped into the user application's address space.

Operational Flow:

  1. Opening a netmap Port: An application initiates netmap operation by opening /dev/netmap and then issuing an ioctl(NIOCCONFIG) call with a struct nmreq to specify the target interface (e.g., em0, igb1). This call configures the NIC for netmap mode, detaching it from the kernel's normal network stack.
  2. Memory Mapping: After configuration, the application uses mmap() on the /dev/netmap file descriptor to map the shared memory region into its address space. This region contains the netmap_if structure, all netmap_ring structures, and the actual packet buffers.
  3. Packet Reception (RX):
    • The application polls the netmap file descriptor (e.g., using poll() or select()) to wait for incoming packets.
    • When packets arrive, the NIC places them into available buffers and updates the netmap_ring's head pointer.
    • The application iterates through the netmap_ring's descriptors from the head to the tail. For each netmap_slot, it retrieves the buf_idx to access the packet data directly from the shared buffer.
    • After processing a packet, the application updates the netmap_ring's tail pointer to release the buffer back to the NIC for reuse.
    • An ioctl(NIOCTXSYNC) call (or NIOCTXSYNC flag in poll) synchronizes the ring state with the kernel/NIC, making processed buffers available for new incoming packets.
  4. Packet Transmission (TX):
    • The application prepares an outgoing packet in an available netmap buffer (obtained from the TX ring).
    • It updates the netmap_slot for that buffer with the packet's length and any necessary flags.
    • The application then updates the TX netmap_ring's head pointer to indicate that the packet is ready for transmission.
    • An ioctl(NIOCTXSYNC) call (or NIOCTXSYNC flag in poll) signals the kernel/NIC to transmit the packets indicated by the updated TX ring.

Netmap Modes:

netmap supports different modes of operation. By default, when an interface is put into netmap mode, it becomes fully dedicated to netmap and is no longer accessible by the kernel's normal network stack. However, netmap also supports a "host stack" mode where the kernel can still receive packets that are not consumed by the netmap application, or vice versa. This is achieved by configuring specific flags during nm_open. For maximum performance, dedicated mode is typically preferred.

// Example: Basic netmap setup (simplified)
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <net/if.h>
#include <net/netmap.h>
#include <net/netmap_user.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
    struct nmreq req;
    struct netmap_if *nif;
    char *mem;
    int fd;
    const char *ifname = "em0"; // Target interface

    if (argc > 1) {
        ifname = argv[1];
    }

    // 1. Open /dev/netmap
    fd = open("/dev/netmap", O_RDWR);
    if (fd < 0) {
        perror("open /dev/netmap");
        return 1;
    }

    // 2. Configure netmap for the interface
    memset(&req, 0, sizeof(req));
    strncpy(req.nr_name, ifname, sizeof(req.nr_name));
    req.nr_version = NETMAP_API; // Use current API version
    req.nr_flags = NR_REG_ALL_NIC; // Register all rings for the NIC

    if (ioctl(fd, NIOCREGIF, &req) < 0) {
        perror("ioctl NIOCREGIF");
        close(fd);
        return 1;
    }

    // 3. Memory map the netmap region
    mem = mmap(NULL, req.nr_memsize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    if (mem == MAP_FAILED) {
        perror("mmap");
        close(fd);
        return 1;
    }

    // Get netmap_if structure
    nif = NETMAP_IF(mem, req.nr_offset);

    printf("Netmap configured on %s with %d RX rings, %d TX rings.\n",
           nif->ni_name, nif->ni_rx_rings, nif->ni_tx_rings);

    // In a real application, you would now enter a loop to poll for packets,
    // process them, and transmit.
    // Example: poll(fd, ...) and then iterate through rings.

    // Cleanup (simplified)
    munmap(mem, req.nr_memsize);
    close(fd);
    return 0;
}

This simplified example demonstrates the initial steps. A full application would involve a poll() loop, iterating through netmap_rings, accessing netmap_slots, and manipulating packet buffers.

VALE Software Switches

Beyond direct NIC access, netmap introduces a powerful concept: the VALE software switch. VALE (Virtual Adapter for Local Ethernet) is a high-performance, user-space configurable software switch built entirely on top of the netmap framework. It allows for the creation of virtual Ethernet switches within the kernel, enabling extremely fast packet forwarding between virtual ports and physical NICs, all with zero-copy semantics.

Key Features of VALE:

  • Virtual Ports: VALE switches can have multiple virtual ports. These ports can be connected to:
    • Physical NICs (e.g., em0, igb1) in netmap mode.
    • Other virtual VALE ports.
    • User-space applications (via netmap API).
    • FreeBSD network stack (via netmap host stack mode).
  • Zero-Copy Forwarding: When packets are forwarded between ports within the same VALE switch, they are not copied. Instead, only the netmap_slot descriptors are manipulated, effectively "moving" the packet buffer ownership from one ring to another. This makes VALE incredibly efficient for inter-VM or inter-container communication.
  • Configurable: VALE switches and their ports are created and managed using nm_open() calls with specific naming conventions.
  • Use Cases: VALE is ideal for:
    • NFV (Network Function Virtualization): Chaining virtual network functions (e.g., virtual firewalls, load balancers) together with minimal overhead.
    • SDN (Software-Defined Networking): Building high-performance data planes.
    • Container/VM Networking: Providing fast and isolated network connectivity to virtualized guests or containers.
    • Testing and Benchmarking: Creating controlled network environments for performance analysis.

Creating and Configuring VALE Switches:

VALE switches are named valeX, where X is an identifier (e.g., vale0, vale1). Ports on a VALE switch are named valeX:Y, where Y is a unique port identifier (e.g., vale0:p1, vale0:veth0).

Example: Setting up a VALE Switch and Connecting Ports

Let's say we want to create a VALE switch vale0 and connect a physical NIC em0 to it, along with a virtual port veth0 that a user-space application will use.

  1. Create the VALE switch vale0:

    This is implicitly created when the first port is attached to it.

  2. Connect em0 to vale0:
    # Put em0 into netmap mode and attach to vale0
    # This command uses the netmap utility 'nm-ctl'
    nm-ctl -i em0 -p vale0:em0

    This command tells netmap to take control of em0 and present it as a port named em0 on the vale0 switch.

  3. Create a virtual port veth0 on vale0 for an application:

    A user-space application would open vale0:veth0 using nm_open():

    // In your C application:
    const char *vale_port_name = "vale0:veth0";
    fd = open("/dev/netmap", O_RDWR);
    // ... error checking ...
    memset(&req, 0, sizeof(req));
    strncpy(req.nr_name, vale_port_name, sizeof(req.nr_name));
    req.nr_version = NETMAP_API;
    req.nr_flags = NR_REG_NIC_HD; // Register as a host-stack port (optional, depends on use case)
    
    if (ioctl(fd, NIOCREGIF, &req) < 0) {
        perror("ioctl NIOCREGIF for VALE port");
        close(fd);
        return 1;
    }
    // ... mmap and packet processing ...

    Now, packets arriving on em0 and forwarded by vale0 can be received by the application on vale0:veth0, and vice versa, all with zero-copy efficiency.

VALE Switch Configuration and Management:

The nm-ctl utility is invaluable for managing VALE switches and ports from the command line:

  • nm-ctl -l: List all active netmap interfaces and VALE switches.
  • nm-ctl -d vale0:em0: Detach port em0 from vale0 and return em0 to normal kernel operation.
  • nm-ctl -N vale0: Destroy the vale0 switch (after all its ports are detached).

VALE switches can also be configured with basic MAC learning capabilities, allowing them to behave like traditional Ethernet switches, forwarding packets based on destination MAC addresses. This behavior is often controlled by sysctl variables or specific flags during port creation.

Practical Implementation and Examples

Developing netmap applications requires a good understanding of C programming, low-level networking, and careful resource management. Here, we'll outline the structure of a simple packet forwarder and discuss key considerations.

Simple Packet Forwarder (Bridge) Example:

A common netmap application is a simple bridge that forwards packets between two netmap ports (e.g., two physical NICs or two VALE ports).

#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/poll.h>
#include <net/if.h>
#include <net/netmap.h>
#include <net/netmap_user.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

// Helper function to open and mmap a netmap interface
static int nm_open_and_mmap(const char *ifname, struct nmreq *req, struct netmap_if **nif_ptr, char **mem_ptr) {
    int fd = open("/dev/netmap", O_RDWR);
    if (fd < 0) {
        perror("open /dev/netmap");
        return -1;
    }

    memset(req, 0, sizeof(*req));
    strncpy(req->nr_name, ifname, sizeof(req->nr_name));
    req->nr_version = NETMAP_API;
    req->nr_flags = NR_REG_ALL_NIC; // Or NR_REG_NIC_HD for host stack

    if (ioctl(fd, NIOCREGIF, req) < 0) {
        perror("ioctl NIOCREGIF");
        close(fd);
        return -1;
    }

    *mem_ptr = mmap(NULL, req->nr_memsize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    if (*mem_ptr == MAP_FAILED) {
        perror("mmap");
        close(fd);
        return -1;
    }

    *nif_ptr = NETMAP_IF(*mem_ptr, req->nr_offset);
    return fd;
}

int main(int argc, char *argv[]) {
    struct nmreq req1, req2;
    struct netmap_if *nif1, *nif2;
    char *mem1, *mem2;
    int fd1, fd2;
    const char *ifname1 = "em0", *ifname2 = "em1"; // Two interfaces to bridge

    if (argc > 2) {
        ifname1 = argv[1];
        ifname2 = argv[2];
    } else {
        fprintf(stderr, "Usage: %s <ifname1> <ifname2>\n", argv[0]);
        return 1;
    }

    fd1 = nm_open_and_mmap(ifname1, &req1, &nif1, &mem1);
    if (fd1 < 0) return 1;
    fd2 = nm_open_and_mmap(ifname2, &req2, &nif2, &mem2);
    if (fd2 < 0) {
        munmap(mem1, req1.nr_memsize);
        close(fd1);
        return 1;
    }

    printf("Bridging %s <-> %s\n", nif1->ni_name, nif2->ni_name);

    struct pollfd fds[2];
    fds[0].fd = fd1;
    fds[0].events = POLLIN;
    fds[1].fd = fd2;
    fds[1].events = POLLIN;

    for (;;) {
        // Wait for packets on either interface
        if (poll(fds, 2, -1) < 0) { // -1 for infinite timeout
            perror("poll");
            break;
        }

        // Process packets from ifname1 to ifname2
        if (fds[0].revents & POLLIN) {
            for (unsigned int r = 0; r < nif1->ni_rx_rings; r++) {
                struct netmap_ring *rx_ring = NETMAP_RXRING(nif1, r);
                struct netmap_ring *tx_ring = NETMAP_TXRING(nif2, r); // Use corresponding TX ring on ifname2

                while (!nm_ring_empty(rx_ring) && !nm_ring_full(tx_ring)) {
                    struct netmap_slot *rx_slot = &rx_ring->slot[rx_ring->cur];
                    struct netmap_slot *tx_slot = &tx_ring->slot[tx_ring->cur];

                    // Swap buffers (zero-copy forwarding)
                    uint32_t tmp_buf_idx = tx_slot->buf_idx;
                    tx_slot->buf_idx = rx_slot->buf_idx;
                    rx_slot->buf_idx = tmp_buf_idx;

                    tx_slot->len = rx_slot->len;
                    tx_slot->flags = NS_BUF_CHANGED; // Indicate buffer changed

                    rx_ring->cur = nm_ring_next(rx_ring, rx_ring->cur);
                    rx_ring->head = rx_ring->cur; // Advance head to release buffer

                    tx_ring->cur = nm_ring_next(tx_ring, tx_ring->cur);
                    tx_ring->head = tx_ring->cur; // Advance head to queue for transmit
                }
            }
            // Synchronize rings to transmit and release buffers
            ioctl(fd2, NIOCTXSYNC, NULL);
            ioctl(fd1, NIOCTXSYNC, NULL);
        }

        // Process packets from ifname2 to ifname1 (symmetric)
        if (fds[1].revents & POLLIN) {
            for (unsigned int r = 0; r < nif2->ni_rx_rings; r++) {
                struct netmap_ring *rx_ring = NETMAP_RXRING(nif2, r);
                struct netmap_ring *tx_ring = NETMAP_TXRING(nif1, r);

                while (!nm_ring_empty(rx_ring) && !nm_ring_full(tx_ring)) {
                    struct netmap_slot *rx_slot = &rx_ring->slot[rx_ring->cur];
                    struct netmap_slot *tx_slot = &tx_ring->slot[tx_ring->cur];

                    uint32_t tmp_buf_idx = tx_slot->buf_idx;
                    tx_slot->buf_idx = rx_slot->buf_idx;
                    rx_slot->buf_idx = tmp_buf_idx;

                    tx_slot->len = rx_slot->len;
                    tx_slot->flags = NS_BUF_CHANGED;

                    rx_ring->cur = nm_ring_next(rx_ring, rx_ring->cur);
                    rx_ring->head = rx_ring->cur;

                    tx_ring->cur = nm_ring_next(tx_ring, tx_ring->cur);
                    tx_ring->head = tx_ring->cur;
                }
            }
            ioctl(fd1, NIOCTXSYNC, NULL);
            ioctl(fd2, NIOCTXSYNC, NULL);
        }
    }

    // Cleanup
    munmap(mem1, req1.nr_memsize);
    close(fd1);
    munmap(mem2, req2.nr_memsize);
    close(fd2);
    return 0;
}

This example shows the core logic of a zero-copy bridge. Instead of copying packet data, it swaps the buf_idx between the RX slot of one interface and the TX slot of the other. This effectively transfers ownership of the packet buffer without moving any data.

Integrating with Packet Filters (BPF/PF):

While netmap bypasses the kernel's network stack, it's still possible to integrate with kernel-level packet filtering for certain use cases. For instance, you could use bpf (Berkeley Packet Filter) to capture packets before they enter netmap mode or after they leave it, for monitoring or specific filtering. However, for high-performance filtering on the fast path, the netmap application itself would implement the filtering logic in user space, leveraging its direct access to packet data. FreeBSD's pf (Packet Filter) operates within the kernel stack and would not directly interact with packets processed in netmap mode, unless netmap is configured in host-stack mode to pass certain packets to the kernel. For maximum performance, all filtering should be done in the netmap application.

Sysctl Configuration:

netmap behavior can be tuned via sysctl variables under net.netmap. Key variables include:

  • net.netmap.buf_size: Size of each packet buffer (default is typically 2048 bytes).
  • net.netmap.total_buffers: Total number of buffers allocated for netmap.
  • net.netmap.verbose: Enable verbose logging for debugging.

These can be adjusted to match specific hardware and application requirements. For example, increasing total_buffers might be necessary for applications handling a large number of concurrent connections or high burst rates.

Performance Considerations and Best Practices

Achieving optimal performance with netmap requires careful attention to system configuration and application design.

  • CPU Affinity and NUMA Awareness:
    • CPU Affinity: Pin your netmap application to specific CPU cores using cpuset(1) or cpuset_setaffinity(2). This reduces context switching overhead and improves cache locality.
    • NUMA (Non-Uniform Memory Access): On multi-socket systems, ensure that the NIC, its DMA memory, and the CPU cores running your netmap application are all on the same NUMA node. Accessing memory across NUMA nodes incurs significant latency penalties. Use tools like numactl (if available, or FreeBSD equivalents) to verify and configure this.
  • Batch Processing: Always process multiple packets in a single loop iteration. The overhead of poll() and ioctl(NIOCTXSYNC) is amortized over many packets, significantly boosting throughput. Aim to drain entire rings if possible.
  • Minimizing System Calls: The goal of kernel bypass is to minimize kernel interaction. Use poll() efficiently, and only call ioctl(NIOCTXSYNC) when necessary to synchronize ring states, typically after processing a batch of packets.
  • Hardware Offloads: While netmap bypasses much of the kernel stack, some NIC hardware offloads (e.g., TCP checksum offload, TSO/LRO) might still be beneficial or require careful consideration. Ensure your NIC drivers are netmap-compatible and that relevant offloads are configured appropriately. For raw packet processing, many offloads are disabled by netmap to ensure predictable behavior.
  • Memory Alignment: Ensure that your application's data structures and packet buffers are properly aligned to cache lines for optimal CPU performance. netmap buffers are typically aligned by the kernel, but any custom data structures should also follow this practice.
  • Error Handling and Monitoring: Implement robust error handling for netmap operations. Monitor ring occupancy, dropped packets, and CPU utilization to identify bottlenecks. netmap provides statistics through ioctl(NIOCCONFIG) with NR_STATUS flag, which can be invaluable for debugging and performance tuning.
  • Application Design:
    • Single-threaded per ring: For maximum performance, dedicate a single thread to each RX/TX ring pair of a NIC. This avoids locking overhead and maximizes cache utilization.
    • Lock-free data structures: If multiple threads need to share data, use lock-free algorithms or carefully designed mutexes to avoid contention.
    • Minimal processing: Keep the packet processing logic as lean and efficient as possible. Offload complex tasks to other threads or processes if they are not on the critical path.

Use Cases for Netmap:

netmap is particularly well-suited for applications that require extreme packet I/O performance and low latency:

  • High-Performance Firewalls/Routers: Implementing custom packet filtering and forwarding logic directly in user space.
  • Load Balancers: Distributing incoming traffic across multiple backend servers at wire speed.
  • Intrusion Detection/Prevention Systems (IDS/IPS): Analyzing network traffic for malicious patterns without introducing significant latency.
  • Network Taps/Monitors: Capturing and analyzing full packet streams for diagnostics or security auditing.
  • Network Function Virtualization (NFV) Infrastructure: Building virtual network functions (VNFs) like virtual NATs, VPN gateways, or DPI engines.
  • Traffic Generators: Creating high-rate packet streams for network testing and benchmarking.

By carefully designing and optimizing netmap applications, developers can unlock the full potential of modern network hardware on FreeBSD, achieving throughput and latency figures that are simply not possible with the traditional kernel network stack.

๐Ÿ”’

Advanced VALE Configuration

Detailed guide on setting up complex VALE topologies, including VLAN tagging and advanced forwarding rules for NFV deployments.

๐Ÿ”’

Optimizing Netmap for Specific NICs

In-depth tuning parameters and driver-specific considerations for Intel (igb, ixl) and Mellanox (mlx5) NICs.

๐Ÿ”’

Netmap Application Debugging and Profiling

Techniques for identifying bottlenecks and debugging high-performance Netmap applications using DTrace and other FreeBSD tools.

๐Ÿš€ Master Kernel Bypass Networking

Get access to advanced Netmap configurations, VALE switch blueprints, and performance tuning guides for production deployments on FreeBSD.

Request Access Browse Documentation

External Resources

โ“ Frequently Asked Questions

What is kernel bypass networking?โ–ผ
Kernel bypass networking is a technique that allows user-space applications to directly access network interface card (NIC) hardware, bypassing the operating system's kernel network stack. This significantly reduces latency and increases throughput by eliminating context switches, data copies, and much of the kernel's protocol processing overhead.
How does Netmap achieve high performance?โ–ผ
Netmap achieves high performance primarily through zero-copy I/O and direct ring access. It memory-maps NIC transmit and receive rings and their associated packet buffers directly into the user application's address space. This allows applications to read incoming packets and write outgoing packets without any data copying, and to process packets in batches, amortizing system call overhead.
What is a VALE switch?โ–ผ
VALE (Virtual Adapter for Local Ethernet) is a high-performance software switch built on top of the Netmap framework. It allows for the creation of virtual Ethernet switches within the kernel, enabling extremely fast, zero-copy packet forwarding between virtual ports, physical NICs, and user-space applications. VALE is ideal for network function virtualization (NFV) and inter-VM/container communication.
```