Network stack cloning / virtualization extensions to the FreeBSD kernel
News
With generous support from the FreeBSD Foundation and NLNet Foundation,
a project aimed at virtualizing the network stack in FreeBSD -CURRENT has
been started in 2006. The code is reasonably stable for testing already
(as of September 2007.); you can check out the project's
web page for more
details.
Maintenance of network stack virtualization patches for FreeBSD 4.11
continues as a part of the University of Zagreb project
IMUNES, an Integrated Multiprotocol Network
Emulator / Simulator.
The latest patchset now also virtualizes the IPv6 portion of the stack.
If you encounter any bugs with IPv6 please don't hesitate to report them
to me!
Files
Here you can find patches against the FreeBSD 4.11 RELEASE kernel that
provides the functionality of maintaining multiple independent network stack
instances within a single operating system kernel. No userland modifications are
necessary, except the management utility included bellow.
4.11-R-20050703.diff.gz
(gzipped kernel diffs, 330K)
vimage-20040209.tgz (userland management
utility, 5K)
Functionality
Within a patched kernel, every process, socket and network interface belongs to
a unique virtual image. Each virtual image provides entirely
independent:
-
set of network interfaces and userland processes;
-
interface addresses and routing tables;
-
TCP, UDP, raw protocol control blocks (PCBs);
-
network traffic counters / statistics;
-
set of net.inet tunable sysctl variables (well, most of them actually);
-
ipfw and dummynet instance;
-
kernel message buffer instance;
-
system load and CPU usage accounting;
- proportional share CPU scheduling
From the userland perspective, all the virtualization modifications within
the kernel have been designed to preserve the complete API/ABI compatibility,
so absolutely all existing userland binaries should be able to run unmodified
on the virtualized kernel. Furthermore, as there are no address translation
hacks, library replacements/hooks etc., the overall performance penalty
of introduction of virtualization layer is mostly neglectable.
Within the kernel, the API compatibility is preserved on the device
driver layer, however most modules will require recompilation, and some
of them source code modification, because of API changes in the higher
level networking routines and data structures.
Additional goodies contained in the above patch include:
-
"ve" virtual ethernet clonable interfaces, which can be created
on demand, assigned to a target virtual image, and then bridged either
internally or externally through a real physical ethernet interface, to
provide the convenient access to the outside network from within the virtual
images. This feature will be mostly useful in virtual hosting applications
-
"vipa" virtual internal IP address interface - a loopback type interface,
which enables transparent binding of all outgoing TCP/UDP sessions
to the IP address configured on this internal interface. This can be very
useful for enhancing the robustness of sessions originating from / connecting
to a system with more than one physical network interface, in case of changes
in availability of one of the real interfaces. The idea is borrowed from
IBM's OS/390 V2R8 TCP/IP stack implementation.
-
hiding of "foreign" filesystem mounts within chrooted virtual images
Installation
Patch the kernel sources:
# cd /usr/src
# mkdir sys/modules/if_ve
# gzcat 4.10-R-20050227.diff.gz | patch
Configure, build and install the kernel and the modules
in the usual way. LIMUNES or VMBSD are sample kernel config files
that compiles successfully with the above patches applied. Do not bother
trying to include INET6 or IPSEC in the config file, currently they are
not supported, as well as many other options.
Furthermore, you have to compile and install the vimage management
utility:
# tar -xzvf vimage-20040209.tgz
# cd vimage
# make; make install
Configuration examples
A manual page
is included with the vimage userland utility, which
explains the command syntax and options.
Bellow are three (mostly) self explaining setup scripts, which demonstrate
the possible applications of the virtualized networking code:
bridging - setting up virtual hosted
environment with two virtual nodes, visible on a single LAN segment.
routing - setting up a simulated network
topology with multiple virtual nodes interconnected through point-to-point
links, exchanging routing information via RIP/routed.
overlay - demonstrates how to set up an overlay
vpn network using clonable network stacks and IP tunnels.
For running the "routing" simulated network, you will need to install
the ng_dummy netgraph traffic
shaper. If you prefer to run the zebra routing daemon instead of
routed, you will probably have to modify the script to create the
virtual images in separated chrooted environments. See the jail(8)
manual pages for further instructions on how to properly construct the
chrooted directory tree.
Known bugs / errata
-
IPFilter compiles with tons of warnings, but is probably unusable;
-
many many more - please report the bugs/problems to the author
To do
-
document CPU usage accounting algorithms
-
fix IPFW2 code
-
removal of many now obsolete variable and struct definitions throughout
diverse header files
-
virtualization of other network protocols/domains (IPX, Appletalk...)
-
...
Publications
Julian Elischer gave a talk on the subject at the
USENIX Annual Technical Conference (FreeNIX track)
in San Antonio, TX, June 2003. You can fetch the accompanying
paper
here.
Here you can also find the
slides
from the lecture I presented at
BSDCon Europe 2002 in Amsterdam.
You might also want to check a brief
FAQ
containing miscellaneous questions with hopefully useful answers on the
subject.
Back to the author's home page