As processor speeds have increased, the time spent processing communication protocols has become less significant in terms of affecting network throughput. It has been shown that for higher speed networks like FDDI, most of the time is spent in the the kernel performing mbuf management, data movement and checksum computation. Kay and Pasquale demonstrated that UDP performs only ten to fifteen percent better than TCP on a DecStation 5000[7]. The authors own experiments using netperf confirmed this fact. This seems to indicate a fundamental problem with the way data is exchanged between the user, the kernel and the communication device. Most operating systems' networking codes utilize the 4.3 BSD release of Unix as a reference point, and thus this is not suprising.
As communication bandwidth increases, the industry is slowly starting to realize this shortcoming and is rectifying it through various optimizations.[1] Until these changes are made public and applied across a wide variety of machines, the user has but one choice to guarantee the efficient use of the machines resources. That is, to use specialized APIs that allow the non-root user to eliminate much of the kernel from the processing loop by controlling the network device directly.