Gigabit TechBlast Newsletter, March-April 2003 Issue
Welcome to the March-April 2003 edition of the Gigabit TechBlast Newsletter from DSS Networks.
– The Gigabit Experts™
[The up-to-date resource for information on DSS Networks high-performance network solutions]
Comments, questions or suggestions? Please send feedback to firstname.lastname@example.org
If you wish to opt-in or out of future issues of this e-letter, please send us an email with 'subscribe' or 'unsubscribe' in the subject line to email@example.com
In this months edition:
===== F E A T U R E D P R O D U C T S =====
- Featured products
- Recent press
- New board support packages
- New performance benchmark data
- Gigabit-Performance networking "Tech Tips"
1) The new single port Gigabit Ethernet 64-bit PCI Fiber controller. Featuring 1000
base SX/LX over single or multimode fiber in a low-profile PCI card form factor with
new ultra low power full-featured transceiver. Linux 2.4 and VxWorks 5.5 driver support.
More info >>>
2) The new single port Gigabit Ethernet 64-bit PCI copper controller. Featuring 1000
base T over CAT5 RJ-45 cabling in a low-profile PCI card form factor with new
ultra low power full-featured transceiver. Linux 2.4 and VxWorks 5.5 driver support.
More info >>>
===== N E W B O A R D S U P P O R T P A C K A G E S =====
** New board support and benchmarks include SuperMicro P4DL6 with dual 2 GHz Zeon processors,
64-bit 133/100/66 PCI-X slots, DDR SDRAM, dual PCI bus and DDR SDRAM memory channels.
** Driver support and performance benchmarks now available for Linux 2.4.20 SMP
as tested on SuperMicro mainboard based system.
** Support for Tornado 2.2 and VxWorks 5.5 for both PowerPC and Intel
pcPentium4 BSP support is now available.
===== N E W P E R F O R M A N C E D A T A =====
Benchmark tests were performed on a single processor 2-GHZ Intel Xeon based system
using our Models 5161 and 6161 Gigabit Ethernet embedded PMC controllers. Main board
used was a SuperMicro model P4DL6 with ServerWorks GC-LE chipset running both Linux
2.4.18 and VxWorks 5.5 with BSP support for pcPentium4. Tests results are as follows:
Single-port test benchmarks:
Dual-port test benchmarks (2-ports in same system):
- Over 850,000 60-byte frames per second.
- Over 654,000 204-byte (170-byte payload) frames per second.
- Over 242 megabytes per second (1.93 Gb) using 1500-byte frames.
For additional benchmark data, please see our Gigabit Ethernet FAQS page
- Over 1.2 million 60-byte frames per second.
- Over 1.02 million 204-byte (170-byte payload) frames per second.
- Over 484 megabytes per second (3.87 Gb) using 1500-byte frames.
===== R E C E N T P R E S S =====
1) DSS Networks Announces Gigabit Products for High Throughput and Frame Rate
Applications Adding Support For Tornado2.2/VxWorks 5.5 For High-End Intel
Pentium4 and Xeon Processor Based Boards
Irvine, CA – Feb 19, 2003 -- DSS Networks announces one of the fastest network
benchmarks ever for supporting high-frame applications such as voice over IP, digital
broadcast video and other high-speed multimedia applications. To support these very
demanding high-performance applications, additional driver and board support for Tornado
2.2 and VxWorks 5.5 for Intel architecture and pcPentium4 BSP is now available.
More info >>>
2) DSS NETWORKS ANNOUNCES NEW SINGLE PORT GIGABIT ETHERNET PCI FIBER
CONTROLLER FOR REAL-TIME AND EMBEDDED SYSTEMS APPLICATIONS
Irvine, CA - Feb 17, 2003 -- DSS Networks announces support for a single port Gigabit
Ethernet controller card with multiple fiber optic and copper interface options in a low-profile
PCI form factor -- expanding existing family of Gigabit Ethernet products targeted for
Embedded Broadband Network Applications.
More info >>>
===== T E C H T I P S =====
This month's tech-tip addresses the issues regarding "high-frame-rate" Gigabit
Networking for applications requiring short frames like VoIP.
First we should define what "high-frame-rate" refers to. High-frame rate applications are
those which require a specified performance level in megabytes-per-second typically using
short Ethernet frames of fixed length. Some applications including voice over IP (VoIP)
have special requirements for packaging and transmitting payload data in short frames
and then transmitting them at high rates. So in addition to high frame rate, these
applications may still require a high overall throughput in megabytes-per-second. This
performance model is in contrast to the "typical" model for Ethernet networks where the
TCP/IP protocol stack is employed. In the conventional TCP/IP network, payloads are
normally large for information including file transfer, web page download and email and the
frames can be packed to the standard maximum size of around 1500 bytes. Using
conventional TCP/IP or UDP/IP "file" based transfers, wire-speed is much more easily attained.
The issues that arise from the high frame rate model present special technical challenges
due to several system-level factors. The factors include the arbitration, transaction and
handshaking overhead of both the PCI bus and of the SDRAM controllers along the data
path as well as other factors including the CPU, graphics engine and cache coherency. All
of these factors impact the maximum performance that a gigabit controller can attain, and
special attention and evaluation must be given to the system controllers involved in the
data path between the Gigabit Ethernet controller and SDRAM. This would include the CPU,
the SDRAM controller and the PCI interface controller and on standard PC's would be
housed in the North and South bridges. The CPU is involved to the extent of the driver
performing "buffer management" posting transmit descriptors and processing receive
descriptors, however, the CPU is not involved in the data transfer itself as this is directly
controller by bus-master DMA from the Gigabit controller.
System level performance issues for Gigabit Ethernet can be best analyzed using loopback
tests, end-to-end saturation tests and frame-generators that will stress the system to the
maximum allowable performance. For example, you may want to test end-to-end between
two systems running short frames of a given size to test and verify at the maximum speed
possible. Our driver supports several types of loopbacks, end-to-end tests and includes a
frame generator module. It also provides statistics maintained by the driver that can be
displayed on the console including both frames and bytes per second throughput. This
allows accurate measurements to be taken while monitoring the health and performance
of the driver. In Linux, the "top" command can also be run simultaneously to monitor the CPU usage.
The remainder of this tech-tip will focus on a high-frame rate benchmark application using a
frame size of 204 bytes. This is approximately a 170-byte payload plus the overhead of a
34-byte IP and Ethernet frame header. As measured on a 2 GHZ Intel Xeon based system
(SuperMicro P4DL6) using a high-end ServerWorks chipset and our Model 6161 Gigabit
Ethernet controller in a single 64-bit PCI-X slot, we can attain the following benchmark for
frame-rate performance when running a bi-directional end-to-end transfer test:
However, when we run a "transmit only" frame generator test, we get the following
throughput which does not perform quite as well (but still fairly substantial – 88 MB using
204-byte frames is moving quite along):
- 654,000 204-byte (170-byte payload) frames per second / 133.41MB (1.07 Gb)
Since the CPU is not fully utilized and the PCI bus is not running at full capacity, you might
expect to see similar aggregate results for transmit only vs. transmit plus receive.
However, this is not always the case as the dynamics for transmitting from SDRAM out the
Gigabit Ethernet controller (PCI bus-master reads) are not as fast as receiving frames and
writing to SDRAM (posted writes). These dynamics can be summarized as follows. The
transmit descriptor lists have transmit buffers queued at all times which means either the
SDRAM access, PCI or gigabit controller (or combination thereof) is the limiting factor in
short-frame transmits. PCI (and SDRAM) bus-master reads (memory reads from gigabit
controller during transmit) are always somewhat slower than memory writes (frame
receives) due to turnaround time, waits states (SDRAM fetching) and cache coherency
enforcements incurred on SDRAM reads across the PCI bus. You would need to use a PCI
bus analyzer to determine for certain on your particular system, but the following
calculations on the PCI bus transactions seem to confirm these limitations during frame
transmit. If we calculate the setup, transfer and wait-state time on SDRAM memory reads,
we can calculate the PCI bus usage as in the following example for our 204 byte frame
size. Similar calculations can be done for both shorter (60 byte) and longer (512 byte)
frames. The actual overhead on the PCI bus may be slightly higher as this calculation only
considers the gigabit controller's bus-master DMA SDRAM accesses and not driver accesses
to the controllers registers, but the driver accesses are mostly insignificant in comparison.
The results seem to indicate that due to the ratio of buffer descriptor accesses for short
frames in addition to bus transaction, wait state delays and cache coherency
enforcements are causing the degradation in frame rate performance using short frames.
The shorter the frame, the more significant the overhead and more degradation occurs.
- 432,009 204-byte (170-byte payload) frames per second / 88.1 MB (704.8 Mb)
Example of performing PCI bus frame rate calculations (aimed at average and not worst case) for 204-byte frame size:
204 / 8 = 26 (number of 64-bit transfers)
204 / 32 = 7 cache lines (round up to 7 cache lines read) per frame
For each burst DMA: Assume 6 clocks setup, hold and turnaround time on bus (minimum transaction time on PCI bus).
Add additional 8-16 clocks (wait states) per cache line to fetch data from SDRAM (will
assume average of 8) as some prefetching may occur.
For each transmit packet the controller must DMA one 32-byte transmit descriptor (read),
transfer the packet then DMA update (write) same transmit descriptor back to SDRAM. For
descriptor read + write:
6 (setup) + 4 (transfer) + 16 (wait states) = 26 times 2 = 52 (read + write).
For each 204-byte frame DMA: 6 (setup) + 26 (transfer) + 56 (additional wait states for 7 cache lines) = 88 PCI clocks for frame transfer.
88 (frame) + 52 (descriptor) = 140 clocks.
66.6M / 140 = 475,714 FPS / 97.04 MB (maximum 204-byte frames-per-second obtainable over 64/66 PCI during transmits using calculations)
Summary: The calculated PCI and SDRAM access overhead for transmitting these short
frames is substantial and a probable cause of the transmit performance degradation. The
problem degrades the performance further when using 60-byte frames and diminishes
when using 500 bytes or longer. For example, it can be see by using similar calculations for
500-byte frames and also running a benchmark, that this bus and SDRAM access
transaction overhead decreases significantly to the point that the benchmark can be run at
wire-speed. Using a combination of high-performance frame tests, calculations and a PCI
bus analyzer can help analyze and address these system level issues so that the best
system can be obtained and tuned for a high-frame-rate application.
Other system level issues to consider and checklist items for deploying Gigabit
Ethernet in your system for high frame rate and high throughput applications:
- CPU speed. Use the fastest CPU available, an 2.4 GHZ Pentium4 or Xeon. Dual-
processors may also help in certain applications where the application processing
can be distributed.
- Evaluate the "system controllers". Use high-end system controllers like the Intel
7505/7501 series or Serverworks GC-HE / GC-LE.
- Use system controllers that support multiple 64-bit PCI-X slots (multiple PCI-X bus
segments and channels)
- Use DDR SDRAM (fastest available) and system controllers that support multi-
gigabit dual DDR-SDRAM access channels.
- Double check and compare the performance specifications and architecture for the
system controllers, especially the PCI and SDRAM access capabilities.
- Make sure Gigabit Ethernet controller is also full-featured and supports high-end
features including buffer descriptor caching and bursting and advanced DMA and
- Consider using a Gigabit Ethernet controller that allows buffer descriptors to be
located in the Gigabit Ethernet controller's local memory as opposed to system
SDRAM. This may offer performance increases for some applications, however there
are obvious tradeoffs.
- Make sure your Gigabit Ethernet vendor provides system level benchmarks for their
Gigabit Ethernet controller on various systems and evaluate these benchmarks
against your proposed target system.
- Investigate other alternatives to TCP/IP such as UDP/IP, "direct IP" or other high-
performance protocol implementations. You may need a custom high-frame rate
kernel module used in conjunction with our Gigabit Ethernet driver.
Contact the DSS team via email at firstname.lastname@example.org
Thanks for reading! We hope you found this information useful.
-- The team at DSS Networks - "The Gigabit Experts"
For a Customized application to your product: email your requirements to email@example.com
DSS Networks designs and manufactures its products in Lake Forest, CA under the highest quality standards. We provide embedded solutions based on high-performance next generation broadband networking technologies including Gigabit Ethernet, 10-Gigabit Ethernet and CWDM/DWDM. DSS markets its products to OEMs, VARs and Systems Integrators.
For Complete Information: visit our website at http://www.dssnetworks.com or call us at 949.727.2490 for additional information.
If you wish to be removed from this newsletter, please send an email to firstname.lastname@example.org
At the RTECC Show in Washington DC, DSS Networks Announced a New Family of 3U Compact PCI Network Cards to Address New Market Demands.
DSS Networks Today Announced a New Reduced Cost 12-Port PICMG 2.16 Compliant Gigabit Ethernet Backplane Switch Fabric Card Targeted at High Volume OEMS.
DSS Networks today announced the innovative evolution of it's Model 6468 as an intelligent quad port Gigabit Ethernet server adapter with an onboard level 2 switch providing an independent bypass feature.
DSS Networks today announced a dual port Gigabit Ethernet PMC with SC-type transceivers supporting multiple fiber transceiver wavelength options and ruggedized features for MIL-AERO.
DSS Networks Today Announced a Next-Gen RoHS Compliant PCI-X Based Quad Port Gigabit Ethernet PMC for Telecom, Mil-Aero and Industrial Apps.
DSS Networks Today Announced Entry into the ATCA, MicroTCA and AdvancedMC Product Market with a New Family of AdvancedMC Cards.
DSS Networks today announced another highly differentiated extreme performance Network Product -- a new dual port Gigabit Ethernet PMC Controller with pluggable SFP transceivers supporting both fiber and copper.
DSS Networks today announced another highly differentiated extreme performance Network Product -- a multi-port Processor PMC Gigabit Ethernet Network Processor Engine.
DSS Networks today announced first to market extreme performance PCI-Express Switch/Interface product.
DSS and Wind River are teaming up together to bring you advanced embedded broadband networking solutions.
DSS Networks is a member of the PICMG association