Gigabit TechBlast Newsletter, August-September 2003 Issue
Welcome to the August-September 2003 edition of the Gigabit TechBlast Newsletter from DSS Networks.
– The Gigabit Experts™
[The up-to-date resource for information on DSS Networks high-performance network solutions]
Comments, questions or suggestions? Please send feedback to firstname.lastname@example.org
If you wish to opt-in or out of future issues of this e-letter, please send us an email with 'subscribe' or 'unsubscribe' in the subject line to email@example.com
In this months edition:
===== F E A T U R E D P R O D U C T S =====
- Featured products
- Recent press
- New drivers and board support
- Latest performance data
- Gigabit-performance "Tech Tips" technical article
1) The new 2-port Gigabit Ethernet PMC/PCI-X fiber controller. Featuring 1000 base SX/LX over single or multimode fiber in a PMC form factor using a highly integrated, low power Intel 82546 gigabit pci-x controller. Linux 2.4 and VxWorks 5.5 driver support.
More info >>>
2) The new 2-port Gigabit Ethernet PMC/PCI-X copper controller. Featuring 1000 base T over CAT5 RJ-45 cabling in a PMC form factor using a highly integrated, low power Intel 82546 gigabit pci-x controller. Linux 2.4 and VxWorks 5.5 driver support.
More info >>>
===== R E C E N T P R E S S R E L E A S E S =====
1) DSS NETWORKS, THE GIGABIT EXPERTS™, TODAY ANNOUNCED A NEW 12-PORT PICMG 2.16 COMPLIANT GIGABIT ETHERNET BACKPLANE SWITCH BLADE FOR EMBEDDED & REAL-TIME TELECOM / DATACOM APPLICATIONS.
Irvine, CA - September 3, 2003 - DSS Networks announces another innovative board level product - a high-performance scalable 12-port Compact PCI PICMG 2.16 compliant 6U switch fabric blade - demonstrating commitment to quality board and system level products targeted for OEMS and integrators of embedded and real-time telecom/datacom systems.
More info >>>
2) DSS NETWORKS, THE GIGABIT EXPERTS™, EXPANDS SALES INTO EUROPEAN MARKETS.
Irvine, CA - June 18, 2003 - DSS Networks announced today that it has expanded its sales presence into Europe by entering into a sales representation agreement with ADCOMTEC. Founded in 1998 by Mr. Nick Maddicks, ADCOMTEC has been servicing the European market with it primary focus on industrial and embedded computing products. The Company specializes in the creation, development and management of high value, long-term OEM accounts throughout Europe. Its location near Munich in the heart of "Silicon Bavaria" makes ADCOMTEC optimally sited to service the European marketplace via the region's excellent road, rail and direct access to air links to all of Europe.
More info >>>
===== N E W D R I V E R S A N D B O A R D S U P P O R T =====
** Driver support for Tornado 2.2 and VxWorks 5.5 for both PowerPC and Intel pcPentium4 BSP is now available for all Intel based cards.
** Driver support and performance benchmarks now available for Linux 2.4 (2.4.18 - 2.4.20) for all Intel-based cards.
** New board support and benchmarks include Intel-based gigabit controllers, SuperMicro P4DL6 with dual 2 GHZ Zeon processors, 64-bit 133/100/66 PCI-X slots, DDR SDRAM, dual PCI bus and DDR SDRAM memory channels.
===== L A T E S T P E R F O R M A N C E D A T A =====
The following driver-level benchmark tests were performed on a single processor 2-GHZ Intel Xeon based system using our Intel-based models 5262-RJ and 6162 Gigabit Ethernet controllers. Main board used was a SuperMicro model P4DL6 with ServerWorks GC-LE chipset running both Linux 2.4.20 and VxWorks 5.5 (pcPentium4 BSP). Tests results for Linux and VxWorks are as follows:
Linux single-port test benchmarks:
VxWorks single-port test benchmarks:
- Over 164,000 1500-byte frames per second - 246 MB/sec (1.96 Gb)
- Over 592,000 204-byte frames per second - 118.4 MB/sec (947.2 Mb)
- Over 674,000 64-byte frames per second - 43.13 MB/sec (345.08 Mb)
For both Linux and vxWorks, a 2-port 1500-byte frame test resulted in aggregate system throughput of 492MB/sec (3.93Gb) sustained.
- Over 164,000 1500-byte frames per second - 246.2 MB/sec (1.97 Gb)
- Over 726,000 204-byte frames per second - 148.32 MB/sec (1.19 Gb)
- Over 756,000 64-byte frames per second - 48.4 MB/sec (387.2 Mb)
For additional information and benchmark data, please see our Gigabit Ethernet FAQS and Screenshot pages at the following links:
===== T E C H T I P S =====
This month's tech-tip addresses how to analyze and resolve system-level issues with PCI bus-master DMA when using Gigabit Ethernet controllers in multi-vendor integrated hardware platforms.
Introduction: Gigabit Ethernet controller cards are used by OEM's and Systems Integrators in almost every type of system imaginable. The systems employed are in many cases unlike a "PC" which has a very standard and specific processor and hardware platform. Instead, these systems typically use a wide variety of different hardware components including processors, SDRAM and system controllers and other hardware devices. Because of the widely varied hardware used in these systems, the implementation of the system initialization and configuration firmware, board support packages and HAL's for these operating systems are also as varied as the hardware they support. In most cases this does not present a problem because of adequate testing and quality measures taken by the board vendor. However, in some instances these combinations may propagate certain hardware and software incompatibilities that may affect the operation and performance of a gigabit ethernet controller card. The most obvious and typical of these types of system-level issues are ones that affect the performance of data transfer across the PCI bus - commonly referred to as "bus-master DMA".
Background: Like many high-speed intelligent I/O controllers, virtually all PCI-based Gigabit Ethernet controllers employ PCI bus-master DMA to transfer network data (TCP/IP messages encapsulated into IP segments and Ethernet frames) between the host's memory (SDRAM or DDR SDRAM) and the physical wire. This is done to offload the host CPU from the data intensive task of transferring Ethernet frames and is especially important in a high-speed technology like Gigabit Ethernet. The DMA operations are initiated by the Gigabit Ethernet card (the initiator), but the actual delivery mechanism involves a two-way handshake with a "target" which is usually a system controller. It is the target that is responsible for interacting with the SDRAM controller to fetch read data from SDRAM (tx direction) or to post write data (rx direction). It is during the PCI bus-master DMA operations that some system level issues may arise that can prevent or limit the rate in which data may be transferred from the gigabit controller to SDRAM.
How PCI bus-master DMA issues can occur: Bus-master DMA issues that prevent or limit the transfer of Gigabit Ethernet data across the PCI bus to and from SDRAM can be classified as "hard" or "soft" errors. In this context, hard errors are those that prevent data transfer from taking place while soft errors are ones that limit performance. Hard errors can be caused by any number of system integration reasons including invalid SDRAM addresses or improperly mapped addresses and are outside the scope of this article. Soft errors - problems categorized by poor performance, can be caused by either the Gigabit Ethernet controller or by the system controller housing the PCI target interface. In our experience, we have found in many cases it to be caused by system level issues with the configuration of the system controller and/or SDRAM interface. To understand why this is a likely cause, we should first provide a short description of how PCI-based Gigabit Ethernet cards transfer data.
Gigabit Ethernet card bus-master DMA: Gigabit Ethernet controllers (also known as MAC's or MAC/PHY combos) house the hardware logic for the PCI or PCI-X interface. On the front-end, they provide a PCI bus-master transmit and receive DMA-engine and on the back end they provide an interface to a transceiver interfacing to the physical medium (CAT5 copper or fiber optic cable). In between, the Gigabit Ethernet controller contains a fixed amount of buffering or FIFO memory for very short-term storage of sequences of Ethernet frames. The intention of this onboard FIFO buffering is to sustain the latencies or short delays incurred while transmitting frames across the PCI bus to and from main memory - it is not designed for large storage or exceedingly long latencies over the PCI bus. Due to the nature of the gigabit speed and limited FIFO buffering, the PCI bus must be fast enough in terms of bandwidth and latency to support Gigabit Ethernet. A 32-bit PCI operating at 33 MHZ is not fast enough to carry gigabit data in full-duplex as Gigabit Ethernet represents a potential of 250MB/sec per port. Without flow-control the PCI bus would be over-run. However, a 64-bit and/or 66 MHZ PCI bus does have enough speed to fill the bandwidth and latency requirements for full-duplex gigabit transfers.
To explain further, Gigabit Ethernet controllers operate on link lists of buffer descriptors located in SDRAM. These buffer descriptors are normally 16 or 32 bytes in length each and contain a pointer to a buffer in which to read data for a transmit or to write data for a receive. The gigabit controller transfers the data by acquiring the bus and performing a "burst mode" transfer of the packet data. The descriptors themselves are normally "cached" by the gigabit controller and also transferred in bursts. The gigabit controller is designed to transfer this data at wire speed and can operate on entire lists of buffer descriptors without CPU intervention. The driver is a "buffer manager" that formats and queues transmit descriptors and processes receive descriptors - the transfers themselves are purely bus-master DMA driven. Typical DMA burst lengths are dependent on the length of the Ethernet frame and are typically anywhere from 64-1024 bytes (16 - 256 long-words on 32-bit PCI, 8 - 128 long-words in 64-bit PCI transfers). It is during these burst bus-master DMA transfers where the performance rate of the flow of data could be compromised by the performance of the system controller or SDRAM interface. In other words, in these situations the initiator (and DMA performance) is dependent on the target's ability to deliver and in these cases the gigabit controller is dependent on the system controller and SDRAM's ability to perform.
Potential for DMA issues: Gigabit Ethernet controllers are very standard devices from companies including Intel, National Semiconductor and Broadcom. They are normally tested and certified by both the vendor and in laboratories such the UNH IOL and have been deployed in just about every type of system imaginable -- their use is very broad and dynamic. They are also fully contained in one piece of silicon from one vendor. On the other hand, a system board usually contains chips from multiple vendors including the system controller and is far less integrated. In addition, it has probably not been tested with a gigabit-speed bus-master controller like Gigabit Ethernet - certainly not at maximum wire speed. Because of this and the fact that the host's PCI and SDRAM interfaces are "centrally shared resources", it is much more likely and indeed common for a bus-master DMA issue to be caused by some type of problem in the system controller or SDRAM interface. This is not to say that the driver could not be responsible, but just to say that in hardware terms there is more likelihood that the problem exists somewhere in the target's central resource/path to SDRAM. To be certain, there are methods described below that can be used to isolate and find the root cause of the problem.
The performance problem itself may or may not be hardware related -- in many cases it is not a hardware problem but due to how the operating system has configured the system controllers and SDRAM interfaces. The following are the typical causes of DMA performance limitations in the target system controller side of the PCI transactions and these types of issues are soft failures because the system and the driver appear to be operating normally, but just somewhat slow. It would be rational to believe that the driver is somehow the bottleneck, but this can be determined using the methods described below.
Target initiated termination or delays: These problems are categorized as follows:
Target initiated termination issues are described in section 220.127.116.11 of the PCI 2.2 specification. Target initiated "wait-states" are described in section 3.5. titled "Latency". They can have a major impact on bus-master DMA transfers and overall Gigabit Ethernet performance. Cache coherency issues can also impact performance, but in most cases cause "hard-failures". Please also note that these conditions listed are not "hard failures" like Master or Target Aborts and are not due to too many devices on the PCI bus or because the PCI bus is not fast enough - they can occur in optimum conditions due to a poorly configured or otherwise working system controller and/or SDRAM interface.
- Slow access to SDRAM during reads causing excessive wait states on the PCI bus (PCI 2.2, section 3.5)
- System controller/SDRAM unable to keep up with the aggregate throughput (target initiated retry and disconnect conditions, PCI 2.2 spec, section 18.104.22.168)
- Larger DMA transfers being broken down into smaller ones due to the reasons above.
Target initiated termination including retries and disconnects can be caused by system controller/SDRAM configuration issues including the following:
Master initiated termination: Master initiated termination is categorized by either completion of the transaction or a timeout due to the latency timer expiration as explained in section 22.214.171.124 of the PCI 2.2 specification. There are clearly less reasons for early termination by a master (the gigabit controller) because for reads (frame transmit) it is prepared to receive all data transferring the entire Ethernet frame to FIFO. And for writes (frame receive) it is prepared to transfer an entire Ethernet frame to SDRAM. While it is possible for the driver to have improperly configured the FIFO thresholds in the gigabit controller to cause such problems, these problems are obvious and show up as FIFO under-runs or overruns in controller statistics. Also, latency timer issues are also not normally a cause as the latency timer is only utilized in cases where other masters are trying to acquire the bus. The latency timers are also configured by default to a value high enough to allow for un-interrupted burst transfers of sizes large enough as not to impact performance
- PCI interface has low priority than other devices including CPU and display/graphics controller.
- System is not configured to handle large bursts of DMA transfers properly this stopping transfer prematurely.
- System is not configured to "pre-fetch" data from SDRAM for burst reads of cache lines or multiples thereof.
- System controller not using a "burst-mode" to transfer writes to SDRAM for receive data or is waiting for writes to complete.
How to analyze bus-master DMA issues: Analyzing DMA issues is not difficult once a basic understanding of how Gigabit Ethernet controllers perform using burst-mode bus-master DMA. The most direct approach is using a PCI-bus analyzer. PCI bus analyzers can be rented or purchased, are not difficult to setup or use and are very effective -- they clearly show how bus-master dma is being setup, how transfers are taking place and exact causes of delays and early termination. For example, from the trace analysis from a PCI bus-analyzer termination including wait states and stop conditions are easily identified as well as which side (initiator or target) is responsible. As with most problems, once the root cause is found the solution is attainable using the proper resources.
Using driver statistics to isolate problem: The statistics instrumented into our vxWorks, Linux and Windows drivers also provide important clues as to the source of the problem. By disabling pause-frame flow control in the driver and by running our frame generator in loopback and end-to-end tests, it can be determined if the problem is on the transmit side, receive side or both. Disabling pause frame flow control allows performance problems on the PCI bus to show up more readily in the driver statistics because flow control can mask DMA performance problems. Driver-level statistics including transmit and receive frame counts, byte counts, transmit frames queued, transmit flow count and transmit empty count can be used to determine whether the driver or bus-master DMA is the bottleneck. These statistics are available from the driver.
Transmit DMA isolation: By running the loopback or transmit-only frame generator test, it can be determined if the CPU and driver are keeping the transmit queues busy (not empty) at all times. If so, then the performance is limited by which the system controller is able to deliver data to the gigabit controller across the PCI bus. Statistics indicators can be monitored as follows to determine where bottleneck is occurring. For example:
Then it is a very significant clue that the CPU and driver are not the bottleneck and that the problem is likely a system-level problem with the system controller and/or SDRAM interface ability to fetch and deliver burst-mode PCI reads initiated by the gigabit controller. These statistics can be displayed from the driver using the vxWorks "nsShow" or Linux "dmUtil -s" commands. Please see Users Manual for usage and description of these statistics.
- If the transmit empty count is low
- And the transmit flow on/off counts are high
- And the transmit throughput is lower than expected (lower or much lower than wire speed)
- And other transmit errors such as aborts and underruns are not occurring
Receive DMA isolation: Isolation of problems on receive side can similarly be performed using the frame generator from a remote end or a loopback test (remote end provides better isolation of transmit from receive). Again, statistics indicators can be monitored as follows to determine where bottleneck is occurring. For example:
Then it is a very significant clue that the CPU and driver are not the bottleneck and that the problem is likely a system-level problem with the system controller and/or SDRAM interface ability to transfer and write burst-mode PCI writes initiated by the gigabit controller.
- If receive over-runs are occurring frequently
- And the receive descriptor lists are not full or overflowing
- And other driver errors are not occurring such as master/target aborts
Note: As previously noted, pause frame flow control must be disabled during configuration to prevent gigabit controller from flow controlling frame transmits and receives which might prevent errors from occurring.
Summary: Performance limitations to bus-master DMA problems can occur in multi-vendor integrated systems and can be caused by a malfunctioning driver, system controller or Gigabit Ethernet controller. However, in cases where the driver is otherwise operating healthy and not reporting master aborts, target aborts, overruns or under-runs it may likely be due to a system-level problem on the side of the system controller because this central resource is the most probable area where stalls can occur. Cache coherency related issues usually cause hard failures but should not be ruled out until verified. A PCI bus analyzer used in combination with the driver statistics and frame generators is the best-recommended approach to identify and isolate problem. In most cases once the problem is known it can be resolved quickly and a vendor-supplied patch should correct the problem.
Gigabit Ethernet Checklist [and other system level issues to consider and checklist items for deploying Gigabit Ethernet in your system and minimizing the possibility of bus-master DMA issues]:
- CPU speed. Use the fastest CPU available, a 2.4 GHZ Pentium4 or Xeon. Dual-processors may also help in certain applications where the application processing can be distributed.
- Evaluate the system controllers providing the PCI interface. Use the latest full-featured high-end system controllers like the Intel 7505/7501 series or Serverworks GC-HE / GC-LE.
- Review the system configuration of the system controllers in respect to the PCI interface, write/read modes, bursting, DMA priority and max DMA lengths and/or consult the board documentation and vendor.
- Use system controllers that support multiple 64-bit PCI-X slots (multiple PCI-X bus segments and channels)
- Use DDR SDRAM (fastest available) and system controllers that support multi-gigabit dual DDR-SDRAM access channels.
- Double check and compare the performance specifications and architecture for the system controllers, especially the PCI and SDRAM access capabilities.
- Make sure Gigabit Ethernet controller is also full-featured and supports high-end features including buffer descriptor caching and bursting and advanced DMA and interrupt options.
- Make sure your Gigabit Ethernet vendor provides system level benchmarks for their Gigabit Ethernet controller on various systems and evaluate these benchmarks against your target system.
Contact the DSS team via email at firstname.lastname@example.org
Thanks for reading! We hope you found this information useful.
-- The team at DSS Networks - "The Gigabit Experts"
For a Customized application to your product: email your requirements to email@example.com
DSS Networks designs and manufactures its products in Lake Forest, CA under the highest quality standards. We provide embedded solutions based on high-performance next generation broadband networking technologies including Gigabit Ethernet, 10-Gigabit Ethernet and CWDM/DWDM. DSS markets its products to OEMs, VARs and Systems Integrators.
For Complete Information: visit our website at http://www.dssnetworks.com or call us at 949.727.2490 for additional information.
If you wish to be removed from this newsletter, please send an email to firstname.lastname@example.org
At the RTECC Show in Washington DC, DSS Networks Announced a New Family of 3U Compact PCI Network Cards to Address New Market Demands.
DSS Networks Today Announced a New Reduced Cost 12-Port PICMG 2.16 Compliant Gigabit Ethernet Backplane Switch Fabric Card Targeted at High Volume OEMS.
DSS Networks today announced the innovative evolution of it's Model 6468 as an intelligent quad port Gigabit Ethernet server adapter with an onboard level 2 switch providing an independent bypass feature.
DSS Networks today announced a dual port Gigabit Ethernet PMC with SC-type transceivers supporting multiple fiber transceiver wavelength options and ruggedized features for MIL-AERO.
DSS Networks Today Announced a Next-Gen RoHS Compliant PCI-X Based Quad Port Gigabit Ethernet PMC for Telecom, Mil-Aero and Industrial Apps.
DSS Networks Today Announced Entry into the ATCA, MicroTCA and AdvancedMC Product Market with a New Family of AdvancedMC Cards.
DSS Networks today announced another highly differentiated extreme performance Network Product -- a new dual port Gigabit Ethernet PMC Controller with pluggable SFP transceivers supporting both fiber and copper.
DSS Networks today announced another highly differentiated extreme performance Network Product -- a multi-port Processor PMC Gigabit Ethernet Network Processor Engine.
DSS Networks today announced first to market extreme performance PCI-Express Switch/Interface product.
DSS and Wind River are teaming up together to bring you advanced embedded broadband networking solutions.
DSS Networks is a member of the PICMG association