Use the Bird’s Eye View to Debug PCIe Flow Control
One of the most difficult debugging challenges in PCIe involves flow control. In stark contrast to PCI, in which flow control was handled through sideband signals, PCIe flow control is an in-band, point-to-point mechanism using both DLLPs (Update FC packets) and TLPs to update flow control state between the ends of a single link (not the ends of an entire set of links). Through a scheme of debits and credits, the components at each end of a link determine whether to continue transmitting. In addition, ordering rules permit the “passing” of high priority traffic past lower priority traffic.
The PCIe specification does not require any specific design for a flow control scheme; rather it specifies two quantities the transmitter gating function must track, Credits Consumed and Credit Limit as well as two quantities the receiver must track, Credits Allocated and Credits Received. Using these four variables, designers are permitted a variety of possible algorithms to tune their flow control gating functions. See Figure for an idea of the complexity of tracking these values.
When the link is not being managed correctly due to a flaw in the flow control design or a bug in its implementation, the effects can vary: dropped packets may simply reduce overall quality of service (QoS) or they may result in link degradation and deadlock. In the worst cases, dropped packets result in silent data corruption – for example, a dropped Posted Write could mean bad data is written (or good information is not written) to the disk.
Because the flow control mechanism relies on both DLLPs and TLPs, the bug hunter must account for credits and debits across these two layers of the protocol. Complicating matters further, the specification divides flow control packets into three categories: P, NP and CPL types corresponding to Posted, Non-Posted and ComPLetion type requests. Posted Requests include Messages and Memory Writes; Non-Posted requests include All Reads, I/O Writes, Configuration Writes and AtomicOps; Completions are associated with their corresponding NP Requests.
Within each category, flow control information is further sub-divided into HDR (header) and DATA types. Flow control is inherently a bi-directional operation: Accounting must occur on both sides of the link. If that weren’t complicated enough, the specification allows for separate flow control operations on each of the eight Virtual Channels (VC).
Even limiting the situation to VC0 (the only required channel in which flow control must be implemented), manually tracking just the 12 possible subtypes is not straightforward. Flow control credits are visible through the HdrFC and DataFC fields (in DLLPs), but debits are calculated for each type of TLP that participates. The problem of accounting for TLP debits is discussed a little further on.
The values transmitted in flow control DLLPs do not contain the current available receiver buffering, but rather contain the cumulative total credits that the receiver has granted to the transmitter since link initialization (using modulo 2field_size arithmetic). An initial credit grant is communicated using InitFCDLLPs, and subsequent grants are communicated using UpdateFC DLLPs.
Tracking TLPs is even more troublesome because the values are not transmitted with a packet; they are calculated based on the type of TLP, the amount of data it carries, the Traffic Class (TC) assigned to the TLP,and so forth. For example, per the PCIe specification: I/O and Configuration Write Requests consume 1 NPH (Non-Posted Request Header) and 1 NPD (Non-Posted Data payload) credit each.
To keep track of all of these flow control values, the bug hunter must do the following:
1) Capture the InitFCs as the link is being established.
2) Track all UpdateFCs as they are issued.
3) If multiple VCs are in use, track the Traffic Class (TC) to Virtual Channel (VC) mapping.
4) Track all TLPs that participate in flow control, calculate the appropriate credits and debit the accounts properly.
5) Track each buffer’s (PH, PD, NPH, NPD, CPLH, CPLD) current credit capacity at any given time in the acquisition.
For anything but the most trivial acquisitions, tracking and calculating all of these values by hand is time consuming, prone to error and expensive. Devising some method of calculating these values automatically would surely improve the accuracy of identifying flow control values. One can imagine, for example, capturing the flow control values, exporting them to a spreadsheet and running a macro to calculate the current value of the buffers as each DLLP and TLP are transmitted. While automatically calculating the value would help, simply displaying these values within the stream of packets doesn’t isolate where credit violations may have occurred. Displaying and calculating the proper values is only the first step in the bug hunter’s effort to isolate flow control problems.
In addition to acquiring the InitFCs, all UpdateFCs and all pertinent TLPs, the time-window of a PCIe stream capture must be long enough to capture the point of interest where flow control issues appear. For example, it is possible the problem occurs well after the initialization phase, perhaps several seconds or minutes into the circuit’s operation. Even the 16 GB of acquisition memory available in some instruments on the market today (using aggressive hardware filtering to only capture Flow Control DLLPs and TLPs), may not be sufficient to capture the necessary time window.
Creating a Bug Trap, Relatively Speaking
The process of capturing every credit and debit starting from InitFCs is often referred to as “absolute flow control accounting.” In this type of capture, the initial credit grant is known exactly and every change, whether from a TLP or UpdateFC DLLP, is tracked and accounted for. Although this form of flow control accounting is accurate it has two limitations:
* It depends on capturing data during the initialization phase.
* It is limited to the length of acquisition memory.
To see how long a time window we could capture, we ran a couple of experiments using a Logic Protocol Analyzer (LPA). Through creative storage techniques and filtering in the LPA, capturing only InitFCs, Update FCs and TLPs, we were able to extend the time window on one of the test systems to as long as 24 minutes. While this time window may be more than sufficient to contain flow control problems, it isn’t guaranteed to be long enough. Unfortunately, the length of the time window isn’t constant. Different systems issue UpdateFCs at different rates. For example, on a different system we tested, UpdateFCs were issued 10 times more frequently, reducing the overall time window to just 2.5 minutes.
If the problem occurs at some arbitrary time much later in the acquisition, outside of the storage time window limits, is there a way to debug flow control? To do so would require either capturing the buffer initialization values and storing them for later use, or somehow calculate flow control credits within any given region irrespective of “knowing” the real absolute value of the transmitting buffers.
Capturing the initialization values and using them at an arbitrarily later time might be of value for very well defined system behaviors, for example if the system had minimal TLP traffic for a period of time. But for the general case, losing track of the values of the DLLPs and TLPs between the InitFCs and the start of the flow control window, means not knowing what values are present at the start of the capture window.
For the second case, there are a couple of ways of calculating flow control without capturing the initialization:
1. Ask the bug hunter to provide values for the InitFCs.
2. Assume a starting value and use the MAX credit allocation value as the buffer size.
Although seemingly straightforward, the first alternative is problematic in a few ways:
a) The bug hunter is busy — the system shouldn’t require a lot of work to get setup – entering the six InitFC values for each side of the link is a step easily forgotten.
b) More importantly, the number the bug hunter provides (other than “INFINITE”) is an expected value for the InitFCs. Unfortunately, these allocated credits values are only half of the equation; the bug hunter doesn’t know the current values of the credits consumed.
c) In addition, because the PCIe protocol allows a receiver to add or remove buffers “on the fly,” the actual allocated credits buffer may differ from the expected value at the start of the acquisition time window.
The second alternative, assuming a starting value for the start of the acquisition, is a perfectly reasonable simplification for most flow control debugging situations. The reason is that when flow control problems appear, they are usually local to a region in the time window. If flow control values begin to vary wildly or in unexpected ways, it doesn’t matter that their absolute value is displayed -- just their relative value.
In this simplified model, assume the start of the time window has allocated a starting value of ½ the maximum buffer size specified in the PCIe Gen 3 specification (128 for Header credits, 2048 for Data credits). Similarly, assume the Credits Consumed values to be zero just as in the absolute credit tracking case.
Using these assumptions, relative flow control calculations proceed almost identically to the absolute case with one major exception: If the actual Credits Consumed values at the start of the acquisition time window aren’t zero, they may go “negative”-- an event that obviously can’t occur in the absolute tracking case. This is mostly an aesthetic problem; once the bug hunter understands the situation, the sign can be ignored.
But two more serious problems arise with this relative approach:
1. If the actual Credits Allocated buffers are much larger than MAX/2, the system may provide false indications of buffer overflows.
2. If the actual Credits Allocated buffers are much lower than MAX/2, the system may fail to report errors.
It seems providing the option for the bug hunter to enter InitFC values may be of use after all.
In fact, these false results can be reduced by introducing another element into the calculation: a threshold for buffer overflows. By establishing a threshold (for example, when Credits Consumed equals 80 percent of MAX), the bug hunter provides a “high water mark” above which the system can assume there is a problem.
So, imagine a spreadsheet macro that calculates flow control for all six credit types, on both sides of the link, taking into account the eight VCs, displaying the current “relative” value of the Credits Consumed for each DLLP and TLP. Imagine also it provides a means to establish a “high-water mark” threshold to help distinguish credit values that are suspicious.
Mapping the Terrain: Getting the Bird’s Eye View
The relative tracking approach overcomes the need to capture InitFCs and permits arbitrary start points for capture time windows. Imagine, however, that the flow control problem isn’t limited to a specific area, or perhaps flow control problems build over a lengthy period, perhaps seconds of capture. In these and many other cases, it is almost impossible to locate flow control problems just by browsing through the packet data. Even searching for suspect values may require hours of detailed investigation.
As in any hunt: finding flow control bugs is a lot easier if you have a map. Wouldn’t it make sense to somehow chart the flow control values over the time window capture and observe how the values change? In the case of this imaginary spreadsheet, applying the flow control values to a charting package could, in theory, show you whether (and perhaps even where in the time window) there was a problem.
Consider a system that presents just such a chart — a “Bird’s Eye View” of all of the data captured in the entire acquisition. A Bird’s Eye View (BEV) visualization of flow control could display the relative flow control buffer values at each point (or “chronitude”) in the time window. Depending on the flow control values, the BEV could render the chart to clearly indicate if there is a problem.
Even in the thumbnail image of the BEV (shown in Figure 2, the flow control visualization clearly shows two different problems, one on the down and one on the up side of the link. In the down side of the link (left-hand side of the chart), the Credits Consumed values are intermittently crossing the critical threshold (small red lines). In the up side of the link (the right-hand side of the chart), the Credits Consumed values are increasing over time until they pass a threshold. In both cases, the problems appear to be cyclical.
In this approach to rendering the BEV, the Credits Consumed values are assigned a line color and thickness to indicate how close they are to the threshold. When the Credits Consumed values stay around zero (meaning there has been a reasonable balance of credits consumed and credits allocated) the visualization shows a thin blue line. As the values become unbalanced, the line thickens but remains blue. Finally, as the balance exceeds the threshold defined by the bug hunter, the line is thickened and painted red.
Here is where having an adjustable threshold value (rather than having to adjust 12 different InitFC values) comes in handy. By adjusting this threshold the bug hunter can “fine tune” the sensitivity of the visualization or explore different scenarios. For example, by adjusting down the threshold value, the bug hunter could see how “sensitive” the flow control values are – noting the lines thickening but not turning red until a specific threshold is reached.
Consider additional ways to enhance the visualization: What if the chart could show more than just two lines? Consider a problem that is isolated to one direction: The values might be just the Header and Data buffers for that direction (as shown in Error! Reference source not found.), or in the case of payloads, perhaps only the Data values are most interesting to the bug hunter.
To be of greatest use, the chart should allow the bug hunter to easily adjust the values of interest quickly revealing patterns in the data. Some of the ways the BEV could be adjusted include the colors of the lines, the threshold value, the number of lines displayed, and the types of data the lines represent.
In these examples, the BEV displays the state of the flow control credits from the start of the acquisition to the end whether an InitFC was captured (absolute) or not. In either case, the visualization provides immediate at-a-glance indications of issues.
But simply showing the problems where they occur within the time window is just the beginning of a usable bug hunting map. What would really be great is if the chart could automatically link to the detailed flow control information: not only could it be a map of the problem areas, it could be a direct means of navigating to the problems.
Closing In On the Bug
What if just clicking on a spot in the BEV could update the spreadsheet of values to show the packets flowing on the link at that chronitude? Clicking on a red region, for example in Figure 4, could display the first instance of the packet causing the overflow. Using the BEV as a navigational aid, the hunter could quickly move around the acquisition exploring details revealed in the patterns of data.
Quickly navigating to points of interest is a necessity when the time window is defined in seconds and possibly millions of packets are displayed in the spreadsheet.
But the BEV can bring even more utility to the bug hunter: Since the chart already has captured and calculated flow control values across the entire time window, why not provide basic statistics about the flow control values as well?
What if the bug hunter could reveal patterns in the data merely by brushing the mouse cursor over one of the visualizations in the BEV. Using this approach, the display could provide a report of the top three Credits Consumed values for the chronitude under the cursor as suggested by Figure 5.
In another example, imagine the bug hunter could stretch a rectangular graticule, or Viewfinder, over a region of the BEV to calculate summary statistics for that region. If the statistics could be displayed near the BEV and the spreadsheet, the bug hunter would have almost all of the information required to analyze flow control anywhere within the trace.
In Figure 6, the statistics panel displays two sets of values: the maximum Credits Consumed across the entire acquisition (Total) and the maximum Credits Consumed within the Viewfinder region. As the bug hunter moves the Viewfinder around the BEV, the statistics panel updates immediately, helping the bug hunter identify key values within the various regions.
Even here, the information can be more than a display. As in the BEV itself, the statistics panel could act as a navigation aid to the bug hunter. Clicking on one of the blue underlined (hyperlinked) elements could update the adjacent spreadsheet to display the first chronitude in the trace (or the Viewfinder region) where that value occurs.
Closing the Trap (Quickly)
An integrated system such as an interactive BEV with adjacent detailed spreadsheet and statistics panel can quickly help the bug hunter identify and navigate to a problem, but what if the system required no interaction on the part of the bug hunter at all? What if the bug hunter could be certain flow control wasn’t a problem anywhere in the system merely by looking at the BEV? Consider the situation in which flow control issues on one link are the root cause of problems occurring on a second link. What would debugging flow control problems on a circuit be like if the BEV could absorb more than one link’s worth of information at a time?
As an example, assume a circuit with several PCIe links. Assume the hunt is for a flow control problem, but it is not clear which link might have the issue. Assume also that no matter how many links are being captured, the system will display and analyze the data in the spreadsheet, the BEV and the statistics panel. If, by default, the visualization is designed to show the maximum Credits Consumed values for all credit types on all captured links, then the bug hunter can immediately see if there are no flow control issues on any link — the BEV will show only thin blue lines. This of course is the “happy case” in which the bug hunter has to take no action at all except to glance at the BEV.
In the case of a possible flow control problem, the bug hunter must determine which link is causing the issue. Again, merely glancing at the BEV will show the bug hunter there is a problem (red lines or thick lines in the visualization), but here some action must be taken. Clicking on a red region in the BEV would bring the bug hunter to the associated packet in the spreadsheet showing the offending link. Alternatively the bug hunter could click in the statistics panel on the hyperlinked value in the totals section to display the offending packet. In yet another scenario, the bug hunter could reconfigure the BEV display to limit which links are displayed, ultimately isolating the problem to a specific link.
Keeping Bugs in Sight, No Matter How Fast They Fly
Efficiently stomping bugs requires a speedy and fluid experience. For such a system to be usable, it should display information as quickly as possible. Consider, for example, what it would feel like to see the first piece of information display in the BEV (and associated spreadsheet) within a couple of seconds, no matter how long the time window or how much data has been acquired. Say, for example, the system could acquire as many as 160M symbols representing as much as 24 seconds of system operation. A really usable system would display that data as it is being processed, immediately after the trigger occurred in the capture system.
There are three main advantages to this approach:
1. There is no time wasted preprocessing the data.
2. The hunt can continue through the acquisition even if the processing isn’t complete.
3. If the trigger didn’t work as expected, the bug hunter can interrupt the BEV display processing and take a new acquisition.
Rather than having to wait for several minutes until the final data is displayed, the bug hunter can rework the trigger, take additional acquisitions and repeat the process until the instrument has triggered on the desired information.
Even as the system processes the acquisition to display items of interest in the BEV, the hunt can continue using more traditional methods such as filters, searches and the like. Naturally, once the entire acquisition has been processed, a well-designed system would immediately let the bug hunter navigate through the data, no matter how long the time window.
By updating the BEV to display different credit types, isolate links, or adjust the threshold values immediately, the system would allow the bug hunter to rapidly and fluidly view different aspects of the flow control values.
One of the most difficult challenges debugging PCIe designs involves flow control. The consequences of flow control bugs can be significant, at a minimum slowing down of the system, at worst corrupting data or generating system errors. To date, the only way of finding flow control bugs has been to manually calculate flow control values – a time consuming and error prone method usable only on very small acquisitions.
This article describes a novel “map” of a PCIe acquisition to investigate flow control: the Bird’s Eye View. Not only does the BEV provide a full-acquisition view of the trace, it provides the bug hunter with new ways of navigating through large data sets to quickly see where trouble spots may be hiding.
The BEV quickly and fluidly displays information as the hunter needs it, regardless of the scale of the acquisition – whether on one link or across many. It provides the bug hunter with a map of previously uncharted territory, the proverbial high-ground from which to see patterns of flow control information otherwise invisible in the data.
Absolute Flow Control – the calculation of flow control values using the InitFC DLLP values.
BEV – Bird’s Eye View, a full-trace visualization of a captured PCIe data
Chronitude – a Cartesian point in a time window, the geometrical equivalent of a point in time
Data Credit – one of two credit types associated with a DLLP or TLP.
DLLP – Data Link Layer Packet, the middle tier of the PCIe protocol responsible for handling flow control and other health-of-link communication
Hdr Credit – one of two credit types associated with a DLLP or TLP.
InitFC – a DLLP issued at the initialization of the PCIe link advertising the allocated credits for the various buffers
P, NP, CPL – Posted, Non-Posted and Completion type TLPs. Each consume a different number of Hdr and Data credits
Relative Flow Control – the calculation of flow control values based on an assumed starting value.
Time window – the amount of time captured in a trace
TLP – Transaction Layer Packet – the top tier of the PCIe protocol responsible for carrying data across the link
UpdateFC – a DLLP that advertises an allocation of credits.
About the Author
Leo Frishberg is the Principal Architect of the User Experience for the Digital Analyzer Product Line at Tektronix, Inc.