Advertisement

Broadband speeds have been on a steady rise. For MSOs, DOCSIS 1.0 offered bandwidths that were sufficient for initial deployments. However, getting beyond 3 Mbps per subscriber was challenging with first-generation DOCSIS equipment. The main reason for this barrier was the interaction between TCP, the primary transport protocol used on the Internet, and the DOCSIS scheduler. Three primary strategies arose to address this limitation:

1 - DOCSIS concatenation
2 - TCP ACK Suppression (TAS)
3 - Traffic shaping.

The DOCSIS scheduler

To better understand the interaction between DOCSIS and TCP, consider the operation of the DOCSIS scheduler. In order to transmit a packet upstream, a modem must first request bandwidth. The CMTS then grants the bandwidth. The modem then waits for its scheduled time before it can transmit. This cycle is referred to as the request-grant cycle. The number of transmit bursts per second that an individual cable modem can send is inversely proportional to the request-grant cycle duration TRGC. For example, if TRGC is 5 ms, then the maximum bursts per second the modem can send is 200 per second.

Figure 1
Figure 1: DOCSIS request grant cycle.

Downstream TCP throughput is related to TRGC since each TCP ACK is a packet in the upstream. Hence, TCP throughput is inversely proportional to TRGC. For TCP transfers, more bursts upstream equates to more bandwidth downstream.

Typical values of TRGC range from 3 to 5 ms. These values allow a single modem to achieve download speeds in the range of 2.4 Mbps to 4 Mbps, assuming 1 TCP ACK for each TCP packet received. Many CPEs send 1 TCP ACK for every 2 TCP packets received. This effectively doubles the downstream throughput to 4.8 Mbps and 8 Mbps.

TRGC can be minimized in order to increase downstream throughput. It is composed of the average MAP duration plus the average MAP-ahead time. Practical MAP durations are in the 1.5 ms to 3 ms range. Any smaller and the MAP messages start to consume a significant percentage of the downstream bandwidth.

MAP-ahead time consists of a 200 mS minimum time that a cable modem MUST receive a MAP message prior to any grant becoming active, propagation delay, and any latency induced by downstream interleaving. A 16 tap interleave depth adds an additional 480 mS of latency.

Therefore, the minimum MAP-ahead time, assuming no propagation delay, is 200 µS + 480 µS, or 680 µS. However, the DOCSIS specification allows for 100 miles of HFC between the headend and a subscriber's home. This equates to 800 uS of propagation delay. This number can be reduced if there is less than 100 miles of HFC. It is possible to even change this parameter dynamically by having the CMTS keep track of the modem with the largest timing offset and adjusting the MAP-ahead time accordingly. This produces the lowest possible value of TRGC.

However, dynamically changing the MAP-ahead time based on current worst case propagation delay does have issues. If, for example, one day all CMs that are registered are within 10 miles of the headend, subscribers will see very good throughput. But if the next day a sub registered at the far end of the cable, 100 miles away, throughput seen by all subscribers would be reduced. Any change made to the MAP-ahead time because of newly-registered modems also affects modems previously registered. This type of inconsistent delivery of service is typically unacceptable.

Methods of optimization

Due to the connection-orientated nature of TCP, throughput from server to host is proportional to the rate of TCP ACKs from host to server.

Figure 2
Figure 2: Base Case 1 TCP ACK per burst.

In Figure 2, the CPE is sending TCP ACKs at a rate that is proportional to the rate at which it is receiving TCP data packets. Neither concatenation nor TCP ACK suppression is enabled.

While the CPE is transmitting at a rate greater than 1/TRGC packets per second, the cable modem is limited to sending 1/TRGC packets per second. Therefore the total upstream throughput is limited by TRGC.

Concatenation

Concatenation is the ability to send more than one Ethernet PDU per upstream burst. This allows more packets to be sent upstream per second, thus increasing the number of TCP ACKs sent upstream per second. With today's downstream TMAX values exceeding 6 Mbps or 8 Mbps, concatenation is critical. Marked improvements in downstream throughput can be observed when enabling concatenation for either DOCSIS 1.0, 1.1, or 2.0 modems. Figure 3 shows how concatenation increases the number of packets per second that can be sent upstream.

Figure 3
Figure 3: Concatenated TCP ACKs.

Concatenation allows the cable modem to send at a rate greater than 1/TRGC packets per second, thus increasing the TCP ACK rate, which corresponds to an increase in the downstream TCP throughput. Note in both Figures 2 and 3 there is a one-for-one correspondence between the packets sent by the CPE and the packets forwarded by the CMTS.

TCP ACK Suppression

TCP ACK Suppression overcomes the TRGC limitation without actually affecting the DOCSIS specification or involving the CMTS. It improves downstream TCP transmissions by taking advantage of TRGC and only sending the last ACK it receives when its data grant becomes active. Thus, the number of TCP ACKs is fewer, but the number of bytes acknowledged by each TCP ACK is increased.

Consider a user who is FTPing a file downstream. There will be a succession of ACKs sent at a rate proportional to the TCP data packets received. Assume that TCP ACKs are being sent every 1.5 ms on a system that has a TRGC of 4.5 ms.

When the cable modem receives the first TCP ACK from the CPE, it will send a request for bandwidth equivalent to one TCP ACK. Each ACK contains an acknowledgement number that corresponds to the byte in the transfer that is being acknowledged. All prior bytes are considered acknowledged.

In this example, TCP ACK #1 is acknowledging byte 1500, ACK #2 3000, and ACK #3 4500. The size of each ACK is equivalent. At time 4.5 ms, the TAS-enabled modem will have the opportunity to send one packet whose size is the length of a standard ACK packet.

Figure 4
Figure 4: TCP ACK suppression.

Therefore, instead of sending ACK #1, the modem sends ACK #3. This is possible because the CM has received ACK #3 at T = 3.0 ms, has had time to inspect the packet, and has had time to make the switch before the grant becomes active. This grant, remember, was received as a result of the request sent after receiving ACK #1.

Without TAS enabled, the user's FTP was limited to about 222 ACKs per second x 1500 bytes per ACK x 8 bits per byte, or about 2.6 Megabits acknowledged per second. By enabling TAS, this maximum was increased to 222 ACKs per second x an average of 4500 bytes per ACK or about 8 Megabits acknowledged each second.

The benefit that TAS has over concatenation is that it not only increases downstream throughput but it also decreases the amount of bandwidth consumed in the upstream. However, TAS only works on TCP ACKs. It has no effect on any other traffic. Concatenation, on the other hand, works on all traffic.

Concatenation and TAS are not mutually exclusive. They operate independently, but they can operate at the same time.

The effect of using both simultaneously will be more downstream TCP bandwidth with less upstream overhead.

Traffic shaping

Traffic shaping is the third technique used to maximize throughput. Unlike concatenation and TAS, traffic shaping does not increase downstream throughput by increasing the number of bytes acknowledge per second in the upstream. It actually speeds up downstream transmissions by first slowing them down.

Typically, Max-sustained-traffic rate or TMAX values are set to a rate that is much less than the total rate of the downstream channel. DOCSIS defined policing algorithms limit subscriber throughput to the value of TMAX with a configurable window that allows the subscriber to burst beyond TMAX for some number of bytes. This is typically implemented using a Token Bucket algorithm that allows the user to burst beyond TMAX as long as there are tokens in the bucket.

For each byte transmitted at a rate faster than TMAX, tokens are removed from the bucket. Tokens are added back to the bucket for every quantum of time that the transmission is not exceeding TMAX. This quantum is implementation dependent. This has the effect of emptying the bucket when the transmission exceeds TMAX, and filling the bucket when the transmissions are below TMAX. Packets are allowed to be transmitted as long as the bucket is not empty. Arriving packets are dropped when the bucket is empty. Transmissions are forced to conform to the TMAX value. This is referred to as "Traffic Policing."

Policing is an effective way to enforce TMAX. Unfortunately, it has an undesirable side effect with TCP. TCP slows down dramatically when any packet loss occurs. Thus, what you see with a token bucket-based policing algorithm is a saw-toothed traffic pattern that ramps up toward the channel bandwidth until the bucket is exhausted and then cuts in half. This cycle repeats throughout the transmission. The average throughput of the transmission is somewhere between the peaks and valleys on the saw tooth. To reach an average transmission rate of TMAX, the depth of the token bucket must be sized appropriately.

Policing worked well in initial deployments as TMAX values were low and bucket depths were small. However, with the increase in TMAX beyond 3 Mbps to 5 Mbps, there is a need to increase the depth of the bucket. Ten times the bandwidth requires ten times the bucket size.

Also, at high values of TMAX, such large bucket sizes produce inconsistent results between modems from different vendors, interaction with concatenation, TAS, and CPEs with different TPC/IP protocol stacks. It is difficult to produce consistent results from subscriber to subscriber.

By tweaking the Token Bucket algorithm from a Policing algorithm to a Shaping algorithm, consistent and accurate results can be obtained. There may still be limitations to specific brands or models of cable modems that are beyond the ability of traffic shaping to fix, but traffic shaping will deliver consistent results on the same model in download after download.

Traffic shaping also uses a Token Bucket to allow bursts, as this is still desirable, but during periods when the bucket is empty packets are not dropped. Instead, they are queued until such time that they are conforming to the TMAX rate. The saw tooth traffic graph of Policing is replaced by a flat graph at a value of TMAX bps.

Conclusion

In order to deliver optimum throughput in DOCSIS systems, MSOs must deploy both cable modems and CMTSs that work with TCP and its state machines.

Concatenation and TAS have the ability to increase the downstream bits per second by increasing the upstream bytes acknowledged per second. At the same time, TAS reduces the total amount of upstream bandwidth required for a TCP flow at a given data rate.

Lastly, traffic shaping reduces retransmissions due to dropped packets and smoothes traffic rates for more dependable operation. Each strategy to optimize TCP throughput can work independently or in concert with one another. The best results come from using all three strategies.

E-mail: Steven.Krapp@arrisi.com

  • Advertisement
    Advertisement