doi:10.3772/j.issn.1006-6748.2018.04.007

# Effect of DFE error propagation and its mitigation using MUX-based FEC interleaving for 400 GbE electrical link<sup>①</sup>

Zhan Yongzheng(展永政), Hu Qingsheng<sup>②</sup>
(Institute of RF- & OE-ICs, Southeast University, Nanjing 210096, P. R. China)

## **Abstract**

This paper evaluates the effect of decision feedback equalizer (DFE) error propagation for 400 Gb/s Ethernet (400 GbE) electrical link in order to propose some effective methods to improve bit error rate (BER). First, an analytical model for DFE burst error length distribution is proposed and simulated based on a NRZ electrical link in which a 5-tap DFE combined with a multiple-tap feed forward equalizer (FFE) is included. Then, a detailed derivation for BER considering DFE error propagation is given based on the distribution of burst error run length and the BER performance with and without forward error correction (FEC) is simulated too. After that, this paper investigates several MUX-based FEC interleaving methods including their complexity and latency in order to improve BER further. At last, three FEC interleaving schemes are compared not only in interleaving gain, but also in hardware complexities and latencies. Simulation results show that pre-interleave bit muxing can obtain good tradeoff between BER and complexity for 400 GbE electrical link.

**Key words:** decision feedback equalizer (DFE) error propagation, forward error correction (FEC) interleaving, multiplexer, 400 GbE electrical link

# 0 Introduction

Since the establishment of IEEE P802.3 bs 400 Gb/s Ethernet (400 GbE) task force in 2013<sup>[1]</sup>, high-bandwidth interconnect technology has been developed rapidly to support many emerging application areas, such as cloud data center, online social networks and Internet exchanges. In 400 GbE, forward error correction (FEC) technique based on Reed Solomon (RS) coding<sup>[2]</sup> is recommended for physical coding sublayer (PCS) in order to raise the performance of bit error rate (BER) further. However, the error propagation of decision feedback equalizer (DFE) in receiver<sup>[3]</sup>, which contributes to long burst errors beyond the error correction capability of RS coding, may affect the performance of BER to some extent.

To solve this problem, some researches have focused on estimating the probability of various lengths of burst errors aiming at exploiting suitable FEC codes with the capability of correcting burst errors effectively. For example, Liu et al. [4] demonstrated that a double-burst-error correcting RS code over GF(2<sup>10</sup>) outperformed single one. This RS code, however, could only correct a single 11-bit burst error in each symbol so that

it could not meet the BER requirement for 400 GbE when long burst error happened. Meanwhile, others put emphasis on the modification of DFE system for reducing the burst error length significantly. Tang et al. [5] changed the hard decision into two opposite thresholds in a dual DFE structure to minimize the probability that subsequent decisions were wrong with a cost of more complex DFE than the conventional one. On the other hand, some alternative methods, e. g. multi-codeword interleaving, was proposed to break long burst errors into short ones and improve FEC performance in high-speed Ethernet system. For example, FEC orthogonal multiplexing (FOM) can be implemented in physical media attachment (PMA) sublayer and FEC pre-interleaving scheme stripes FEC codewords to PCS lanes have been investigated for 400 GbE<sup>[6]</sup>. In these two methods, the DFE, however, is only 1-tap structure which may be too simple to analyze the effect of DFE error propagation accurately, especially for very high speed Ethernet<sup>[7]</sup>.

This paper focuses on evaluating the effect of DFE error propagation based on multi-tap DFE structure and improving BER performance using adequate FEC interleaving techniques which have good tradeoff between performance and complexity.

① Supported by the National Natural Science Foundation of China (No. 61471119).

② To whom correspondence should be addressed. E-mail; qshu@ seu. edu. cn Received on Dec. 13, 2017

# Effect of DFE error propagation

## **DFE** error propagation

Fig. 1 illustrates the process of DFE error propagation, where  $\{x_1, x_2, \dots, x_n, \dots\}$ ,  $\{y_1, y_2, \dots, y_n, \dots\}$  $\cdots$  and  $\{z_1, z_2, \cdots, z_n, \cdots\}$  are received signals, pre-slicer signals and post-slicer signals, respectively, and  $\{c_1, c_2, \cdots, c_k\}$  are the coefficients of feedback filter.



Fig. 1 The process of DFE error propagation

Then the following can be obtained.

$$y_n = x_n - \sum_{i=1}^k c_i \times z_{n-i} \tag{1}$$

From Eq. (1), it can be seen that because of the feedback structure, when a single or multiple errors occur in post-slicer signals, they may impact the current one with a certain probability. For example, if  $z_{n-1}$  is wrong and  $c_1$  has a larger magnitude, then a larger offset may occur for  $y_n$ , which may result in an error for  $z_n$ . This is the primary process of DFE error propagation. It is worth noting that the probability of error propagation depends on the structure of feedback filter, not only the number of taps but also the tap magnitudes. In other words, the larger the tap numbers or magnitudes are, the larger the probability of error propagation is.

Based on the analysis above, it can be known that the effect of error propagation can be reduced by decreasing either the number of taps or the magnitude of tap coefficients. The cost may be the degradation of equalization performance to some extent. An effective method to overcome this problem is to combine a feed forward equalizer (FFE) with appropriate post-cursor taps and control the magnitude of DFE taps.

See Fig. 2, by adding the magnitude of post-cursor of FFE, e.g.  $a_1$ , the coefficients of DFE  $c_i$  (i = 1,  $\cdots, k$ ) can be reduced without affecting the equalization performance.

#### Analysis of burst error run length

To further evaluate the effect of DFE error propagation on BER performance, the first step is to analyze the burst length distribution, which can be defined as the



Suppress error propagation by combining FFE

cumulative-probability distribution of error bursts as a function of burst length. Assuming  $p(e_i | E)$  is the bit error rate of having error pattern E which can be modeled by flipping the sign of corresponding feedback tap values of DFE, then  $p(e_i | E)$  can be calculated from signal-noise-ratio (SNR) by following analytic models.

$$P_{err} \approx \frac{1}{2} erfc \left( \frac{\sqrt{SNR}}{2\sqrt{2}} \right)$$
 (2)

Further, let brl be the burst run length,  $brl_{\max}$  the maximum error run length to be considered, and p(brl = l) denote the probability of error run length l given that an error has occurred, then the probability of various burst error length will be got as follows:

$$p(brl = 1) = \sum_{j=1}^{1} p(brl = 1, E_{1,j})$$

$$= p(brl = 1, E = \{1\})$$

$$= \prod_{brl_{max}+1} (1 - p(e_i | \{e_1\} = \{1\}))$$

$$p(brl = 2) = \sum_{j=1}^{1} p(brl = 2, E_{2,j})$$

$$= p(brl = 2, E = \{1,1\})$$

$$= p(e_2 | \{e_1\} = \{1\})$$

$$\cdot \prod_{i=3}^{brl_{max}+2} (1 - p(e_i | \{e_1, e_2\})$$

$$= \{1,1\}))$$

$$p(brl = 3) = \sum_{j=1}^{2} p(brl = 3, E_{3,j})$$

$$= p(brl = 3, E_{3,1} = \{1,0,1\})$$

$$+ p(brl = 3, E_{3,2} = \{1,1,1\})$$

$$+ p(brl = 3, E_{3,2} = \{1,1,1\})$$

$$\cdot p(e_3 | \{e_1, e_2\} = \{1,0\})$$

$$\cdot p(e_3 | \{e_1, e_2\} = \{1,0\})$$

$$\cdot p(e_3 | \{e_1, e_2\} = \{1,1\})$$

$$\cdot p(e_3 | \{e_1, e_2\} = \{1,1\}$$

where,  $E_{1,j}$ ,  $E_{2,j}$  and  $E_{3,j}$  represent the j-th error pat-

tern of burst errors with brl=1, brl=2 and brl=3, respectively.  $p(e_i | \{e_1\} = \{1\})$  is the probability of the i-th bit which is wrong when the first bit is in error,  $p(e_i | \{e_1, e_2\} = \{1,1\})$  is that of the i-th bit which is wrong when both the first and second bit $_a$  are in error, and  $p(e_i | \{e_1, e_2, e_3\} = \{1,0,1\})$  is that of the i-th bit which is wrong when both the first and third bit are in error.

Fig. 3 illustrates the meaning of Eq. (3) in detail. It can be seen that different from the first and second cases of brl=1 and brl=2 which both have one error pattern, the third case, i. e. brl=3, has two error patterns. One is  $E_{3,1}=\{1,0,1\}$ , the other is  $E_{3,2}=\{1,1,1\}$ , so the probability of brl=3 is the sum of them.



Fig. 3 Examples of error pattern with different brl

Generally, a conclusion is got that totally  $2^{l-2}$  error patterns are included in a burst run length of l. Let  $E_{l,j}$  represent the j-th error pattern among them, then the probability of brl = l could be got as follows:

$$p(brl = l) = \sum_{j=1}^{2^{l-2}} p(brl = l, E_{l,j})$$

$$= \sum_{j=1}^{2^{l-2}} \prod_{i=2}^{l} p(e_i^{l,j}) \cdot \prod_{i=l+1}^{brl_{\max}+l} (1 - p(e_i | E_{l,j}))$$
(4)

where  $p(e_i^{l,j})$  is the probability of i-th bit in error pattern  $E_{l,j}$ ,  $1 \le l \le brl_{\max}$ , got as Eq. (5) and  $\prod_{i=l+1}^{brl_{\max}+l} (1-p(e_i \mid E_{l,j}))$  represents the probability that each error bit in the following  $brl_{\max}$  bits isn't caused by error patter  $E_{l,j}$ .

$$p(e_i^{l,j}) = \begin{cases} 1 - p(e_i \mid \{e_1, e_2, \dots, e_{i-1}\}), & \text{if } e_i^{l,j} = 0\\ p(e_i \mid \{e_1, e_2, \dots, e_{i-1}\}), & \text{if } e_i^{l,j} = 1 \end{cases}$$
(5)

From Eq. (4), BER performance with the effect of DFE error propagation can be got.

## 1.3 Simulation results

After getting the analytic model of burst run length of Eq. (4), the distribution of DFE burst error length will be further analyzed by simulation. Fig. 4 gives the simulation platform in which a transmitter, lossy channel, and a receiver are contained. In the transmitter, some equalizers such as pre-emphasis and TX buffer are modeled [8], and in the receiver a 5-tap DFE combined with a dedicated FFE is considered. Additionally, in order to estimate the error propagation accurately



device package and crosstalk are necessarily added into the link system in a form of S-parameter<sup>[9]</sup>.

Totally three backplane channels A, B and C including device package are simulated, whose frequency responses are depicted in Fig. 5, in which the near-end and far-end crosstalks are also presented. Table 1 lists the channel insertion losses of 25 Gb/s and 50 Gb/s NRZ signaling. It can be observed that the insertion losses increase significantly when data rate goes high. For example, the insertion loss of channel C can be up to 48.7 dB at 25 GHz.

Table 2 lists part of the simulation results for a 5-tap DFE combined with a 2-tap FFE and Fig. 6 plots the probability under different burst error run lengths. From Table 2, it can be seen that a single random error  $p_1$  may be propagated to a short or long burst error with



Fig. 5 Frequency responses

|           | Table 1 Insertion lo | oss    |  |
|-----------|----------------------|--------|--|
| Ch annual | Insertion loss(dB)   |        |  |
| Channel   | 12.5 GHz             | 25 GHz |  |
| A         | 14.6                 | 28     |  |
| В         | 22.6                 | 40     |  |
| C         | 25.8                 | 48.7   |  |

certain probability. For channel A, for example,  $p(e_2 \mid e_1)$ , the probability of the second bit in error due to the first error bit  $e_1$ ,  $p(e_3 \mid e_1)$  and  $p(e_4 \mid e_1)$ , that of the third and the fourth bit in error caused by the first

error bit, are 2.672e-1, 1.1504e-1 and 4.585e-2, respectively, with the trend of decreasing with the increasing of *brl*. Similarly, from Fig. 6 it can be impressed that all the probabilities decrease significantly with the increasing of *brl*, either for 2-tap or 3-tap FFE. And the probability of 3-tap FFE is less than that of 2-tap FFE for a given *brl*, which also explains the reason that the effect of DFE error propagation can be mitigated effectively by increasing FFE post-cursor tap numbers and/or magnitudes.

Table 2 Partial simulation results of burst error run length for 5-tap DFE combined with 2-tap FFE

| brl | $p(e_i E)$                                 | channel   |           |                            |
|-----|--------------------------------------------|-----------|-----------|----------------------------|
|     |                                            | A         | В         | С                          |
| 1   | $p_1$                                      | 2e-06     | 5e-06     | 1.3e-05                    |
| 2   | $p(e_2 \mid \{e_1\} = 1)$                  | 2.672e-1  | 5.3587e-1 | 6.544e-1                   |
| 3   | $p(e_3 \mid \{e_1\} = 1)$                  | 1.1504e-1 | 1.6422e-1 | 1.892e-1                   |
|     | $p(e_3   \{e_1, e_2\} = \{1, 1\})$         | 1.1829e-1 | 9.945e-2  | 1.226e-1                   |
| 4   | $p(e_4 \mid \{e_1\} = 1)$                  | 4.585e-2  | 6.905e-2  | 7.944e-2                   |
|     | $p(e_4   \{e_1, e_2\} = \{1, 1\})$         | 8.289e-2  | 1.436e-1  | 1.649e-1                   |
|     | $p(e_4   \{e_1, e_3\} = \{1,0,1\})$        | 8.012e-2  | 1.3608e-1 | $1.697\mathrm{e}\text{-}1$ |
|     | $p(e_4   \{e_1, e_2, e_3\} = \{1, 1, 1\})$ | 5.657e-2  | 2.645e-2  | 2.672e-2                   |



Fig. 6 Probability of different DFE burst error run length over channels

# 2 BER with DFE error propagation

Based on the above discussion, it can be known that DFE error propagation may result in burst errors with different run length and this may affect BER performance to some extent. In the following we will investigate the effect of DFE error propagation on BER in detail.

# 2.1 Effect of DFE error propagation on BER

First the definition of p(W(E) = w) is given. In this paper p(W(E) = w) is defined as the probability that total w bits are in error among n block bits, where W(E) is called the weight of error pattern E. For example,  $W(E = \{1\}) = 1$ ,  $W(E = \{1,0,1\}) = 2$  and  $W(E = \{1,1,1\}) = 3$  are got.

Next p(W(E) = w) will be calculated, where  $1 \le w \le \infty$ . For the simplest case, i. e.  $W(E = \{1\})$  = 1, the probability can be got as:

$$p(W(E) = 1) = n \cdot p_1 \cdot p(brl = 1) \cdot (1 - p_1)^{n - brl_{\text{max}} - 1}$$
(6)

where, n is the length of a block,  $p_1$  is the random bit error rate, obtained without flipping any DFE tap value, and p(brl = 1) can be got from Eq. (4).

For the case of  $W(E) \ge 2$ , however, there are two different subcases; one is that the error pattern is a single burst error, the other is that it consists of multiple random or/and burst errors. Since the probability of former is much greater than that of latter, p(W(E) = 2) can be calculated as follows:

$$p(W(E) = 2) = p(\text{burst error of 2 bits}) + p(\text{two separate errors})$$

$$\approx p(\text{burst error of 2 bits})$$

$$= p(E = \{1,1\}) + p(E = \{1,0,1\}) + p(E = \{1,0,0,1\})$$

where, p(brl = l, W(E) = 2) is the probability that error pattern E with weight of 2 happens in the burst error run length l and  $\sum_{j, W(E_{l,j}) = 2} p(brl = l, E_{l,j})$  standards for the probability of brl = l for all error patterns  $E_{l,j}$  with weight of 2.

Similarly, p(W(E) = w) can be driven as following.

$$p(W(E) = w) \approx p(\text{burst error of } w \text{ bits})$$

$$\approx \sum_{l=w}^{brl_{\text{max}}} p(brl = l, W(E) = w)$$

$$= \sum_{l=w}^{brl_{\text{max}}} n \cdot p_1 \cdot (\sum_{j, W(E_{l,j}) = w} p(brl = l, E_{l,j}))$$

$$\cdot (1 - p_1)^{n-brl_{\text{max}}-l}$$
(8)

Now, from Eq. (8) the bit error rate (BER) with effect of DFE error propagation can be calculated as

$$BER = p_{bit} = \sum_{w=1}^{\infty} p(W(E) = w) \cdot W(E)$$

$$n = 0$$
(9)

By replacing p(W(E) = w) with Eq. (8), Eq. (9) can be rewritten as:

$$BER = \sum_{l=1}^{brl_{\text{max}}} \sum_{all\ E} p(brl = l, E) \cdot W(E) \cdot p_1$$
$$\cdot (1 - p_1)^{n-brl_{\text{max}}-l}$$
(10)

It is obvious that BER with effect of DFE error propagation is larger than  $p_1$ . For the detailed derivation please see Appendix A.

### 2.2 BER with FEC coding

From the analysis above, it is known that the link performance is degraded to some extent due to DFE error propagation. As a result, RS (544,514), which has the capability of correcting single burst error with length up to 141 bits and provides a burst coding gain of 6.64 dB at the BER of  $10^{-15[10]}$ , has been recommended for 400 GbE application [11].

Fig. 7 gives the BER simulation results with and without FEC for the same NRZ electrical link mentioned above. It can be seen that for random error, RS (544,514) can provide a random coding gain of 3.6 dB at the BER of 10<sup>-7</sup>. Compared with random error, however, DFE error propagation degrades the BER performance about 1.8 dB although FEC has been uti-

lized. Another situation that must be noted is that at the BER of 10<sup>-7</sup> a performance improvement of 1.5 dB can be obtained for the structure of 3-tap FFE compared with that of 2-tap FFE when FEC are employed.



**Fig. 7** Simulation results of BER with DFE error propagation using FEC coding

# 2.3 BER improvement using interleaving scheme

It has been known that various techniques such as interleaving technique<sup>[12]</sup> can be employed to enhance FEC coding gain. In fact, some high-performance interleaving schemes, e. g. FOM (FEC orthogonal multiplexing) bit muxing, symbol pre-interleave bit muxing and symbol pre-interleave symbol muxing have been investigated for 400 GbE physical layer to relax the layout constrain of PMA service interface, i. e. 400 GAUI.

Fig. 8 shows a block diagram of 400 GbE physical layer<sup>[13]</sup> in which electrical signals interact through 400 Gb/s eight-lane attachment unit interface (400 GAUI-8) between two PMA layers. FEC interleaving can be realized effectively in PMA. Fig. 9 gives the details of three FEC interleaving schemes. In FOM bit muxing, each FEC symbol from a given FEC lane is distributed to 4 sub-lanes in a round robin manner, and each 2 bits from different FEC lanes are then multiplexed in the PMA. The disadvantage of this scheme is that there are multiple sub-lanes ( = 16) before they are multiplexed and this may result in the tight restriction in route.



Fig. 8 FEC in 400 GbE physical layer



Fig. 9 Block diagram of FEC interleaving scheme

This problem can be solved by another scheme, i. e. pre-interleaving in which FEC lane is selected by round

robin and each 16 symbols in the selected lane are then distributed into 16 sub-lanes. For this scheme, there

are two multiplexing methods in the PMA. One is bit muxing, the other is symbol one. It can be proven that symbol muxing outperforms bit one. For example, considering a 4-bit burst error that occurs around the boundary of two symbols from different lanes (see Fig. 9(b),(c)), the 4 errors hit 4 FEC symbols in bit muxing, but they only hit 2 ones in symbol muxing.

Fig. 10 compares BER performances with and without FEC interleaving over 400 GbE electrical link based on the same equalization scheme, i. e. 5-tap DFE combined with 3-tap FFE. It can be seen that among the three interleaving schemes, pre-interleave symbol muxing is the best with an interleaving gain of 0.48 dB at the BER of 10<sup>-7</sup>. Pre-interleaving scheme outperforms FOM scheme because the more codewords from different FEC lanes are interleaved together and longer burst error can be split to achieve the better performance. For pre-interleaving bit muxing and pre-interleaving symbol muxing, the latter can reduce the error symbol caused by same burst errors, shown as Fig. 9, namely that symbol muxing better than bit muxing can improve FEC correction error capability.



Fig. 10 BER improvement using FEC interleaving

Table 3 gives the performance comparison of these three FEC interleaving schemes, including the capacity of memory used to buffer codewords, the 2:1 multiplexer and the latency due to the operation of interleaving. For FOM bit muxing, because of the parallel distribution, at least a 3-symbol, i. e. 30-bit memory for each FEC lane and totally 120-bit (=4 × 30) memory for 4 lanes are necessary. Besides, eight 2:1 bit multiplexers are required. On the other hand, assume that T is the clock period of FEC lane, then the clock period of the 16 sub-lanes is 4T. Thus, the total latency of FOM is about 120T (=  $(4-1) \times 10 \times 4T$ ). Similar-

ly, the distribution scheme in pre-interleave bit muxing contributes to 1500-bit memory, which is the sum of 150-bit, 300-bit, 450-bit and 600-bit used to buffer 16 symbols from the first FEC lane to the fourth one, respectively. Moreover, it will take 600 T(=(16-1) $\times 10 \times 4T$ ) to prepare all the 16 symbols well before they can be sent to the multiplexer. At last, for the pre-interleave symbol muxing, 10-bit 2:1 multiplexer instead of 1-bit one is needed to multiplex each 2 parallel 10-bit symbol from two sub-lanes and an additional 400T latency to complete a serial-to-parallel conversion before the multiplexer needs to be considered. From Table 3 it can be summarized that the pre-interleave symbol muxing has the best interleaving performance at the cost of the highest complexity and latency. While the FOM bit muxing has the lowest cost, but its interleaving gain is the lowest too. Compared with these two schemes, pre-interleave bit muxing can provide good tradeoff between complexity and interleaving gain. Therefore, pre-interleave bit muxing is preferable in practical use.

Table 3 Comparisons of FEC interleaving schemes

|                       |             |              | C         |                |  |
|-----------------------|-------------|--------------|-----------|----------------|--|
| Performance           |             | Fig. 9(a)    | Fig. 9(b) | Fig. 9(c)      |  |
| Complexity            | Memory(Bit) | 120          | 1 500     | 1 500          |  |
|                       | 2:1MUX      | 8(1-bit)     | 8(1-bit)  | 8 (10-bit)     |  |
| Latency               |             | 120 <i>T</i> | 600T      | 1 000 <i>T</i> |  |
| Interleaving gain(dB) |             | 0.27         | 0.35      | 0.48           |  |

# 3 Conclusions

In this paper, the effect of 5-tap DFE error propagation on BER performance for 400 GbE electrical link is evaluated based on a proposed analytical model for DFE burst error length distribution and then effectively mitigated by using MUX-based FEC interleaving. For three FEC interleaving methods, hardware complexities and latencies are also compared. Simulation results show that pre-interleave bit muxing can obtain better BER performance than FOM bit muxing, and reduce hardware complexity and latency compared to pre-interleave symbol muxing, which is much more adaptable to 400GbE environment. Future work will focus on the circuit implementation of pre-interleave bit muxing.

#### References

- [ 1] Gustlin M, Nicholl G, Ofelt D. 400 GbE PCS architectural options [EB/OL]. http://grouper. ieee. org/groups/802/3/400GSG/public/13\_07/gustlin\_400\_02\_0713. pdf:IEEE 400 Gb/s Ethernet Study Group, 2017
- [ 2] Tzimpragos G, Kachris C, Djordjevic I B, et al. A survey

- on FEC codes for 100 G and beyond optical networks [J]. *IEEE Communications Surveys & Tutorials*, 2016, 18 (1): 209-221
- [ 3] Kocaman N, Ali T, Rao L P, et al. A 3.8 mW/Gbps quad-channel 8.5-13 Gbps serial link with a 5 Tap DFE and a 4 tap transmit FFE in 28 nm CMOS [J]. *IEEE Journal of Solid-State Circuits*, 2016, 51(4): 881-892
- [ 4] Liu C Y, Jin W Y, Healey A. Forward error correction for high-speed SerDes link system of 25-28 Gb/s [ C ]. In: Proceeding of DesignCon, Santa Clara, USA, 2011
- [5] Tang T, Li Y B, Wang J. Improved methods of decision feedback equalization for error propagation prevention
   [C]. In: IEEE 9th Conference on Industrial Electronics and Applications, Hangzhou, China, 2014. 1072-1076
- [ 6] Wang T T, Wang Z F, Wang X Y, et al. Analysis and comparison of FEC schemes for 200 GbE and 400 GbE [J]. IEEE Communications Standards Magazine, 2017, 1(1): 24-30
- [7] Han J, Lu Y, Sutardja N, et al. A 60 Gb/s 288 mW NRZ transceiver with adaptive equalization and baud-rate clock and data recovery in 65 nm CMOS technology[C]. In: IEEE Solid-State Circuits Conference, San Francisco, USA, 2017. 112-113
- [ 8] Bae W, Ju H, Park K, et al. A 6-to-32 Gb/s voltage-mode transmitter with scalable supply, voltage swing, and pre-emphasis in 65 nm CMOS[C]. In: IEEE Asian Solid-State Circuits Conference, Toyama, Japan, 2016. 241-244
- [ 9] Yao W, Lim J, Zhang J, et al. Design of package BGA

- pin-out for >25 Gb/s high speed SerDes considering PCB via crosstalk [C]. In: IEEE Symposium on Electromagnetic Compatibility and Signal Integrity, Santa Clara, USA, 2015. 111-116
- [ 10 ] Cideciyan R, Ewen J. Transcoding/FEC options and trade-offs for 100 Gb/s backplane and copper cable [ EB/OL ]. http://www.ieee802.org/3/bj/public/nov11/cideciyan\_01a\_1111.pdf; IEEE 802.3bj Task Force, 2011
- [11] Chagnon M, Lessard S, Plant D V. 336 Gb/s in direct detection below KP4 FEC threshold for intra data center applications [J]. IEEE Photonics Technology Letters, 2016, 28(20): 2233-2236
- [12] Chen M, Xiao X, Li X Y, et al. Improved BER performance of real-time DDO-OFDM systems using interleaved Reed-Solomon codes [J]. *IEEE Photonics Technology Letters*, 2016, 28(9): 1014-1017
- [13] Kono M, Kanbe A, Toyoda H, et al. A novel 400-Gb/s (100-Gb/s × 4) physical-layer architecture using low-power technology [J]. *Ieice Transactions on Communications*, 2012, E95. B(11): 3437-3444

**Zhan Yongzheng**, born in 1989. He is currently working toward the Ph. D. degree in Southeast University. He received his M. S. degree from Zhejiang Gongshang University in 2015. His research interests focus on FEC and equalization technology and circuit design for 400 GbE.

# Appendix A

The numerator in Eq. (9) is calculated as follows:

$$\begin{split} \sum_{w=1}^{\infty} p(W(E) = w) \cdot W(E) &= \sum_{w=1}^{\infty} \left( \sum_{l=w}^{brl_{\max}} p(brl = l, W(E) = w) \right) \cdot W(E) \\ &= \sum_{w=1}^{\infty} \left( \sum_{l=w}^{brl_{\max}} n \cdot p_1 \cdot \left( \sum_{j, W(E_{l,j}) = w} p(brl = l, E_{l,j}) \right) \cdot (1 - p_1)^{n - brl_{\max} - l} \right) \cdot w \\ &= \sum_{l=1}^{brl_{\max}} n \cdot p_1 \cdot \left( \sum_{j, W(E_{l,j}) = 1} p(brl = l, E_{l,j}) \right) \cdot (1 - p_1)^{n - brl_{\max} - l} \cdot 1 \\ &+ \sum_{l=2}^{brl_{\max}} n \cdot p_1 \cdot \left( \sum_{j, W(E_{l,j}) = 2} p(brl = l, E_{l,j}) \right) \cdot (1 - p_1)^{n - brl_{\max} - l} \cdot 2 \\ &+ \sum_{l=3}^{brl_{\max}} n \cdot p_1 \cdot \left( \sum_{j, W(E_{l,j}) = 3} p(brl = l, E_{l,j}) \right) \cdot (1 - p_1)^{n - brl_{\max} - l} \cdot 3 \\ &+ \sum_{l=4}^{brl_{\max}} n \cdot p_1 \cdot \left( \sum_{j, W(E_{l,j}) = 4} p(brl = l, E_{l,j}) \right) \cdot (1 - p_1)^{n - brl_{\max} - l} \cdot 4 + \cdots \\ &\vdots \\ &= n \cdot p_1 \cdot p(brl = 1, E_{1,1}) \cdot (1 - p_1)^{n - brl_{\max} - 1} \cdot 1 + n \cdot p_1 \cdot p(brl = 2, E_{2,1}) \cdot (1 - p_1)^{n - brl_{\max} - 2} \cdot 2 \\ &+ n \cdot p_1 \cdot p(brl = 3, E_{3,1}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 2 + n \cdot p_1 \cdot p(brl = 4, E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 4} \cdot 2 + \cdots \\ &+ n \cdot p_1 \cdot p(brl = 3, E_{3,2}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 3 + n \cdot p_1 \cdot \sum_{j, W(E_{4,j}) = 3} p(brl = 4, E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 4} \cdot 2 + \cdots \\ &+ n \cdot p_1 \cdot p(brl = 3, E_{3,2}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 3 + n \cdot p_1 \cdot \sum_{j, W(E_{4,j}) = 3} p(brl = 4, E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 4} \cdot 2 + \cdots \\ &+ n \cdot p_1 \cdot p(brl = 3, E_{3,2}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 3 + n \cdot p_1 \cdot \sum_{j, W(E_{4,j}) = 3} p(brl = 4, E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 4} \cdot 2 + \cdots \\ &+ n \cdot p_1 \cdot p(brl = 3, E_{3,2}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 3 + n \cdot p_1 \cdot \sum_{j, W(E_{4,j}) = 3} p(brl = 4, E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 4} \cdot 2 + \cdots \\ &+ n \cdot p_1 \cdot p(brl = 3, E_{3,2}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 3 + n \cdot p_1 \cdot \sum_{j, W(E_{4,j}) = 3} p(brl = 4, E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 4} \cdot 2 + \cdots \\ &+ n \cdot p_1 \cdot p(brl = 3, E_{3,2}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 3 + n \cdot p_1 \cdot \sum_{j, W(E_{4,j}) = 3} p(brl = 4, E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 4} \cdot 2 + \cdots \\ &+ n \cdot p_1 \cdot p(brl = 3, E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 3 + n \cdot p_1 \cdot \sum_{j, W(E_{4,j}) = 3}$$

$$\begin{array}{l} \cdot 3 + n \cdot p_1 \cdot \sum\limits_{j, \ W(E_{5,j})} p(brl = 5, E_{5,j}) \cdot (1 - p_1)^{n - brl_{\max} - 5} \cdot 3 + \cdots \\ \vdots \\ = n \cdot p_1 \cdot p(brl = 1, E_{1,1}) \cdot (1 - p_1)^{n - brl_{\max} - 1} \cdot 1 + n \cdot p_1 \cdot p(brl = 2, E_{2,1}) \cdot (1 - p_1)^{n - brl_{\max} - 2} \cdot 2 \\ + n \cdot p_1 \cdot p(brl = 3, E_{3,1}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 2 + n \cdot p_1 \cdot p(brl = 3, E_{3,2}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 3 \\ + n \cdot p_1 \cdot p(brl = 4, E_{4,1}) \cdot (1 - p_1)^{n - brl_{\max} - 4} \cdot 2 + n \cdot p_1 \cdot p(brl = 3, E_{3,2}) \cdot (1 - p_1)^{n - brl_{\max} - 3} \cdot 3 \\ + n \cdot p_1 \cdot p(brl = 5, E_{5,1}) \cdot (1 - p_1)^{n - brl_{\max} - 4} \cdot 2 + n \cdot p_1 \cdot \sum\limits_{j, \ W(E_{4,j}) = 3} p(brl = 4, E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 4} \cdot 2 \\ \cdot 3 + \cdots \\ \vdots \\ = n \cdot p_1 \cdot p(brl = 5, E_{5,1}) \cdot (1 - p_1)^{n - brl_{\max} - 5} \cdot 2 + n \cdot p_1 \cdot \sum\limits_{j, \ W(E_{5,j}) = 3} p(brl = 5, E_{5,j}) \cdot (1 - p_1)^{n - brl_{\max} - 5} \\ \cdot 3 + \cdots \\ \vdots \\ = n \cdot p_1 \cdot \sum\limits_{all \ E_{1,j}} p(brl = 1, E_{1,j}) \cdot W(E_{1,j}) \cdot (1 - p_1)^{n - brl_{\max} - 1} \\ + n \cdot p_1 \cdot \sum\limits_{all \ E_{5,j}} p(brl = 2, E_{2,j}) \cdot W(E_{3,j}) \cdot (1 - p_1)^{n - brl_{\max} - 2} \\ + n \cdot p_1 \cdot \sum\limits_{all \ E_{5,j}} p(brl = 3, E_{3,j}) \cdot W(E_{3,j}) \cdot (1 - p_1)^{n - brl_{\max} - 2} \\ + n \cdot p_1 \cdot \sum\limits_{all \ E_{5,j}} p(brl = 4, E_{4,j}) \cdot W(E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 2} \\ + n \cdot p_1 \cdot \sum\limits_{all \ E_{5,j}} p(brl = 4, E_{4,j}) \cdot W(E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 2} \\ + n \cdot p_1 \cdot \sum\limits_{all \ E_{5,j}} p(brl = 4, E_{4,j}) \cdot W(E_{4,j}) \cdot (1 - p_1)^{n - brl_{\max} - 2} \\ + n \cdot p_1 \cdot \sum\limits_{all \ E_{5,j}} p(brl = 1, E) \cdot W(E) \cdot (1 - p_1)^{n - brl_{\max} - 1} \\ \end{bmatrix}$$

$$= n \cdot p_1 \sum\limits_{k=1} \sum\limits_{all \ E_{5,k}} p(brl = l, E) \cdot W(E) \cdot (1 - p_1)^{n - brl_{\max} - 1} \\ = n \cdot p_1 \sum\limits_{k=1} \sum\limits_{all \ E_{5,k}} p(brl = l, E) \cdot W(E) \cdot (1 - p_1)^{n - brl_{\max} - 1} \\ = n \cdot p_1 \sum\limits_{k=1} \sum\limits_{all \ E_{5,k}} p(brl = l, E) \cdot W(E) \cdot (1 - p_1)^{n - brl_{\max} - 1} \\ = n \cdot p_1 \sum\limits_{k=1} \sum\limits_{all \ E_{5,k}} p(brl = l, E) \cdot W(E) \cdot p_1 \cdot (1 - p_1)^{n - brl_{\max} - 1} \\ = \sum\limits_{k=1} \sum\limits_{all \ E_{5,k}} p(brl = l, E) \cdot W(E) \cdot p_1 \cdot (1 - p_1)^{n - brl_{\max} - 1} \\ = \sum\limits_{all \ E_{5,k}} p(brl = l, E) \cdot W(E) \cdot p_1 \cdot (1 - p_1)^{n - brl_{\max$$