DOI: xxx/xxxx

#### **RESEARCH ARTICLE**

# Redundancy-bandwidth scalable techniques for signal-independent element transition rates in high-speed current-steering DACs

Longqiang Lai | Xueqing Li | Huazhong Yang

<sup>1</sup>Department of Electronic Engineering, Tsinghua University, Beijing, China

#### Correspondence

Xueqing Li, Department of Electronic Engineering, Tsinghua University, Beijing, China. Email: xueqingli@tsinghua.edu.cn

#### **Funding information**

This research was supported by the National High Technology Research and Development Program of China ("863" program, Grant No. 2013AA014103)

#### Summary

This paper presents redundancy-bandwidth scalable techniques to deal with the intersymbol interference (ISI) distortions for current-steering digital-to-analog converters (DACs) in high-speed applications. A switching strategy that explores the use of redundant current sources is proposed to realize a signal-independent element transition rate (ETR), i.e. the number of switching activities during the transition of successive sampling clock cycles. With a certain number of redundant current sources, this strategy significantly reduces the ISI distortions without oversampling operation or causing signal attenuation, which makes it appealing for high-speed applications. As analyzed in this paper, the number of required redundant current sources is scalable for different bandwidth requirement in specific applications, leading to three redundancy-bandwidth scalable trade-offs between the cost from redundant current sources and the high-dynamic-range bandwidth. In implementation, we propose a custom-designed decoder, named as the overlap-controlled data-weighted averaging (OC-DWA). Compared with the existing similar-purpose designs, the proposed OC-DWA decoder realizes the current sources selection with a simple barrel rotator, which is of much lower hardware complexity and energy consumption. Simulations of a DAC with this decoder exhibit an enhanced dynamic range over the entire Nyquist band, which verifies the redundancy-bandwidth scalability of the proposed techniques.

#### **KEYWORDS:**

current-steering, digital-to-analog converters (DACs), element transition rate (ETR), inter-symbol interference (ISI), switching strategy, redundancy-bandwidth scalable

# **1** | **INTRODUCTION**

The current-steering architecture is widely used in high-speed digital-to-analog converters (DACs) because of its high intrinsic switching speed and moderate-matching property. One key performance metric for its high-speed and wideband applications is the dynamic linearity, usually evaluated as the spurious-free dynamic range (SFDR). When the output frequency or the sampling rate goes up, the inter-symbol interference (ISI) errors during the switching transition of current sources become one of the main bottlenecks towards a higher SFDR.<sup>1–11</sup> Techniques used to improve the SFDR in high-speed applications rely on more



**FIGURE 1** (A), Illustration of switching activities in a differential-ended current steering DAC during a transition. " $1 \rightarrow 0$ " means that the output of the current source is steered from the positive output to the negative output. (B), Example of the element transition rate (ETR) in a 3-bit current steering DAC with 7 elements. Each box represents a current source. The dashed box (blank box) represents that the corresponding current source is connected to the positive (negative) output [Colour figure can be viewed at wileyonlinelibrary.com]

stringent clocking<sup>1, 2, 12–20</sup> (e.g. half-cycle sampling) or increased area<sup>3–8, 19, 21–24</sup> (e.g. two interleaved sub-DACs). Although the dynamic linearity is increased with these techniques, extra power consumption shortens battery life for edge devices, e.g. fielded software-defined portable radio stations<sup>25</sup> and 5G handset transmitters,<sup>26</sup> where low-power design of high-performance wideband DACs is highly preferred. Meanwhile, the signal bandwidth and the location of the band of interest vary from one application to another,<sup>27–29</sup> which indicates power-performance co-design opportunities for DAC operation optimization based on the specific application requirement on the signal being converted. Therefore, understanding the existing bottlenecks for potential design space extension may enable a new paradigm of power savings.

For a differential-ended current steering DAC, a certain number of current sources are connected to the positive output while the rest of current sources are connected to the negative output. As depicted in Figure 1, during the transition of successive sampling clock cycles, there exist a certain number of switching activities, named as the element transition rate (ETR). Prior efforts  $^{1-8, 12-16}$  to achieve high-dynamic range have revealed that the signal-independent ETR is effective in reducing the distortions caused by the ISI errors. With reduced correlation between the input code and the ETR, the techniques  $^{1-8, 12-16}$  turn most of the ISI errors into an offset, and thus increase the linearity significantly. The techniques with signal-independent ETRs can be categorized into the quad-switching-based approaches,  $^{5-8, 22}$  the randomized return-to-zero-based (RZ-based) techniques,  $^{1-4}$ and the  $\Delta\Sigma$ -based techniques.  $^{12-16}$ 

From the aspect of switching circuits, the differential quad switching (DQS) techniques<sup>5–8, 22</sup> add two dummy switches. Each of the two dummy switches is connected in parallel with one of the original complementary switches separately. If the output direction of a current source does not change during the transition, a dummy switching occurs to compensate the number of switching operations, leading to an increased dynamic range. However, the SFDR improvement from the dummy switching is still limited by the remnant signal dependency at the output, as dummy switching generates a different glitch.

From the aspect of decoding approaches, the randomized return-to-zero (RZ) techniques, including the half-cycle randomized RZ solutions<sup>1, 2</sup> and the time-interleaved randomized RZ solutions,<sup>3, 4</sup> insert a random state during the normal operation and realize signal-independent ETRs during transitions. The half-cycle RZ techniques<sup>1, 2</sup> require no redundant current sources but suffer from an equivalently doubled sampling clock frequency plus attenuated signal power in the first Nyquist band. In contrast, the interleaved RZ techniques<sup>3, 4</sup> do not affect the sampling frequency or signal power but suffer from the doubled number of current sources to build two sub-DACs, and consequently, larger area, power, and mismatches between the current sources. In both RZ techniques, the active randomized operation may also increase the noise floor, and limit their usage in noise-sensitive applications.

Alternatively, the techniques  $^{12-16}$  used in  $\Delta\Sigma$  modulators provide another kind of decoding approach to achieving a constant ETR. While keeping a certain number of current sources from being switched, these techniques can precisely control the number of switching operations, achieving significant improvement of in-band linearity and noise performance in low-speed applications, such as the high-resolution audio DACs. However, the range of the controllable ETR in those techniques  $^{12-16}$  requires restrictions

on the minimal oversampling rate (OSR) and the maximal swing in the input digital code even if redundant current sources are used. <sup>15, 16</sup> On the one hand, achieving a high OSR is challenging for high-speed applications. On the other hand, the limitation of the input digital code range even with redundant current sources, as revealed in Sanyal et al, <sup>15, 16</sup> makes this technique less practical in high-speed applications.

In this paper, we explore the use of redundant current sources to realize the signal-independent ETR. With a certain number of redundant current sources, the proposed techniques do not require the oversampling or the half-cycle RZ operation, which is favorable for high-speed applications. Different from the concept in the  $\Delta\Sigma$ -based techniques, <sup>12–16</sup> the proposed strategy has no restriction on both the upper and lower bound of the input digital code. Notably, for the first time, the required number of redundant current sources for specific applications is provided in this paper, which provides the guidance to determine how much redundancy is sufficient for specific applications. The contributions of this paper are as follows.

- We model and analyze the theoretical range of the signal-independent ETR with and without redundant current sources, which, for the first time, provides insight into how many redundant current sources are needed to realize the signal-independent ETR without oversampling or signal attenuation.
- We propose a switching strategy to realize the signal-independent ETR with a certain number of redundant current sources. In addition to the signal-independent ETR, the proposed switching strategy can also address the mismatches by randomly selecting an ETR from a specific range. To keep the switching strategy realizable for specific applications with minimal redundancy, we provide three redundancy-bandwidth scalable trade-offs between the number of redundant current sources and the high-dynamic-range bandwidth. These trade-offs reveal how much redundancy is sufficient for specific applications with different high-dynamic-range bandwidth requirement.
- We also propose a custom-designed decoder, named as the overlap-controlled data-weighted averaging (OC-DWA), to implement the switching strategy. The OC-DWA decoder utilizes a barrel rotator as the current source selection logic, and its complexity is  $(\log_2 N)^2/4$  times lower than that of existing similar-purpose designs, where N represents the number of current sources. Simulations of a 14-bit DAC with the proposed decoder exhibit an enhanced dynamic range over the entire Nyquist band, which verifies the redundancy-bandwidth scalability of the proposed techniques.

In the rest of this paper, Section 2 models and analyzes the ETR with a different number of redundant current sources and provides the boundary of required redundancy to realize the signal-independent ETR. Section 3 presents the proposed switching strategy, the trade-offs between the redundancy and the high-dynamic-range bandwidth, and the comparisons between similar techniques. Section 4 presents the proposed OC-DWA decoder and the comparisons between existing implementations. Section 5 provides the simulation results, and finally, Section 6 concludes this paper.

# 2 | SIGNAL-INDEPENDENT ELEMENT TRANSITION RATES

This section provides modeling and analyses of the ETR in two cases: with and without redundant current sources. For a traditional *P*-bit unary-weighted DAC, the number of current sources, represented as *N*, is  $2^P - 1$ . We use 2R to represent the number of the redundant current sources. We will show that the ETR of a traditional Nyquist DAC without redundant current sources is always dependent on the input digital code. The work of Sanyal et al<sup>16</sup> already reveals a signal-independent ETR that can be achieved with restrictions on the minimum OSR and the maximum signal swing. In this section, we reveal a different signal-independent ETR that can be realized with redundant current sources and without restrictions on the OSR or the signal swing. The required number of redundant current sources to achieve the new signal-independent ETR is analyzed in this section.

## 2.1 | ERT of traditional DACs

In a *P*-bit conventional unary-weighted differential-output DAC, let us use  $S_i = [s_i^1 \ s_i^2 \ s_i^1 \ \dots \ s_i^N]^T$  to represent the switching control vector corresponding to the input digital code  $D_i$ , where  $s_i^j$  represents the status of the *j*th current source at the *i*th clock cycle. If  $s_i^j = 1$ , the *j*th current source is connected to the positive output. If  $s_i^j = 0$ , the *j*th current source is connected to the negative output. Thus, the control vector  $S_i$  obeys



**FIGURE 2** Range of  $K_i$ . (A), The upper bound for  $K_i$ . (B), The lower bound for  $K_i$ . Each row represents a control vector while each box represents an element in the control vector. The box with shaded color represents that the corresponding element equals one [Colour figure can be viewed at wileyonlinelibrary.com]

For a unary-weighted DAC, the number of 1's in  $S_i$  equals the value of  $D_i$ . Then we have

$$D_i = (S_i)^T S_i. (2)$$

Similar to Shui et al<sup>12</sup> and Sanyal et al, <sup>14–16</sup> we monitor both the up  $(1 \rightarrow 0)$  and down  $(0 \rightarrow 1)$  transitions. The ETR during the transition from the (i - 1)th clock cycle to the *i*th clock cycle, represented as  $C_i$ , can be calculated as

$$C_{i} = (S_{i} - S_{i-1})^{T} (S_{i} - S_{i-1})$$
  
=  $D_{i} + D_{i-1} - (S_{i})^{T} S_{i-1} - (S_{i-1})^{T} S_{i}$   
=  $D_{i} + D_{i-1} - 2K_{i}$ . (3)

where  $K_i = (S_i)^T S_{i-1} = (S_{i-1})^T S_i$ , and  $K_i$  represents the number of current sources that remain connected to the positive output during the transition from the (i - 1)th clock cycle to the *i*th clock cycle.

As depicted in Figure 2A, the maximum value of  $K_i$  can be obtained when all the current sources that are connected to the positive output are still connected to the positive output at adjacent clock cycles. Then we can get the upper bound for  $K_i$ 

$$K_i \le \min(D_i, D_{i-1}). \tag{4}$$

As depicted in Figure 2B, when  $D_i + D_{i-1} \le N$ , the current sources that are connected to the positive output at adjacent clock cycles can be completely different. In this case, the minimum value of  $K_i$  is zero. If  $D_i + D_{i-1} \ge N$ , there must be at least one current source that remains connected to the positive output from the (i - 1)th clock cycle to the *i*th clock cycle, which makes the minimum value of  $K_i$  to be  $D_i + D_{i-1} - N$ . Therefore, the lower bound for  $K_i$  is

$$K_i \ge \max(0, D_i + D_{i-1} - N).$$
 (5)

Combining (3), (4), and (5), we can get the range of  $C_i$ 

$$(D_i + D_{i-1}) - 2\min(D_i, D_{i-1}) \le C_i,$$
(6a)

$$C_i \le (D_i + D_{i-1}) - 2\max(0, D_i + D_{i-1} - N).$$
 (6b)

Assuming that there exists a signal-independent ETR, represented as  $C^{SI}$ , gives

$$\max\{(D_i + D_{i-1}) - 2\min(D_i, D_{i-1})\} = N \le C^{SI},$$
(7a)

$$C^{\text{SI}} \le \min\{(D_i + D_{i-1}) - 2\max(0, D_i + D_{i-1} - N)\} = 0,$$
(7b)

where the superscript of  $C^{SI}$  (i.e., "SI") is the acronym of "signal-independent".

Note that the subscript '*i*' in (7a) and (7b) does not have to be the same. This is because that  $C^{SI}$  is completely signal-independent. Obviously, (7a) and (7b) cannot hold at the same time, which means that in a traditional Nyquist DAC where each current source has only two differential switches, the ETR is always signal-dependent.

#### 2.2 | ETR with redundant current sources

With 2*R* redundant current sources added to the traditional DAC, the total number of current sources becomes N + 2R. In order to keep the differential output property of the traditional DAC, there should be *R* current sources connected to the negative output while the other *R* current sources connected to the positive output. In this way, the differential output of the new DAC is still the same as that of the traditional DAC. Note that the added current sources have the same functionality as original ones. Therefore, the *R* current sources connected to the negative output or the positive output are selected from the total N + 2R current sources instead of the added 2*R* current sources.

The length of the switching control vector is extended to N + 2R with  $D_i + R$  elements equaling one. In the case with more current sources, we can rewrite (2) as

$$D_i + R = (S_i)^T S_i. aga{8}$$

Since the number of 1's in the control vector  $S_i$  is  $D_i + R$ , (3) can be rewritten as

$$C_i = D_i + D_{i-1} + 2R - 2K_i.$$
(9)

In the similar way when we get the derivation of (4) and (5), we can get the new range of  $K_i$ 

$$\max(0, D_i + D_{i-1} - N) \le K_i \le \min(D_i, D_{i-1}) + R.$$
(10)

Comparing (10) with (4), we find that the added current sources loosen the upper bound for  $K_i$  and thus the lower bound for  $C_i$ . This feature gives us the chance to find a signal-independent ETR.

In order to find a signal-independent ETR, we represent  $K_i$  as

$$K_i = \frac{1}{2}(D_i + D_{i-1} - M_i), \tag{11}$$

where  $M_i$  represents an undetermined value. Substituting (11) into (9), we get

$$C_i = 2R + M_i. \tag{12}$$

If a signal-independent  $M_i$  exists, there exists a signal-independent  $C_i$ . Combining (10) and (11) gives

$$(D_i + D_{i-1}) - 2\min(D_i, D_{i-1}) - 2R \le M_i,$$
(13a)

$$M_i \le (D_i + D_{i-1}) - 2\max(0, D_i + D_{i-1} - N).$$
 (13b)

Assuming that there exists a signal-independent  $M_i$ , represented as  $M^{SI}$ , we can derive the range of  $M^{SI}$  in the similar way when we get (7a) and (7b). Then we have

$$N - 2R \le M^{\rm SI} \le 0. \tag{14}$$

If  $N \leq 2R$ , (14) holds for all possible input patterns, which means that any  $M_i$  from the set of

$$\left\{ M^{\rm SI} \mid N - 2R \le M^{\rm SI} \le 0 \right\},\tag{15}$$

can guarantee that  $C_i = 2R + M_i$  is signal-independent, as long as (11) is guaranteed.

In summary, when the number of added current sources meets

$$N \le 2R,$$
 (16)

we can always find a completely signal-independent ETR from the following set

$$\left\{ C^{\mathrm{SI}} \mid N \le C^{\mathrm{SI}} \le 2R \right\}. \tag{17}$$



FIGURE 3 Basic concept of the proposed switching strategy [Colour figure can be viewed at wileyonlinelibrary.com]

Combining (12) and (13), we can get a more general representation of  $C^{SI}$  as

$$\max\{|D_{i} - D_{i-1}|\} \le C^{\mathrm{SI}} \le \min\{N + 2R - |D_{j} + D_{j-1} - N|\},\tag{18}$$

where the value of the subscript 'i' and 'j' in the lower and upper bound of  $C^{SI}$  can be different.

According to Sanyal et al, <sup>15, 16</sup> the restriction on the swing and the signal-independent ETR in the  $\Delta\Sigma$ -based techniques <sup>12–16</sup> can be represented as

$$C^{\rm SI} \le (D_i + D_{i-1}) \le 2N - C^{\rm SI}.$$
(19)

The new range of the signal-independent ETR in (17) is completely different from the range in (19). Comparing (18) and (19), we can see that the restrictions on both the lower bound and the upper bound of input digital codes can be alleviated with redundant current sources in a Nyquist DAC. More importantly, the costs to achieve these two different kinds of  $C^{SI}$  are also different, which will be discussed in Section 3.

# **3** | **PROPOSED SWITCHING STRATEGY**

This section describes the proposed switching strategy to achieve the new signal-independent ETR described in Section 2. In order to reduce the number of redundant current sources for specific applications, the trade-offs between the redundancy and the high-dynamic-range bandwidth are discussed under three common scenarios. The comparisons between similar techniques are also provided in this section.

### 3.1 | Switching strategy with constant or random ETR

The basic concept of this strategy is illustrated in Figure 3. In our proposed switching strategy, the number of added current sources is fixed to be 2R in the design phase. The minimum value of 2R should meet the restriction in (16) for general-purpose applications. However, it can be further reduced under certain conditions, which will be discussed later in this section. The operations of the proposed switching strategy in every clock cycle are as follows.

- 1) We choose  $M^{SI}$  from the range in (15). The selected  $M^{SI}$  can remain constant or vary in different clock cycles.
- 2) We substitute  $D_i$ ,  $D_{i-1}$ , and the selected  $M^{SI}$  into (11) to calculate the value of  $K_i$  which is the number of current sources that remain connected to the positive output during the transition.
- 3) We keep  $K_i$  current sources, from the ones that have been already connected to the positive output in the previous clock cycle, connected to the positive output. Meanwhile, we select  $D_i + R K_i$  current sources, from the ones that have been connected

to the negative output in the previous clock cycle, to be connected to the positive output. The rest of current sources are all connected to the negative output. For simplicity, we name this operation as "reuse  $K_i$  1's from  $S_{i-1}$ " in the rest of this paper.

Note that for some combinations of  $D_{i-1}$  and  $D_i$ , the calculated value of  $K_i$  in step 2) is not an integer. For these combinations, we force  $K_i$  to be the nearest integer around the calculated value, which only affects the total number of switching operations a little, especially when the number of current sources is large.

The analyses in Section 2 reveal that when the chosen  $M^{SI}$  in step 1) belongs to the range in (15), this switching strategy is realizable for any input patterns without any restriction on the swing of the input digital code or the minimum OSR. According to (12), if the chosen  $M^{SI}$  in step 1) remains the same for every clock cycle, the achieved ETR is constant.

In addition to the signal-independent ETR, two kinds of random operations can be inserted in step 1) or step 3) to address the mismatch problem. In step 1), we can randomly choose  $M^{SI}$  from the range in (15) for each clock cycle. This random operation can introduce randomness into the static mismatches of current sources and reduce the distortions caused by the static mismatches. Besides the randomized selection of  $M^{SI}$ , we can also introduce randomness into the static mismatches by randomly selecting the current sources in step 3) of our proposed strategy, which is similar to the DEM algorithm in Sanyal et al.<sup>16</sup> Since there usually exists more than one kind of selection for the  $K_i$  current sources in the third step, we can randomly choose them while the ETR is still constant. This random operation is also effective to reduce the distortions caused by the static mismatches. Different from randomly selecting current sources in step 3), randomly selecting  $M^{SI}$  in step 1) can be realized with lower hardware complexity during the selection of current sources, which will be further discussed in Section 4. The performance of these two random operations in dealing with the static mismatches will be provided in Section 5.

#### 3.2 | Redundancy-bandwidth scalability

As analyzed in Section 2, to achieve the completely signal-independent ETR without signal attenuation or high OSR, the required number of total current sources is doubled. The increased number of current sources will introduce more area and power consumption, which is the main drawback of all the techniques with redundancy, such as the interleaved RZ techniques.<sup>3, 4</sup> Fortunately, the redundancy can be reduced under two facts. The first one is that the signal-dependent ETR is not the only bottleneck that affects the linearity performance within the entire Nyquist band. For instance, when the output frequency is low, the linearity is usually limited by the static mismatches instead of the ISI distortion. The second one is that, in practice, the input digital signal only contains a single tone within the first Nyquist band. For narrowband wireless communications, the signals usually occupy only a small part of the entire Nyquist band. Based on the above two facts, we provide three redundancy-bandwidth scalable trade-offs between the redundancy and the high-dynamic-range bandwidth under three common scenarios.

#### 3.2.1 | Trade-off in the single-tone-signal conversion

In this scenario, the input digital signal  $D_i$  can be represented as

$$D_i = (1 + \sin(2\pi f_o i))N/2,$$
(20)

where  $f_o$  represents the normalized frequency. Substituting (20) into (13) and combining the items with trigonometric functions give

$$\left|\cos(\pi f_o + 2\pi f_o i)\sin(\pi f_o)\right| N - 2R \le M_i,\tag{21a}$$

$$M_i \le N - N \left| \sin(\pi f_o + 2\pi f_o i) \cos(\pi f_o) \right|. \tag{21b}$$

The signal-independent  $M_i$ , represented as  $M^{SI}$ , should meet the following restrictions

$$\max\left\{\left|\cos(\pi f_o + 2\pi f_o i)\sin(\pi f_o)\right| N - 2R\right\} \le M^{\mathrm{SI}},\tag{22a}$$

$$M^{\mathrm{SI}} \le \min\left\{N - N\left|\sin(\pi f_o + 2\pi f_o i)\cos(\pi f_o)\right|\right\}.$$
(22b)



**FIGURE 4** Range of signal-independent  $M_i$  for different number of redundant current sources: (A), 2R = 1.5N, (B), 2R = 1N, (C), 2R = 0.5N, and (D), 2R = 0.43N, where N is 64 [Colour figure can be viewed at wileyonlinelibrary.com]

Note that, since  $M^{SI}$  is signal-independent, the subscript '*i*' in (22a) and (22b) does not have to be the same. Simplifying (22) gives

$$\max\left\{\left|\sin(\pi f_o)\right| N - 2R\right\} \le M^{\text{SI}},\tag{23a}$$

$$M^{\mathrm{SI}} \le \min\left\{N - N\left|\cos(\pi f_{\rho})\right|\right\}.$$
(23b)

If we allow  $f_o$  in (23a) and (23b) to be different, we can get the same result as (14) and there is no limitation on the input frequency. In the single-tone-signal conversion, there is only one frequency component in the signal, so we can force  $f_o$  in (23a) and (23b) to be the same. In this scenario, the required number of redundant current sources should meet the following restriction

$$\{\max\{|\sin(\pi f_o)| + |\cos(\pi f_o)|\} - 1\}N \le 2R.$$
(24)

Using trigonometric functions, we can simplify (24) and get

$$(\sqrt{2}-1)N \le 2R. \tag{25}$$

From (25), we can get the conclusion that, for the single-tone-signal conversion, the required number of redundant current sources can be reduced to  $(\sqrt{2} - 1)N \approx 0.41N$ , which is lower than the redundancy in the interleaved RZ techniques<sup>3, 4</sup> while  $M^{\text{SI}}$  still exists. Note that for different  $f_o$ , the available selection of  $M^{\text{SI}}$ , in this scenario, is also different. Although the value of the selected  $M^{\text{SI}}$  is frequency-dependent, the selected  $M^{\text{SI}}$  can guarantee that the ETR in the time domain is still signal-independent. The value of  $(\sqrt{2} - 1)N$  is the theoretical lower bound for the number of redundant current sources to ensure that the ETR is signal-independent in the single-tone-signal conversion. Since the number of current sources should be an integer, the actual lower bound for 2R should be a little higher than the calculated value of  $(\sqrt{2} - 1)N$ .

The range of  $M^{SI}$  with different number of redundant current sources is calculated with (23) and depicted in Figure 4. As we can see from Figure 4, with 1.5N or N redundant current sources, there exists at least one constant  $M^{SI}$  which is applicable for every frequency point. Therefore, this  $M^{SI}$  is frequency-independent and signal-independent for arbitrary signals. When the number of redundant current sources decreases to 0.43N, we can still find  $M^{SI}$  for single-tone signals. According to the analyses above, all the  $M^{SI}$  from the shaded area in Figure 4 can ensure that the ETR is signal-independent for single-tone signals with frequencies corresponding to the selected  $M^{SI}$ . The simulation results in Section 5 verify that with 0.5N redundant current sources the SFDR performance can still be guaranteed.

As we can see from Figure 4, when the number of redundant current sources decreases, the lower bound for  $M^{SI}$  increases. Since the ETR equals  $2R + M_i$ , the ETR in this scenario can be lower than N, which also improves the noise performance.

## 3.2.2 | Trade-off in the narrowband-signal conversion

From Figure 4, we also notice that, for each  $M_i$  within the range of (23), there exists a frequency band where the in-band signals can be converted with the same signal-independent ETR. We name this frequency band as the optimum band of the corresponding  $M_i$ . For different redundancies, the optimum band of the same  $M_i$  is usually different. One interesting insight is that the redundancy can be further reduced to lower than  $(\sqrt{2} - 1)N$  if the frequency band of the signal is still in the optimum band of the selected  $M_i$ .



**FIGURE 6** (A) Scalable area consumption form the required redundancy (2R) for multi-tone signals which range from different lower frequency bounds to different upper frequency bounds. (B) Scalable power consumption for multi-tone signals. The power consumption comes from the static current of redundancy and the switching activity ( $C^{SI}$ ). We assume that the ratio between these two sources is 1:4 [Colour figure can be viewed at wileyonlinelibrary.com]

Figure 5 shows the optimum band of  $M_i = 0$  with different redundancies. According to (13a) and (22a), the redundancy affects only the lower bound of  $M^{SI}$ . When the redundancy decreases, the lower bound of  $M^{SI}$  increases, which may reduce the width of the optimum band of the selected  $M_i$ . As illustrated in Figure 5, if the frequency band of the signal lies in the high or low frequency band of the Nyquist band, the redundancy requirement might be further reduced to lower than  $(\sqrt{2} - 1)N$ . Although the redundancy requirement for the signal in the middle frequency band is higher, the required number of current sources is still lower than N, unless the frequency band of the signal occupies the entire Nyquist band.

Based on the concept of the optimum band, Figure 6 shows the scalable area and power consumption for multi-tone signals ranging from different lower frequency bounds to different higher frequency bounds. For example, for the multi-tone signal ranging from  $0.08f_o$  to  $0.24f_o$ , the required number of redundant current sources is 65% of the original number of current sources, while the corresponding power consumption is 66% of the worst case. The worst case happens when 100% redundant current sources are needed and the ETR equals the number of the original current sources. Similarly, for the multi-tone signal ranging from  $0.4f_o$  to  $0.48f_o$ , the area and power consumptions are 31% and 85%, respectively. As we can see from Figure 6, for

9



**FIGURE 7** Range of the ETR in our proposed strategy,  $\Delta\Sigma$ -based techniques, and randomized half-cycle RZ, where 2R represents the number of redundant current sources in our proposed strategy and *L* represents the maximum ETR in the  $\Delta\Sigma$ -based techniques<sup>12, 15, 16</sup> [Colour figure can be viewed at wileyonlinelibrary.com]

different requirement of the bandwidth with signal-independent ETRs, the required redundancy and the corresponding energy are scalable.

Figure 6 also provides the guidance to determine the redundancy for single-tone signal conversion. For a series of single-tone signals ranging from  $0.08f_o$  to  $0.24f_o$ , the required number of redundant current sources could be reduced to 43%, which is lower than the required number of multi-tone signal with the same frequency range. It is noted that the reduced requirement in redundancy comes at the requirement of a varying  $M^{SI}$  for different frequency point, which is applicable for signal-tone signals. With the required redundancy illustrated in Figure 6, the strategy can guarantee that the signal within the frequency range could be converted without signal-dependent ETRs. Definitely, the energy and area consumption from redundancy could be further reduced by sacrificing the linearity within the bandwidth, which will be discussed in the next trade-off and verified in the simulation results in Section 5.

#### 3.2.3 | Trade-off in the wideband-signal conversion

For a wideband signal, there usually exist high-frequency components and low-frequency components at the same time. Since the ISI errors are usually not the bottleneck of linearity at low frequencies, we can choose  $M_i$  whose optimum band covers the high-frequency components of the signal. In this way, the redundancy can also be reduced while the overall linearity can still be guaranteed. The performance with 0.5N redundant current sources and the same  $M_i$  is provided in Section 5 to reveal the effect of this trade-off.

To draw a conclusion, based on the signal frequency distribution and bottlenecks of SFDR performance at different frequencies, this subsection provides trade-offs in the single-tone-signal conversion, the narrowband-signal conversion, and the wideband-signal conversion. The provided trade-offs can help to reduce the number of redundant current sources while the overall linearity can still be guaranteed. The discussed trade-offs reveal the required redundancy and the energy consumption for different applications, which makes our switching strategy redundancy-bandwidth scalable. It should be noted that the attenuation of the signal swing could also help to reduce the redundancy. However, for a Nyquist DAC without the  $\Delta\Sigma$  modulation, the attenuation of the signal swing may also reduce the SFDR and the signal-to-noise-and-distortion ratio (SNDR) because of the reduced signal energy.

#### 3.3 | Comparisons with similar techniques

Since the operations of our proposed strategy have some similarities to those of the modified mismatch shaping (MMS) technique<sup>12</sup> and the dynamic element matching (DEM) techniques in Sanyal et al, <sup>15, 16</sup> it is necessary to make a comparison between these similar techniques.

Compared with the techniques of Shui et al<sup>12</sup> and Sanyal et al,<sup>15, 16</sup> the selection range of the signal-independent ETR is different in the proposed strategy, as depicted in Figure 7. In the MMS technique<sup>12</sup> and the DEM techniques,<sup>15, 16</sup> the range of the signal-independent ETR should follow the restriction in (19), which can be easily met in an audio DAC where the signal



**FIGURE 8** (A), Required signal attenuation to achieve the signal-independent ETR with different OSRs at different output frequencies. (B), Simulated SFDR with the same transistor-level model and the same signal-independent ETR at different sampling rates. As the sampling rate goes up, the distortion in analog circuits gets worse, which deteriorates the overall performance [Colour figure can be viewed at wileyonlinelibrary.com]

frequency is low and the high OSR can be realized with reasonable power consumption. However, for a Nyquist DAC operating at gigahertz, the high OSR will introduce undesired power consumption and strict timing requirement.<sup>10</sup> The reduced OSR will increase the value of the signal-independent ETR, which requires signal attenuation in the  $\Delta\Sigma$ -based techniques.<sup>12, 15, 16</sup> The required signal attenuation to achieve the signal-independent ETR with different OSRs at different output frequencies is illustrated in Figure 8A where the number of current sources is 64. As depicted in Figure 8A, in order to ensure that the signal attenuation is less than -3 dBFS within the entire Nyquist band, the minimal OSR is more than five. It means that, compared with a 3-GS/s Nyquist DAC with our proposed strategy, DACs with the  $\Delta\Sigma$ -based techniques<sup>12, 15, 16</sup> require the circuits to operate at more than 15 GS/s to realize the signal-independent ETR. Besides the power consumption and timing requirement, this high operation speed also deteriorates the performance of each switching unit (the current source and the complementary switches), such as the finite output impedance,<sup>3, 24</sup> and thus reduces the linearity. Figure 8B shows the transistor-level simulation result of a 14-bit DAC with the same signal-independent ETR at different operation speeds. As we can see from Figure 8B, when the sampling rate goes up, the SFDR decreases, although the ETR is signal-independent.

In the proposed method, the signal-independent ETR belongs to the range in (17), which requires no extra clock cycles or signal attenuation. Therefore, our strategy is more suitable for high-speed applications. Comparing (18) and (19), we can find that both the restriction on the upper bound and the lower bond of the input digital code can be relieved with redundancy in the proposed strategy. Increasing N in (19) can definitely reduce the restriction on the upper bound of the input digital code. However, as we can see from (19), the lower bound of the input digital code is still restricted to  $C_i/2$ . Although changing the offset of the digital signal can still increase the signal swing, this will introduce a non-zero offset into the differential output, which may not be desired in real applications and may deteriorate the common-mode rejection performance of the differential structure.

Notably, we provide the first rigorous analyses about the boundary of the redundancy, as depicted in Figure 6, which is critical for a redundancy-bandwidth scalable method to decide how much redundancy is required for specific applications with different high-dynamic-range bandwidth requirement. Compared with the  $\Delta\Sigma$ -based techniques <sup>12, 15, 16</sup> that provide trade-offs between the bandwidth and the OSR, this paper offers trade-offs between the bandwidth and the redundancy. In  $\Delta\Sigma$  modulator, the OSR is higher for high frequencies, leading to higher power consumption. However, in the proposed strategy, the redundancy for high frequencies can be as low as that for low frequencies, which helps to reduce the power consumption for high-speed applications.

Another essential difference between our proposed strategy and the technique of Sanyal et al<sup>16</sup> is that we propose to use the randomized ETR to introduce the randomness into static mismatches while the technique<sup>16</sup> realizes the randomization of static mismatches by randomly selecting the current sources. It is noted that randomly selecting the ETR also results in the randomly selecting the current sources. However, as will be analyzed in Section 4, the randomized ETR in our proposed strategy can be realized more efficiently than the random selection of current sources does. Although Sanyal et al<sup>16</sup> also mentions about



12

**FIGURE 10** SFDR and SNDR performance comparison. Since a wider bandwidth requires a higher OSR or more signal attenuation in the  $\Delta\Sigma$ -based technique<sup>16</sup> (See Figure 8A), only the results within  $0.15 f_{clk}$  bandwidth are compared, where  $f_{clk}$  represents the sampling rate of the proposal [Colour figure can be viewed at wileyonlinelibrary.com]

randomly selecting the ETR, that randomized operation introduces only +1 or -1 deviation from the constant ETR, which can hardly introduce effective randomness into the static mismatches in a multi-bit DAC. Besides, the wider range of ETR in Sanyal et al<sup>16</sup> is, the more signal attenuation will be. Therefore, randomly selecting the ETR may not be an optimum choice when the technique in Sanyal et al<sup>16</sup> is used.

Our strategy has the best noise performance among the non-oversampling techniques with the signal-independent ETR, such as the randomized return-to-zero (RZ) techniques.<sup>1-4</sup> Meanwhile, it should be noted that, as oversampling and noise shaping are not adopted, the in-band noise performance of our strategy may not be as good as those with the  $\Delta\Sigma$  techniques (see Figure 9), such as the techniques in Shui et al<sup>12</sup> and Sanyal et al.<sup>15, 16</sup> According to Clara<sup>40</sup> and Wikner,<sup>41</sup> the theoretical maximum SNR of a *L*th-order noise shaper is

$$SNR_{\max,L} = 6.02P + 1.76 + (20L + 10)\log_{10}(OSR) - 10\log_{10}\left(\frac{\pi^{2L}}{2L+1}\right) (dB),$$
(26)

which shows that doubling the OSR improves the in-band SNR by nearly 6L + 3 dB. However, at the same operation rate, doubling the OSR also lowers the signal bandwidth by half. In addition, according to Sanyal et al, <sup>16</sup> the higher the max{|NTF(*f*)|} is, the lower the signal swing will be. On the other hand, for the same signal bandwidth, using a higher OSR needs a higher sampling rate and consequently causes more severe distortion degradation due to more nonlinear switching operations. Sometimes these deteriorated distortions may even dominate the SNDR performance, resulting in reduced SNDR improvement by further increasing the OSR, as depicted in Figure 10B. Therefore, this is essentially a trade-off between the signal bandwidth, the noise performance, and the linearity. Figure 10 provides the simulated in-band SFDR and SNDR performance that supports the above comparisons between the proposal and the  $\Delta\Sigma$ -based technique.<sup>16</sup>

Compared with the area and power consumption of a traditional Nyquist DAC, the consumption from the redundancy is another drawback of the proposed method. However, compared with the techniques in Shui et al<sup>12</sup> and Sanyal et al,<sup>15, 16</sup> the reduced operation speed and the simplified implementation which will be introduced in Section 4 guarantee the advantage of the



**FIGURE 11** (A), Basic concept of the proposed decoder. (B), Illustration of the calculation of  $H_n$  [Colour figure can be viewed at wileyonlinelibrary.com]

proposed techniques in high-speed applications. Besides, the redundancy-bandwidth scalability of the proposed strategy also helps to reduce the area and power consumption.

## 4 | PROPOSED OC-DWA DECODER

This section presents a decoder, named as OC-DWA, to implement our proposed switching strategy. This decoder can achieve the signal-independent ETR without oversampling or signal attenuation. Compared with the vector-quantizer-based (VQ-based) structures in the  $\Delta\Sigma$ -based techniques, <sup>12–16</sup> this decoder needs no complex sorting, either. Due to these features, the proposed OC-DWA decoder can be realized more efficiently, which makes it more appealing for high-speed applications. Considering the cost from redundant current sources, we provide comprehensive comparisons on the area and power consumption between the DAC with the proposed OC-DWA decoder and the  $\Delta\Sigma$  DAC with the VQ-based DEM implementation in Sanyal et al.<sup>16</sup>

#### 4.1 | Structure and operation principle

Figure 11A shows the basic concept of the OC-DWA decoder. In our proposed decoder, the decoding operation is similar to that of a data-weighted averaging (DWA) decoder except that there are redundant control signals for the redundant current sources, and that the calculation of the start point of the 1's in the control vector is also different. In the original DWA decoder, the start point of the 1's in the next clock cycle is the end point of the 1's in the previous clock cycle. This feature of the DWA decoder cannot achieve the signal-independent ETR, as it cannot guarantee (11). In order to reuse  $K_i$  1's from  $S_i$ , as mentioned in step 3) in our proposed strategy, we make modification to the calculation of the start point in the traditional DWA decoder to meet the requirement in (11). Since the number of the overlapped 1's is controlled to meet (11), we call this decoder the overlap-controlled DWA, abbreviated as OC-DWA.

In our proposed decoder, four registers  $(H_p, H_n, E_p, \text{ and } E_n)$  are used to record the information of the control vector.  $H_p$ and  $E_p$  record the start and end point of the 1's in the previous control vector, respectively, while  $H_n$  and  $E_n$  record the similar information for the next control vector. As depicted in Figure 11B, in order to reuse  $K_i$  1's in the previous control vector  $S_i$ , the start point of 1's in  $S_{i+1}$  is  $E_p - K_i + 1$  which is equivalent to  $D_{i-1} + R - K_i + H_p$ . Due to the finite number of current sources, the mod operation which can be realized with barrel rotator is used in the calculation of  $H_n$ . Therefore, we have

$$H_n = (E_p - K_i + 1) \mod (N + 2R)$$
  
=  $(D_{i-1} + R - K_i + H_p) \mod (N + 2R).$  (27)

Figure 12 provides an operation example for a 3-bit OC-DWA decoder. In this example, the number of redundant current sources is eight and the selected  $M^{SI}$  is zero. The filled boxes represent those current sources connected to the positive output, and the corresponding element in the control vector equals one. At the (i - 2)th clock cycle, the input digital code is two and



**FIGURE 12** Operation example of a 3-bit OC-DWA decoder [Colour figure can be viewed at wileyonlinelibrary.com]

the number of 1's in  $S_{i-2}$  is six from 2 + 8/2. For the transition from  $D_{i-2}$  to  $D_{i-1}$ , the calculated value of  $K_{i-1}$  is four, and the value of  $E_p$  is six. To realize the calculated  $K_{i-1}$ , we use (27) to calculate the start point of 1's in  $S_{i-1}$ . As we can see from the second row in Figure 12, it is three from 6 - 4 + 1 or 2 + 4 - 4 + 1. Once the start point of 1's for  $S_{i-1}$  is obtained, the barrel rotator in Figure 11A can realize the selection of current sources with certain steps of cyclic shift. In Figure 12, the number of blue arrows represents the ETR which is configured to be eight. Note that from  $D_i$  to  $D_{i+1}$  or  $D_{i+1}$  to  $D_{i+2}$ , the calculated value of  $K_{i+1}$  or  $K_{i+2}$  is 3.5 and the nearest integer three or four is used. As depicted in Figure 12, the OC-DWA decoder ensures a constant ETR, neglecting the small variation due to the non-integer  $K_i$ . Although the deviation of the ETR is +1 or -1 due to the approximation of the non-integer  $K_i$ , this deviation can be negligible for a large N.

According to Section 3, the switching strategy can also achieve a random ETR by randomly selecting  $M^{SI}$  from the range in (15) or (22). The random selection of  $M^{SI}$  can be realized with a pseudo-random number generator (PRNG).<sup>3</sup> As analyzed in Section 3, this random operation can also help to deal with the static mismatches of current sources. The simulation results in Section 5 will verify the effectiveness of this decoder to deal with the ISI errors and the static mismatches.

### 4.2 | Hardware comparison and redundancy-bandwidth scalable configurations

Figure 13 shows the architecture of a  $\Delta\Sigma$  DAC with the DEM technique in Sanyal et al<sup>16</sup> and the standard vector quantizer diagram where a sorter is required.<sup>15, 16</sup> The main difference between the implementation of the techniques in Sanyal et al<sup>15, 16</sup> and the OC-DWA decoder is the current sources selection logic, i.e. the sorter and the barrel rotator which are the most complex part in the decoding circuits. The smallest-area sorter can be built with only one serial comparison-exchange comparator module that needs more comparison and exchange steps. According to Thompson et al,<sup>30</sup> the area-time complexity of sorting can be evaluated by the area\*time<sup>2</sup> metric. In order to make a fair comparison, both the "area" and "time" are normalized with the basic operation unit and time unit.<sup>30</sup> Although there exist a lot of VLSI solutions for a complete sorting, there exists a lower bound for the area\*time<sup>2</sup> metric which is  $\Omega^1(N^2 \log_2 N)$ .<sup>30</sup> In the OC-DWA decoder, the area\*time<sup>2</sup> metric of the barrel rotator is  $\Omega((\log_2 N)^2 N)$ . For the same number of current sources, the area-time complexity of the current sources selection logic of the OC-DWA decoder is  $N/\log_2 N$  times lower than the VQ-based implementations.<sup>15, 16</sup>

For high-speed applications, the sorter is usually realized with pipelined comparators each of which has two input words and two output words.<sup>30–33</sup> According to Yasuda et al,<sup>32</sup>  $N\log_2 N(\log_2 N + 1)/4$  comparators are required to realize a fast complete sorting. In the OC-DWA decoder, the sorter is replaced by the barrel rotator that needs only  $N\log_2 N$  multiplexers which has two 1-bit inputs and one 1-bit output. We assume that the hardware complexity of the comparator with  $\log_2 N$ -bits input width is  $\log_2 N$  times that of the multiplexer used in the OC-DWA decoder. Then, for the same number of current sources, the hardware complexity of the current sources selection logic is nearly  $(\log_2 N)^2/4$  times lower than the existing VQ-based implementations.<sup>12–16</sup> This is not a precise comparison since we do not take the consumption of PRNG in the  $M^{S1}$  generator and the  $H_n$  calculator into account. This is because that they are not the dominating consumption, and that they are usually more cost-friendly than the comparator and the loop filter depicted in Figure 13. Figure 14 shows an example of a vector quantizer and the barrel rotator with seven current sources (3-bit). As we can see from Figure 14, the complexity of the propose OC-DWA

 $<sup>^{1}\</sup>Omega$  represents the asympttic lower bound.  $^{30}$ 



**FIGURE 13** Architecture of a  $\Delta\Sigma$  DAC with the DEM technique in Sanyal et al<sup>16</sup> [Colour figure can be viewed at wileyonlinelibrary.com]

**FIGURE 14** (A), Example of a 3-bit barrel rotator with 21 2-1 multiplexers each of which has 1-bit input width.<sup>1, 4</sup> (B), Example of a 3-bit pipelined sorter with 16 2-2 comparator each of which has 3-bit input width.<sup>31</sup> A 2-2 comparator with 3-bit input width requires at least 3 times area than a 2-1 multiplexers with 1-bit width. Therefore, the area of the 3-bit sorter is at least two times the size of the 3-bit barrel rotator

decoder is indeed much simpler than that of the VQ-based implementations. The low complexity of digital circuits makes the proposed OC-DWA decoder more suitable for high-speed applications. The complexity reduction approaches with the splitter in Sanyal et al <sup>16</sup> and Galton <sup>34</sup> can also be used to further reduce the complexity of the rotator in the OC-DWA decoder.

It should be noted that the above comparisons are based on the same number of current sources. Since the proposed strategy requires a certain number of redundant current sources and the  $\Delta\Sigma$  DAC may use less number of current sources than a traditional Nyquist DAC, the overall area consumption of the DAC with the OC-DWA decoder may not be lower than that of the  $\Delta\Sigma$  DAC with the DEM technique in Sanyal et al.<sup>16</sup> In order to clarify the advantage of the proposed OC-DWA decoder, we provide comprehensive comparisons about the overall area and power consumption of the DACs with these two techniques. The comparisons are based on two basic configurations of the proposed OC-DWA decoder.

# **4.2.1** | Configuration with the highest dynamic range performance

In this configuration, the OC-DWA decoder operates at the normal sampling rate and the input of the OC-DWA decoder is the input digital code of the DAC. Although the area cost of this configuration is the worst case, the low operation speed offers the highest dynamic performance. The analog area of the DAC with the OC-DWA decoder is definitely large than that of the  $\Delta\Sigma$  DAC with the techniques in Shui et al <sup>12</sup> and Sanyal et al. <sup>15, 16</sup> However, the digital area of the DAC with the OC-DWA decoder



**FIGURE 15** Normalized total area against the number of current sources at different ratios of the analog area to the digital area (represented as  $\eta$ ) [Colour figure can be viewed at wileyonlinelibrary.com]

can be lower. Thus, the overall area comparison is dependent on the truncation and the ratio between the analog area and the digital area in the  $\Delta\Sigma$  DAC. In order to provide a numerical comparison, we define a parameter, represented as  $\eta$ . The value of  $\eta$  equals the ratio of the analog area to the digital area in the  $\Delta\Sigma$  DAC. We assume that the area of digital circuits in a  $\Delta\Sigma$  DAC is proportional to  $N\log_2 N(\log_2 N + 1)/4$ . Figure 15 shows the total area consumption of the DAC with the OC-DWA decoder and the  $\Delta\Sigma$  DAC with the VQ-based DEM technique<sup>16</sup> under different  $\eta$ . In Figure 15, the analog area of the DAC with the OC-DWA decoder is assumed to be doubled, which is the worst case of the proposed strategy. Besides, no redundancy is added into the  $\Delta\Sigma$  DAC. As we can see from Figure 15, when  $\eta$  is lower than 0.6, the total area could be smaller with our techniques. Although this low  $\eta$  may require calibration to address the mismatch problem, the reduced area of analog circuits also helps to reduce parasitic capacitance, which is widely used in high-speed DACs.<sup>8-10, 35-37</sup>

As for the power consumption, the reduced operation rate reduces the dynamic power from digital circuits while the redundancy increases the static power of the DAC with OC-DWA decoder. For high-speed DACs, the dynamic power is usually much higher than the static power. In order to provide a numerical comparison, we define another parameter, represented as  $\beta$ . The representation of  $\beta$  is

$$\beta = \frac{P_{\text{swing}}}{P_{\text{dynamic}} + P_{\text{static}}}.$$
(28)

where  $P_{dynamic}$  and  $P_{static}$  is the dynamic power and the static power of the  $\Delta\Sigma$  DAC with the technique in Sanyal et al.<sup>16</sup>  $P_{swing}$  is the product of the output current swing and the analog power supply. Ignoring the advantage of the digital hardware reduction of the OC-DWA decoder, we approximately calculate the power of the DAC with the OC-DWA decoder as

$$P_{\text{OC-DWA}} = \left[\frac{P_{\text{dynamic}}}{\text{OSR}} + P_{\text{static}}\right] * 2 + P_{\text{swing}}.$$
(29)

In (29), doubled number of current sources are used in the DAC with the OC-DWA decoder, which introduces extra  $P_{swing}$ . Combining (28) and (29) gives

$$P_{\text{OC-DWA}} = \left(\frac{2}{\text{OSR}} + \beta\right) P_{\text{dynamic}} + (2 + \beta) P_{\text{static}}.$$
(30)

For a high-speed CMOS current-steering DAC, the static power mainly comes from the leakage of transistors, the bandgap reference, the output current, the bias circuit, etc. Typically, the static power is dominated by the output current.<sup>38</sup> Therefore, we assume that the static power equals the value of  $P_{swing}$ . In the state-of-the-art high-speed designs<sup>9, 10, 19</sup> that utilize  $\Delta\Sigma$  modulation, the value of  $\beta$  is less than 0.2, while the ratio of  $P_{static}$  to  $P_{dynamic}$  is less than 0.3. Let us assume that the value of  $\beta$  equals 0.2 while the ratio of  $P_{static}$  to  $P_{dynamic}$  is 0.3. Then we have

$$P_{\text{OC-DWA}} = \left(\frac{2}{\text{OSR}} + 0.86\right) P_{\text{dynamic}},\tag{31a}$$

$$P_{\rm VQ-based} = 1.3 P_{\rm dynamic}.$$
 (31b)

where  $P_{\text{VQ-based}}$  is the power consumption of the  $\Delta\Sigma$  DAC with the technique in Sanyal et al.<sup>16</sup> According to the analyses in Section 3, for a  $\Delta\Sigma$  DAC with the technique in Sanyal et al,<sup>16</sup> the required OSR is more than five to keep the signal swing attenuation lower than -3 dB. Even if the OSR is as low as five, the value of  $P_{\text{OC-DWA}}$  is still lower than that of  $P_{\text{VQ-based}}$ . Therefore,



**FIGURE 16** (A), Simulated ERT at a low frequency point. (B), Simulated ETR at a high frequency point. (C), Energy of the highest harmonic in the FFT of the ETR versus the signal frequency [Colour figure can be viewed at wileyonlinelibrary.com]

when used in high-speed applications, the DAC with the OC-DWA decoder under this configuration is more promising with lower energy consumption.

#### 4.2.2 | Configuration with the lowest area consumption

Combing the  $\Delta\Sigma$  modulator in Figure 13 with the proposed OC-DWA decoder, we can realize a DAC with the lowest area consumption. In this configuration, the input digital code of the DAC is first fed into the  $\Delta\Sigma$  modulator. The input of the OC-DWA decoder is the output of the  $\Delta\Sigma$  modulator. Although the performance of the analog circuits, such as less settling time for current sources, is reduced from the oversampling operation, this configuration allows the OC-DWA decoder to use the same number of current sources as the techniques in the  $\Delta\Sigma$ -based techniques.<sup>12–16</sup> In this configuration, the simplified current source selection logic (the barrel rotator) further reduces the area and energy consumption. However, the benefit from area and energy reduction comes at the cost of the reduced randomness, since the range of the signal-independent ETR is reduced. Therefore, calibration might be required in this configuration to address the mismatch problem.

To draw a conclusion, this section has presented an area- and power-efficient implementation of the proposed strategy. Although the redundancy of the proposed strategy introduces extra area and power consumption, the simplified digital circuits and the reduced operation speed guarantee the overall performance of the DAC with our proposed techniques. Comprehensive comparisons between the OC-DWA decoder and the VQ-based implementations<sup>12–16</sup> verify the advantages of our proposed techniques in high-speed applications over the signal-independent switching techniques<sup>12–16</sup> used in  $\Delta\Sigma$  DACs.

### **5** | SIMULATION RESULTS

This section provides the simulation results of the proposed switching strategy. The performance comparisons with some similar techniques are further discussed in this section.

Figure 16A and Figure 16B show the simulated ETR of a 12-bit DAC using the proposed switching strategy. In this simulation, the digital data are the samples of a sine signal and the number of sampling points is 1024. Figure 16C shows the energy of the highest harmonic in the FFT of the ETR from low frequencies to high frequencies. If the ETR is signal-dependent, the FFT of the ETR will contain significant signal tones. As depicted in Figure 16A and Figure 16B, the proposed switching strategy has a signal-independent ETR, while the ETRs of the other three decoding methods are signal-dependent. Therefore, as shown in Figure 16C, the proposed switching strategy yields much lower harmonic energy than the others.

Simulations in the rest of this section are based on the spectre model of the transistor in 65nm CMOS process. The simulated DAC is constructed with transistors and has 14-bit resolution with 6 unary-weighted most-significant bits (MSBs), 4 unary-weighted upper-least-significant bits (ULSBs), and 4 binary-weighted last-significant bits (LSBs). The switching strategy is applied to MSBs and ULSBs.

Figure 17 shows the simulated SFDR and SNDR of a 14-bit 1-GS/s DAC with different techniques. Compared with the traditional DAC, the total number of current sources in our proposed strategy is doubled and the ETR is configured to be constant



**FIGURE 18** (A), Values of the selected varying  $M_i$ . (B), Simulated SFDR of the 14-bit 1-GS/s DAC using our proposed switching strategy with 0.5*N* redundant current sources and varying  $M_i$  [Colour figure can be viewed at wileyonlinelibrary.com]

in this simulation. To reveal the impact of the signal-dependent ETR on SFDR, we do not add static mismatches or timing mismatches. As depicted in Figure 17, due to the ISI distortion, the linearity of thermometer decoding drops rapidly when the frequency goes up. The signal-independent ETR significantly improves the SFDR. Among the compared techniques with signal-independent ETRs used in Nyquist DACs, the proposed switching strategy shows a higher SNDR than other techniques. It is because that the randomized operations in the time-relaxed interleaving digital-random-return-to-zero (TRI-DRRZ)<sup>3</sup> and the time-relaxed interleaving DEMRZ (TRI-DEMRZ)<sup>4</sup> usually increase the noise floor. Therefore, the SNDR of these two techniques is lower than that of our proposed switching strategy with a constant ETR. At low output frequencies where the ISI distortion is not the bottleneck, the increased noise floor may also affect the SFDR. As we can see from Figure 17, the SFDR of the DAC with the proposed strategy is higher than that of the DAC with TRI-DEMRZ and TRI-DRRZ at low output frequencies.

Figure 18 shows the simulation results of a 14-bit 1-GS/s DAC using our proposed switching strategy with fewer redundant current sources and varying  $M_i$ . In this simulation, the number of redundant current sources is 0.5N. Three different sets of varying  $M_i$  are simulated separately. The three sets of  $M_i$  for MSBs are depicted in Figure 18A. Note that the reduced number of redundant current sources reduces the dc offset of the single output and increases the voltage headroom for stacked transistors. In this simulation, we adjust the voltage for the case with N redundant current sources and constant  $M_i$  to keep the voltage headroom for these two different cases the same. In this way, the glitches in these two cases are also the same and the only difference is the ETR. The provided analyses in Section 3 have revealed that, with  $(\sqrt{2} - 1)N$  redundant current sources, the signal-independent ETR can still be realized for the single-tone-signal conversion. As we can see from Figure 18B, all the cases with different sets of  $M_i$  within the signal-independent range still show significant SFDR improvement as the case with N redundant current sources and constant  $M_i$  does, which verifies the feasibility to reduce the redundancy for the single-tone-signal conversion.



**FIGURE 19** (A), Values of the selected constant  $M_i$ . (B), Simulated SFDR of the 14-bit 1-GS/s DAC using our proposed switching strategy with 0.5*N* redundant current sources and constant  $M_i$  [Colour figure can be viewed at wileyonlinelibrary.com]

Figure 19 shows the simulation results of a 14-bit 1-GS/s DAC using our proposed switching strategy with fewer redundant current sources and constant  $M_i$ . In this simulation, all the configurations are the same with those of the simulation in Figure 18 except that the selected  $M_i$  for each SFDR curve is constant. The selected  $M_i$  for MSBs is depicted in Figure 19A. The analyses in Section 3 have revealed that our switching strategy in this case can still achieve the signal-independent ETR within the optimum band of the selected  $M_i$ . As depicted in Figure 19A, when  $M_i$  increases, the corresponding optimum band increases to high frequencies. As we can see from Figure 19B, since the optimum band of  $M_i = 0$  dose not cover high frequencies, the SFDR is relatively low at high frequencies, where the ISI distortion is the bottleneck of SFDR performance. When  $M_i$  increases from 20 to 40, the corresponding SFDR tends to be higher at high frequencies and lower at low frequencies. Since the ISI distortion is usually not the bottleneck at low frequencies, the SFDR of  $M_i = 0$ , the SFDR of  $M_i = 40$  is 9 dB lower than that of  $M_i = 0$  at low frequencies but 18 dB higher at high frequencies. The results of this simulation are consistent with the discussion about the optimum band in Section 3 and confirm the feasibility to further reduce the number of current sources based on different bottlenecks of SFDR performance and the signal frequency distribution.

Figure 20 shows the simulation results of a 14-bit 1-GS/s DAC using our proposed switching strategy with N redundant current sources and a 0.1% standard deviation of current-source mismatch. In order to verify the effectiveness of the proposed switching strategy in dealing with static mismatches, we utilize two randomized operations mentioned in Section 3. In one of the randomized operations, the  $K_i$  reused 1's are chosen randomly while  $M_i$  is constant. In the other randomized operation, the OC-DWA decoder is utilized with  $M_i$  being randomly selected from the range in (23). Compared with the case without the randomized operation, the cases with the randomized operations improve the SFDR by 6 dB at low frequencies, which verifies the effectiveness of the proposed switching strategy in dealing with the static mismatches. As we can see from Figure 20, both randomized operations show effective SFDR enhancement at low frequencies. It is obvious that the decoding will be more complicated to realize the random selection of the reused 1's than the random rotation with different  $M_i$  in the OC-DWA decoder.

Figure 21 shows the simulation results of a 14-bit 1-GS/s DAC using our proposed switching strategy with 0.5*N* redundant current sources and constant  $M_i$ . In this simulation, the signal swing is attenuated to -6 dBFS, -12 dBFS, and -18 dBFS separately. According to (13), for the signal attenuated to -6 dBFS, the required number of redundant current sources to achieve the completely signal-independent ETR could be reduced to 0.5*N*. When the attenuation of the signal changes from -6 dBFS to -12 dBFS, the effective number of input digital bits decreases by 1 bit. As depicted in Figure 21, every 1-bit loss in the effective number of input digital bits reduces the SNDR by nearly 6 dB. According to the relationship between the SNDR and the resolution, we can see that the linearity of the proposed switching strategy is still guaranteed when the swing changes. Since the proposed strategy dose not utilize the  $\Delta\Sigma$  modulator to increase the in-band SNDR, the SFDR and SNDR will decrease when the signal swing decreases.



**FIGURE 20** Simulated SFDR of a 14-bit 1-GS/s DAC using our proposed switching strategy with two kinds of randomized operations and a 0.1% standard deviation of current-source mismatch [Colour figure can be viewed at wileyon-linelibrary.com]

**FIGURE 21** Simulated SFDR and SNDR of a 14-bit 1-GS/s DAC using our proposed switching strategy with 0.5*N* redundant current sources and different signal swing [Colour figure can be viewed at wileyonlinelibrary.com]

## 6 | CONCLUSION

This paper proposes a switching strategy for current-steering DACs with redundant current sources to achieve a signalindependent element transition rate (ETR) and thus a high dynamic-linearity range. Compared with existing techniques in  $\Delta\Sigma$  DACs for a signal-independent ETR, the proposed switching strategy can be used in Nyquist DACs without restriction on the lower and upper bounds of the input digital code. Mathematical analysis provides new insights into the range of the signal-independent ETR and the number of required redundant current sources, which further leads to three practical redundancy-bandwidth scalable trade-offs under three common scenarios. These trade-offs avoid excessive use of redundant switching current sources while ensuring signal-independent switching for a high dynamic-linearity range. A low-complexity low-power decoder, named as OC-DWA, is proposed to realize the switching strategy. Besides the signal-independent ETR, the OC-DWA decoder can also deal with the static mismatches by randomly selecting the ETR from the revealed ETR range. The relaxed clocking requirement and the low decoding complexity make the OC-DWA decoder appealing for high-speed applications. Simulation results verify the effectiveness in enhancing the DAC linearity with the redundancy-bandwidth scalability of the proposed techniques.

## ACKNOWLEDGMENTS

The authors would like to thank Hua Fan from State Key Laboratory of Electronic Thin Films and Integrated Devices, School of Microelectronics and Solid-State Electronics, University of Electronic Science and Technology of China, Chengdu, China for useful discussions.

# ORCID

Longgqiang Lai, https://orcid.org/0000-0001-8654-3695 Xueqing Li, https://orcid.org/0000-0002-8051-3345

# References

- Lin W-T, Kuo T-H. A 12b 1.6 GS/s 40 mW DAC in 40 nm CMOS with > 70 dB SFDR over entire Nyquist bandwidth. Proceeding IEEE International Solid-State Circuits Conference 2013; 474–475.
- 2. Tseng W-H, Fan C-W, Wu J-T. A 12-bit 1.25-GS/s DAC in 90 nm CMOS With > 70 dB SFDR up to 500 MHz. *IEEE Journal of Solid-State Circuits*. 2011; 46(12):2845–2856.
- Li X, Wei Q, Xu Z, Liu J, Wang H, Yang H. A 14 bit 500 MS/s CMOS DAC using complementary switched current sources and time-relaxed interleaving DRRZ. *IEEE Transactions on Circuits and Systems-I: Regular Papers*. 2014; 61(8):2337– 2347.
- 4. Liu J, Li X, Wei Q, Yang H. A 14-bit 1.0-GS/s dynamic element matching DAC with > 80 dB SFDR up to the Nyquist. Proceeding IEEE International Symposium on Circuits and Systems 2015; 1026–1029.
- Park S, Kim G, Park S-C, Kim W. A digital-to-analog converter based on differential-quad switching. *IEEE Journal of Solid-State Circuits*. 2002; 37(10):1335–1338.
- Schafferer B, Adams R. A 3 V CMOS 400 mW 14b 1.4 GS/s DAC for multi-carrier applications. Proceeding IEEE International Solid-State Circuits Conference 2004; 360–532.
- Engel G, Kuo S, Rose S. A 14b 3/6 GHz current-steering RF DAC in 0.18 μm CMOS with 66 dB ACLR at 2.9 GHz. Proceeding IEEE International Solid-State Circuits Conference 2012; 458–460.
- Engel G, Clara M, Zhu H, Wilkins P. A 16-bit 10 Gsps current steering RF DAC in 65 nm CMOS achieving 65 dBc ACLR multi-carrier performance at 4.5 GHz Fout. Symposium on VLSI Circuits Technical Digest Papers 2015; C166–C167.
- 9. Su S, Chen M S-W. A 12-bit 2 GS/s dual-rate hybrid DAC with pulse-error pre-distortion and in-band noise cancellation achieving > 74 dBc SFDR and < -80 dBc IM3 up to 1 GHz in 65 nm CMOS. *IEEE Journal of Solid-State Circuits*. 2016; 51(12):2963–2978.
- 10. Su S, Tsai T I, Sharma P K, Chen M S-W. A 12 bit 1 GS/s dual-rate hybrid DAC with an 8 GS/s unrolled pipeline delta-sigma modulator achieving > 75 dB SFDR over the Nyquist band. *IEEE Journal of Solid-State Circuits*. 2015; 50(4):896–907.
- 11. Lin C-H, Goes F M L, Westra J R, et al. A 12 bit 2.9 GS/s DAC With IM3 < -60 dBc Beyond 1 GHz in 65 nm CMOS. *IEEE Journal of Solid-State Circuits*. 2009; 44(12):3285–3293.
- 12. Shui T, Schreier R, Hudson F. Mismatch shaping for a current-mode multibit delta-sigma DAC. *IEEE Journal of Solid-State Circuits*. 1999; 34(3):331–338.
- Risbo L, Hezar R, Kelleci B, Kiper H, Fares M. Digital approaches to ISI-mitigation in high-resolution oversampled multilevel D/A converters. *IEEE Journal of Solid-State Circuits*. 2011; 46(12):2892–2903.
- 14. Sanyal A, Sun N. An enhanced ISI shaping technique for multi-bit  $\Delta\Sigma$  DACs. Proceeding IEEE International Symposium on Circuits and Systems 2014; 2341–2344.
- 15. Sanyal A, Sun N. Dynamic element matching techniques for static and dynamic errors in continuous-time multi-bit  $\Delta\Sigma$  modulators. *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*. 2015; 5(4):598–611.
- 16. Sanyal A, Chen L, Sun N. Dynamic element matching with signal-independent element transition rates for multibit  $\Delta\Sigma$  modulators. *IEEE Transactions on Circuits and Systems-I: Regular Papers*. 2015; 62(5):1325–1334.

21

22

- 17. Porrazzo S, Morgado A, San Segundo Bello D, et al. A design methodology for power-efficient reconfigurable SC  $\Delta\Sigma$  modulators. *International Journal of Circuit Theory and Applications*. 2015; 43(8):1024–1041.
- 18. Sabouhi V, Aghdam E N, Saeedi S. A single-bit continuous-time delta-sigma modulator using clock-jitter and inter-symbolinterference suppression technique. *International Journal of Circuit Theory and Applications*. 2017; 45(1):63–82.
- McCue J J, Dupaix B, Duncan L, et al. A time-interleaved multimode ΔΣ RF-DAC for direct digital-to-RF synthesis. *IEEE Journal of Solid-State Circuits*. 2016; 51(5):1109–1124.
- Bhide A, Najari O E, Mesgarzadeh B, Alvandpour A. An 8-GS/s 200-MHz bandwidth 68-mW ΔΣ DAC in 65-nm CMOS. IEEE Transactions on Circuits and Systems-II: Express Briefs. 2013; 60(7):387–391.
- 21. Grasso A D, Mirabella C A, Pennisi S. CMOS current-steering DAC architectures based on the triple-tail cell. *International Journal of Circuit Theory and Applications*. 2008; 36(3):233–246.
- Olieman E, Annema A-J, Nauta B. A 110 mW, 0.04 mm<sup>2</sup>, 11 GS/s 9-bit interleaved DAC in 28 nm FDSOI with > 50 dB SFDR across Nyquist. Symposium on VLSI Circuits Technical Digest Papers 2014; 1–2.
- Olieman E, Annema A-J, Nauta B. An interleaved full Nyquist high-speed DAC technique. *IEEE Journal of Solid-State Circuits*. 2015; 50(3):704–713.
- 24. Li X, Wei Q, Yang H. Code-independent output impedance: a new approach to increasing the linearity of current-steering DACs. Proceeding IEEE International Conference on Electronics Circuits and Systems 2011; 216–219.
- 25. Kreutz D, Ramos F M V, Verissimo P, et al. Software-defined networking: a comprehensive survey. *Proceedings of the IEEE*. 2015; 103(1):14–76.
- 26. Veyrac Y, Rivet F, Deval Y, et al. A 65-nm CMOS DAC based on a differentiating arbitrary waveform generator architecture for 5G handset transmitter. *IEEE Transactions on Circuits and Systems-II: Express Briefs*. 2016; 63(1):104–108.
- 27. Sun H, Nallanathan A, Wang C-X, Chen Y. Wideband spectrum sensing for cognitive radio networks: a survey. *IEEE Wireless Communications*. 2013; 20(2):74–81.
- 28. Rappaport T S, Sun S, Mayzus R, et al. Millimeter wave mobile communications for 5G cellular: It will work!. *IEEE Access*. 2013; 1:335–349.
- 29. Bechthum E, Radulov G I, Briaire J, et al. A wideband RF mixing-DAC achieving IMD < -82 dBc up to 1.9 GHz. *IEEE Journal of Solid-State Circuits*. 2016; 51(6):1374–1384.
- 30. Thompson C D. The VLSI complexity of sorting. IEEE Transactions on Computers. 1983; C32(12):1171-1184.
- Codish M, Cruz-Filipe L, Frank M, Schneider-Kamp P. Twenty-five comparators is optimal when sorting nine inputs (and twenty-nine for ten). Proceeding IEEE International Conference on Tools with Artificial Intelligence 2014; 186–193.
- Yasuda A, Tanimoto H, Iida T. A third-order Δ-Σ modulator using second-order noise-shaping dynamic element matching. *IEEE Journal of Solid-State Circuits*. 1998; 33(12):1879–1886.
- 33. Batcher K E. Sorting networks and their applications. Proceeding ACM Spring Joint Computer Conference 1968; 307–314.
- 34. Galton I. Spectral shaping of circuit errors in digital-to-analog converters. *IEEE Transactions on Circuits and Systems-II:* Analog and Digital Signal Processing. 1997; 44(10):808–817.
- Sande F, Lugil N, Demarsin F, et al. A 7.2 GSa/s, 14 bit or 12 GSa/s, 12 bit signal generator on a chip in a 165 GHz f<sub>T</sub> BiCMOS process. *IEEE Journal of Solid-State Circuits*. 2012; 47(4):1003–1012.
- 36. Wang R, You Y, Wu G, et al. A 150 MHz bandwidth continuous-time  $\Delta\Sigma$  modulator in 28 nm CMOS with DAC calibration. Proceeding IEEE International Midwest Symposium on Circuits and Systems 2015; 1–4.
- 37. Duncan L, Dupaix B, McCue J J, et al. A 10-bit DC-20-GHz multiple-return-to-zero DAC with > 48-dB SFDR. *IEEE Journal of Solid-State Circuits*. 2017; 52(12):3262–3275.

- 39. Bhide A, Alvandpour A. An 11 GS/s 1.1 GHz bandwidth interleaved ΔΣ DAC for 60 GHz radio in 65 nm CMOS. *IEEE Journal of Solid-State Circuits*. 2015; 50(10):2306–2318.
- 40. Clara M. High-performance D/A-converters: application to digital transceivers. Berlin, Germany: Springer-Verlag, 2013.
- 41. Wikner J. Studies on CMOS digital-to-analog converters. Ph.D. dissertation, University of Linkping, 2001.

**How to cite this article:** Lai Longqiang, Li Xueqing, and Yang Huazhong (xxxx), Redundancy-bandwidth scalable techniques for signal-independent element transition rates in high-speed current-steering DACs, *Int J Circ Theor Appl.*, *xxx;xxx:xx–xx*.