Exploiting Floating-Gate Transistor-Ozalevli Erhan 200612 PHD
Exploiting Floating-Gate Transistor-Ozalevli Erhan 200612 PHD
Exploiting Floating-Gate Transistor-Ozalevli Erhan 200612 PHD
CIRCUIT DESIGN
A Dissertation
Presented to
The Academic Faculty
By
Erhan Özalevli
In Partial Fulfillment
of the Requirements for the Degree
Doctor of Philosophy
in
Electrical and Computer Engineering
CIRCUIT DESIGN
Approved by:
To my family...
ACKNOWLEDGEMENTS
I would like to thank my family for their endless support and love through all my endeavors.
I wish to express my sincere gratitude to my advisor Dr. Hasler for his support through-
out my PhD. I am also grateful to Dr. Higgins, Dr. Anderson, and Dr. Ayazi for serving in
Also, I would like to thank all the members of the CADSP Lab for a pleasant and
Kofi, Thomas, Gail, David, Ryan, Degs, Huseyin, and Jenny for their friendship and sup-
port. Lastly, I would like to thank Serdar, Koray, Yakup, Gunay, Zafer, and Menderes for
iv
TABLE OF CONTENTS
DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
v
CHAPTER 4 A TUNABLE FLOATING-GATE CMOS RESISTOR USING SCALED-
GATE LINEARIZATION TECHNIQUE . . . . . . . . . . . . . 37
4.1 Scaled-gate linearization technique . . . . . . . . . . . . . . . . . . . . . 38
4.2 Circuit Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
vi
CHAPTER 10 IMPACTS AND APPLICATIONS OF THE PRESENTED WORK103
10.1 Impacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
10.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
10.2.1 Tunable resistors . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.2.2 Epot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.2.3 Mixed-signal implementation of the distributed arithmetic . . . . . 107
vii
LIST OF TABLES
Table 3 Speed comparison of the BWCDAC and the FGDAC for one-stage am-
plifier case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Table 4 Ratio of noise contributions for the BWCDAC and the FGDAC . . . . . 81
Table 8 Ideal and actual coefficients of the comb, low-pass, and band-pass filters . 100
Table 9 Performance and design parameters of the DA based FIR filter. . . . . . 100
viii
LIST OF FIGURES
Figure 1 Typical artificial neural network setup and McCulloch-Pitts neuron model 3
Figure 9 Design of floating-gate transistors from regular nMOS and pMOS tran-
sistors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Figure 10 Gate sweeps of a floating-gate pMOS transistor and its injection efficiency 16
ix
Figure 23 Scaled-gate linearization technique . . . . . . . . . . . . . . . . . . . . 39
Figure 27 Effect of the well voltage on the FGRS GL resistance, linearity test of the
FGRS GL , and its die photo . . . . . . . . . . . . . . . . . . . . . . . . . 44
Figure 32 Effect of the well voltage on the FGRCML resistance and voltage sweeps
of the well computation circuit . . . . . . . . . . . . . . . . . . . . . . . 53
Figure 35 Linearity test of the FGRCML for a range of well feedback ratios . . . . . 56
Figure 36 The second and third-order harmonics of the FGRCML for a range of well
offset voltages and normalized resistance of the FGRCML circuits . . . . . 57
Figure 40 Linearity tests and frequency sweeps of the highly linear amplifier . . . . 64
x
Figure 45 MSB step responses, sinusoidal transient response, and short term lin-
earity test of the DAC . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Figure 48 Speed comparison of the BWCDAC and the FGDAC for one-stage am-
plifier case and small amplifier input capacitance . . . . . . . . . . . . . 76
Figure 58 Transient responses of the DA based FIR filter for 50kHz sampling fre-
quency and their power spectrums . . . . . . . . . . . . . . . . . . . . . 99
Figure 59 Magnitude and phase responses of the DA based FIR filter at 32/50kHz
sampling rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Figure 62 Small signal models used to analyze the DAC structures . . . . . . . . . 114
Figure 63 Models used to analyze the noise of the DAC structures. . . . . . . . . . 118
xi
SUMMARY
With the downscaling trend in CMOS technology, it has been possible to utilize
the advantages of high element densities in VLSI circuits and systems. This trend has
readily allowed digital circuits to predominate VLSI implementations due to their ease of
scaling. However, high element density in integrated circuit technology has also entailed a
decrease in the power consumption per functional circuit cell for the use of low-power and
Analog circuits have the advantage over digital circuits in designing low-power and
compact VLSI circuits for signal processing systems. Also, analog circuits have been em-
ployed to utilize the wide dynamic range of the analog domain to meet the stringent signal-
imperfections and mismatches of CMOS devices can easily deteriorate the performance
of analog circuits when they are used to realize precision and highly linear elements in
the analog domain. This is mainly due to the lack of tunability of the analog circuits that
These problems can be alleviated by making use of the analog storage and capaci-
elements and analog storages are built using floating-gate transistors to be incorporated
into signal processing applications. Tunable linearized resistors are designed and imple-
mented in CMOS technology, and are employed in building a highly linear amplifier,
Moreover, a tunable voltage reference is designed by utilizing the analog storage feature of
the floating-gate transistor. This voltage reference is used to build low-power, compact, and
architecture.
xii
CHAPTER 1
EXISTING APPROACHES IN ANALOG AND MIXED-SIGNAL
CIRCUIT DESIGN
Maintaining the signal integrity and precision through the signal processing path is one
of the most challenging issues in analog and mixed-signal circuit design. To achieve this,
analog and mixed-signal circuits are generally designed to preserve the accuracy and pre-
cision in the signal amplitude and time while processing them. On the other hand, digital
circuits process the information in two amplitude states of a bit during a predefined time
interval, thus the accuracy in the signal amplitude is not the main constraining issue for dig-
ital circuits. Therefore, the demands of analog and mixed-signal circuits from the process
The scaling in the process technology has enabled designers to obtain high element
densities with digital circuits. However, this scaling trend has imposed different design
challenges for analog and mixed-signal circuits and the cost-effective CMOS integration.
Especially as the supply voltage is decreased due to the technology scaling it has become
more difficult to process the signals in the analog domain with the reduced voltage head-
room. In addition, the relative parametric variations has increased with the scaling in the
process technologies [1], making the linearity, noise, and distortion issues become more
While being less prone to the device imperfections, digital circuits also offer design
generally large and power-hungry [2], [3]. The multiplication and addition operations are
the repetitive functions frequently used in signal-processing systems. Even for custom dig-
ital circuits, their digital implementations cause increase in the total die area and power
1
of the signal-processing systems. In contrast, the area and power consumption associated
with the addition and multiplication operations can be easily optimized with analog and
employed for analog and mixed-signal circuits to achieve reconfigurability and tunability
and to deal with the device mismatches and imperfections. For instance, tunable resis-
tors are incorporated into artificial neural networks (ANN) to set and tune the synaptic
weights. Similarly, linearization techniques are employed for highly linear amplifiers and
multipliers to increase the circuit linearity and minimize the signal distortion. In data con-
verters, a variety of calibration methods are utilized to alleviate the device imperfections.
and mixed-signal circuits to achieve the reconfigurability and tunability. In the subsequent
and to achieve certain tasks. Adaptive ANN systems learn by example, and like it is the
case for biological systems they adjust their synaptic weights and connections to adapt to
Figure 1a illustrates a typical artificial neural network architecture [4], where the inputs
are usually binary, and the connections between the input layer and the middle or hidden
layer contain the weights. These weights are generally determined by training the system.
In addition, the middle layer processes the weighted inputs and sums them. The output
is created based on the transfer function of the system. This transfer function can be a
sigmoid function, which varies from 0 to 1 for a range of inputs. The connections between
the middle and output layer also have weights, and the output layer contains the transfer
2
" # $ % &
!
' ( ) ) * % + &
, ( ) ) % " # $ % &
!
" # $ % &
(a) (b)
Figure 1. (a) Typical artificial neural network setup [4]. (b) McCulloch and Pitts neuron model [5]. The
inputs are weighted so that the effect that each input has at decision making is dependent on
the weight of the particular input
Moreover, a neuron model by McCulloch and Pitts [5] is depicted in Figure 1b. In
this model, the inputs are weighted so that their effect at decision making is dependent
on the weight of a particular input. These weighted inputs are then added together and if
they exceed a pre-set threshold value, the neuron fires. This neuron model has the ability
to adapt to a particular situation by changing its weights and/or threshold. This has been
achieved by employing algorithms such as the back error propagation and the Delta rule.
The synaptic weights in ANN systems can be implemented in CMOS processes by us-
ing resistors [6]. The resistors in such applications can be designed and made tunable by
exploiting the CMOS transistor properties. While the linearity is one of the most important
metrics used to design tunable CMOS resistors, they are usually built based on the spec-
ifications imposed by their application. Therefore, depending upon the application, the
CMOS resistors are generally required to be highly linear, area and power efficient, and to
have a wide tuning/operating range. The compactness, power efficiency, and tuning range
applying linearization techniques to MOS transistors. These techniques exploit both the
3
MOSFET’s square-law characteristic in the saturation region [7], [8], and its resistive na-
ture in the triode region [9], either separately or in combination [10], [11]. Although the
linearization of MOS transistors in the saturation region has been achieved to obtain CMOS
square-law method [7], these structures generally suffer from channel-length modulation,
mobility degradation, and device mismatches. In addition, MOS transistors have been lin-
earized by operating them in the triode region, and using balanced networks [13], [14],
[9] or depletion devices [15]. However, the balanced resistor structures are sensitive to the
mismatches that cause even-ordered distortion, and to the mobility degradation that results
in odd-ordered distortion. Moreover, the tuning range of the linearization technique with
depletion devices are strongly limited [16]. Alternative to these approaches, the gate lin-
earization [17] or common-mode strategy [18] can be adopted to a single MOS transistor
in the triode region to alleviate the linearity, mismatch, operating-range, and tuning-range
issues.
2a. This resistor is similar to the resistor structures proposed by Rasmussen [20] and Singh
[8]. In addition to the mirror transistors, four more MOS transistors and two control voltage
sources are used for the design of this resistor. A pair of MOS transistors, M1 and M2 , is
connected as a bilateral resistor while the other pair of MOS transistors, M3 and M4 , is
similarly connected to the middle right of the circuit. These resistors are controlled by the
Furthermore, a tunable CMOS resistor for ANN systems can be also implemented by
using the circuit shown in Figure 2b. This circuit is a floating resistor exhibiting positive
or negative resistance values depending on its biases [21]. The transistor nonlinearities
are cancelled by operating the transistors in their saturation region. The nodes VX and
VY are the two terminals of the resistor, and the resistance is inversely proportional to the
difference of control voltages VC1 and VC2 . If VC1 is greater than VC2 , the circuit operates
(a) (b)
Figure 2. (a) Circuit schematic of the CMOS bilateral linear floating resistor [19]. (b) Circuit diagram
of floating resistor [21].
as a resistor circuit with positive resistance. Alternatively, if VC2 is greater than VC1 , the
log circuit blocks and are widely used in signal and information processing applications.
Highly linear amplifiers are particularly important for the design of data converters and
continuous-time filters, and multipliers are essential components of modulators and mix-
require highly linear circuits that can handle large signal swings at their inputs/outputs.
serve this purpose [22] as shown in Figure 3a. However, the use of a single transistor alone
is not effective due to the fact that MOS transistors in triode region exhibit a large depen-
dence on the common mode of its input signals. Another approach is to use a cross-coupled
5
(a) (b)
Figure 3. (a)V − I conversion based on a single MOS triode transistor[22]. (b) Circuit realization of the
linearized transconductance based on the cross-coupled quad configuration [23].
quad cell that has transistors n times larger than the input transistors and acts as a source
follower to create a constant sum of Vgs [23] as illustrated in Figure 3b. This topology re-
sults in increased power consumption, and the linearity of the amplifier is limited. Similar
triode region suffer not only from mismatch and offset, but also from the MOS transistor
straints imposed by the trade-offs between power, speed, resolution, and area. This is espe-
cially the case for embedded on chip systems where die area tends to be a major concern.
Depending upon the application, accuracy and/or resolution is often sacrificed for reduced
area.
6
1.3.1 Binary-weighted capacitor DAC
Within the Nyquist rate DACs, the binary-weighted capacitor DAC (BWCDAC) allows for
obtaining a good accuracy [25]. This DAC architecture, shown in Figure 4, was first pre-
sented by McCreary and Gray [26], and implemented by utilizing the scaled capacitors.
Although it yields a good accuracy, its binary-weighted capacitor array causes a large el-
ement spread and an exponential growth in the total area as the number of bits increases.
Also, the achievable resolution and accuracy of this DAC is limited for higher resolutions,
since the matching accuracy of the capacitors degrades as the capacitor ratio increases. In
order to ease the area and resolution trade-off, DAC architectures based on two stage capac-
itor arrays [27], and C − 2C ladders [28] were proposed. C − 2C ladder structure is one of
the best area optimization technique for the BWCDAC, since in this case, the area increases
linearly with the number of bits and the element spread is only 2. However, the accuracy of
this DAC is sensitive to the parasitic capacitances at the capacitor ladder interconnections.
While it is possible to reduce the total area of the BWCDAC by employing different de-
sign strategies, the accuracy and the area of this converter is mostly dictated by the capac-
itor matching. Therefore, it becomes crucial for this kind of converters to have minimized
capacitor mismatches. The mismatch between capacitors is caused by the systematic and
random errors [29–31]. The area and perimeter of capacitors, capacitor-to-capacitor gap,
corner-cutting, and capacitor ratio determine the maximum achievable capacitor matching
[32]. The capacitor matching can be improved by employing unit capacitors that have the
modern CMOS processes can be obtained by employing different layout techniques [33],
the total capacitor area dictated by the capacitor matching and unit capacitor size increases
with these techniques. It has been shown that capacitor mismatch errors can be filtered out
by employing sinal processing techniques such as dynamic element matching [34], data-
weighted averaging [35], and noise-shaping [36, 37] techniques. These design strategies
use digital signal processing techniques to minimize the effect of the mismatch errors in
7
Figure 4. Traditional design of binary-weighted capacitor charge amplifier DAC circuit. C f is the feed-
back capacitor and equal to 2N C. φ is the digital reset signal used to clear the inverting-node
of the amplifier.
the frequency range of interest. For that purpose, the sampling rate has to be increased
verters, multi-bit quantizers can be successfully employed to improve the overall perfor-
mance. In pipelined ADCs, the use of multi-bit quantizers decreases the number of stages
and reduces the conversion latency. Also, interstage analog signal processing performance
can be optimized depending on the accuracy of the sub-stages. Proper selection of stage
resolution and use of multi-bit quantizers allow for the optimization of silicon area, power
consumption, and conversion speed for resolutions higher than 10 bits [38].
designing a converter with a high dynamic range for the low-voltage and low-power appli-
cations, the signal swing at the integrator output needs to be lowered, and this requirement
can be readily met by employing multi-bit quantizers. Also, increasing the number of bits
of the internal quantizer in ∆Σ modulators enables for the reduction of the quantization
noise by 6dB for each additional bit, and improves the stability of the higher order ∆Σ
8
Figure 5. Traditional design of n-bit binary-weighted resistor DAC circuit. R f is the feedback resistor
and bi is the digital input bit for i = 0, .., N − 1.
A quantizer can be easily built by using a binary-weighted resistor DAC structure shown
in Figure 5. Although this kind of DAC structure can be fast and insensitive to parasitics,
it is susceptible to resistor mismatches, which can substantially alter the linearity perfor-
mance of the converter. Passive resistors in CMOS technologies are typically implemented
by utilizing polysilicon, diffusion or well strips. These resistors exhibit around ±0.1%
matching accuracy and ±30% tolerance due to device-to-device and lot-to-lot variations
in semiconductor fabrication processes [33]. Thin film resistors typically have much bet-
ter matching accuracy and temperature coefficients, but they are not available in the main
stream CMOS processes. The device mismatches and component variations in CMOS
techniques include trimming and the use of programmable binary-weighted array. Compo-
nent trimming is achieved during the test phase of the production by using laser technology.
The programming method is used to choose the desired array of elements by blowing fuses.
These methods are irreversible and introduce problems over time due to aging, stress, and
temperature.
9
Figure 6. Example of switched-capacitor FIR filters. A general purpose 6th-order direct-form FIR
filter by using switched-capacitor technique [43].
suggested [41, 42]. The analog and mixed-signal implementations of FIR filters have
been generally designed for pre and post-processing applications by employing switched-
Switched-capacitor techniques are suitable for FIR filter implementations and offer pre-
cise control over the filter coefficients. A general purpose of FIR filter implementation
pose different design challenges depending upon the implementation. To avoid the power
and speed trade-off in the switched-capacitor FIR filter implementations, a transposed FIR
10
Figure 7. Example of switched-capacitor FIR filters. Sampled-data analog FIR filter with digitally
programmable coefficients [44].
filter structure is usually employed [41]. Also, a parallel filter concept is suggested to in-
switch matrix is used to eliminate the error accumulation [46]. Alternatively, these prob-
The filter implementations with these techniques offer a design flexibility by allowing for
coefficient and/or input modulation [49, 50]. However, this design approach requires the
The programmability in analog FIR filter implementations can also be obtained by uti-
tation is shown in Figure 7b. These techniques allow for the integration of the digital
coefficients through the use of the current division technique [51] or multiplying digital-to-
analog converters (MDAC) [52,53]. Moreover, a circular buffer architecture can be utilized
to ease the problems associated with analog delay stages and to avoid the propagation of
both offset voltage and noise [44, 54–56]. Recently, a switched-current FIR filter based
on DA has also been suggested for pre-processing applications to decrease the hardware
11
1.5 Motivation for using floating-gate transistors in analog and mixed-
signal circuits
In the previous sections, the overview of the techniques to deal with device imperfections
and to obtain tunable and/or reconfigurable circuits is given. These techniques generally
result in increase in power consumption and/or die area, which negate the benefits of the
to design programmable circuits for signal-processing systems. In this respect, the design
issues associated with the analog and mixed-signal circuits are circumvented by introduc-
ing floating-gate transistors to the available devices in the mainstream of the CMOS pro-
transistors, enables designers to exploit the benefits of the analog and mixed-signal circuits.
resistors and voltage references, which further extend the capabilities of the programmable
circuit design. These circuit blocks are used in analog and mixed-signal circuit applications
and compactness. The tunable resistors are used in highly-linear amplifier and multiplier
circuits to improve the linearity and to obtain precise resistors. Moreover, these resistors are
are employed in the implementation of a low-power and compact DAC and a reconfigurable
This thesis is organized into ten chapters. In Chapter 2, we present the design and the
conditions for designing tunable resistors and the role of floating-gate transistors in these
resistor implementations. After that we explain the design of a voltage reference and an-
alyze its noise, temperature dependence, and charge retention. In Chapter 3, we describe
the implementation of a tunable resistor using the gate-linearization technique and present
12
$
$ &
0
%
! " #
2
1
'
' ) * +
) * +
&
$
, -
3 & %
&
.
. / %
$
% %
& & .
Figure 8. Design flow of the presented work. Floating-gate transistors are added to the available de-
vices in the mainstream CMOS processes to design tunable resistive elements and voltage
references, which are then used to build tunable and reconfigurable analog and mixed-signal
circuits.
its experimental results. In Chapter 4, we explain the design and implementation of a com-
pact tunable resistor using scaled-gate linearization technique and present its experimental
based on the common-mode linearization technique and compare it with other existing tun-
amplifier and a transconductance multiplier employing the tunable resistor based on the
binary-weighted resistor DAC using the tunable resistor based on the scaled-gate lineariza-
weighted DAC using tunable voltage references and discuss the design issues. In Chapter
based FIR filter. Lastly, in Chapter 10, we discuss the impact of the presented work and
13
CHAPTER 2
DESIGN OF TUNABLE CIRCUITS USING FLOATING-GATE
TRANSISTORS
The programmability of the floating-gate transistors enables to build systems that can adapt
associated with digital systems, into analog and mixed-signal circuits that are more area and
power efficient. In this chapter, the tuning mechanisms of the floating-gate transistors are
described. Also, the storage and capacitive coupling capabilities of floating-gate transistors
to build tunable resistors and a voltage reference are also explained. These tunable resistors
and voltage reference will be used to design and implement tunable and reconfigurable
technique is utilized to tune the charge on the floating-gate terminal of these transistors. In
this technique, a tunneling junction capacitor and an additional pMOS transistor are em-
ployed to tune the charge on the floating gate without introducing additional switches at the
signal path.
Figure 10a illustrates that the threshold voltage of a floating-gate pMOS transistor can
be increased or decreased by tuning the charge on the floating-gate terminal. The charge
anisms. The hot-electron injection increases the number of electrons on the floating-gate
terminal; thus the threshold voltage of the pFET is decreased and the threshold voltage
of the nFET is increased. In contrast, the tunneling mechanism decreases the number of
14
Figure 9. Design of floating-gate transistors from regular nMOS and pMOS transistors. Charge on the
floating gate is tuned by employing Fowler-Nordheim tunneling and hot electron injection
mechanisms. This is achieved by utilizing an indirect programming technique, where elec-
trons are injected using a pMOS injection transistor, Min ject , and tunneled using a tunneling
junction capacitor, Ctun .
The tunneling mechanism is utilized for the coarse programming of the threshold volt-
age. The rate of the electron tunneling can be increased by increasing Vtun . The precise
achieved by creating 6.5V voltage pulses across a pFET’s drain and source terminals. These
pulses are generated by modulating the drain terminal of Min ject , while keeping its source
Figure 10b illustrates that as the floating-gate voltage decreases, the injection efficiency
drops exponentially since the injection transistor has better injection efficiency for smaller
source-to-gate voltages. This efficiency drops as the transistor channel becomes more in-
verted. Therefore, the gate voltage, Vg , is modulated during programming to keep the
floating-gate voltage at the same place, where the injection efficiency is high. In this way,
the number of injected electrons and the output voltage change is accurately controlled.
Moreover, it is observed that increasing the injection voltage, V sd , increases the injection
efficiency. However, after 6.5V the transistor channel becomes more inverted compared to
the channel for a smaller injection voltage, and this degrades the injection efficiency.
15
Vtun
8 Vs C tun 0
10
Vg
Drain Current (µ A)
5
-2
4 10 V = 5.5V
sd
V = 6V
sd
3 V = 6.5V
sd
-3 V = 7V
sd
2
10
1
Injection
-4
10 3.4 3.6 3.8 4 4.2 4.4
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Source-to-Gate Voltage (V) Floating gate v oltage (V)
Figure 10. (a) Gate sweeps of a floating-gate pMOS transistor. The threshold voltage of the transistor
is tuned by using Fowler-Nordheim tunneling and hot electron mechanisms. The threshold
voltage can be made negative by increasing the number of electrons on floating gate using
injection mechanism. (b) Change in the floating-gate voltage for 10ms injection pulses and
for different injection voltages.
linear resistive element is to suppress its nonlinearities by applying a function of the input
signal to its gate [17] and/or its body [58]. In order to determine this function, the source
of the nonlinearities in the drain current needs to be identified, and a linearization scheme
The drain current of a MOS transistor in the strong inversion has been accurately mod-
elled [59], [60], [61]. Based on these models, three principal nonlinearities in the drain
current of a long-channel transistor in the triode region are identified as the body effect, the
mobility degradation, and the fundamental quadric component due to the common-mode of
the drain and source voltages. These nonlinearities are mostly dependent on the common-
mode of the input signals, and can be suppressed by building common-mode feedback
The linearization techniques based on the transistors operating in the triode region ne-
cessitate the generation of common-mode and large gate voltage for their proper operation.
While most of the linearization techniques are appealing in terms of the reduced nonlinear-
16
increased number of components and increased power consumption. In addition, creating
a large quiescent voltage with fully integrated circuits in CMOS processes is not a trivial
task. These disadvantages limit the operation of a linearized MOS transistor and, thus, the
In this section, we show that introducing floating-gate MOS transistors can effectively
circumvent these problems by providing capacitively coupled gate connection, and an qui-
escent gate voltage that can be adjusted by using the hot-electron injection and Fowler-
For applications where tunable linear elements operating in triode region are required, cre-
ating a large DC offset voltage within the power supply voltage range becomes a crucial
part of the design. This offset voltage is applied to the gate of the transistors to extend
their triode operation regime. In this respect, a floating-gate transistor can be employed to
alleviate this problem by generating a large offset that is not limited with the power supply.
The quiescent gate voltage ensures the proper operation of the linearized elements by
keeping the transistors in the triode region, Vds < Vgs − VT , where Vds , Vgs , and VT are
the drain-to-source, gate-to-source, and threshold voltages, respectively. The gate voltage
is also utilized to control the resistance of these elements. Therefore, the operating range,
which is determined by Vds , has to be optimized to accommodate the desired tuning range
of the resistor while still keeping the transistor in the triode region.
The drain sweeps of a regular pMOS transistor shown in Figure 11a illustrate that in a
0.5µm CMOS process, V sg > 5V needs to be supplied in order to keep the transistor in the
triode region for 5V operating range. Although this allows to obtain the maximum linear
operating range of the linearized elements, it necessitates the use of voltages that are larger
than the power supply for nFETs or lower than the ground potential for pFETs.
When the common mode of the input signals is fixed, and a differential test is performed
17
! "
0.2 *
VSG = 9V
# "
VSG = 8V
(
)
)
-
$ "
0.15
% "
+ ,
VSG = 6V
"
0.1
% "
VSG = 5V
$ "
# "
V = 3V Saturation
SG
VSG = 2V
! "
% $ #
% " !
$
!
#
'
0
& ' &
&
& &
0 1 2 3 4 5
(a) (b)
Figure 11. Drain sweeps of a pMOS transistor and differential test of a floating-gate transistor. (a)
Drain voltage, Vd , sweep of a pMOS transistor tuned for gate voltages, VG , from 2V to 9V.
Source and well voltages (V s and Vw ) are kept at 5V. The dashed line separates the triode and
saturation regions. (b) Differential test of a floating-gate CMOS transistor. The voltages, VG ,
VW , and VC are set as 0V, 5V, and 2.5V, respectively. VX is swept from −2.5V to 2.5V. The
curve-A is obtained without tuning the charge on the floating gate. The curve-B is measured
after injecting electrons to the floating gate.
with a pMOS floating-gate transistor, as illustrated in Figure 11b, the drain current exhibits
a linear characteristic as long as it stays in the triode region. The curve-A in Figure 11b
is created on the floating gate, the output current of the floating-gate transistor for the dif-
ferential test has the same characteristics as the output current of a regular MOS transistor
with the same dimensions. In addition, it can be observed that for large drain-to-source
voltages the transistor leaves the triode region, since Vds < Vgs − VT does not hold anymore.
However, after injecting enough electrons to the floating gate by using the hot-electron
injection mechanism, the floating-gate voltage decreases much enough that the transistor
exhibits a very linear characteristic for the given input voltage range. This is illustrated
with the change in the transistor linearity from curve-A to curve-B in Figure 11b.
18
2.2.2 Common-mode voltage computation
The common-mode of the input signals can be computed using the capacitive design ap-
proach illustrated in Figure 12. This approach can readily allow for reduced power con-
sumption without increasing the total harmonic distortion of the designed circuit. For input
signals, V1 and V2 , the capacitive division with capacitors, C1 and C2 , results in an output
C1 C2 Q
Vout = V1 + V2 + (1)
C1 + C2 C1 + C2 C1 + C2
where Q is the charge stored at the capacitive node, Vout . If the capacitors are designed to
be equal, the above expression becomes Vout = (V1 + V2 )/2 + VQ , where VQ is the effect
of the stored charge. Although the common-mode voltage can be computed precisely with
this method, when the capacitors are used with a transistor, M, as shown in Figure 12, the
input capacitance of the transistor cause error in the common-mode computation. In this
case, for the same size input capacitors, C1 = C2 = C, the computed voltage becomes
where Cin is the input capacitance of the transistor and composed of the gate-to-drain ca-
pacitor (Cgd ), gate-to-source capacitor (Cgs ), and gate-to-body capacitor (Cgb ). Depending
upon the transistor’s region of operation, their values change with the input voltages. In the
triode and saturation regions, Cgb becomes very small, thus can be ignored. In the triode
2 1 + 2α
Cgs = Cox (3)
3 (1 + α)2
√
where α = 1 − Vds (1 + δ)/(Vgs − VT ), δ = γ/(2 φB + V sb ), and Cgd = αCgs [61]. The
crucial point here is that Cgd becomes equal to Cgs as Vgs − VT Vds (1 + δ), which can be
satisfied for deep threshold conditions. Moreover, in the saturation region of the transistor,
19
Figure 12. Common-mode voltage computation method using capacitive design strategy. The gate ca-
pacitors of an nMOS transistor are showed to illustrate their effect on the common-mode
voltage computation when this transistor is integrated with input capacitors to form a float-
ing gate.
These capacitor characteristics not only determine the limitations in implementing the
linearization techniques, but also the amount of nonlinearity that can be suppressed with
gate transistors. The design of a such voltage reference can enable to store the scaled-
voltage levels for data converters and to obtain a tunability and reconfigurability for mixed-
signal circuits. In this work, the tunable voltage reference (epot) is built to be incorporated
Depending upon the application and its circuit specifications, the design of the epot can
be different. For the DAC, the epot programming determines the programming precision
and affects the maximum achievable DAC linearity. Also, the epot charge retention sets the
lifetime of the DAC linearity. In addition, the temperature dependence of epots determines
the operating range of the DAC, where the variation of the stored epot voltages with the
temperature is less than the tolerable error. Similarly for FIR filters, the coefficients of the
filters are stored by the epots, and thus the programming precision and charge retention of
20
(a) (b)
Figure 13. (a) Circuit schematic of the epot. Charge on the floating-gate is used to program the voltage
output of the epot. The number of electrons on the floating gate are increased by using hot-
electron injection and decreased by utilizing tunnelling quantum mechanical phenomena.
The tunnel, select, and in ject are the digital signals used for digital control of the epots.
Ctun is the tunnelling junction used for tunnelling. (b) Low-noise amplifier used to buffer
the stored voltage. Vcas is a bias voltage used for cascoding and Ccomp is the compensation
capacitor.
epots are also important design issues. Any change in the coefficients of the filter changes
Epots are programmed using two methods defined as coarse and fine programming. The
The epot programming circuitry is shown in Figure 14. In order to program the epots,
the desired epot is first selected using a decoder and enabled by setting the select signal
to high. Depending on whether the epot is to be programmed using the coarse or fine
the epot involves the modification of the number of electrons on the floating node.
The tunnelling mechanism increases the epot voltage through the removal of electrons
on the floating-gate node. The procedure for coarse programming of an epot involves
tunnelling the epot until the epot output voltage reaches 200mV above the target voltage.
21
" " !
$
) * #
# $ $
' (
$
&
! "
" "
$
$
(
+
!
Figure 14. Programming circuitry of the epot. V sin ject and Vdin ject are the source and drain voltages used
to control the injection, while Vtun and Vthr are the tunnelling and the reference voltage of
high voltage amplifier used to control tunnelling mechanism.
The purpose for overshooting is to avoid the coupling effect of the tunnelling junction
on the floating-gate terminal once tunnelling is disabled. Once the digtunnel terminal is
activated a high voltage is created across the tunneling junction. The high voltage amplifier
The hot-electron injection mechanism decreases the epot voltage through the addition
of electrons onto the floating-gate node. Precise control of the injection process is achieved
by pulsing 6.5V across the drain and source terminals of the pFET and by keeping the
trons, hence the output voltage change, can be precisely controlled. During programming
the input voltage, Vre f , of the epot is modulated based on the output voltage of the epot.
This further facilitates precise programming since the epot output is approximately at the
same potential as V f g .
Once the output voltage of the epot has been programmed to the desired value through
the use of coarse and fine programming, the tunnelling and injection voltages are set to
ground to decrease power consumption, and to minimize the coupling to the floating-gate
terminal.
22
2.3.2 Epot Noise
The data converters and mixed-signal circuits using epots to store their biases and refer-
ences become sensitive to the epot noise. Also, when an array of epots are incorporated
into VLSI circuits, these noise sources directly affect the linearity of the data converters
and the characteristics of the circuits. The epot output noise can be written as
" #
e2n8 e2n9
e2epot = g2m6 R2II e2n6 + e2n7 + R2I g2m1 e2n1 + g2m2 e2n2 + g2m3 e2n3 + g2m4 e2n4 + 2
+ 2
(4)
rds1 rds2
where gmi is the transconductance of ith transistor, RI = rdsm4 //(rdsm9 + rdsm1 (1 + rdsm9 gm9 )),
In order to minimize the flicker noise, the amplifier is designed with pFET input stage.
Also, input/load devices are sized properly to minimize the total epot output noise. The
output noise of the epot is shown in Figure 15a. The epot voltage is measured through
an on-chip buffer. Therefore, the measured epot noise also includes the noise of the buffer.
The measured thermal noise level is −120dB, and the noise corner is measured to be around
4kHz.
The temperature dependence of the epot is crucial for the circuits if the epot is employed
to set their circuit parameters. The epot output voltage relative to the reference voltage can
be written as
Q
Vepot − Vre f = + Vo f f set (6)
C
where Vo f f set is the offset voltage introduced by the epot amplifier. Assuming δC/δT = α,
where α is a process dependent parameter and around 50ppm/oC for poly-poly capacitors
23
In addition, the temperature dependence of Vo f f set depends on the amplifier structure
and the layout technique used to minimize the mismatch between the critical devices. In
mismatch between input and load transistors, which are M1 − M2 and M3 − M4 , respec-
tively. This strategy helps minimizing the offset of the amplifier. If Vo f f set has temperature
coefficient around 50ppm/oC, then it can be used to obtain a minimized temperature de-
pendence. However, it is not possible to control this coefficient with the proposed design.
The epot output voltage for a range of temperatures is shown in Figure 15b, and the
with a maximum variation of 20.8ppm/oC. The epots are programmed relative to the ref-
After programming, it is crucial that the epots hold the stored charge for a long-term circuit
reliability. The long-term charge loss of floating-gate transistors is mainly caused by the
trap assisted tunnelling as well as the thermionic emission phenomenon [62,63]. By reduc-
ing the number of programming cycles, trap assisted tunnelling can be minimized. Since
it may take the trapped electrons hours or days to be released from the traps, the initial
programming is performed to come close to the desired epot voltage. After that minimized
number of programming steps are applied for precise programming of the epots. The in-
put capacitance of the epot can be sized properly to reduce the effect of the release of the
trapped electrons.
Thermionic emission is a function of both temperature and time, and can be expressed
as
where Q(0) and Q(t) are the initial floating-gate charge and the floating-gate charge at time
24
−95
Flicker Noise
Thermal Noise
Power (dB)
−115
−135 2 3 4 5
10 10 10 10
Frequency (Hz)
(a)
1
2.425
meausured data 0.98
quadratic fit
2.424 at 300 oC
0.96
2.423
0.94
2.422
0.92
Epot Output (V)
Q(t)/Q(0)
2.421
0.9
2.42 at 325 oC
0.88
2.419
0.86
2.418 0.84
y = 2.421 + 6.237⋅ 10−5⋅ x − 4.501⋅ 10−7⋅ x2
2.417 0.82
2.416 0.8 −1 0 1
−50 0 50 100 10 10 10
o
Temperature ( C) Time (days)
(b) (c)
Figure 15. (a) Output noise of the epot. (b) Temperature sweep of the epot. The epot exhibits second-
order temperature dependence when programmed around 2.422V.(c) Stress test performed
at 300oC and 325oC to quantify the charge loss over time.
S i − S iO2 barrier potential, k is the Boltzmann constant and T is the temperature. The
change of the floating-gate charge directly affects the epot output voltage.
The design of the epots in CMOS processes with feature sizes smaller than 0.35µm ne-
cessitates the use of transistors with thicker silicon dioxide since the gate leakage becomes
an serious issue in modern processes. Therefore, epots can be designed in these processes
The retention of the epots are determined based on the stress tests. The theoretical fits
using (8) along with the measurement results at 300oC and 325oC are shown in Figure 15c.
25
The worst case results are obtained after the first stress test at 300oC. After the first test,
the charge loss of the epots is decreased considerably. The φB and v from these worst-case
experiments are extracted as 0.9eV and 60s−1 . Based on this worst-case data, it is calculated
that the stored epot voltage drifts 10−3 % over the period of 10 years at 25oC.
26
CHAPTER 3
A TUNABLE FLOATING CMOS RESISTOR USING GATE
LINEARIZATION TECHNIQUE
Tunable CMOS resistors are are usually built based on the specifications imposed by their
application. Depending upon the application, the CMOS resistors are generally required to
be highly linear, area and power efficient, and to have a wide tuning/operating range. The
compactness, power efficiency, and tuning range are the primary concerns for ANN sys-
tems. In this chapter, we present a tunable CMOS resistor that can be suitably employed
in ANN systems. This CMOS resistor operates in the triode region, and utilizes the gate
linearization technique [17]. In this structure, floating-gate transistors are not only em-
ployed to scale the input signals to the gate terminal [64], but also to store the charge on
In the next section, we explain the gate linearization strategy, and analyze its effect on
(FGRGL ). After that, we present the experimental results of this circuit. In the last part of
to-rail, it is generally the best design choice to fix the body/well potential of the linearized
elements to one of the rails. In such cases, the gate linearization technique depicted in
Figure 16 can be used to serve this purpose. This technique was originally proposed by
Nay et al. [17], and used to suppress the fundamental quadratic component in the drain
current. However, this technique does not completely eliminate the body effect and the
mobility degradation.
27
V G + vc
vd vs
VB
Figure 16. Gate linearization technique [17] applied to an nMOS transistor in the triode region. vd and
v s are the drain and source voltages, respectively. VG and VB are the tunable quiescent gate
and body voltages, and vc is the common-mode voltage, vc = (vd + v s )/2. The common-mode
voltage is applied to the gate terminal to suppress the fundamental quadratic nonlinearity
due to the common-mode of the drain-to-source input voltage.
v s )/2, to the gate terminal with the addition of a tunable quiescent gate voltage, VG . vd and
v s are the drain and source voltages referenced to the ground, respectively. By using this
technique, the quadratic term in the drain current is cancelled as shown in Appendix-I. In
order for this technique to work effectively, the MOS transistor has to be kept in the triode
region for the required input range. This requires vds < 2(VG −v s −VT ), and also necessitates
p
VT = VFB + φ + γ vc − vb + φ (9)
where VFB is the flat-band voltage, φ is the surface potential, and γ is the body-effect
θ µ0
θ1 = , µ1 = (10)
1 + θVG1 1 + θVG1
p
Vc1 = γ vc − vb + φ, VG1 = (VG − VFB − φ) (11)
where µ0 is the carrier mobility and θ is the mobility degradation factor, then, as shown in
v2 −1
Appendix-I, the drain current for θ1 Vc1 − 96Vds3 can be approximated as
c1
µ1Cox W γ4 v3ds
Id = vds (VG1 − Vc1 )(1 − θ1 Vc1 ) + 1 + θ (V
1 G1 − 2V c1 ) (12)
L 96Vc31
28
where Cox is the gate capacitance per unit area, W is the channel width, and L is the channel
length. Ignoring the higher order terms, the resistance of the linearized element becomes
L
R= (13)
µ1Cox W(VG1 − Vc1 )(1 − θ1 Vc1 )
By using the threshold equation in (9), and the θ1 approximation, the resistance equation
simplifies to
L
R= (14)
µ1Cox W(VG − VT )
This result reveals the fact that since VT changes with vc , the resistance of the linearized
element depends on the common-mode of the input signals. In order to obtain higher
The tunability with this structure can be obtained by by changing the value of VG .
Therefore, the tuning range of this resistor is limited by the required resistor linearity for
voltages for their proper operation. In this work, we show that introducing floating-gate
MOS transistors provides capacitively coupled gate connection and an quiescent gate volt-
age that can be adjusted by using the hot-electron injection and Fowler-Nordheim tun-
nelling mechanisms. These features facilitate the circuit implementation of the gate lin-
earization technique.
Employing a capacitive coupling connection to the gate terminal for the linearization
was first suggested by Lande et al. [65], and implemented by using quasi floating-gate
devices. However, a quasi floating-gate terminal acts as a high-pass filter with a very low
corner frequency. Therefore, the DC common-mode of the input signals can not be tracked
by the gate of the transistor with this approach. Here, we show that employing floating-
gate transistors and using Fowler-Nordheim tunnelling and hot-electron injection quantum
29
mechanical phenomena for the resistance control improves the operation of CMOS resistors
The implementation of a tunable CMOS resistor based on the gate linearization tech-
nique is shown in Figure 17. This resistor operates as a floating resistor with its well termi-
nal kept at a fixed potential. The common-mode voltage of the input signals is computed
by using the feedback capacitors, which couple the drain and source voltages to the gate
terminal. In addition, the charge stored on the floating-gate terminal creates the required
quiescent gate voltage to satisfy the triode condition and linearity requirement. In this cir-
cuit, Vtun is used to enable the tunnelling mechanism to decrease the number of electrons on
the floating-gate terminal. Also, V sPROG and VdPROG are used to create the required voltage
difference that is necessary for the hot-electron injection mechanism to occur and increase
the number of electrons at the gate terminal. As a result, the floating-gate voltage can be
expressed as
where V p is the effect of the stored charge and the capacitive coupling from the peripheral
circuit that includes Ctun and C MP . Ctun is the tunnelling junction capacitance, and C MP is
the input capacitance of the injection transistor, MP . In this equation, Cgs becomes equal
to Cgd for large gate quiescent gate voltages. Therefore, the necessary condition for an
accurate common-mode computation is to create a large quiescent gate voltage and to keep
Cg much larger than C MP and Ctun so that the floating-gate potential is close to
(V s + Vd )
Vfg ' + Vp (16)
2
The scaling error introduced by the common-mode computation increases the common-
mode dependence of the circuit. However, the main source of the distortion in this lineariza-
tion technique is the body effect since the body potential is fixed relative to the common-
30
Figure 17. The circuit implementation of the gate linearization technique (FGRGL ). The common-mode
feedback is realized by using feedback capacitors (Cg ) between source-gate and drain-gate
terminals. Vwell , V s and Vd are the well, source and drain voltages of MR , respectively. This
resistor is tuned by changing the quiescent gate voltage and this is achieved by using the
tunnelling junction connected to Vtun , and the injection circuit that has source voltage V sPROG
and drain voltage VdPROG .
1 µ0nCox W
= [VG − VT ] (17)
R L
The temperature dependence of µ0n and VT can expressed as µ0n = µ0no (T/T 0 )−m and VT =
VT0 − αVT (T − T 0 ), where T 0 is the reference temperature, and m is the positive constant that
ranges from 1.5 to 2, and µ0no and VT0 are the temperature independent parameters. Also,
αVT is in the range of 0.5 to 4 mV/oC [61]. Hence, the temperature coefficient of the FGRGL
can be expressed as
can be tuned by altering the effect of αVT through the use of VG . For desired temperature,
αVT
T d , and VG = m
T d + VT , the temperature coefficient of the FGRGL can be set to zero at T d .
31
1.2
20
Injection
Tunnelling
15 1
Output current (µ A) 10
0.8
Resistance (MΩ)
5
Injection
0 0.6
−5
0.4
−10
0.2
−15 Tunnelling
−20 0
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5
Source−to−drain voltage (V) Source−to−drain voltage (V)
(a) (b)
Figure 18. Experimental results. (a) I-V characteristics of the FGRGL . (b) Extracted resistances of the
FGRGL tuned to different quiescent gate voltages.
surements were obtained from the chip that was fabricated in a 0.5µm CMOS process.
The experiments for the static measurements are performed by keeping one terminal of
the floating-gate resistors at 2.5V, and then sweeping the other terminal between 0 and 5V.
Also, the well terminal of FGRGL is kept at 5V. After each programming step by tuning the
quiescent gate voltage, the experiment is repeated to observe the change in the resistance
and linearity. The I-V curves of the FGRGL are shown in Figure 18a. FGRGL exhibits
better linearity for its smaller resistance values. This is mainly because the relative effect
of the common-mode voltage on the resistance becomes less for higher VG voltages. The
extracted resistance sweeps of the FGRGL are shown in Figure 18b. It can be observed that
the resistance of FGRGL changes with the input voltage. This agrees with the theoretical
results shown in (13), since VT deviates from its nominal value with the change in the
common-mode voltage. Therefore, this resistor exhibit small changes for small resistance
values, and large changes for large resistance values. While decreasing the resistance of
the floating-gate resistor the quiescent gate voltage is also increased. In turn, this helps the
transistor to stay in the deep triode region even for large differential input signals.
32
8
1.8
7 Well−to−drain voltage= 2.5V
1.75
Well−to−drain voltage= 5V
6 1.7
1.65
5
1.6
4
1.55
3 1.5
1.45
2
1.4
1
1.35
0 1.3
0 0.5 1 1.5 2 2.5 3 3.5 4 0 1 2 3 4 5 6
Input amplitude (V) Well−to−drain potential (V)
(a) (b)
4.5 1.5
W/L=1.2/0.6
W/L=1.2/1.2
4.4 W/L=1.2/7.5 1.4
W/L=1.2/0.6
4.3 1.3
Normalized resistance
W/L=1.2/1.2
Nonlinearity (%)
W/L=1.2/7.5
4.2 1.2
4.1 1.1
4 1
3.9 0.9
3.8 0.8
2 2.5 3 3.5 4 4.5 5 5.5 −2.5 −2 −1 0 1 2 2.5
Well−to−drain voltage (V) Source−to−drain voltage (V)
(c) (d)
Figure 19. Experimental results. (a) Total harmonic distortions of the FGRGL for a range of sine-wave
signal amplitudes. (b) Total harmonic distortions of the FGRGL for 1V pp sine wave signal and
for a range of well voltages. (c) Total nonlinearity of the FGRGL circuits in the full range of
operation (0-5V). The length of the transistors are 0.6µm, 1.2µm, and 7.5µm. (d) Normalized
resistances of the FGRGL circuits for different transistor lengths.
inverting amplifier with a corresponding feedback resistor (matches the resistance of on-
chip resistor). Also, 16 − bit DAC is employed to generate the sine-wave for the charac-
terization of the FGRGL linearity. The distortion level of this resistor for a range of signal
amplitudes is illustrated in Figure 19a. This experiment is repeated for Vwell = 5V and
Vwell = 7.5V while Vdrain is kept at 2.5V and V source is swept around 2.5V. It is observed
that the linearity is also dependent on the well-to-drain potential. Therefore, another lin-
earity test is performed for a range of well-to-drain voltages as shown in Figure 19b. For
33
0.6
0.4
−0.2
−0.4
−0.6
0 10 20 30 40 0 10 20 30
Time (µ s)
Figure 20. Experimental results. Transient response of the FGRGL for 1V pp 100kHz sine-wave.
1V pp sine-wave, it is seen that the linearity of the FGRGL can be increased by keeping the
well-to-drain voltage around 4V. The main source of the distortion in FGRGL linearity is
the change in its threshold due to body effect, and this effect becomes more apparent as
the resistance of FGRGL is increased by decreasing the quiescent gate voltage. Depend-
ing upon the linearity level that certain applications may require the tuning range of these
The change in the total nonlinearity of the FGRGL for different transistor lengths is
depicted in Figure 19c. The transistor lengths are chosen as 0.6µm, 1.2µm, and 7.5µm.
more dominant, the total nonlinearity in the full range of operation becomes less for short-
channel transistors. This is mainly due to their resistance behavior with the common-mode
voltage. As shown in Figure 19c, transistors with shorter channels exhibit less variation
with the input and common-mode voltage change. Moreover, the transient test of the
FGRGL is performed by using 1V pp 100kHz input signal. It is seen that FGRGL operates at
The temperature test of the FGRGL is performed to characterize its temperature depen-
dence between −60 to 100 oC. As shown in Figure 21a, the temperature behavior of the
34
7
Tunnel
6.5 Inject
5.5
Resistance (MΩ)
5
4.5
3.5
2.5
−60 −40 −20 0 20 40 60 80 100
Temperature ( oC )
(a)
3000 4.16
4.15
2000
Temperature coefficient (ppm/ C )
o
4.14
1000
Resistance (MΩ)
4.13
0
4.12
−1000
4.11
−2000
4.1
−3000 4.09
−4000 4.08
0 5 10 15 −60 −40 −20 0 20 40 60 80 100
Resistance (MΩ) Temperature ( oC )
(b) (c)
Figure 21. Experimental results. (a) Temperature behavior of FGRGL for differently tuned resistance
values. (b) Temperature coefficient of the FGRGL for a range resistance values. (c) Temper-
ature behavior of the FGRGL when its first-order temperature dependence is cancelled.
FGRGL depends on the programmed resistance value. Figure 21b illustrates the change of
the temperature coefficient of the FGRGL with the programmed resistance value. Around
10MΩ, the first-order temperature dependence of the FGRGL becomes much less than its
second-order temperature dependence, thus for this operating condition FGRGL is gov-
erned by its second and higher-order temperature dependence. As shown in Figure 21c, the
Finally, the die photo of the fabricated FGRGL circuit is shown in Figure 22. The total
area of this circuit is 4900µm2 and its each gate-feedback capacitor is 1.46pF.
35
INJECTION TRANSISTOR
FEEDBACK CAPACITOR -
?
RESISTOR TRANSISTOR -
FEEDBACK CAPACITOR - 6
TUNNELLING JUNCTION
3.5 Discussion
The presented CMOS resistor is very suitable for variety of applications. Especially allevi-
ating the trade-off between the tuning range and the operating range by using floating-gate
transistors allows to leverage the tunability into analog circuits without being limited by
The floating-gate resistor reported in this chapter utilizes the properties of MOS tran-
sistors in a CMOS process. FGRGL uses only 2 capacitors and 1 transistor in addition to
the programming circuit. It yields around 1.3% linearity (for 1V pp ) without consuming ad-
ditional power for the operation of the circuit. Moreover, FGRGL can be easily employed
in low-voltage applications since the operation of FGRGL does not depend on any of the
supply rails. Therefore, FGRGL offers a circuit implementation of a power efficient, com-
pact, and tunable CMOS resistor. Especially, this design becomes very suitable for the
ANN systems, where an array of compact CMOS resistors needs to be integrated while
keeping the power consumption down. Finally, this resistor has the ability to store its own
resistance value, therefore it does not require an additional circuit to generate a voltage to
36
CHAPTER 4
A TUNABLE FLOATING-GATE CMOS RESISTOR USING
SCALED-GATE LINEARIZATION TECHNIQUE
The tunable CMOS resistors offer a design flexibility in building precision and compact
analog circuits. Therefore, they are widely used in transconductance multipliers, highly
linear amplifiers, and tunable MOSFET-C filters. While the passive resistors that are im-
around ±0.1% matching accuracy and around ±30% tolerance [33], the tunable CMOS
resistors easily achieve high and precise resistance values through the utilization of con-
ANN systems as well as other low-power and low-voltage applications require a design
of a compact and tunable resistor that is not only less sensitive to the mismatches but
also suitable for the operation at low supply voltages. Moreover, such a resistor needs
to achieve the required linearity and tuning range with low power consumption. In this
chapter, we propose such a tunable CMOS resistor that can be successfully incorporated
into ANN systems as well as low-power and low-voltage applications. This CMOS resistor
employs a floating-gate transistor operating in the triode region, and utilizes the scaled-
gate linearization technique [18] to decrease its nonlinearities. In the next section, we
explain the scaled-gate linearization technique, and theoretically analyze the nonlinearities
of a MOS transistor operating in the triode region. Subsequently, we describe the circuit
implementation of this tunable floating-gate resistor (FGRS GL ). In the last part of this
37
4.1 Scaled-gate linearization technique
In a standard CMOS technology, a linearized tunable CMOS resistor can be designed by
acteristic in the saturation region [7], [8] or its resistive nature in the triode region [9].
In the triode region, the common-mode [18], gate [17], and scaled-gate linearization [18]
techniques can be easily utilized to suppress the nonlinearities of a MOS transistor and to
In contrast to other strategies, the common-mode strategy offers a high linearity, but its
implementation requires the use of a higher voltage than the supply voltage and increased
resistor area to generate the well-feedback voltage [66]. If high linearity is traded with
nonlinearity, then the gate linearization technique can be utilized to build a compact tunable
resistor [67]. Alternative to these techniques, the scaled-gate linearization technique can
be adopted to a single MOS transistor in the triode region to alleviate the area and linearity
a scaled common-mode voltage to the gate terminal. If the body potential, vb , is fixed at
some bias potential, VB , then the fundamental quadratic component, vds · vc , and the body
effect can be cancelled by applying a scaled common-mode voltage to the gate terminal.
γ
a=1+ p (19)
2 (VB + φ)
For a fixed body potential, VB , a becomes a process dependent parameter. While the
variation in process parameters becomes a limiting factor for this technique, this can be
overcome by tuning VB . After applying this technique, the first order mobility dependence
38
V G + a Vc
Vd Vs
VB
Figure 23. Scaled-gate linearization technique [18] to eliminate the fundamental quadratic nonlinear-
ity and body-effect of an nMOS transistor operating in the triode region. The scale factor
a is a process and body voltage dependent parameter. Vd and V s are the drain and source
voltages, respectively. VG and VB are the tunable quiescent gate and body voltages, and Vc is
the common-mode voltage, (Vd + V s )/2.
of the transistor dominates the distortion, and the drain current can be approximated as
( )
µ0oCox W θ0 γ[VG − VT ](vds vc )
Id = [VG − VT ]vds − p (20)
L (VB + φ)
where Cox is the gate capacitance per unit area, W is the channel width, L is the channel
length, VG is the quiescent gate voltage, and VT is the threshold voltage. Also, vds is the
drain-to-source voltage, vc is the common-mode voltage and equal to (vd + v s )/2, and µ0o
and θ0 are
µo
µ0o = p (21)
1 + θ[VG − VFB − φ + γ (VB + φ)]
θ
θ0 = p (22)
1 + θ[VG − VFB − φ + γ (VB + φ)]
where VFB is the flat-band voltage, µ0 is the carrier mobility, θ is the mobility degradation
factor.
This technique is used to eliminate not only the fundamental quadratic term, but also
the body effect term. The input voltage range of this technique is determined by the triode
2
condition, which is Vds < (2−a)
(VG − VT − (1 − a)V s ) for Vg = VG + a(Vd + V s )/2. Therefore,
this technique requires the design of a scale factor to minimize the nonlinearities, and the
generation of a large VG to ensure the triode operation for given operating range.
39
4.2 Circuit Description
The circuit implementation of the FGRS GL is shown in Figure 24. In contrast to the floating-
gate implementations of the gate linearization [67] and common-mode linearization [66]
techniques, one of the input terminal of FGRS GL has to be maintained at a fixed potential,
or at AC ground. Use of a floating-gate transistor in this structure enables to obtain the scale
factor and large gate voltages due to its capacitive coupling and charge storage capabilities.
The scale factor, a, is obtained by sizing the transistors and capacitors connected to
the floating-gate terminal. Since the common-mode voltage and the scale factor in this
structure are computed at the same time, the scale factor for this implementation can be
Cg1
χ= (23)
Cg1 + Cg2 + C P + C MR
where Cg1 and Cg2 are the gate feedback capacitor and the trimming capacitor. Also, C P ,
and C MR are the parasitic capacitance of the peripheral circuit and input capacitance of MR ,
(Cgs ), and gate-to-well capacitor (Cgw ). In triode region, Cgs = αCds , where α = 1 − Vds (1 +
√
δ)/(Vgs − VT ) and δ = γ/(2 φB + V sb ) [61]. Since a part of C MR contributes to Cg1 , this
effect needs to be taken into account when designing the circuit with large transistors.
pMOS transistor are used to tune the charge on the floating-gate terminal without introduc-
ing additional switches in the signal path. The resistance of FGRS GL is tuned by utilizing
ena. Vtun is used to enable the tunnelling mechanism to decrease the number of electrons;
and V sPROG and VdPROG are used to create the required voltage difference that is necessary
for the hot-electron injection mechanism to occur and increase the number of electrons on
40
VsPROG Vtun Vg Vs
2
C tun C g2
Vwell
MP C g1 MR
VdPROG
Vd
Figure 24. The circuit implementation of the scaled-gate linearization scheme (FGRS GL ). Vwell , V s and
Vd are the well, source and drain voltages of MR , respectively. Also, V sPROG and VdPROG
are the source and drain voltages of the injection transistor, MP . V s is kept constant at ac
ground, and Vd is used as the input of the resistor. Vg2 is held fixed during normal operation.
The resistance tuning is achieved by changing Vg2 or the charge on the floating gate. The
floating-gate charge is tuned by using the tunnelling junction, Ctun , connected to Vtun for
Fowler-Nordheim tunnelling, and employing MP for hot electron injection.
In addition to the programming circuit, this structure has only one transistor and two
capacitors resulting in a very compact circuit. The scale factor has to be chosen properly
to minimize the nonlinearities. However, there is no specific matching between the devices
necessary. Therefore, the total area can easily be optimized for given application. Further-
more, since the computation is achieved by utilizing the capacitive coupling and charge
its features based on the measurement results. The measurements are obtained from a chip
that was fabricated in a 0.5µm CMOS process. The test structure is built to have one main
capacitor and thirteen trimming capacitors that are used to change the scaling factor.
The static measurements of the FGRS GL are shown in Figure 25a and obtained by keep-
ing the source terminal at 2.5V, and then sweeping the drain terminal from 0V to 5V. The
scale factor, χ, for this experiment is chosen as 0.7918. Also, the well terminal of the circuit
is kept at 5V. After each programming step using tunnelling and injection, the experiment
41
10 15
Increasing capacitive
8 ratio from 0.417 to 0.96
Inject
10
6
4
Output current (µ A)
Current Output (µ A)
5
2
Tunnel
0 0
−2
−5
−4
−6
−10
−8
−10 −15
−2.5 −2 −1 0 1 2 2.5 −2.5 −2 −1 0 1 2 2.5
Source to drain voltage (V) Source to drain voltage (V)
(a) (b)
Figure 25. (a) Experimental results obtained with differently programmed quiescent gate voltage (VG )
for χ = 0.7918. VG is increased through injection to decrease the resistance. Similarly, VG is
decreased by using tunnelling to increase the resistance. (b) The effect of χ on the linearity
of the resistor. χ is increased from 0.417 to 0.96, and the implemented scale factors are 0.417,
0.4584, 0.5001, 0.5418, 0.5835, 0.6251, 0.6668, 0.7085, 0.7502, 0.7918, 0.8335, 0.8752, 0.9169,
and 0.96.
is repeated to observe the change in the resistance and linearity. Figure 25b illustrates the
effect of the scale factor on the linearity and resistance of the FGRS GL . It is observed that
the second-order nonlinearity is compensated well especially when the scale factor is cho-
sen as 0.7918. The second-order nonlinearity is more apparent for scale factors smaller
than 0.7918. As the scale factor is increased up to 0.96, the nonlinearity is increased too.
Therefore, the optimum value for the scale factor is found as 0.7918.
The extracted resistance of the FGRS GL for differently tuned resistance values are shown
in Figure 26a. The scale factor is fixed at 0.7918, and the resistance is again changed by
using the tunnelling and injection mechanisms. It is observed that as more electrons are
injected to the floating gate, which means increasing VG for a pMOS transistor, FGRS GL
becomes more linear. This is mainly because the triode condition for the resistor is satisfied
more with the increased VG values. As the resistor operates in deep triode region, it allows
for larger voltage swings across its terminals. In addition, the relative nonlinearity of the
transistor decreases for higher VG values since θ0 in (22) reduces for higher VG values. The
extracted resistances for different scale factors are shown in Figure 26b. These sweeps
42
550 360
Inject
340
500
Tunnel
320
450 Increasing ratio
300 from 0.417 to 0.96
Resistance (KΩ)
Resistance (kΩ)
400 280
350 260
240
300
220
250
200
200 180
−2.5 −2 −1 0 1 2 2.5 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5
Source to drain voltage (V) Source to drain voltage (V)
(a) (b)
Figure 26. Experimental results. (a) Extracted resistances of the FGRS GL tuned to different quiescent
gate voltage. (b) Extracted resistance of the FGRS GL for different scale factors, from 0.417 to
0.96, used to linearize the resistor. The sweeps show the voltage dependence of the resistor,
and illustrates the compensation of the second order nonlinearity.
justify the previous result that the optimum value of the scale factor for better linearity
is 0.7918. Also, the extracted resistances that have a smaller or larger scale factor than
0.7918 exhibit a non-symmetric behavior due to the fact that the second-order nonlinearity
becomes the dominant source of the nonlinearity if it is not cancelled properly. This non-
symmetric behavior is also partially contributed by the fact that Cgd and Cgs of MR have
The low-voltage characteristics of FGRS GL are determined by decreasing the well volt-
age down to 0.25V, as illustrated in Figure 27a. Each sweep in this plot is performed by
changing Vd from −Vwell to +Vwell while keeping V s at Vwell /2. The sweeps are obtained for
Vwell equal to 0.25V, 0.5V, 1V, 2V, and 4V. It is observed that the linearity of the resistor is
preserved at low-voltages even if the capacitive division factor to obtain the scale factor is
fixed at 0.7918. Since the well voltage changes the effective scale factor, the circuit should
be designed for the desired supply voltage that also determines the well voltage. However,
the results of this test show that the change in the effective scale factor due to the well
voltage does not alter the linearity characteristics of the resistor as much as the change due
43
10
8
Vwell = 4V
6
Output Current (µ A)
2 Vwell = 2V Vwell = 0.5V
V = 0.25V Vwell = 1V
−2 well
−4
−6
−8
−10
−1 −0.5 0 0.5 1
Normalized source to drain voltage (V/Vwell)
(a)
40
20
44.9 dB
28.95x94.2 um2
0
Power (dB)
−20
−40
−60
−80
−100 3 4 5 6
10 10 10 10
Frequency (Hz)
(b) (c)
Figure 27. (a) Experimental results obtained with different well voltages, Vwell . x − axis of the plot
is normalized to show the relative change. (b) The linearity test of the FGRS GL for 1KHz
sinusoidal input signal with 1V pp amplitude. (c) Die photo of the FGRS GL .
The dynamic measurements of the FGRS GL are shown in Figure 27b, and obtained by
using an off-chip inverting amplifier with corresponding feedback resistor. 1kHz sinusoidal
wave with 1V pp amplitude is used for the test, and the scale factor of the resistor is set to
0.7918. The second order harmonic distortion of the resistor for this test is measured as
44.9dB. This is mainly because the second-order distortion of the FGRS GL is not com-
Furthermore, the die photo of the fabricated FGRS GL circuit is shown in Figure 27c.
In the designed circuit, MR has a dimension of W/L = 19.5µm/1.2µm. Also, the main
44
4000
1
0.98
3000
o
at 300 C
0.96
gm(t)/gm(0)
0.92
1000
0.9
0
0.88
0.86
−1000
0.84
−2000
0.82
−3000 0.8 −1 0 1
1 1.5 2 2.5 3 3.5 10 10 10
Effective threshold voltage of floating−gate transistor (V) Time (days)
(a) (b)
Figure 28. (a) Temperature coefficient of the FGRS GL for differently programmed threshold voltages.
(b) Stress test of the FGRS GL performed at 300oC and 325oC.
capacitor is 560 f F and each trimming capacitor is 56 f F. The total area of the test circuit
is 2727µm2 .
voltages is illustrated in Figure 28. The effective threshold voltages of the FGRS GL are
obtained from their gate sweeps. It is observed that this coefficient can be changed from
thermionic emission [62]. The resistance change over time can be found by using the
following equation
gm (t) βVT h i
= Φ(t, T ) + Φ(t, T ) − 1 (24)
gm (t0 ) gm (t0 )
where Φ(t, T ) = exp − tv.exp −φkT
B
, gm = 1/R is the conductance, v is a relaxation fre-
mann’s constant. Figure 28 illustrates the stress test results. The worst case results are ob-
tained after the first stress test at 300oC. After the first test, the charge loss of the FGRS GL
is decreased considerably. The φB and v from these experiments are extracted as 0.9eV
and 60s−1 . Based on this worst-case data, it is calculated that the FGRS GL resistance drifts
45
is presented. This resistor exploits the floating-gate transistor properties and the scaled-gate
linearization technique. Better than 7 − bit linearity is obtained for 1V pp sinusoidal input.
The circuit does not consume additional power for the offset and feedback generation, thus
becomes very suitable for ANN systems and low-power applications. Furthermore, we
showed that for a fixed scale factor, the well voltage can be reduced while still preserving
the linearity of the resistor. Therefore, this CMOS resistor can be easily integrated with
low-voltage applications.
46
CHAPTER 5
TUNABLE HIGHLY LINEAR FLOATING-GATE CMOS
RESISTOR USING COMMON-MODE LINEARIZATION
TECHNIQUE
The linearity and operating range of the resistors are the most crucial features for highly lin-
ear applications that require high signal-to-noise and distortion. In this work, we propose a
tunable CMOS resistor that can be suitably employed in highly linear circuits. This CMOS
resistor operates in the triode region, utilizes the common-mode linearization technique
[18], and achieves a compact and power efficient circuit implementation by employing
floating-gate MOS transistors. In the next section, we explain the common-mode lineariza-
tion strategy, and analyze its effect. Subsequently, we describe the implementation of a
tunable floating-gate resistor using the common-mode linearization (FGRCML ). After that,
we present the experimental results of this circuit. In the last part of this chapter, we com-
pare this resistor with previously reported resistors and discuss their characteristics.
in the triode region and these are identified as the body effect, the mobility degradation,
and the fundamental quadric component due to the common-mode of the drain and source
voltages. These nonlinearities are mostly dependent on the common-mode of the input
transistor [18].
The common-mode linearization scheme is illustrated in Figure 29, and exploits the fact
that the linearity of a single transistor can be greatly improved by applying the common-
mode signal (with the addition of their corresponding quiescent voltages) to the gate and
body terminals [18]. Similar to the gate linearization, this technique also requires vds <
2(VG − v s − VT ) to operate in the triode region, where vds is the drain-to-source voltage, VG
47
V G + Vc
Vd Vs
-V B + Vc
Figure 29. Common-mode linearization technique [18] applied to an nMOS transistor in the triode re-
gion. This method allows to minimize the nonlinearities of a MOS transistor by modulating
the body and gate terminals with the common-mode voltage. vd and v s are the drain and
source voltages, respectively. VG and VB are the tunable quiescent gate and body voltages,
and vc is the common-mode voltage, vc = (vd + v s )/2.
is the quiescent gate voltage, v s is the source voltage, and VT is the threshold voltage. In
this technique, the gate and body voltages, vg and vb , are defined as
vg = VG + vc , vb = −VB + vc (25)
where VB is the quiescent body voltage, and vc is the common-mode voltage and equal to
p
VT = VFB + φ + γ VB + φ (26)
θ
θ2 = √ (27)
1 + θ VG − VFB − φ + γ VB + φ
µ0
µ2 = √ (28)
1 + θ VG − VFB − φ + γ VB + φ
where VFB is the flat-band voltage, φ is the surface potential, γ is the body-effect coefficient,
θ is the mobility degradation factor, and µ0 is the carrier mobility. As suggested in [18] and
96(VB +φ)3/2
explained in Appendix-I, by using the above equations, the drain current for θ2 γv3ds
can be approximated as
( )
µ2Cox W γ(1 + θ2 [VG − VT ]) 3
Id = [VG − VT ]vds + √ vds (29)
L 96 3 VB + φ
where Cox is the gate capacitance per unit area, W is the channel width, and L is the channel
length. The above result is remarkable in the sense that the inherent nonlinearities of a
MOS transistor can be reduced down to a cubic ordered term. With a reasonable selection
48
of the quiescent gate and bulk voltages, the linear region of a MOS transistor can be greatly
extended. After ignoring the higher order terms, the resistance of the linearized element
can be expressed as
L
R= (30)
µ2Cox W(VG − VT )
In the above equation, VT does not depend on the common-mode of the input voltages, thus
30. FGRCML operates as a tunable floating resistor by exploiting the features of the floating-
gate transistors. The common-mode voltage of the input signals is computed by using the
feedback capacitors, which couple the drain and source voltages to the gate terminal. In
addition, the charge stored on the floating-gate terminal creates the required quiescent gate
voltage to satisfy the triode condition and linearity requirement. As shown in Figure 30a,
Vtun is used to enable the tunnelling mechanism to decrease the number of electrons on
the floating-gate terminal of MR . Also, V sPROG and VdPROG are used to create the required
voltage difference that is necessary for the hot-electron injection mechanism to occur and
increase the number of electrons at the gate terminal of MR . As a result, by using (2) the
where V p is the effect of the stored charge and the capacitive coupling from the peripheral
circuit that includes Ctun and C MP . Ctun is the tunnelling junction capacitance, and C MP is
the input capacitance of the injection transistor, MP . Cgs becomes equal to Cgd for large
quiescent gate voltages. Therefore, the necessary condition for an accurate common-mode
computation is to create a large quiescent gate voltage and to keep Cg much larger than C MP
49
and Ctun so that the floating-gate potential is close to
(V s + Vd )
Vfg ' + Vp (32)
2
The scaling error introduced by the common-mode computation increases the common-
mode dependence of the circuit. The circuit, shown in Figure 30a, employs a well feedback
in addition to the gate feedback to further reduce the inherent nonlinearities of a MOS
transistor. The common-mode computation circuit is illustrated in Figure 30b. This circuit
is a source follower and used to drive the well terminal of MR . Similar to the gate common-
mode computation, two capacitors are used to compute the common-mode voltage at the
input of the source follower. Since the well voltage has to be larger than the input voltages,
Vd and V s , to prevent the drain and source junctions from being forward biased, an offset
voltage must be created at the input of the follower. This is achieved by programming (in
this case by tunnelling) the charge stored on the floating gate of the follower. In addition, if
a rail-to-rail operation is required with this resistor, the source follower needs to be powered
For an accurate common-mode well feedback voltage computation, the well feedback
capacitors (Cg ) have to be sized relative to the input capacitance of the source follower. In
this case since the input transistor of the source follower operates in the saturation region,
the input capacitance approximately becomes Cgs = 2Cox A/3, where A is the total area of
If there is a mismatch between the gate feedback capacitors, or if there is a scaling error,
then an error term, ε, is introduced to (29). This error can be approximated as ε(v2d − v2s )/2,
As a result, the error term in the drain current gives rise to a common-mode voltage depen-
dence. In modern processes, a matching accuracy better than 0.1% can be obtained with the
50
(a) (b)
Figure 30. (a) Circuit implementation of the tunable floating-gate resistor. V s and Vd are the source
and drain voltages of MR , respectively. This resistor is tuned by changing the quiescent
gate voltage. This is achieved by using the tunnelling junction connected to Vtun , and the
injection transistor that has source voltage V sPROG and drain voltages VdPROG . The feedback
capacitors (Cg ) are used to compute the common-mode gate voltage. Also, the well feedback
voltage is computed by the common-mode circuit. (b) The common-mode computation cir-
cuit. This circuit consists of a source follower, a programming circuitry and input capacitors
(Cg ). Input capacitors compute the common-mode voltage and apply it to the input of the
buffer. Vbias is used to set the current through the circuit, and Vcascode is employed to min-
imize the effect of the output voltage on the bias current. The computed common-mode
voltage is tracked by the buffer circuit and then applied back to the well.
capacitors, and this together with high quiescent gate voltages readily allow for the circuit
surements are obtained from the chips that were fabricated in a 0.5µm CMOS process.
A 16-bit DAC is used for the measurements to characterize the linearity and voltage-
The experiments for the static measurements are performed by keeping one terminal of
the floating-gate resistors at 2.5V, and then sweeping the other terminal between 0.5 and
4.5V. Also, the source follower of the FGRCML is powered with 6V during the experiments.
After the each programming step by tuning the quiescent gate voltage, the experiment is
repeated to observe the change in the resistance and linearity. This is achieved by tuning
51
20 1.4
Inject
15 1.2
10
1
Output current ( µ A)
Resistance (MΩ)
Injection Tunnelling
5
0.8
0
0.6
−5
0.4
−10
Tunnell 0.2
−15
−20 0
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2
Source−to−drain voltage (V) Source−to−drain voltage (V)
(a) (b)
Figure 31. Experimental results. The measurements are performed by keeping one of the terminals at
2.5V and sweeping the other terminal from 0V to 5V. These measurements are obtained for
differently tuned quiescent gate voltages, which is increased through injection to decrease
the resistance. Also, this gate voltage is decreased by using tunnelling to increase the resis-
tance. (a) The output current vs. input voltage sweeps. (b) The resistance vs. input voltage
sweeps. Extracted resistances of the FGRCML tuned to different quiescent gate voltages.
the amount of stored charge on the floating-gate terminal of MR . The I-V curves of the
FGRCML are shown in Figure 31a. The FGRCML exhibits less variation for its smaller
resistance values as shown in Figure 31b. This is mainly because the relative effect of
the common-mode voltage on the resistance becomes less for higher VG values, and VT
stays almost fixed in the operating range of the resistor. In addition, having a larger VG
helps the transistor to stay in the deep triode region even for large differential input signals.
Moreover, the nonlinearities of this structure are a function of the quiescent gate voltage,
and they can be better suppressed for large gate quiescent voltages. This is especially true
for the FGRCML , since θ2 becomes very small for large gate quiescent voltages.
The well voltage of MR affects the resistance and the linearity of the FGRCML . When
the offset voltage of the source follower is increased from −0.5V to 3V, the resistance of the
FGRCML changes ±15% as shown in Figure 32. Here, Vwell offset is defined as the voltage
difference between the source/drain and well voltages of MR when the drain/source voltage
is 5V. For that purpose, the source follower is powered with 9V to observe the resistance
variation of the FGRCML when one of its input terminals swept from 0V to 5V and the
52
5
115
85 V offset = − 0.5
well 1.5
80 1
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 0 1 2 3 4 5
Source−to−drain voltage (V) One of input voltages of buffer (V)
(a) (b)
Figure 32. (a) Effect of the well offset voltage on the resistance of the FGRCML . (b) Output voltage of
the source follower when one of the FGRCML inputs is swept from 0V to 5V. The slope is
measured to be 0.486.
other terminal is fixed at 2.5V. The output voltage of the source follower and its offset
programming using the Fowler-Nordheim tunnelling and hot electron injection is depicted
in Figure 32b. It is observed that the slope of the well common-mode computation is only
0.486 and not 0.5. This difference causes asymmetry in the output current of the FGRCML
The dynamic measurements of the FGRCML are obtained by using an off-chip invert-
ing amplifier with a corresponding feedback resistor (matches the resistance of on-chip
resistor) as shown in Figure 33a. For this purpose, a sine-wave with 2.5V offset and 1V pp
amplitude is used to test the transient behavior as well as the distortion level of the FGRCML .
The maximum frequency of the input signal that can be used with the FGRCML depends on
the resistance and the input capacitance of the FGRCML . It is also important to use enough
bias current for the source follower so that it can drive the well terminal of MR at given
frequency. The FGRCML transient response for 100kHz and 1V pp input sine-wave is shown
in Figure 33b. For this transient test, the source follower is biased with 10µA. Moreover,
the total nonlinearity and harmonic distortion of the FGRCML are tested using this test setup
and utilizing the 16-bit DAC. It is observed that the total nonlinearity of the FGRCML with
W/L = 1.2/7.5 in the full operating range of ±2V can be held below 1% while changing its
53
0.6
0.4
−0.2
−0.4
−0.6
0 10 20 30 40
Time (µ s)
(a) (b)
Figure 33. (a) Inverting amplifier used to test the transient behavior and distortion level of the FGRCML .
(b) Transient measurement data of the FGRCML for 100kHz and 1V pp sine-wave.
When the FGRCML is tuned to have a resistance around 100kΩ, its total harmonic dis-
tortion for sine amplitude levels from 0.3V to 3V increases from 0.012% to 0.18% as illus-
trated in Figure 34b. Although, a better linearity performance is possible with this struc-
ture, due to inaccurate computation in the well computation circuit the distortion level of
the FGRCML is measured to be higher. Since the common-mode computation by the source
follower results in 0.486, especially the second-order harmonic of the FGRCML output in-
creases considerably. This reasoning is justified by testing the FGRCML nonlinearity and
its THD for a range of well feedback ratios. This test is performed by supplying the well
As shown Figure 35a, the nonlinearity of the FGRCML in the full operating range can
be reduced below 0.1% when the well feedback ratio is very close to 0.5. Similarly, the
THD of the FGRCML for 1V pp input signals becomes around 0.005% for well feedback
ratio of 0.5. Therefore, the well feedback ratio is found to be the main source of error
in this resistor structure. Furthermore, the well offset voltage also affects the linearity of
the FGRCML . The second and third-order harmonic distortions of the FGRCML is also a
function of the well offset voltage as illustrated in Figure 36a. The third-order harmonic
54
2
10 0.18
0.16
0.14
1
10 0.12
Nonlinearity (%)
THD (%)
0.1
0.08
0
10 0.06
0.04
0.02
−1
10 0
100 200 300 400 500 600 700 800 0 0.5 1 1.5 2 2.5 3
Resistance (kΩ) Input amplitude (V)
(a) (b)
Figure 34. a) Nonlinearity of the FGRCML for differently tuned resistance values. The nonlinearity
is measured in the full operating range of ±2V. The resistance is tuned by changing the
quiescent gate voltage, which is increased through injection (to decrease the resistance) and
decreased by using tunnelling (to increase the resistance). (b) Total harmonic distortion of
the FGRCML for a sine-wave with different amplitude levels.
distortion can be reduced from −60dB to −95dB when the well offset is increased from
0.75V to 2.25V. However, the second-order harmonic distortion is mainly caused by the
inaccuracy of the well feedback ratio, thus it does not change much with the well offset
voltage.
Lastly, the change of the FGRCML resistance with the input voltage is tested for smaller
transistor lengths. Since the initial assumption in building this structure is to have a long-
channel transistor, the linearity of the FGRCML decreases and its resistance changes much
more for smaller channel lengths as depicted in Figure 36b. The die photo of the fabricated
chip is shown in Figure 37. The dimensions of MR is W/L = 1.5µm/15µm, and the values
of each gate and well feedback capacitors are 450 f F and 1970 f F, respectively. These
capacitors can be optimized depending on the input capacitance of the transistors. Also, an
auxiliary bias generator circuit is used to generate the bias current and the cascode voltage
55
0
10
0
10 −1
10
−2
10
−1
10
−3
10
0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7
Well ratio Well feedback ratio
(a) (b)
Figure 35. Measurement results for a range of well feedback ratios. The well potential of MR is supplied
from off-chip. (a) The nonlinearity of the FGRCML in the full operating range of ±2V for a
range of well feedback ratios. (b) Total harmonic distortion of the FGRCML for 1V pp input
sine wave for a range of well feedback ratios.
5.4 Discussion
The results obtained from the presented CMOS resistor make this structure very suitable
tions are summarized to compare FGRCML with these implementations. These resistors are
has a resistance that is independent of the threshold voltage of the CMOS transistors [7].
This floating resistor achieves 1% THD for 2.4V pp . It is implemented with 20 transistors,
and it allows tuning from 56kΩ to 112kΩ (can be scaled for chosen W/L). The main
shortcoming of this implementation is that it requires the use of large area due to the number
Moreover, a voltage controlled MOS resistor based on the bias-offset technique operates
within the 80% of the supply range, and achieves ±1% THD for 8V pp input signals [12]. 9
transistors are used to implement this compact MOS resistor. Although this resistor offers
56
1.15
−55
Second−order harmonic
−60 Third−order harmonic
1.1
−65
W/L = 1.2/0.6
Normalized resistance
−70 W/L = 1.2/1.2
Harmonics (dB)
1.05
−75 W/L = 1.2/7.5
−80
1
−85
−90 0.95
−95
−100 0.9
0.5 1 1.5 2 2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2
Well offset (V) Source−to−drain voltage (V)
(a) (b)
Figure 36. (a) The second and third-order harmonics of the FGRCML for a range of well offset voltages.
The well offset voltage is changed by programming the offset voltage of the source follower
by using the injection and tunnelling mechanisms. (b) Normalized resistance of the FGRCML
circuits vs. their input voltage. The length of MR is sized as 0.6µm, 1.2µm, and 7.5µm.
division technique [68] allows for high linearity even with large voltage swings. It yields
0.01% THD for 2.5V pp signals. However, 4 transistors, 1 amplifier, and 4 resistors increase
achieves the best linearity performance within the reported resistors. While around 0.0032%
THD is possible with this structure for 1V pp input signals, its size and power consumption
are the main disadvantages of this design. Also, the BICMOS process increases the cost of
The floating-gate resistor that is reported in this work utilize the properties of MOS
two programming circuitry. This structure results in increased linearity, which is necessary
for the most of highly linear applications and reduced power consumption since only one
source follower needs to be powered. At most 72dB (for 1V pp ) of linearity is obtained with
this FGRCML design. It is observed that the accuracy of the well feedback computation is
the limiting factor for the resistor linearity. Also, the implementation of the FGRCML in
57
BIAS GENERATOR
BUFFER
TRIODE
TRANSISTOR
BUFFER INPUT
? CAPACITORS
6
GATE
FEEDBACK
CAPACITORS
PROGRAMMING CIRCUITRY OF
BUFFER AND TRIODE TRANSISTOR
CMOS processes with feature sizes smaller than 0.35µm can be achieved by using thick-
CMOS resistor by making use of the floating-gate transistor features. We showed that the
tuning and operating ranges of the resistor are extended by employing the analog storage
characteristic of the floating-gate transistors. Also, we showed that FGRCML offers a com-
pact and power efficient implementation that yields around 72dB of linearity. The linearity
and power efficiency of this resistor make it suitable for highly linear circuit applications.
58
Table 1. Experimental results of tunable CMOS resistors (T:transistor, R:resistor, C:capacitor,
B:buffer, LS:level shifter, A:amplifier, PC:programming circuitry)
Design [7] [12] [68] [69] FGRCML
Process 2µm CMOS 3µm CMOS 2µm CMOS BICMOS 0.5µm CMOS
Power supply 10V 10V 5V - 6V
Operating range 2.4V 8V 3V 10V 4V
Tuning range 56 to 112 kΩ - ±5% - 100 to 800 kΩ
THD 1% (2.4V pp ) ±1% (8V pp ) 0.01% (2.5V pp ) < 0.0032% (1V pp ) 0.024% (1V pp )
Components 20T 9T 4T+1A+4R 1T+4B+4LS 4T+4C+2PC
59
CHAPTER 6
DESIGN OF HIGHLY LINEAR AMPLIFIER AND MULTIPLIER
CIRCUITS USING A CMOS FLOATING-GATE RESISTOR
The linearity of the highly linear amplifier and multiplier circuits can be increased by em-
ploying the highly linear tunable CMOS resistor described in Chapter 5. This resistor can
serve as an alternative to passive resistors and allow the realization of a dynamic and linear
resistor while facilitating a reduction in system size and cost. In the next section, we ex-
plain how this resistor can be used to increase the linear range in differential amplifiers and
using high gain amplifiers to achieve the voltage-to-current conversion without introducing
additional distortion. Also, it employs the FGRCML circuit as a variable resistor, Rvar . Each
high gain amplifier consists of an input differential amplifier and folded-cascode output
stage that results in a high gain [70]. With the use of these amplifiers, NMOS current
mirrors achieve boosted gm and conduct the current (I p3 +I p2 /2) plus the signal current i s ,
which is the current created by the differential voltage, vin across Rvar . When the finite open-
loop gain, A0 , of the amplifiers is taken into account, the signal current can be expressed
as
vin A0
is = . (34)
Rvar + 2/(gm (1 + A0 )) 1 + A0
This equation shows that for more accurate voltage-to-current conversion and less distor-
tion, high gain is required. In order to prevent the capacitive loading at the resistor stage
and to improve the frequency response and linearity of the circuit, the feedback capacitors
60
of the FGRCML are buffered by employing the same source follower used for the well feed-
back shown in Figure 38b. Similar to the gate common-mode computation of the FGRCML ,
two gate capacitors are used to compute the common-mode voltage at the input terminal
of the source follower. This structure employs a highly linear source follower [71] to drive
the well terminal. This open-loop source follower is preferred because of its wider band-
width than the closed loop followers and its high linearity. Since the well voltage has to
be larger than the input voltages, Vd and V s , to prevent the drain and source junctions from
being forward biased, an offset voltage must be created at the input of the follower. This is
achieved by programming (in this case by tunnelling) the charge at the follower gate input
terminal enough to obtain the voltage needed for the operation of the resistor. Additionally,
if a rail-to-rail operation is required, then a higher supply voltage needs to be used for the
source follower.
For this application, the folded cascode amplifier is preferred over grounded amplifiers
[72] not only to obtain a higher gain but also to avoid the additional V sg drop that can
counteract the effect of the injected charge at the gate of the pMOS floating-gate resistor.
In addition, the input transistors of the amplifier are chosen to be pMOS to utilize their
n-well for eliminating the body effect and improving the noise performance.
shown in Figure 38c. One of the source/drain terminals of the floating-gate transistor is
fixed and used as an output, Vout , and the other terminal is employed as an input, Vd . The
second input of the multiplier, Vr , is supplied from the feedback gate capacitor, Cg1 . While
conductance of the FGRCML . For this purpose, the FGRCML has to be put into the triode
regime by injecting enough electrons to the gate terminal so that the transistor stays in the
61
Iout1 Ip2 Ip1 Iout2
Ip1 Ip3 Ip3 Ip2
Rvar
In1 In1
(a)
Vr
Vd
Vs
Vbias
Cw Cg
Vd
1
Vcas
Cg
Cw
2
Vout
Vs_inj
MR Common
mode
Ctun circuit
Vtun mux PROG
Vd_inj1 Vd_inj2 Vout
(b) (c)
Figure 38. (a) Circuit implementation of the variable gain amplifier. The FGRCML is used as a vari-
able resistor, Rvar . (b) The common-mode computation circuit. It consists of a highly linear
source follower [71], programming circuitry and input capacitors. Input capacitors com-
pute the common-mode voltage and apply it to the input of the follower. The computed
common-mode voltage is tracked by the follower circuit and applied to the well. (c) Two
quadrant multiplier circuit implementation. Vr and Vd are the input voltages and Vout is the
output voltage, and the output of the circuit is obtained in the form of current.
linear region for the required input swing. In addition, the gate capacitor modulates the
resistance of the circuit by changing the effective voltage at the gate terminal, and this can
µ0oCox W Vr
Iout = [ + VG − VT ](Vd − Vout ) (35)
L 2
where VG in this equation is defined as the effect of the charge at the gate and capacitor
couplings when Vr is set to Vout . Assuming the common mode voltage of Vr is Vout , then
the amplitude of Vr needs to be smaller than 2(VG − VT ) so that multiplier stays in the triode
region.
62
Differential output: Vout1 -Vout2 (V)
6
4
2
2
1
0
0
-2 -1
-2
-4
-3
-6
-1 -0.5 0 0.5 1 -4
-1 -0.5 0 0.5 1
Single-ended input: V in1 -Vin2 (V) Differential input: V in1 -Vin2 (V)
(a) (b)
Figure 39. Experimental results of the highly linear amplifier. The output current of the amplifier is
converted to voltage by using 10KΩ on-chip resistors. Input-output DC characteristics of
the amplifier for differently tuned FGRCML values. (a) Differential output response of the
amplifier to a single-ended input. Vin1 is used as a input while Vin2 is kept constant at 2.5V.
(b) Differential output response of the amplifier to a differential input.
This multiplication gives two terms, Vr (Vd − Vout ) and (VG − VT )(Vd − Vout ). The second
term can be removed by using two multiplier circuits, and then by applying fully differential
signals to their input capacitors. In this case, multipliers must be programmed to the same
resistance value for accurate offset cancellation. The subtraction of output currents of the
terms of (Vr1 − Vr2 )(Vd − Vout ), where Vr1 and Vr2 are the differential inputs.
surements are obtained from the chips that were fabricated in a 0.5µm CMOS process.
The DC characteristics of the highly linear amplifier for single-ended and differential
inputs are shown in Figure 39a and 39b , respectively. It is shown that it is possible to ap-
ply 2.5V pp single-ended and differential inputs. This range is mainly limited by the cascode
transistors of the amplifier. The output current of the highly linear amplifier is converted to
63
10
Gain
5
0
-30
Gain (dB)
-5
-40
-50 -10
-60 -15
-70 -20
-80
-25
-90
-30
2 3 4 5 6 7
0 1 10 10 10 10 10
10 10 10
Differential Output Voltage Amplitude (V) Frequency (Hz)
(a) (b)
Figure 40. Experimental results of the highly linear amplifier. (a) Total harmonic distortion of the am-
plifier for differential input signals. The upper curve represents the total harmonic distor-
tion of the amplifier for different gains, which is defined as 10KΩ/RFGR in this context. Out-
put voltage amplitude is fixed at 1V pp for distortion measurements and the gain is changed
by tuning the FGRCML to different resistance values. The lower curve illustrates the distor-
tion levels of the amplifier for a range of output voltage amplitudes. For this measurement,
the gain is fixed at 1.5 by tuning the FGRCML . (b) The frequency response of the amplifier
for different gains obtained by tuning the resistance of the FGRCML .
voltage by using on-chip 10KΩ resistors, and then buffered for off-chip reading. Total har-
monic distortion (THD) of this amplifier for a range of signal amplitude and amplifier-gain
is illustrated in Figure 40a. The amplifier can yield 0.018% THD for 1V pp differential input.
Increase in the input voltage amplitude and in the FGRCML resistance cause degradation in
Furthermore, the frequency sweeps of the amplifier for differently tuned FGRCML re-
sistance values are shown in Figure 40b. The amplifier has a 3dB frequency around 1MHz,
and this limitation is mainly caused by the buffer circuit as well as breadboard parasitics.
Finally, the dynamic results of the multiplier circuit is illustrated in Figure 41. The
output current of the multiplier is converted to voltage for off-chip reading. It is shown
that the output of the multiplier fits well with the theoretical results. The linearity and
linear range of the multiplier can be improved by increasing VG in (35) since FGRCML in
the multiplier circuit becomes more linear. There are two design issues with the FGRCML
64
0.4
Amplitude
0.2
0
-0.2
0.4
Voltage (V)
0.2
-0.2
-0.4
0 1 2 3 4
Time (ms)
Figure 41. Output of the multiplier to a 1KHz, 1V pp input signal while its gate is modulated with 10KHz,
1.5V pp signal. The upper curve is a theoretical result of the multiplication and the lower
curve illustrates the output of the multiplier. Theoretical result shows that the response of
the multiplier fits with the equation sin(w0 t + φ0 ) · (2.475 + 1.5 · sin(10w0 t)), where φ0 is the
phase difference between two input signals.
structure. Firstly, the source follower has to operate with larger power supply voltages than
Vdd if a rail-to-rail resistor operation is required. Secondly, the feedback capacitors has to
be large enough to minimize the effect of the peripheral circuit. The parasitic capacitors and
finite matching of the feedback capacitors may prevent the accuracy in the common-mode
voltage computation.
In this chapter, it is shown that a tunable resistor can be employed to design highly
linear amplifier and two-quadrant multiplier circuits. Also, the design of a four-quadrant
multiplier circuit is described. The amplifier exhibited 0.018% THD for 1V pp differential
input, and a linear input range of 2.5V pp . These circuits will be employed in applications
65
CHAPTER 7
DESIGN OF A BINARY-WEIGHTED RESISTOR DAC USING
TUNABLE LINEARIZED FLOATING-GATE CMOS RESISTORS
In this chapter, the design of a binary-weighted resistor DAC using the linearized tunable
process and provides a high resolution and precise device calibration through the use of
[67] [66], this resistor has a simple structure and provides a high degree of design flexi-
bility in optimizing the overall area and the tuning range of the DAC. In the next section,
we describe the design and implementation of the binary-weighted resistor DAC. Subse-
used to obtain the scaled currents and full output voltage swing at the DAC output. The
input resistors, Ri for i = 1, ..., N, switch between ground and voltage reference, Vre f , and
generate the scaled currents. Also, Vc and Rc are used to obtain a larger output voltage range
In this kind of implementation, where accuracy is the main design objective, highly
matched passive resistors are used in the design to prevent any degradation in the DAC
linearity. However, this requirement necessitates the use of large devices, which can be
expensive in terms of area and may degrade the high frequency performance. Instead of
passive devices, tunable resistors can be used to alleviate matching and area requirements.
the voltage across the resistors can assume only two values. However, due to the limited
66
Figure 42. Proposed implementation of a binary-weighted DAC using tunable resistors. Ri is the tun-
able resistor, where i = 0, 1, 2, 3. Also, R f is the feedback resistor, and Rc is used to obtain
the full output voltage range and to tune the offset of the DAC. Vc is set to supply rail of the
DAC.
low frequency voltage gain of the amplifier, the voltage across the resistors still vary by the
error voltage, e = Vo /Ao , where Vo is the output voltage swing and Ao is the low frequency
voltage gain of the amplifier. Therefore, when tunable resistors are incorporated into such
design, the nonlinearities of these resistors have to be suppressed to obtain a better DAC
linearity.
MOS transistor operating in the triode region. However, MOS transistors operating in the
triode-mode exhibit a large resistance variation mainly due to their quadratic dependence
on voltage across their source and drain terminals. For this reason, it is necessary to apply
obtain the scale factors. Use of floating-gate transistors in this DAC structure enables to
Due to the asymmetric structure of the FGRS GL , one of its input terminals has to be
maintained at a fixed potential. Hence, V s terminals of these resistors are connected to the
corresponding switches while their Vd terminals are connected to the inverting node of the
amplifier. In this resistor structure, Vg2 can be used to tune the resistance of the FGRS GL .
67
As long as the FGRS GL stays in the triode region, Vg2 can alter the transconductance of the
FGRS GL linearly since it has a linear relation with the effective gate voltage.
ricated in a 0.5µm CMOS process. The input capacitors, Cg1 and Cg2 , are sized as 2016 f F
and 784 f F, respectively, to obtain a scale factor of χ = 0.72. The scaled resistors are im-
plemented by using scaled transistors with W = 1.2µm, and L = 2.4µm, 4.8µm, 9.6µm, and
19.2µm.
The DC characteristics of the FGRS GL circuits are obtained by keeping their drain ter-
minal at ground and sweeping their source terminals from 0 to 5V as illustrated in Figure
43a. In this experiment, the well potential is fixed to 5V. The extracted resistances of these
resistors are shown in Figure 43b, where resistances are scaled by the scale factor of the
resistors to observe the relative change in their resistance. The precise scale factors for
the implementation of the DAC are obtained by tuning the resistance of the FGRS GL for a
source-to-drain voltage of 2.5V, which is the reference voltage of the DAC. As the length
of the tunable resistor increases, the deviation in the FGRS GL resistance decreases. This is
mainly because the scaled-gate linearization technique becomes more effective for the long
channel devices. The temperature dependence of the FGRS GL is shown in Figure 43c and
obtained by changing the temperature from −40 to 80oC. The temperature coefficient of
The static characteristics of the 4-bit DAC are illustrated in Figures 44. DAC has an
output voltage range of 4.56V, and the INL and DNL plots illustrate that the accuracy er-
ror can be limited to less 139µV, which corresponds to 15-bit of accuracy. The MSB step
response of the DAC is shown in Figure 45a, and depending on the size of the feedback
capacitor, settling time less than 10µs can be obtained. The sine wave test is shown in Fig-
ure 45b. 1kHz sinusoidal signal is generated by setting the sampling frequency at 170kHz.
68
140
120
Drain Current (µ A)
100
80 W/L = 1.2/2.4
60
W/L = 1.2/4.8
40 W/L = 1.2/9.6
20
W/L = 1.2/19.2
0
0 1 2 3 4 5
Drain to Source Voltage (V)
(a)
55
39.5
W/L = 1.2/2.4
85 y = 218.47*x + 67182
39
80
Resistance (KΩ)
38.5
Resistance (KΩ)
75
38
W/L = 1.2/9.6
70
37.5 W/L = 1.2/4.8 W/L = 1.2/19.2
65
37 data 1
V=2.5V 60 linear fit
36.5
36 55
0 1 2 3 4 5 −40 −20 0 20 40 60 80
Drain to Source Voltage (V) Temperature (oC)
(b) (c)
Figure 43. (a) Voltage sweeps of the tunable resistors from 0 to 5V. (b) Extracted resistances of the
tunable resistors with different lengths. For visual purposes, all other resistances are scaled
to W/L = 1.2µm/2.4µm. (c) Temperature sweep of the FGRS GL for W/L = 1.2µm/4.8µm.
This DAC can be made much faster by properly sizing the FGRS GL .
The long-term and short-term drift of the DAC is crucial as it determines the DAC re-
liability. The short-term drift can be observed shortly after the floating-gate programming,
and can be minimized by decreasing the number of injection pulses for the fine tuning of the
devices. The short-term drift of the DAC linearity is illustrated in Figure 45c. It is observed
that after programming the DAC for 15-bit accuracy, the linearity drops to around 14-bit.
Moreover, the long-term drift of the DAC resistors is mainly caused by the thermionic
emission [62]. Based on the stress tests, it is calculated that the FGRS GL resistance drifts
69
DAC Output (V)
4
0
50
INL (µ V)
0
−50
50
DNL (µ V)
−50
−100
0 5 10 15
Digital Input Data
Figure 44. Static characteristics of the DAC: Output voltage, INL, and DNL.
floating-gate CMOS resistors is presented. It is shown that the resistance and temperature
coefficient of the FGRS GL can be tuned to a desired operating point. The stress test of
these resistors showed that the FGRS GL resistance drifts negligibly over time. It was also
demonstrated that 15−bit accurate, 4−bit resolution DAC can be built using these resistors.
This will readily enable the implementation of multi-bit CMOS quantizers in pipelined and
70
3.5
With 2pF feedback capacitor
With 5pF feedback capacitor
3
1.5
0.5
0
0 20 40 60 80 100
Time (µ s)
(a)
5
2
4.5
4 1.5
DAC output voltage (V)
3.5
1
3
0.5
2.5
2 0
1.5 data
sine fit −0.5
1
0.5 −1
0 0.2 0.4 0.6 0.8 0 100 200 300 400 500 600
time (ms) time (minutes)
(b) (c)
Figure 45. (a) MSB step responses for 2pF and 5pF feedback capacitors. (b) Sinusoidal transient re-
sponse of the DAC. The sinusoidal-fit is shown to illustrate the behavior of the DAC response.
(c) Short term linearity test of the DAC. The 10-hour data illustrates the change of the lin-
earity over time for LS B = 139µV.
71
CHAPTER 8
PROGRAMMABLE VOLTAGE-OUTPUT DIGITAL-TO-ANALOG
CONVERTER
Nyquist rate converters that require low-power and small area. In this chapter, we propose
based binary-weighted DAC (FGDAC). The epot is an ideal device for obtaining a dynami-
[73]. Utilizing epots to compensate for capacitor mismatches and to obtain binary-weighted
voltage levels enable to implement a DAC with an unity element spread. This implemen-
tation results in a compact, low-power voltage-output DAC. Earlier results [74, 75] demon-
strated the feasibility of the epot integration into a charge amplifier architecture.
In the next section, the binary-weighted capacitor DAC (BWCDAC) is compared with
the FGDAC, and their area, speed, accuracy, and noise performances are compared. Sub-
sequently, the circuit architecture of the FGDAC is explained, and integration of epots into
this implementation is described. In the last part of this chapter, the experimental results of
incorporated to periodically clear the inverting node of the amplifier as illustrated in Figure
4. This structure has its own limitations mainly due to its scaled capacitor array. Some of
the trade-offs and limitations of the BWCDAC can be alleviated by utilizing the FGDAC
implementation, shown in Figure 46. This implementation employs epots to obtain the
scaled voltage levels, which readily allow for a fixed-area-per-bit. In addition, the reset
72
Figure 46. Proposed design of floating-gate based DAC (FGDAC) that uses scaled voltages instead of
scaled capacitors to achieve the digital-to-analog conversion. In this design, C f is equal
to C. This converter is implemented by employing epots in a charge amplifier structure.
Reference voltages for each bit are programmed both to scale the input voltages and to
minimize the effect of the mismatch between capacitors.
control the charge on the inverting node of the amplifier. Therefore, return-to-zero phase
in the BWCDAC design can also be eliminated and the timing requirements of the DAC
can be relaxed. Other than these differences, the analysis for the DAC area, speed, gain
error, and noise performances are provided in the following subsections to show the design
8.1.1 Area
The area allocated for the capacitor array of the BWCDAC depends on the unity-size-
capacitor area, AC , and on the number of bits, N. Therefore, the total capacitor area used
for this converter becomes AC f + AC s = (2N+1 − 1) · AC , where AC f and AC s are the area used
for feedback capacitor and scaled capacitors, respectively. In this equation, C f = 2N C and
C s = (2N − 1)C.
In contrast, the total area used for obtaining the scale factors of the FGDAC is mostly
determined by the epots. Therefore, the total area increases linearly with the number bits,
for the input capacitors and epots, and Aepot is the area of an epot.
As a result, to obtain an improvement in the total DAC area using the FGDAC imple-
mentation, Aepot has to be smaller than AC · (2N+1 − N − 2)/N for an N-bit converter. For
73
4
10
Normalized Area ( A )
C
3
10
2
10
BWCDAC
FGDAC (AEPOT=20AC)
FGDAC (AEPOT=62.75AC)
FGDAC (AEPOT=203.6AC)
1
10 FGDAC (AEPOT=681.5AC)
4 5 6 7 8 9 10 11 12
Number of Bits
Figure 47. Comparison of the BWCDAC and the FGDAC for the area used to achieve the binary-
weighted scaling. This area corresponds to the area of the capacitor array for the BWCDAC,
while it is the sum of the areas of the capacitor and epot arrays for the FGDAC. AC and
AEPOT are the capacitor and epot areas. The FGDAC area is computed for a range of epot
area, AEPOT = α · AC , where α = 20, 62.75, 203.6, 681.5.
approach to minimize the total DAC area. The areas of the BWCDAC and the FGDAC
are compared for a range of Aepot values, as shown in Figure 47. The curves in this plot
represent only the total area used for scaling, and exclude the area used for other DAC
components. The intersection of these curves represent the point where the areas of the
BWCDAC and the FGDAC become equal for given number of bits and epot area. There-
fore, the FGDAC design strategy can yield a more compact converter depending on the
value of Aepot and the number of bits. For instance, the total capacitor area of the 10 − bit
DAC can be reduced around 100 times for Aepot = 20 · AC if same size unit capacitors are
8.1.2 Speed
The speed of the BWCDAC and the FGDAC are compared based on their time constants.
Here, it is assumed that these converters are structurally same. Since the time constants
of these converters are dependent on the type of the amplifier, one and two-stage amplifier
The time constants of the BWCDAC and the FGDAC are defined as τBWCDAC and
74
τFGDAC , respectively. For unit capacitance, C, the feedback and the total input capaci-
tance of the BWCDAC are C f = 2N C and Ceq = (2N − 1)C. However, these capacitance
values become C f = C and Ceq = NC for the FGDAC. Moreover, in this analysis, the
output resistance of the voltage references are assumed to be very small compared to the
Based on the analysis given in the Appendix-II, the time constants, τDAC1 and τDAC2 , can be
computed as
RonC C f (Camp + C L ) + C LCamp
τDAC2 = (37)
C f (Gm RonC + C L ) + (Camp + Ceq )(C L + C f )
the amplifier input capacitance, C L is the load capacitance, and Ceq is the sum of the input
capacitors.
When designing the converters, it is important to keep Ron small enough to utilize the
full bandwidth of the amplifier. It can be shown that if RonC (C f C L + (Ceq + Camp )(C f +
C L ))/(GmC f ), then τFGDAC2 < τFGDAC1 and τBWCDAC2 < τBWCDAC1 . Therefore, the first time
constants of these converters determine their maximum speed. In this case, the ratio of
The relationship between the BWCDAC and the FGDAC speeds based on the above equa-
tion is illustrated in Figure 48. This equation indicates that for negligibly small amplifier
input capacitance and for a small load capacitor the FGDAC operates much faster than the
BWCDAC does.
75
3
10
CL=C
CL=22C
2 CL=24C
10
CL=26C
1
/τFGDAC
CL=28C
10
CL=2 C
1
BWCDAC
1
10
k=τ
0
10
k=1
−1
10
2 4 6 8 10 12 14
Number of bits
Figure 48. Speed comparison of the BWCDAC and the FGDAC for one-stage amplifier case and small
amplifier input capacitance. The ratio of their time-constants, k, shows the relation for
increasing number of converter bits. k = 1 represents the same speed performance for these
converters. k is computed for C L = 2λ · C, where λ = 0, 2, 4, 8, 10, to show the effect of the
load capacitance on the BWCDAC and the FGDAC speeds. The FGDAC is faster than the
BWCDAC for k > 1.
Table-3 summarizes all the cases based on the initial assumption that RonC is very small.
According to these results, it can be concluded that when the FGDAC is used with one-
stage amplifier, it performs better than the BWCDAC for C C L and C Camp . The first
condition necessitates the use of a buffer if the DAC is designed for off-chip purposes.
The time constants of the BWCDAC and the FGDAC for a two-stage amplifier are com-
puted by using the analysis in Appendix-II. Based on this analysis, the time constants, τDAC1
1 Ceq + Camp
τDAC1 = · 1 + RonC · GB + (39)
GB Cf
76
RonC
τDAC2 = Ceq +Ron C·C f GB
(40)
1+ C f +Camp
1 Ceq
If RonC GB
· (1 + Cf
), the converter speeds become approximately equal. However,
converters are designed not to be limited by the on-resistance of the switches. For this
1 Ceq
reason, it can be assumed that RonC GB
· (1 + Cf
) to help the speed comparison. As a
result, the speeds of these converters are mostly determined by their first time constants. In
2 + Camp /(2N C)
τBWCDAC1 ≈ (41)
GB
N + 1 + Camp /C
τFGDAC1 ≈ (42)
GB
which implies that the BWCDAC is faster than the FGDAC by the factor determined by
the number of bits. As the number of bits increases the BWCDAC performs better than the
FGDAC in terms of speed. This is mainly caused by the fact that the feedback capacitor of
the BWCDAC is much bigger than the feedback capacitor of the FGDAC, and this enables
a better feedback factor for the BWCDAC. While C f /Ceq is approximately one for the
BWCDAC, it is 1/N for the FGDAC. However, it has to be noted that τBWCDAC1 /τFGDAC1
Due to finite gain, Av , of the DAC amplifier, the BWCDAC has a gain error that can be
computed using
V !
out = Ceq Ceq C f + Ceq + Camp
C +C +C
≈ 1− (44)
Vin C f + f eqAv amp Cf AvC f
where the gain error is represented by the term, (C f + Ceq + Camp )/(AvC f ). The gain error
77
In contrast to the BWCDAC, the FGDAC does not suffer from the gain error as long
as the gain stays constant in the bandwidth of interest. This is mainly because the voltage
levels and the least-significant-bit (LSB) of the FGDAC can be set by using the stored epot
8.1.4 Noise
In this section, the noise analysis of the BWCDAC and the FGDAC are presented for the
DAC design with one-stage and two-stage amplifiers. Also, the individual noise contribu-
tion from the switches, the amplifier, and the references are compared for different capac-
itance values to investigate the optimum design approach for the FGDAC that can yield
improved noise performance. In the bandwidth of interest, the total DAC noise can be
written as
e2DAC = e2reset + e2amp Bn1 An1 + (e2re f + e2Ron )Bn2 An2 (45)
where e2amp , e2Ron , and e2re f are the broadband noise contribution of the amplifier, the switches,
and the reference. Also, Bn1 and Bn2 are the noise bandwidths of the amplifier and the
reference/switches, and An1 and An2 are the gain of the DAC from the amplifier and the
reference/switches, respectively. e2reset is the kT/C noise introduced during the reset phase
of the BWCDAC. This reset noise does not exist in the FGDAC, since the FGDAC operates
The output noise of the BWCDAC can be computed using the noise contributions of the
reset and amplification phases. During the reset phase, the feedback path of the amplifier
is shorted, and all the capacitors are connected to the ground. The noise coming from the
on-resistance of the switches during the reset phase is stored and added to the noise in the
amplification phase. Therefore, by using the analysis in Appendix-III and assuming that N
is large and Gm RonC C, the total thermal noise of the BWCDAC for one-stage amplifier
can be approximated as
kT Ron e2re f Gm
e2BWC = N + kT N + + e2amp · (46)
2 C 2 4 Cx
78
(a) (b)
Figure 49. Simplified noise models of the BWCDAC and the FGDAC. e2Ron and e2amp are the broadband
contribution of the switches and amplifier. (a) Noise model of the BWCDAC during the
amplification phase. For the worst-case analysis, all the capacitors are assumed to be con-
nected to the reference voltage. e2re f is the noise contributions of the reference voltage. (b)
Noise model of the FGDAC. e2epot is the noise contribution of the selected epot. Similar to
the worst case analysis of the BWCDAC, all input capacitors of the FGDAC are assumed
to be connected to their corresponding epots. Noise contribution of the reference voltage is
ignored since it sets the common-mode of the amplifier and epots.
where C x = C L + (1 + Camp /(2N C)) · (2N C + C L ). Similarly, for large values of N, the total
levels are summed for the digital-to-analog conversion. During the conversion, the total
FGDAC noise is mainly contributed by the epots, the switches, and the amplifier. Based
on the analysis in Appendix-III, and assuming that all of the epots are selected, N is large,
and Gm RonC C, then the equivalent output thermal noise of the FGDAC for one-stage
C). For two stage amplifier, the equivalent thermal noise of the FGDAC becomes
NC · GB (N + 1)2C · GB
e2FG = (4kT Ron + e2epot ) · + e2amp · (49)
4((N + 1)C + Camp ) 4(C + Camp )
The total noise of the BWCDAC and the FGDAC for one-stage and two-stage am-
plifiers can be compared based on the individual contributions from the thermal noise of
79
the switches, the reference/epots, and the amplifier when the on-resistance, the amplifier
transconductance, the load capacitance, the input capacitance, and the unit capacitance of
To begin with, for one-stage amplifier, if C x Gm RonC, the ratio of noise contribution
due to the switches of the BWCDAC and the FGDAC can be expressed as
1 (N + 1)C + Camp
a2 = · (51)
GB · RonC 2N NC
Similar to noise ratio of switches, the ratio of noise contributions from the amplifier for
2N+2 C + Camp
b2 = · (53)
(N + 1)2 2N C + Camp
Moreover, the ratio of noise contributions from the reference and epots for one-stage am-
2N (N + 1)C + Camp
c2 = · N+1 (55)
N 2 C + Camp
To sum up, the above equations show that the total FGDAC noise due to the on-
resistance of switches, the amplifier, and the references is comparable to the total noise
of the BWCDAC. The BWCDAC exhibits better noise performance in some cases mainly
80
due to scaling difference between the feedback and input capacitors of the BWCDAC and
FGDAC. Ci /C f is equal to 2i−1−N for the BWCDAC, while it is 1 for the FGDAC.
amplifier, and unit-capacitance values. In this table, a1 , b1 , and c1 represent the ratios for
one-stage amplifier case, while a2 , b2 , and c2 represents the ratio for two-stage amplifier
case. From this table, it can be observed that for large values of Camp the performance of
the FGDAC in terms of the noise contributions from the amplifier, the switches, and the
fier, epots, a buffer, switches, and a serial shift register. While the design of the FGDAC is
In this implementation, the serial shift register is utilized to load the FGDAC digital
data. This digital input word controls the desired output voltage by switching the individual
capacitors between the reference voltage and the corresponding epot output voltage. This
operation results in a charge on the input capacitors, which is then amplified by the charge
1 X
n
Vre f − Vout = aiCi (Vi − Vre f ) (56)
C f i=1
Table 4. Ratio of noise contributions from switches, references, and amplifier. Gm Ron = 1/x and RonC ·
GB = 1/y.
Capacitors a1 a2 b1 b2 c1 c2
x 4 1
C C L & NC Camp 2N
- 2N N
- 2N
-
C L 2N C & NC Camp x
2N
· CCL - 2
N
- 1
2
-
x C C y Camp 2N+2 2N+2 2N 2N
Camp C L & C L 2N C 2N N
· LC 2amp (2N N)
· C N2 (N+1)2 N N
x C y Camp 4 2N+2 1 2N
Camp 2N C & C C L (2N N)
· amp
C (2N N)
· C N2 (N+1)2 N N
y 4 1
C Camp - 2N
- (N+1)2
- 2
81
where Vre f is the reference voltage, Vi is the epot output voltage, C f is the feedback ca-
pacitor, Ci is the input capacitor, and ai is the digital input bit for i = 1, 2, ..., N. In this
The epots are used to set the scaled input voltages in (56). The block diagram of the
epot is shown in Figure 13a, and is a modified version of the epot presented in [73]. This
gate transistors and programming circuitry that enables the tuning of the stored analog
voltage. The amplifier, illustrated in Figure 13b, in the epot structure is used to buffer
the stored analog voltage, enabling the epot to achieve low noise, low output resistance,
as well as the desired output voltage range. 10 epots storing the scaled voltages are used
to implement a 10 − bit DAC. During programming the epots are controlled and read by
employing a decoder.
In this architecture, epots and inverting amplifiers are the main blocks that use floating-
gate transistors to exploit their analog storage and capacitive coupling properties. The epots
employ floating-gate transistors to store the analog voltages, and the inverting amplifier
uses them for their capacitive coupling properties and for removing the offset at its floating-
gate terminal. A precise tuning of the stored voltage on floating-gate nodes is achieved by
In this DAC implementation, no layout technique is employed for the input capacitor
array. As expected, due to inevitable mismatches between the capacitors, there will be a
gain error contributed from each input capacitor when epots are programmed without tak-
ing these mismatches into account. Therefore, after the initial epot programming, the stored
voltages are also trimmed to compensate for these mismatches. The stored epot voltage is
tuned by changing the floating-gate charge through the use of the internal programming cir-
cuitry. Programming of the epots is controlled via digital signals, select, tunnel, and in ject.
This digital control of the epot programming allows for the epot voltage to be adjusted to
82
(a) (b)
Figure 50. (a) Inverting amplifier schematic. Ibias is the bias current and Ccomp is the compensation
capacitor of the amplifier. (b) Implemented buffer using a push-pull output stage to drive
the DAC output signal off-chip. nAMP and pAMP are the nFET and pFET input single-
stage amplifiers, and C L is the load capacitor.
Epots are required to drive capacitive loads when integrated into the FGDAC. Depend-
ing on the power consumption requirement, the output resistance of the epot amplifier can
be set to allow operation at different converter speeds. The output resistance of the epot
can be expressed as
RII
Rout = (57)
1 + gm2 gm6 RI RII
where gm2 and gm6 are the transconductance of M2 and M6 , and RI and RII are the output
resistance of the first and second stages, respectively. Here, RI is approximately equal to
the output resistance of M4 , and RII is the parallel combination of the output resistances of
M6 and M7 .
50a. The FGDAC implementation with one-stage amplifier is described in [75]. The two-
stage amplifier circuit allows to obtain a high gain and a large output swing [76]. The
the amplifier output while the system operates in the reset mode. In this mode, all the input
voltages to the input capacitors are set to the reference voltage. This condition ensures that
the amplifier output voltage becomes equal to the reference voltage when the charge on its
83
floating-gate terminal is compensated. For this purpose, a pFET and a tunnelling junction
are integrated with the floating-gate terminal of the amplifier for injection and tunnelling,
respectively. By using this technique, the offset of the amplifier is reduced to much less than
1mV. Lastly, a negative-feedback output stage [77], shown in Figure 50b, is employed to
be able to buffer the output voltage off-chip. This buffer uses complementary single-stage
fabricated in a 0.5µm CMOS process. The previous results from the FGDAC with one-
stage amplifier was presented in [75]. For the static and dynamic tests, the input data of the
The input-output characteristic of the FGDAC is shown in Figure 51a. Epots are pro-
grammed to obtain 3V output voltage range with LS B = ±1.5mV. The integral and dif-
ferential non-linearity (INL and DNL) of the FGDAC is tested with a static input using an
all-codes test. From these tests, INL and DNL are found as shown in Figure 51b, and 51c,
between 0.35LS B to −0.3LS B. Within the full-scale range, the FGDAC yields better than
10 − bit of static linearity. In these experiments, the static linearity of the FGDAC is mainly
limited by the noise in the experimental set-up. The epot voltages are programmed with a
resolution of 100µV; higher DAC linearity would require tighter programming resolution
as well as lower DAC noise levels. High resolution of the epots makes this implementation
realizable for higher DAC resolutions. Also, flicker noise in the signal path was another
limiting factor for the static measurements. Therefore, the DAC amplifier as well as the
buffer need to be designed for low flicker noise to achieve a better DAC voltage trimming.
For the transient measurements, the digital data is loaded into the shift register at
84
4
3.5
1.5
0.5
0 200 400 600 800 1000
Digital Input Data
(a)
1.5 2
0.35LSB
1 1.5 0.434LSB
0.5 1
0 0.5
−0.5 0
−1 −0.5
0.3LSB
−0.19LSB
−1.5 −1
0 200 400 600 800 1000 0 200 400 600 800 1000
Digital Input Data Digital Input Data
(b) (c)
Figure 51. Experimental results obtained to characterize the static behavior of the 10 − bit FGDAC. (a)
Output response of the FGDAC to 10 − bit digital input code. The voltage output is a linear
function of the digital input word. (b) INL characterization results for 10 − bit digital input
code. (c) DNL measurements of the FGDAC.
the FGDAC are obtained by testing the performance of the DAC for 95% of a full-scale
sinusoidal signal, as shown in Figure 52a. Also, the power spectrum of the output signal is
shown in Figure 52b. It is observed that the FGDAC yields an S FDR of 63.3dB for 1kHz
output signal.
In this design, the unit capacitor is sized as 300 f F. The area of the individual blocks
are summarized in Table 5, and the die photo of the fabricated chip is shown in Figure 53.
The total DAC area including all the blocks are is around 0.117mm2 , and the total die area
for the DAC including all the wires and blocks is 0.208mm2 . If this DAC was implemented
85
4 10
0
3.5
DAC Output Voltage (V) −10
−20
3
Power (dB)
−30
SFDR = 63.3dB
2.5 −40
−50
2
−60
−70
1.5
−80
1 −90 2 3 4 5
0 1 2 3 4 10 10 10 10
Time (ms) Frequency (Hz)
(a) (b)
Figure 52. Dynamic measurements of the FGDAC: (a) 1kHz sinusoidal output response of the FGDAC.
(b) Normalized power spectrum of 1kHz and 3.8V pp signal created by the FGDAC.
by using a binary-weighted capacitor array, the total DAC area would be 0.644mm2 for the
same size unit capacitor. Therefore, the 10−bit FGDAC yields around 3 times improvement
in the total DAC area compared to the 10 − bit BWCDAC. The parameters of the FGDAC
To illustrate the total design gain of the FGDAC relative to the BWCDAC, the design
parameters are compared based on the assumption that the unit capacitor of the BWCDAC
is 10 times smaller than the unit capacitor of the FGDAC. In addition, the amplifier and load
capacitances are chosen as Camp = Cu and C L = 10Cu , where Cu is the unit capacitance of
the FGDAC. The results are summarized in Table 7. It is observed that when designed
with one-stage amplifier the FGDAC operates around 10 times faster than the BWCDAC,
and occupies 2 times smaller than the BWCDAC. In the area calculation, it is assumed that
BWCDAC does not employ any layout technique, but in reality BWCDAC has to employ
86
BUFFER - BIASES
SWITCH ARRAY
EPOT ARRAY
DECODER
it to improve its linearity. Therefore, the gain in the capacitor area is assumed to much
higher with the FGDAC design. The trade-off with the FGDAC design is that the amplifier
√
contributes around 5 times ( b) more to the total DAC noise compared to the amplifier in
the BWCDAC. As long as the amplifier noise is kept below the other noise sources, the
FGDAC can provide better linearity with less area and faster speed.
scribed. Also, it is shown that it is a good candidate for implementing a compact and
low-power DAC. This structure can be used for a wide range of embedded system appli-
cations where power and area become one of the main concerns. The results illustrate the
flexibility and programmability of this architecture, which can be leveraged to create linear
87
Table 7. Design example for 10-bit DAC: Performance and area comparison. Unit Capacitors of BWC-
DAC and FGDAC: C = 0.1Cu and C = Cu . Camp = Cu , C L = 10Cu . Area: Aepot = 10ACu .
x = y = 100.
or non-linear output voltage spacing. Dynamic re-calibration can also be achieved using
88
CHAPTER 9
A RECONFIGURABLE MIXED-SIGNAL VLSI
IMPLEMENTATION OF DISTRIBUTED ARITHMETIC
The battery lifetime of portable electronics has become a major design concern as more
functionality is incorporated into these devices. Therefore, the shrinking power budget
of modern portable devices requires the use of low-power circuits for signal processing
applications. The data or media in these devices is generally stored in a digital format
but the output is still synthesized as an analog signal. Examples of such devices are flash
memory and hard disk based audio players. The signal processing functions employed
in these devices include finite impulse response (FIR) filters, discrete cosine transforms
(DCTs), and discrete Fourier transforms (DFTs). The common feature of these functions
is that they are all based on the inner product. DSP implementations typically make use
of multiply-and-accumulate (MAC) units for the calculation of these operations, and the
computation time increases linearly as the length of the input vector grows. In contrast,
inner product in a fixed number of cycles, which is determined by the precision of the input
data. It has been employed for image coding, vector quantization, discrete cosine transform
DA is computationally more efficient than MAC-based approach when the input vector
length is large. However, the trade-off for the computational efficiency is the increased
power consumption and area usage due to the use of a large memory. These problems can
mance, power consumption, and area usage. In this work, we propose a mixed-signal DA
architecture built by utilizing the analog storage capabilities of floating-gate transistors for
89
application of the iterative nature of the DA computational framework, where many mul-
tipliers and adders are replaced with an addition stage, a single gain multiplication, and a
coefficient array.
readily ease the power consumption requirements of portable devices. Also, due to the
serial nature of the DA computation, the power and area of this filter increase linearly with
its order. Hence, this design approach allows for a compact and low-power implementation
architecture is explained, and the integration of tunable voltage references into the DA
for FIR filtering are presented. In the last part of the chapter, the characteristics of the
9.1 DA computation
The DA concept was first introduced by Croisier et al. [82], and later utilized for the
hardware implementation of digital filters using memory and adders instead of multipliers
[83]. It is an efficient computational method for computing the inner product of two vectors
in a bit-serial fashion [84]. The operation of DA can be derived from the inner product
equation as follows
X
M−1
y[n] = x[n − i]w[i] (58)
i=0
In the case of FIR filtering, x is the input vector and w is the weight vector. Using a K-bit
P
2’s-complement representation, x can be written as x[n − i] = −bi0 + K−1 −j
j=1 bi j 2 , where
bi0 is the sign bit, bi j is the jth bit of the ith element in the vector x, and bi(K−1) is the least
significant bit. Substituting x into (58), and by reordering the summations and grouping
90
the terms together, (58) can be written as
X
M−1 X
K−1 X
M−1
y[n] = − wi bi0 + 2− j wi bi j (59)
i=0 j=1 i=0
P M−1
In digital implementations, the summation, i=0 wi bi j , is pre-computed and stored in
a memory for multiplier-less operation and reduced hardware complexity. This is usually
element, a shifter, a switch, and an adder as illustrated by Figure 54a. By reusing the
the number of taps, M, and without using a multiplier. Digital DA architectures obtain
In contrast to digital implementations, the addition in the analog domain is much more
power and area efficient. Therefore, the high memory usage of digital DA implementations
can be eliminated by processing the digital input data in the analog domain as shown in
Figure 54b. To design such a structure, weights in (58) are stored in the analog domain.
For an individual weight, data is processed in a similar way as it is achieved by serial DACs,
an array of tunable FG voltage references (epots) [73], inverting amplifiers (AMP), and
sample-and-hold (SH) circuits, as illustrated in Figure 55. The timing of the digital data
and control bits governs the DA computation and is illustrated in Figure 56. Digital inputs
are introduced to the system by using a serial shift register. These digital input words
represent the digital bits, bi j in (59), which selects the epot voltages to form the appropriate
sum of weights necessary for the DA computation at the jth bit. The clock frequency of the
shift register is dependent on the input data precision, K, and the length of the filter, M, and
is equal to M · K times the sampling frequency. Once the jth input word is serially loaded
91
! " # $ % & ' ! " ! "
- %
5 " ! "
5 , # " 6
)
- * !
. , - , !
( )
* ! + , - ! , . / 0 0 1 2 2 3
(a)
! " # $ %
+ #
0 * ! 1
, * + *
'
# ! . / * +
& '
( ) * + * , -
(b)
Figure 54. Basic DA hardware architecture. bi,k is the input bit for kth cycle of operation and y[n] is the
output. (a) Digital implementation. (b) Proposed hybrid mixed-signal implementation using
digital input data and stored analog weights. Digital input data is processed in the analog
domain.
into the top shift register, the data from this register is latched at K times the sampling
frequency. If the area used by the shift registers is not a design concern, then ideally an M-
tap FIR filter should have M shift registers. A clock that is K times faster than the sampling
The analog weights of DA are stored by the epots. When selected, these weights are
added by employing a charge amplifier structure composed of same size capacitors, and a
two-stage amplifier, AMP1 . The epot voltages as well as the rest of the analog voltages in
the system are referenced to a reference voltage, Vre f = 2.5V. Since the addition operation
is performed by using an inverting amplifier, the relative output voltage, when Reset signal
92
)
U V
M P Q R Z
T M
"
'
'
&
( # $ % )
q r s
,
I J K L
8 9
: ; <
0 1 2 3 4 5 6 7 8 =
/
g h i j k l m n n j k o p
M U V W
M X Y
M U V
P Q R S
M
T
+
S
q r
Z
q r
v
M N O P Q R S T
M
8 9
8 9
4 : 5 6; < 7
8 =
: ; <
4 5 6 7 8 =
> ? @ A
c d e f
0 1 2 3
F
F
" G
a
!
!
[
\ t u
]
^
^
&
# $ %
(
`
Figure 55. Implementation of the 16-tap hybrid FIR filter. bi is the input bit for jth cycle of operation
and y(t) is the output. Epots store the analog weights. Sample-and-holds, SHs, are used to
obtain the delay and hold the computed output voltage.
is enabled, becomes equal to the negative sum of the selected weights for Cini = C FBamp1 .
For the first computational cycle, the result of the addition stage represents the summation,
Pm−1
i=0 wi bi(K−1) in (59), which is the addition of weights for the LS Bs of the digital input
data.
In the feedback path of the system, a delay, an invert and a divide-by-two operations
are used for the DA computation. For that purpose, sample-and-hold circuits, S H1 and
S H2 , and inverting amplifiers, AMP1 and AMP2 , are employed in the implementation. The
SH circuits store the amplifier output to feed it back to the system for the next cycle of the
computation. Non-overlapping clocks, CLK1 and CLK2 , are used to hold the analog voltage
while the next stream of digital data is introduced to the addition stage. These clocks have
a frequency of K times the sampling frequency. The stored data is then inverted relative
to the reference voltage by using the second inverting amplifier, AMP2 , to obtain the same
sign as the summed epot voltages. AMP2 is identical to AMP1 , and has the same size
input/feedback capacitors. After obtaining the delay and the sign correction, the stored
analog data is fed back to the addition stage as delayed analog data. During the addition, it
is also divided by two by using C FB = C FBamp1 /2 = C/2, which gives a gain of 0.5 when it
93
& '
(
& '
)
& '
!
! !
$ %
! !
# $
"
!
Figure 56. Digital clock diagram of the filter architecture. For desired sampling frequency, f s , K − bit
precise M − bit digital input data is loaded serially to a shift register at a K · M · f s clock
frequency, and latched at a K · f s clock frequency. CLK1 , CLK2 , and CLK3 are the bits used
to control S H1 , S H2 , and S H3 , respectively. Invert signal is used to obtain 2’s-complement
compatibility. Also, Reset signal is used to clear the result of the previous computation.
is added to the new sum. This operation is repeated until the MS Bs of the digital input data
is loaded into the shift register. The MS Bs correspond to (K −1)th bits, and are used to make
the inverting amplifier in the feedback path during the last cycle of the computation by
enabling the Invert signal. As a result, during the last cycle of the computation, the relative
where the first term is the result of the calculation with the sign bits. Finally, when the
which is enabled once every K cycle. S H3 holds the computed voltage till the next analog
output voltage is ready. The new computation starts by enabling the Reset signal to zero
out the effect of the previous computation. Then, the same processing steps are repeated
94
9.3 Circuit description of computational blocks
To achieve an accurate computation using DA, the circuit components are designed to min-
imize the gain and offset errors in the signal path. In this architecture, those components
The epot, shown in Figure 13a, is modified from its original version [73] to obtain a
that uses a low-noise amplifier integrated with floating-gate transistors and programming
circuitry to tune the stored analog voltage. The amplifier in the epot circuit is used to
buffer the stored analog voltage so that the epot can achieve low noise and low output
resistance as well as the desired output voltage range. An array of epots is used for storing
the filter weights; and during the programming, individual epots are controlled and read by
employing a decoder.
In this architecture, epots and inverting amplifiers are the main blocks that use FG
transistors to exploit their analog storage and capacitive coupling properties. A precise
tuning of the stored voltage on FG node is achieved by utilizing the hot-electron injection
and the Fowler-Nordheim tunnelling mechanisms. The epots employ FG transistors to store
the analog coefficients of the inner product. In contrast, the inverting amplifiers use them
not only to obtain capacitive coupling at their inverting-node, but also to remove the offset
at their FG terminals.
One of the main advantages of exploiting FG transistors in this design is that the area
allocated for the capacitors can be dramatically reduced. It is shown in [75] that epots can
helps to overcome the area overhead, which is mainly due to layout techniques used to
minimize the mismatches between the input and feedback capacitors. Similarly in this DA
implementation, the unit capacitor, C, is set to 300 f F, and no layout technique is employed.
As expected, due to inevitable mismatches between the capacitors, there will be a gain error
contributed from each input capacitor. The stored weights are also used to compensate this
95
mismatch. When the analog weights are stored to the epots, the gain errors are also taken
without resetting the inverting node of the amplifiers. This is because the floating-gate
inverting-node of the amplifiers allow for the continuous-time operation. This design ap-
proach eliminates the need for multi-phase clocking or resetting. Inverting amplifiers are
implemented by using a two-stage amplifier structure [76], shown in Figure 57a, to obtain
a high gain and a large output swing. Similar to the epots, the charge on the floating-gate
node of these amplifiers is precisely programmed by monitoring the amplifier output while
the system operates in the reset mode. In this mode, the shift registers are cleared and the
Reset signal is enabled. Therefore, all the input voltages to the input capacitors including
the voltage to the feedback capacitor, C FB , are set to the reference voltage. These conditions
ensure that the amplifier output becomes equal to the reference voltage when the charge on
the floating-gate is compensated. The charge on the floating-gate terminal is tuned using
the hot-electron injection and the Fowler-Nordheim tunnelling mechanisms. By using this
technique, the offset at the amplifier output is reduced to less than 1mV.
and high sampling precision due to the bit-serial nature of the DA computation. Therefore,
these circuits are implemented by utilizing the sample-and-hold technique using Miller
hold capacitance [85], as illustrated in Figure 57b. This compact circuit minimizes the
signal dependent error, while maintaining the sampling speed and precision by using the
Miller capacitance technique together with Amp3 shown in Figure 57c. For simplification,
if we assume there is no coupling between M1 and M2 , and amplifier, Amp3 , has a large
gain, then the pedestal error contributed from turning switches (M1 and M2 ) off can be
written as
∆Q1 (C2 + C2B ) ∆Q2
∆VS 1 + ∆VS 2 = + (61)
C2B (C1 + C2 ) + C1C2 (A + 1) C2
where ∆Q1 and ∆Q2 are the charges injected by M1 and M2 , respectively. Also, A and
96
(a) (b)
C2B are the gain and input capacitance of the amplifier, Amp3 . ∆Q2 is independent of the
input level, therefore ∆VS 2 can treated as an offset. In addition, the error contributed by
M1 , ∆VS 1 , can be minimized by the Miller feedback, and this error decreases as A increases
[85]. Due to serial nature of the DA computation offset in the feedback path is attenuated as
the precision of the digital input data increases. Therefore, Amp3 is designed to minimize
as shown in Figure 57d, to achieve a high gain and fast settling. Two SH circuits are used in
the feedback path to obtain the fixed delay for the sampled analog voltage. In addition, the
third SH is utilized to sample and hold the final computed output once every K cycles. This
97
SH uses a negative-feedback output stage [77], shown in Figure 57e, to be able to buffer
the output voltage off-chip. Due to the performance requirements of the system, these SH
which is configured as an FIR filter. The measurement results are obtained from the chips
that were fabricated in a 0.5µm CMOS process. This 16-tap FIR filter is designed to run
the digital input data is set to 8 for these experiments. To meet this sampling rate, the data
is loaded into the upper shift register at a rate of 3.84MHz for a 32kHz sampling frequency
a band-pass filter. The coefficients of these filters are shown in Table 8. Ideal coefficients
are given to illustrate how close the epots are programmed to obtain the actual coefficients.
The epots are programmed relative to a reference voltage, Vre f , which is set to 2.5V. The
error of the stored epot voltages are kept below 1mV to minimize the effect of weight errors
An 858Hz sinusoidal output of the low-pass filter at a 50kHz sampling rate is illustrated
43dB. For the comb filter with a 22kHz input signal frequency, it is observed that the SFDR
does not degrade as shown in Figure 58b. Although the input precision was set to 8 bits,
the gain error in the system as well as noise in the experimental set-up limits the maximum
achievable SFDR.
The second experiment is performed to characterize the magnitude and phase responses
of the filters. For that purpose, a sinusoidal wave at a fixed sampling rate, 32/50kHz, is
generated using the digital data, and the magnitude and phase responses are measured by
98
)D
E
AL
)D
E
AL
-E
ASU
RD -E
ASU
RD
M
M
!
!
P D
6
LITU
P D
6
LITU
E
E
4IM
E
SC 4IM
E
SC
´
´
M
M
W
W
R SP D
R SP D
CTU
CTU
O "
O "
0
0
E
E
´
&RE
Q
U
N
CY K(
Z &RE
Q
U
N
CY K(
Z
(a) (b)
Figure 58. Transient responses for 50kHz sampling frequency and their power spectrums. (a) Low-pass
filter output has a frequency of 858Hz. (b) Comb filter output has a frequency of 22kHz.
sweeping the frequency of the input sine wave from DC to 16/25kHz. For this experiment,
256 data points are collected to accurately measure the frequency response of these filters.
These responses follow the ideal responses closely even if the sampling rate is increased
as illustrated in Figures 59a, 59b, and 59c. Any variation in the frequency response as the
sampling rate increases is caused by the noise and offset in the feedback path as well as due
to the performance degradation of the circuits. As the output signal amplitude becomes
very low, the experimental set-up limits the resolvable magnitude and phase. As expected
for a symmetrical FIR filter, the measured phase responses of comb, low-pass, and band-
The static power consumption of the fabricated chip is measured as 16mW. Most of
the power is consumed by the SH and inverting amplifier circuits. The die photo of the
designed chip is shown in Figure 60. The system occupies around half of the 1.5 · 1.5mm2
die area. The cost to increase the filter order is 0.011mm2 of die area and 0.02mW of power
for each additional filter tap. This readily allows for the implementation of high-order
99
Table 8. Ideal and actual (programmed epot voltages) coefficients of the comb, low-pass, and band-pass
filters.
Filter Comb LPF BPF
Coefficients Ideal Actual (V) Ideal Actual (V) Ideal Actual (V)
1 0.4 2.0996 -0.0190 2.5192 0.033 2.4670
2 0 2.4994 -0.0390 2.5393 -0.064 2.5639
3 0 2.4994 0.0260 2.4738 -0.053 2.5530
4 0 2.5007 0.0160 2.4835 0.038 2.4617
5 0 2.5005 -0.0240 2.5239 0.047 2.4528
6 0 2.5000 -0.0360 2.5362 -0.054 2.5541
7 0 2.4999 0.0600 2.4401 -0.056 2.5561
8 0 2.4994 0.1800 2.3201 0.057 2.4425
9 0 2.4997 0.1800 2.3201 0.057 2.4427
10 0 2.5002 0.0600 2.4391 -0.056 2.5560
11 0 2.4998 -0.0360 2.5358 -0.054 2.5535
12 0 2.5002 -0.0240 2.5240 0.047 2.4527
13 0 2.5001 0.0160 2.4853 0.038 2.4616
14 0 2.5001 0.0260 2.4743 -0.053 2.5526
15 0 2.4997 -0.0390 2.5389 -0.064 2.5638
16 0.4 2.0996 -0.0190 2.5184 0.033 2.4669
100
´
´
´
)D
E
AL
-E
ASU
RD
K(
Z
-
´
AG
N B
D
ITU
E
-E
ASU
RD
K(
Z
´
H
0 D
ASE G
´
R
.
O
RM
ALIZE
D
&Q
U
N
CYX P RAD
SM
P
LE
(a)
´
´
´
´
)D
E
AL
´
-E
ASU
RD
K(
Z
´
-
-
AG
N B
D
ITU
E
AG
N B
D
ITU
-E
ASU
RD
K(
Z
´
)D
E
AL
´
-E
ASU
RD
K(
Z
-E
ASU
RD
K(
Z
´
´
´
D
D
G
G
H
H
0
0
ASE
ASE
R
´
.
O
RM
ALIZE
D
&Q
U
N
CYX P RAD
SM
P
LE
.
O
RM
ALIZE
D
&Q
U
N
CYX P RAD
SM
P
LE
(b) (c)
Figure 59. Magnitude and phase responses at 32/50kHz sampling rates. (a) Comb filter. (b) Low-pass
filter. (c) Band-pass filter.
9.5 Discussion
The proposed DA structure which can be used for FIR filtering circumvents some of these
problems by employing DA for signal processing and utilizing the analog storage capabil-
urability. In this way, the DAC is used as a part of the DA implementation, which helps
by using different capacitor ratios, the proposed implementation offer more design flexibil-
ity since its coefficients can be set by tuning the stored weights at the epots. Also, offset
accumulation and signal attenuation make it difficult to implement long tapped delay lines
101
Inverting SH Inverting
Amplifier Amplifier
Buffer/SH
Input capacitors
Biases
decreases the offset as the precision of the digital input data increases. Also, the gain error
in this implementation is mainly caused by the two inverting stages (implemented using
AMP1 and AMP2 ), and can be minimized using special layout techniques only at these
stages. The measurement results illustrated that the output signal of the filter follows the
ideal response very closely. This is mainly because it is mostly insensitive to the number of
filter taps and most of the computation is performed in the feedback path. Also, the power
and area of the proposed design increases linearly with the number of taps due to the serial
nature of the DA computation. Therefore, this design approach is well suited for compact
programmable analog coefficients of this filter will enable the implementation of adaptive
systems that can be used in applications such as adaptive noise cancellation and adaptive
can also be utilized for signal processing transforms such as a modified discrete cosine
transform.
102
CHAPTER 10
IMPACTS AND APPLICATIONS OF THE PRESENTED WORK
10.1 Impacts
In this work, a tunable voltage reference and a family of tunable resistors are designed
to leverage the tunability and reconfigurability into the analog and mixed signal circuits.
In this way, precision, accuracy, compactness, and power consumption issues associated
with the technology scaling and digital circuit implementations are aimed to be alleviated.
I have designed, simulated, and tested a tunable floating CMOS resistor using floating-
gate transistors and gate linearization technique. I also analyzed this technique to
determine the limitations of the method. This resistor uses only 2 capacitors and 1
sired performance the die area of this resistor can be easily optimized. Within the
consume additional power for the linearization and its operation does not depend on
the supply rails. In addition, since floating-gate transistors can store analog voltages,
this resistor stores its own resistor value, thus becomes very suitable for applica-
tions where an array of resistors are needed. It yields around 1.3% linearity for 1V pp
sinusoidal signals and its linearity is mainly limited by the body effect.
I have designed, simulated, and tested a compact tunable CMOS resistor by employ-
based on the gate linearization is a floating resistor, by scarifying this feature and
103
operating it as a grounded resistor the tunable resistor using scaled-gate lineariza-
tion yields more compact and more linear resistor. Better than 7 − bit linearity is
obtained for 1V pp sinusoidal input. I have showed that the resistance and tempera-
ture coefficient of this resistor can be tuned to desired operating point. By using the
stress tests, I have demonstrated that the resistance of this resistor drifts negligibly
over time. Based on the worst-case data, it is calculated that the resistance drifts
1.6 · 10−3 % over the period of 10 years at 25oC. Similar to the resistor circuit us-
ing gate-linearization, this circuit does not consume additional power for the offset
have designed, implemented, and tested a binary-weighted resistor DAC using this
tunable floating-gate CMOS resistor. The software code to test this converter has
been written by me and Mr. Haw-Jing Lo. I have demonstrated that 15 − bit accurate
Linearization Technique:
I have designed, simulated, and tested a tunable highly linear CMOS resistor by
have showed that this resistor offers a compact and power efficient implementation
that yields around 72dB of linearity. I have analyzed the common-mode linearization
method and showed the linearity limitations. I have also demonstrated the limitations
in the implementation and showed the possible causes and their effects.
I have employed this tunable resistor in the design of highly linear amplifier and
two-quadrant multiplier circuits. I have designed, simulated, and tested these highly
linear amplifier and multiplier circuits. The amplifier exhibited 0.018% THD for
1V pp differential input, and a linear input range of 2.5V pp . With this implementation
104
it is possible to set the gain of the amplifier accurately and precisely.
I have designed, simulated, and tested a tunable voltage reference (epot). This epot
has been designed to store the scale factors of a binary-weighted capacitor DAC
and the coefficients of a distributed-arithmetic based FIR filter. For that purpose,
keeping the noise, temperature variation, and charge loss minimized. The mea-
sured thermal noise level of this voltage reference is −120dB, and its noise corner
of 20.8ppm/oC. Moreover, based on the stress test it is calculated that the stored
epot voltage drifts 10−3 % over the period of 10 years at 25oC. I have analyzed the
noise, temperature dependence, and retention of this reference to correlate with the
measured data.
converter. I have analyzed its noise, speed, and area. Also, I have compared its
converter. I have shown that when the unit capacitor of the BWCDAC is 10 times
smaller than the unit capacitor of the FGDAC and these converter are designed with
one-stage amplifier, the FGDAC operates around 10 times faster than the BWCDAC,
and occupies more than 2 times smaller area than the BWCDAC does. Also, I have
shown that as long as the amplifier noise is kept smaller than the other noise sources,
the FGDAC can provide better linearity with less area and faster speed. Therefore,
105
this structure will enable very compact and low-power implementation of digital-to-
analog converters.
The idea of this structure was first proposed by Dr. Paul E. Hasler, and the test codes
to characterize this converter have been written by me along with Mr. Christopher
ture. Along with Mr. Walter Huang, I have tested this architecture and demonstrated
its functionality for FIR filters. Compared to existing FIR filter implementations, the
proposed implementation offer a more design flexibility since its coefficients can be
Offset accumulation and signal attenuation in the traditional FIR filter implementa-
tions make it difficult to implement long tapped delay lines with these approaches. In
the proposed implementation, we have showed that DA processing decreases the off-
set as the precision of the digital input data increases. Also, the power and area of the
proposed design increases linearly with the number of taps due to the serial nature of
the DA computation. Therefore, this design approach is well suited for compact and
The idea of this structure was first proposed by Dr. Paul E. Hasler, Dr. David Ander-
10.2 Applications
The presented circuits can be used in a variety of applications where the accuracy and
precision or the area and power consumption become the main design concerns. Some of
106
10.2.1 Tunable resistors
Due to their compactness and power efficiency, these resistors can be used in low-power
implementations of the ANN systems for storing and tuning the weights.
Moreover, the resistor based on the common-mode linearization technique offers high
linearity at the expense of very low power consumption. Therefore, as also demonstrated
in this work, this resistor becomes very useful for highly linear amplifier and multiplier
circuits. Also, it can be integrated in variable-gain-amplifier to set the gain of the amplifier
to a desired point.
I have shown that in addition to the resistance of these resistors, their temperature co-
efficients can also be tuned. Therefore, they can be used in current reference and voltage
reference circuits to implement tunable references with very low temperature coefficients.
digital-to-analog converters. Therefore, these resistors will enable the design of multi-bit
10.2.2 Epot
In this work, I have demonstrated that a tunable reference can built by using the analog
storage capability of the floating-gate transistors. Also, I have showed that this tunable
arithmetic based FIR filter. In addition to these applications, epots can be used in the
The programmable analog coefficients of the distributed-arithmetic based FIR filter will en-
able the implementation of adaptive systems that can be used in applications such as adap-
tive noise cancellation and adaptive equalization. Since distributed arithmetic is an efficient
computation of an inner product, this architecture can also be utilized for signal processing
107
architecture can be employed for image coding, vector quantization, and adaptive filtering
implementations.
108
APPENDIX A
LINEARITY ANALYSIS OF GATE AND COMMON-MODE
LINEARIZATION TECHNIQUES
In order to analyze these nonlinearities, the drain current of an nMOS transistor in the
µCox W
Id = [ f (vg , vd , v s ) − g(vb , vd , v s )] (62)
L
where µ is the carrier mobility, Cox is the gate capacitance per unit area, W is the channel
width, L is the channel length, and vg , vd , v s , and vb are the gate, drain, source, body
voltages (referenced to the ground), respectively [59]. Similar to the drain current, the
carrier mobility is also dependent on the terminal voltages and can be expressed in terms
of f and g as follows
µ0
µ= θ
(63)
1 + vd −v s
[ f (vg , vd , v s ) + g(vb , vd , v s )]
1
f (vg , vd , v s ) = [vg − VFB − φ](vd − v s ) − (v2d − v2s ) (64)
2
2γ
g(vb , vd , v s ) = [(vd − vb + φ)3/2 − (v s − vb + φ)3/2 ] (65)
3
where VFB is the flat-band voltage, φ is the surface potential, γ is the body-effect coefficient
[60]. In the subsequent subsection, the above equations are utilized to analyze the common-
result, f becomes
v2ds
f (vds ) = [VG + vc − VFB − φ]vds − = [VG − VFB − φ]vds (66)
2
Also, g can be expanded by using the Taylor series expansion at vds = 0. This can be
109
computation. In this case, g can be expressed as
2γ 3/2 h δ δ i
g= u (1 + )3/2 − (1 − )3/2 (67)
3 u u
For u δ, g becomes
2γ 3/2 h δ δ3 3δ5 i
g= u 3 − 3− + . . . (68)
3 u 8u 128u5
If the higher order terms are ignored, and for δ = vds /2, g can be written as
v3ds
g(vds ) = γ vds (vc − vb + φ)1/2 − (69)
96(vc − vb + φ)3/2
By using (66) and (69), and the definitions in (9) and (11), the drain current (without the
µ0
µ= (71)
γ4 v2ds
1 + θ(VG1 + Vc1 − 96Vc31
)
µ1
µ= (72)
γ4 v2ds
1 + θ1 (Vc1 − 96Vc31
)
γ4 v2ds
For θ1 1/ Vc1 − 96Vc31
, and applying y = 1/(1 + x) ≈ 1 − x approximation to (72), the
µ1Cox W γ4 v3ds
Id = vds (VG − VT )(1 − θ1 Vc1 ) + (1 + θ (V
1 G1 − 2V c1 )) (73)
L 96Vc31
In addition to the common-mode gate signal, the common-mode body signal (vb = −VB +vc )
v3ds
g(vds ) = γ[vds (VB + φ)1/2 − ] (74)
96(VB + φ)3/2
110
By using (66) and (74), the drain current can be written as
( )
µCox W γv3ds
Id = [VG − VT ]vds + (75)
L (VB + φ)3/2
µ2
µ= (76)
θ2 γv2ds
1− 96(VB +φ)3/2
96(VB +φ)3/2
Finally, by applying θ2 γv3ds
and y = 1/(1 − x) ≈ 1 + x approximations to (76), the
111
APPENDIX B
SPEED ANALYSIS OF BWCDAC AND FGDAC
The speed performance of the BWCDAC and the FGDAC are analyzed and compared by
using the illustrated DAC structure in Figure 61. For simplification in the analysis, the
output resistances of the voltage-reference and epots are assumed to be much smaller than
the on-resistance of the switches, and therefore ignored in the analysis. Also, the time
constants of input branches are assumed to be same. Hence, Roni = Ron /2i−1 and Ci = 2i−1C
for the BWCDAC, and Roni = Ron and Ci = C for the FGDAC, where i = 1, ..., N, C is the
Using the small signal models illustrated in Figure 62, the relation between Vin , V x , and
VinCeq1 Ceq
= V x C f + Camp + − VoutC f (78)
1 + Req1 Ceq1 s 1 + Req2 Ceq2 s
N N
where Ceq1 = Σi=1 biCi = k1C, and Req1 = Ron /k1 . Similarly, Ceq2 = Σi=1 biCi = k2C and
Req2 = Ron /k2 . In addition, Ceq = Ceq1 + Ceq2 and Req1 Ceq1 = Req2 Ceq2 = RonC. Based on
the amplifier type and model used in the analysis, the transfer function of the DACs can be
different. In the next subsections, one-stage and two-stage amplifier models are utilized to
verting amplifier. Based on the illustrated model for the amplifier, Vout can be expressed in
terms of V x as follows
where Gm and Ro are the transconductance and output resistance of the amplifier, and C f
and C L are the feedback and load capacitors. Assuming the amplifier gain, Gm Ro , is large
112
Figure 61. DAC structure used to analyze BWCDAC and FGDAC. C L and C f are the load and feedback
capacitors, respectively. Also, for (x = i, j, ..., m, n), Ronx is the on-resistance of the switches
and C x is the input capacitor. This structure illustrates the connections when some of the
input capacitors are connected to the input while others are connected to the ground. For
BWCDAC, Vin is equal to Vre f , while it is equal to Vepot for FGDAC. For simplification the
output resistances of the reference and epots are assumed to be much less than Ron .
where Camp is the amplifier input capacitance. Assuming C14 4GmC f Req1 Ceq1 C22 , the
G mC f C12
p1 = − & p2 = − (83)
C12 Req1 Ceq1 C22
Based on this analysis, the time constants, τDAC1 and τDAC2 , can be computed as
RonC C f (Camp + C L ) + C LCamp
τDAC2 = (85)
C f (Gm RonC + C L ) + (Camp + Ceq )(C L + C f )
113
(a) (b)
Figure 62. Small signal models used to analyze the DAC structures. Req1 and Ceq1 are the equivalent
resistance and capacitance of the selected branches, and Req2 and Ceq2 are the equivalent
resistance and capacitance of the unselected branches. Also, Camp is the input capacitance of
the amplifier, and C L is the load capacitor. (a) Using a simplified one-stage amplifier model.
Gm and Ro are the transconductance and output resistance of the amplifier. (b) Using a
simplified two-stage amplifier model. A is the amplifier gain that has one dominant pole.
For large values of N and small values of Gm Ron , the time constants of the BWCDAC,
Similarly, based on these assumptions, the time constants of the FGDAC, τFGDAC1 and
C L + (N + Camp /C)(C L + C)
τFGDAC1 = (88)
Gm
RonC(C(Camp + C L ) + C L )Camp
τFGDAC2 = (89)
C LC + (Camp + NC)(C L + C)
As a result, the speed performances of the BWCDAC and the FGDAC can be compared
based on these approximated time constants, and the relationship between τDAC1 and τDAC2
is expressed as
1 C x RonC
τDAC1 = · (90)
τDAC2 Gm
114
where C x = C f (Camp + C L ) + C LCamp . The above equation implies that the multiplication of
the BWCDAC time constants is equal to the multiplication of the FGDAC time constants
if the load capacitance, the on-resistance of the switches, the amplifier transconductance,
Figure 62b can be used to express Vout in terms of V x . In this model it is assumed that the
second pole and the zero of the amplifier are beyond the gain-bandwidth of the amplifier.
Therefore, Vout becomes equal to −A(s)V x for A(s) = GB/(s + wa ), where GB is the gain-
bandwidth and wa is the dominant pole of the amplifier. As a result, for a large amplifier
C2 = C f + Camp (93)
The poles of (91) for C12 4C f Req1 Ceq1 C2GB can be found as
C f GB C1
p1 = − & p2 = − (94)
C1 Req1 Ceq1 C2
Based on the computed poles, the time constants, τDAC1 and τDAC2 , can be expressed as
1 Ceq + Camp
τDAC1 = · 1 + RonC · GB + (95)
GB Cf
RonC
τDAC2 = Ceq +Ron C·C f GB
(96)
1+ C f +Camp
115
The expressions for the time constants of the BWCDAC, τBWCDAC1 and τBWCDAC2 , can be
2 + Camp /(2N C)
τBWCDAC1 = + RonC (97)
GB
RonC(2N C + Camp )
τBWCDAC2 = (98)
Camp + 2N C(2 + RonC · GB)
Similarly, the time constants of the FGDAC, τFGDAC1 and τFGDAC2 , become
N + 1 + Camp /C
τFGDAC1 = + RonC (99)
GB
RonC(C + Camp )
τFGDAC2 = (100)
Camp + (N + 1)C + RonC 2 · GB
1 Ron (C + Camp )
τDAC1 = · (101)
τDAC2 GB
Similar to the one-stage amplifier case, the multiplication of the BWCDAC time constants
becomes equal to the multiplication of the FGDAC time constants for the same unit and
116
APPENDIX C
NOISE ANALYSIS OF BWCDAC AND FGDAC
To compare the noise performances of the FGDAC and the BWCDAC, one-stage and two
stage-amplifier models are utilized in the analysis. The general expression used for the total
e2DAC = e2reset + e2amp Bn1 An1 + (e2re f + e2Ron )Bn2 An2 (102)
where e2amp , e2Ron , and e2re f are the broadband noise contribution of the amplifier, the switches,
and the reference, and e2reset is the kT/C noise introduced during the reset phase of the
BWCDAC. Bn1 and Bn2 are the noise bandwidths of the amplifier and the reference/switches,
and An1 and An2 are the gain of the DAC from the amplifier and the reference/switches, re-
spectively.
in Figure 63a. The transfer function for Vout /Vin is given in (80), where Ceq1 and Req1
are now equal to Ceq and Req . This transfer function can be used to express the noise
contribution from input to output. Similarly, the noise contribution of the amplifier can be
computed by finding the transfer function from Vamp to Vout . This can be expressed as
V x Ceq + C f + Camp + sReqCeq (C f + Camp ) + Vamp Ceq + C f + sReqCeqC f
Using the relationship between V x and Vout as given in (79), the transfer function for Vout /V x
can be written as
117
(a) (b)
Figure 63. Models used to analyze the noise of the DAC structures. e2in and e2amp are the input noise and
the amplifier noise, respectively. (a) Using a simplified one-stage amplifier model for the
DAC amplifier. (b) Using a simplified two-stage amplifier model for the DAC amplifier.
Gm (Ceq + C f ) Ca2
z1 ≈ − & z2 ≈ − (109)
Ca2 ReqCeqCampC f
1
p1,2 = (110)
2ReqCeqCb22
G mC f Cb21
p1 ≈ − & p2 ≈ − (111)
Cb21 ReqCeqCb22
118
Depending on the capacitance values, the location of the poles and zeros may change. If
we assume that p1 is the dominant pole of the system, and that p1 z1 , which is true for
Gm Ron 1 and C L Camp . In this case, Vout /Vamp can be approximated as a single pole
system in the bandwidth of interest. Therefore, the transfer function for |Vout /Vamp |2 can be
written as
Ceq + C f 2 1
|H1 ( jw)|2 = 2 (112)
Cf 1 + 2π f · Cb21 /(GmC f )
Based on the above equation, the gain, An1 , can be expressed as (Ceq + C f )2 /C 2f and the
Moreover, if the zero of the transfer function in (80), z = Gm /C f , and its second pole,
p2 , cancel or are further away than the bandwidth of interest, the transfer function for
As a result, the gain, An2 , becomes (Ceq1 /C f )2 and the bandwidth, Bn2 , can be approximated
as GmC f /(4C12 ).
By using these gain and bandwidth expression together with (102), the total thermal
Using the BWCDAC capacitance values the above expression can be written as
kT Ron e2re f Gm
e2BWC = N + kT N + + e2amp · (116)
2 C 2 4 Cx
119
Similarly, for one-stage amplifier the equivalent output thermal noise of the FGDAC
can be approximated as
2
Ron eepot N + 1 2 N 2C · G m
e2FG = 4kT + + e2amp ( ) · (117)
N N N 4C12
where e2epot is the broadband noise contribution of the epots and C12 ≈ C LC + (Camp +
NC)(C L + C) for Gm ReqCeq C. For large values of N, the above expression becomes
2
Ron eepot N 2 · Gm
e2FG = 4kT + + e2amp · (118)
N N 4Cy
output referred noise changes. Therefore, the noise analysis of the DACs with two-stage
amplifier is performed by using the simplified model illustrated in Figure 63b. Using this
where C x is
Ceq + C f 1
z=− (121)
Cf ReqCeq
and for C 2x 4Ceq ReqC f GB(C f + Camp ), the poles can approximated as
Cf Cf
p1 ≈ − GB & p2 ≈ − GB (123)
C 2x C f + Camp
120
Assuming GB < 1/(ReqCeq ) and z ≈ p1 , |Vout /Vamp |2 becomes
2
(Ceq + C f )2 /C 2f
|H1 ( jw)| = 2 (124)
1 + 2π f · (C f + Camp )/(C f GB)
The above equation yields An1 = ((Ceq + C f )/C f )2 and Bn1 = C f GB/(4(C f + Camp )).
Also, the effect of the input noise can be used found by using (91) and assuming that
the first pole of (91) is the dominant pole in the bandwidth of interest. Then this transfer
function can be approximated as a single pole system, and thus |Vout /Vin |2 can be written as
(Ceq1 /C f )2
|H2 ( jw)|2 = 2 (125)
1 + 2π f · C1 /(C f GB)
Therefore, the gain and the bandwidth become An2 = (Ceq1 /C f )2 and Bn2 = C f GB/(4C1 ).
As a result, the total thermal noise of the BWCDAC for two-stage amplifier becomes
2
kT Ceq Ceq GB (Ceq + C f )2GB
e2BWC = + (4kT Req + e2re f ) · + e2amp · (126)
Ceq C f 4C1C f 4C f (C f + Camp )
Using the capacitance values of the BWCDAC, the above expression can be rewritten as
where C1 ≈ (2N+1 − 1)C + Camp . For large values of N, the total thermal noise of the
BWCDAC becomes
kT Ron 2N−2C · GB 2N C · GB
e2BWC = + (4kT + e 2
re f ) · + e 2
amp · (128)
2N C 2N (2N+1C + Camp ) (2N C + Camp )
Lastly, for two stage amplifier, the equivalent thermal noise of the FGDAC becomes
NC · GB (N + 1)2C · GB
e2FG = (4kT Ron N + e2epot ) · + e2amp · (129)
4((N + 1)C + Camp ) 4(C + Camp )
121
REFERENCES
[1] A. B. R. M. Edenfeld, D. Kahng and Y. Zorian, “2003 technology roadmap for semi-
conductors,” IEEE Computer, vol. 37, 2004.
[2] C. Teuscher, I. Sheng, S.and ODonnell, K. Stone, and R. Brodersen, “Design and im-
plementation issues for a wideband indoor DS-CDMA system providing multimedia
access,” Proc. 34th Annu. Allerton Conf. Communication, Control, and Computing,
1996.
[4] M. Caudill and C. Butler, Understanding Neural Networks: Volume 1: Basic Net-
works. Cambridge, Massachusetts: The MIT Press, 1992.
[5] W. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous
activity,” Bulletin of Mathematical Biophysics, vol. 7, 1943.
[7] S. Sakurai and M. Ismail, “A CMOS square-law programmable floating resistor inde-
pendent of the threshold voltage,” IEEE Trans. on Circuit and Systems II: Analog and
Digital Signal Processing, vol. 39, 1992.
[8] S. P. Singh, J. V. Hanson, and J. Vlach, “A new floating resistor for CMOS technol-
ogy,” IEEE Trans. on Circuit and Systems, vol. 36, 1989.
122
[13] M. Banu and Y. Tsividis, “Fully integrated active RC filters in MOS technology,”
IEEE International Solid-State Circuits Conference, vol. 26, 1983.
[14] Z. Czarnul and Y. Tsividis, “Implementation of MOSFET-C filters based on active
RC prototypes,” Electronic Letters, 1988.
[15] J. N. Babanezhad and G. C. Temes, “A linear NMOS depletion resistor and its ap-
plication in an integrated amplifier,” IEEE Journal of Solid-State Circuits, vol. 19,
1984.
[16] Z. Czarnul, “A linear NMOS depletion resistor and its application in an integrated
amplifier,” IEEE Journal of Solid-State Circuits, vol. 22, 1987.
[17] K. Nay and A. Budak, “A voltage-controlled resistance with wide dynamic range and
low distortion,” IEEE Trans. on Circuit and Systems, vol. 30, 1983.
[18] G. Wilson and P. K. Chan, “Analysis of nonlinearities in MOS floating resistor net-
works,” IEE Proc.-Circuits Devices Syst, vol. 141, 1994.
[19] L. Sellami, S. Singh, R. Newcomb, A. Rasmussen, and M. Zaghloul, “CMOS bilateral
linear floating resistors for neural-type cell arrays,” Conference Record of the Thirty-
First Asilomar Conference on Signals, Systems and Computers, vol. 2, 1997.
[20] A. Rasmussen and M. Zaghloul, “CMOS analog implementation of cellular neural
network to solve partial differential equations with a microelectromechanical ther-
mal interface,” Proceedings of the 40th Midwest Symposium on Circuits and Systems,
vol. 2, 1997.
[21] S. Tantry, T. Yoneyama, and H. Asai, “Two floating resistor circuits and their applica-
tions to synaptic weights in analog neural networks,” IEEE International Symposium
on Circuits and Systems, vol. 1, May 2001.
[22] Z. Czarnul and S. Tagaki, “Design of linear tunable CMOS differential transconductor
cells,” Electronics Letters, vol. 26, 1990.
[23] A. Nedungadi and T. Viswanathan, “Design of linear CMOS transconductance ele-
ments,” IEEE Transactions on Circuit and Systems, 1984.
[24] G. Han and E. Sanchez-Sinencio, “CMOS transconductance multipliers: a tutorial,”
Analog and Digital Signal Processing, IEEE Transactions on Circuits and Systems II,
vol. 45, 1998.
[25] P. Allen and D. Holberg, CMOS Analog Circuit Design. Oxford University Press,
Oxford, 2002.
[26] J. McCreary and P. Gray, “All-MOS charge redistribtion Analog-to-Digital conversion
tecniques-part 1,” IEEE Journal of Solid-State Circuits, vol. SC-10, December 1975.
[27] Y. Yee, L. Terman, and L. Heller, “A two-stage weighted capcitor network for D/A-
A/D conversion,” IEEE Journal of Solid-State Circuits, vol. SC-14, August 1979.
123
[28] S. Singh, A. Prabhaker, and A. Bhattacharyya, “C-2C ladder-based D/A converters
for PCM codecs,” vol. 22, pp. 1197–1200, December 1987.
[29] J. McCreary, “Matching properties, and voltage and temperature dependence of MOS
capacitors,” vol. SC-16, pp. 608–616, December 1981.
[30] J.-B. Shyu, G. Temes, and F. Krummenacher, “Random error effects in matched MOS
capacitors and current sources,” vol. 19, pp. 948–956, December 1984.
[31] J.-B. Shyu, G. Temes, and K. Yao, “Random errors in MOS capacitors,” vol. 17,
pp. 1070–1076, December 1982.
[34] B. Leung and S. Sutarja, “Multi-bit sigma-delta A/D converter incorporating a novel
class of dynamic element matching techniques,” vol. 39, pp. 35–51, January 1992.
[35] R. Baird and T. Fiez, “Improved ∆Σ DAC linearity using data weighted averaging,”
vol. 1, pp. 13–16, May 1995.
[36] R. Schreier and B. Zhang, “Noise-shaped multibit D/A convertor employing unit ele-
ments,” vol. 31, pp. 1712–1713, September 1995.
[37] I. Galton, “Noise-shaping D/A converters for ∆Σ modulation,” vol. 1, pp. 441–444,
May 1996.
[38] J. Goes, J. Vital, and J. Franca, “Systematic design for optimization of high-speed
self-calibrated pipelined A/D converters,” IEEE Transactions on Circuits and Systems
II: Analog and Digital Signal Processing, vol. 45, 1998.
[39] T. Brooks, D. Robertson, D. Kelly, A. Del Muro, and S. Harston, “A cascaded sigma-
delta pipeline A/D converter with 1.25 MHz signal bandwidth and 89 dB SNR,” IEEE
Journal of Solid-State Circuits, vol. 32, 1997.
[40] J. Candy, “A use of double integration in sigma delta modulation,” IEEE Transactions
on Communications, March 1988.
[42] P. Wong and R. Gray, “FIR filters with sigma-delta modulation encoding,” vol. 38,
pp. 979–990, 1990.
[43] G. Fischer, “Analog FIR filters by switched-capacitor techniques,” vol. 37, pp. 808–
814, 1990.
124
[44] Y. L. Cheung and A. Buchwald, “A sampled-data switched-current analog 16-tap fir
filter with digitally programmable coefficients in 0.8m cmos,” pp. 54–55, 1997.
[47] Q. Huang and G. Moschytz, “Analog FIR filters with an oversampled σ-4 modulator,”
vol. 39, no. 9, pp. 658–663, 1992.
[48] Q. Huang, G. Moschytz, and T. Burger, “A 100 tap FIR/IIR analog linear-phase low-
pass filter,” pp. 91–92, 1995.
[49] N. Benvenuto, L. Franks, and J. Hill, F., “Realization of finite impulse response filters
using coefficients +1, 0, and -1,” vol. 33, pp. 1117–1125, 1985.
[50] S. Abeysekera and K. Padhi, “Design of multiplier free FIR filters using a LADF
sigma-delta (σ-4) modulator,” vol. 2, pp. 65–68, 2000.
[51] K. Bult and G. Geelen, “An inherently linear and compact MOST-only current divi-
sion technique,” vol. 27, pp. 1730–1735, 1992.
[54] S. Lyle, G. Worstell, and R. Spencer, “An analog discrete-time transversal filter in 2.0
µm CMOS,” vol. 2, pp. 970–974, 1992.
[55] X. Wang and R. Spencer, “A low-power 170-MHz discrete-time analog FIR filter,”
vol. 33, pp. 417–426, 1998.
[59] H. K. J. Ihantola and J. L. Moll, “Design theory of a surface field effect transistors,”
Solid-State Electron., vol. 7, 1964.
125
[60] M. H. White, F. Van De Wiele, and J. P. Lambot, “High accuracy MOS models for
computer-aided design,” IEEE Trans., vol. ED-27, 1980.
[61] Y. P. Tsividis, Operation and modelling of the MOS transistor. New York: McGraw-
Hill Companies, Inc., 1987.
[62] C. Bleiker and H. Melchior, “A four-state eeprom using floating-gate memory cell,”
IEEE J. Solid State Circuits, vol. 22, no. 3, 1987.
[63] H. Nozama and S. Kokyama, “A thermionic electron emission model for charge re-
tention in SAMOS structures,” Japanese Journal of Applied Physics, vol. 21, 1992.
[64] T. Shibata and T. Ohmi, “A functional MOS transistor featuring gatelevel weighted
sum and threshold operations,” IEEE Trans. Electron Devices, 1992.
[66] E. Özalevli and P. Hasler, “Design of a CMOS floating-gate resistor for highly lin-
ear amplifier and multiplier applications,” Proceedings of Custom Integrated Circuits
Conference, San Jose, California, 2005.
[68] K. Vavelidis and Y. Tsividis, “R-MOSFET structure based on current division,” Elec-
tronics Letters, vol. 29, 1993.
126
[75] E. Özalevli and P. Hasler, “10-bit programmable voltage-output Digital-Analog con-
verter,” Proceedings of International Symposium on Circuits and Systems, Kobe,
Japan, 2005.
[76] W. Black, D. Allstot, and R. Reed, “A high performance low power CMOS channel
filter,” vol. 15, pp. 929–938, 1980.
[77] K. Brehmer and J. Wieser, “Large swing CMOS power amplifier,” vol. SC-18,
pp. 624–629, 1983.
[78] S. Merchant and B. Rao, “Distributed arithmetic architecture for image coding,”
pp. 74–77, 1989.
[79] H. Cao and W. Li, “VLSI implementation of vector quantization using distributed
arithmetic,” vol. 2, pp. 668–671, 1996.
[80] M. Sun, T. Chen, and A. Gotlieb, “VLSI implementation of a 16x16 discrete cosine
transform,” vol. CAS-36, pp. 610–617, 1989.
[82] A. Croisier, D. Esteban, M. Levilion, and V. Rizo, “Digital filter for PCM encoded
signals,” 1973.
[83] A. Peled and B. Liu, “A new hardware realization of digital filters,” vol. 22, pp. 456–
462, 1974.
[85] P. Lim and B. Wooley, “A high-speed sample-and-hold technique using a miller hold
capacitance,” vol. 26, pp. 643–651, 1991.
[86] K. Bult and G. Geelen, “A fast-settling CMOS opamp for SC circuits with 90-db
DC-gain,” vol. 25, pp. 1379–1384, 1990.
127