Dissertation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 144

An Ultra-Low-Power ADPLL for BLE

Applications
Analysis, Design and Implementation
An Ultra-Low-Power ADPLL for BLE
Applications
Analysis, Design and Implementation

Afstudeerverslag

Master of Science Thesis

For the Degree of Master of Science in Electrical Engineering at Delft University of


Technology

Lianbo Wu
Academia Supervisor:

Prof. Dr. R. B. Staszewski

Industry Supervisor:

Dr. Xin He

Committee:
Prof. Dr. R. B. Staszewski
Prof. Dr. Michiel Pertijs
Prof. Dr. Nick van der Meijs
Dr. Xin He

Copyright © 2014 by Lianbo Wu


Simplicity is the ultimate sophistication

Leonardo da Vinci

万物之始,大道至简,衍化至繁

老子
Abstract

In recent years, wireless personal area network (WPAN) applications have trig-
gered the needs for low-cost and low-power PLLs which also provide good perfor-
mance. All-digital phased-locked loops (ADPLLs) are preferred over their analog
counterparts in nanoscale CMOS technology due to their flexibility, configurabil-
ity, small area and easy portability. However, fractional spurs and insufficiently low
power dissipation are main problems related to conventional TDC-based structures.
In this work, a sub-half mW 2.2 GHz - 3 GHz fractional-N ADPLL is presented for
Bluetooth Low Energy (BLE) applications. Coarse-fine DTC based phase predictor
with dynamic element matching (DEM) ability and clock gated phase error freezer
are proposed to reduce the power while maintaining good phase noise and frac-
tional spur performance. This prototype ADPLL was taped out on Sep. 11th 2014
in GlobalFoundries 40 nm Low Power (40 nm-LP) technology. Based on post-layout
simulations and modelling, it is expected to consume less than 450 𝜇W with inte-
grated rms jitter of 1.5 ps for the close integer channel and 800 fs for the rest of
channels, leading to a potential state-of-art FoM below -240 dB. Design of the full
ADPLL in terms of system level analysis, digital logic, mixed-signal and RF design
is presented in the thesis.

vii
Acknowledgement

I am deeply grateful to all the people who in one way or another have helped
me during my MSc project. Without the support of others, it would be impossible for
me to reach this stage. First and foremost I would like to express my sincere thanks
to my MSc supervisor, Dr. Robert Bogdan Staszewski. It is really an honor to work
under his supervision on the field of the ADPLL design. Thanks for his guidance in
my MSc project work and other matters. I have learned a lot from his incomparable
expertise on this field, his strict requirement of the design and his passion for the
work. Once again, thank you my supervisor, for your time, attention and patience
with me all over the year. Special thanks to my colleagues in NXP. I want to thank
Dr. Xin He, my industry supervisor, for his help, care, discussion as well as support
for my intern work. He gives me too much. I also want to thank Theo Thurlings
for his time spent with me on the digital flow back-end. The thanks also go to Tarik
Saric for the discussion on DCO design; to Nenad Pavlovic for the discussion on
system level analysis; to Vladislav Dyachenko, Robert Rutten, Salvatore Drago and
other members in the design team for the discussion on circuit design, layout and
chip finishing.
I would like to express my gratitude to the other members of my MSc defense
committee, Dr. Michiel Pertijs and Dr. Nick van der Meijs, for their invaluable time
not only in attending my defence but also in helping me to improve this thesis. I
would like to thank my friends here. These go to the PhD students in Electronics
Research Laboratory (ELCA), especially Gerasimos Vlachogiannakis, Ying Wu and
Zhirui Zong for the technical discussions and the helps on other issues; go to Qilong
Liu for the wonderful cooperation on the courses; and also go to all my classmates
for the help and the fun during the two-year study in microelectronic track in TU
Delft. Last but not least, I would like to thank my parents for all that they have
done for me. Only with their support can I take the courage to study and succeed
in the other end of Eurasia . Special thank also goes to my love, Tianzi Wang for
her support and accompaniment since our bachelor university (University of Science
and Technology of China). It was really a right decision for us to choose Delft to stay

ix
x Acknowledgement

together two years ago, after bachelor study. In a word I owe my accomplishments
to my family.

Lianbo Wu
Delft, 2014
Glossary

List of Acronyms
ADPLL all-digital phase-locked loop

CP-PLL charge-pump phase-locked loop

CKR retimed reference clock

CKV variable clock

ckvd2 divide-by-2 variable clock

CMOS complementary metal-oxide-semiconductor

DCO digitally controlled oscillator

DTC digital-to-time converter

FCW frequency command word

FoM figure-of-merit

OTW oscillator tuning word

PCB printed circuit board

TDC time-to-digital converter

MoM Metal-oxide-Metal

DEM dynamic element matching

TSPC true single phase clock

SPI serial peripheral interface

WPAN wireless personal area network

xi
xii Acknowledgement

WSN wireless sensor network

LMS least mean squares

LFSR linear feedback shift register

TX transmitter

INL integral nonlinearity

DNL differential linearity

PVS Cadence physical verification system

CML current-mode logic


Contents

1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Frequency Synthesizer for Telecommunication Systems and
Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Common Metrics for Frequency Synthesizer . . . . . . . . . . . . 3
1.4 PLL based Frequency synthesizer . . . . . . . . . . . . . . . . . . 4
1.4.1 Charge-pump PLL . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Research Contribution. . . . . . . . . . . . . . . . . . . . . . . . . 8
1.7 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 8
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 ADPLL System: Background, Analysis and Proposed Architec-


ture 13
2.1 Current Popular Architectures . . . . . . . . . . . . . . . . . . . . 14
2.1.1 Divider-based Digital PLL . . . . . . . . . . . . . . . . . . . 14
2.1.2 Counter-based Digital PLL . . . . . . . . . . . . . . . . . . 15
2.1.3 Comparison of Architectures . . . . . . . . . . . . . . . . . 16
2.2 ADPLL System Level Analysis . . . . . . . . . . . . . . . . . . . . 17
2.2.1 PLL Frequency Response . . . . . . . . . . . . . . . . . . . 18
2.2.2 Noise Sources and Noise Transfer Function . . . . . . . . 20
2.3 Worst Inband Fractional Spurious Components of ADPLL Out-
put Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.1 Limited Resolution Introduced Spur . . . . . . . . . . . . 24
2.3.2 TDC Nonlinearity Introduced Spur . . . . . . . . . . . . . 29
2.3.3 TDC Normalization Introduced Spur . . . . . . . . . . . . 31
2.4 Specification of the targeted ADPLL Design . . . . . . . . . . . . 32
2.4.1 Phase Noise Specification . . . . . . . . . . . . . . . . . . . 33

xiii
xiv Contents

2.4.2 Spurious Tone Specification . . . . . . . . . . . . . . . . . 34


2.4.3 Settling Time Specification . . . . . . . . . . . . . . . . . . 34
2.5 Implementation from System level. . . . . . . . . . . . . . . . . . 34
2.5.1 Reference Design Introduction . . . . . . . . . . . . . . . . 34
2.5.2 DTC-based Phase Detector . . . . . . . . . . . . . . . . . . 37
2.5.3 Proposed System . . . . . . . . . . . . . . . . . . . . . . . . 44
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3 ADPLL Modelling and Simulation 49


3.1 General Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1.1 Direct Simulation. . . . . . . . . . . . . . . . . . . . . . . . 50
3.1.2 Event Driven Verilog-based Simulation . . . . . . . . . . . 50
3.2 Mixed Signal Simulation Methodology . . . . . . . . . . . . . . . 51
3.3 Phase Detector Modelling . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.1 Mismatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.2 Accumulated Noise . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.3 Supply Noise and Dirty Ground . . . . . . . . . . . . . . . 52
3.3.4 Metastability . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4 RF Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.1 DCO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.2 Feedback Divider . . . . . . . . . . . . . . . . . . . . . . . . 53
3.5 Digital Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.6 Verification of the Model . . . . . . . . . . . . . . . . . . . . . . . 54
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4 ADPLL Implementation in Transistor-Level 59


4.1 Low Speed Digital Implementation . . . . . . . . . . . . . . . . . 59
4.1.1 Digital Phase Error Detection Logic . . . . . . . . . . . . . 61
4.1.2 TX Interface for Two-point Frequency Modulation . . . . 62
4.1.3 5-bit ΣΔ modulator . . . . . . . . . . . . . . . . . . . . . . . 63
4.1.4 Dynamic Element Matching . . . . . . . . . . . . . . . . . 64
4.1.5 Loop Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2 Mixed Signal Implementation . . . . . . . . . . . . . . . . . . . . 68
4.2.1 Coarse-Fine DTC Design . . . . . . . . . . . . . . . . . . . 68
4.2.2 Fine Bank of DTC . . . . . . . . . . . . . . . . . . . . . . . 74
4.2.3 Coarse Bank of DTC . . . . . . . . . . . . . . . . . . . . . . 76
Contents xv

4.2.4 Input Buffer of DTC . . . . . . . . . . . . . . . . . . . . . . 79


4.2.5 Summary of the proposed DTC . . . . . . . . . . . . . . . 79
4.2.6 Supply Noise Monitor of DTC. . . . . . . . . . . . . . . . . 80
4.2.7 Bang-Bang Phase Detector . . . . . . . . . . . . . . . . . . 82
4.2.8 Counter Design . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2.9 Time Freezer Design . . . . . . . . . . . . . . . . . . . . . . 84
4.3 RF Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3.1 DCO Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.3.2 Dynamic Divider . . . . . . . . . . . . . . . . . . . . . . . . 95
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5 ADPLL Top Level Completion, Simulation and Test Plan 103


5.1 Top Level Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.2 Serial Peripheral Interface . . . . . . . . . . . . . . . . . . . . . . 104
5.3 Top Level Simulation Result . . . . . . . . . . . . . . . . . . . . . 104
5.3.1 locking behaviour . . . . . . . . . . . . . . . . . . . . . . . 106
5.3.2 Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.3.3 Output spectrum . . . . . . . . . . . . . . . . . . . . . . . . 106
5.3.4 Comparison with state-of-art . . . . . . . . . . . . . . . . . 110
5.4 Test Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6 Conclusion 113
6.1 Conclusion for This Work . . . . . . . . . . . . . . . . . . . . . . . 114
6.2 Future Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

A Appendix 117
A.1 Capacitor Bank simulation . . . . . . . . . . . . . . . . . . . . . . 117
A.2 SPI register table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
A.3 Derivation for Mismatch for DTC unit cell design . . . . . . . . . 127
1
Introduction

Yes, we have to divide up our time like that, between our politics and our
equations. But to me our equations are far more important, for politics are
only a matter of present concern. A mathematical equation stands forever.

Albert Einstein

1
.
2 1. Introduction

1 1.1. Introduction
hat frequency synthesizer to IC is what watch to human being. Frequency
W synthesizer is such an important as well as indispensable item that you can
see it everywhere in modern IC products. From telecommunication systems to
digital circuit applications, from clock and data recovery to modulation and wave-
form generation, frequency synthesizer takes places in almost every IC circuit, ei-
ther wireless or wireline. In the mean time, remarkably growth has been seen
in wireless communication ever since year 1901 when Marconi succeed in his ra-
dio experiment, and nowadays people are enjoying the reality of anytime anyplace
communication thanks to emerging IC products. In this emerging IC market, many
untapped opportunities exist in the realm of short range low-cost wireless networks.
Applications in wireless personal area network (WPAN) and wireless sensor network
(WSN) are virgin territories where the new era of wireless would be triggered. Power
budget analysis of such sensor nodes reveals that the wireless transceiver domi-
nates the overall power. This, together with the need for large volume, low-cost,
highly integrated solutions warrant the need for ultra-low power RF transceivers in
nanometer scale digital CMOS technologies. This thesis is an effort in that direction
and explores the implementation of an ultra-low power all-digital phase-locked loop
(ADPLL) for frequency synthesis in WPAN transceivers with main focus on Bluetooth
Low Energy (BLE) application.

1.2. Frequency Synthesizer for Telecommunication


Systems and Modulation
Frequency synthesizer is intensively wanted in telecommunication applications.
It provides the means by which a communication or broadcast channel can be se-
lected. A wireless radio is such a system, which enables the flow of information
from one point to another, usually through air. Here Figure 1.1 shows a simpli-
fied block diagram of a typical low-power transceiver. A low-IF receiver and direct
modulation transmitter is typically the preferred architecture. In a low-IF receiver,
the received radio frequency (RF) signal from the antenna is amplified by the low-
noise amplifier (LNA) after down-converted to a low intermediate frequency (IF) by
a quadrature mixer. Following that, low pass filter and programmable gain ampli-
fier (PGA) is used to feed the signal into an analog-to-digital converter (ADC). The
1.3. Common Metrics for Frequency Synthesizer .
3

digitized data would be processed further by digital baseband. On the transmit-


ter side, the digital baseband converts the user data to symbols which are finally
1
transformed to polar form with amplitude and phase components. The frequency
phase data then modulates the synthesizer while the amplitude data controls the
power amplifier. Conclusion could be easily drawn here that a frequency synthe-
sizer not only plays a key role to generate local oscillator for mixer in receiver but
also contributes significantly to frequency modulation in transmitter path.

Mixer Low-pass filter


I
PGA ADC
RX data
LO I
LNA
LO Q
Q
PGA ADC
RX data

Frequency FM TX data
PA
synthesizer

AM

Figure 1.1: Example transceiver architecture

1.3. Common Metrics for Frequency Synthesizer


As shown in Figure 1.1, the synthesizer is a very important part of the radio.
It must meet very demanding specifications for their performance. These include:
Frequency tuning range and resolution The synthesizer must be able to
cover all frequencies necessary to cover every channel of interest according to the
application. Besides, the adjacent programmable frequencies of the synthesizer
must be separated by no more than the channel spacing of the radio.
Acquisition time This is the settling time after channel switch. In another
word, the time it takes to switch from one channel to another with a required
accuracy. This can be crucial for some applications requiring fast switching such
as frequency hopping spread spectrum. Moreover, much more power is consumed
before the synthesizer settles down and a short switching time is really important
.
4 1. Introduction

for duty-cycled low power systems.


1 Freedom from spurs The waveform must be free of spurs, or spurious fre-
quency components other than the one desired. In every standard there is a re-
quirement about the amplitude of the spur tones to be at least a specified decibels
lower than the desired carrier.
Purity of the output tone Generally, in addition to the desired tone, there
will be additional unwanted frequency. The spreading of the tone power in the
frequency domain around the desired frequency is often called phase noise. In
time domain it could be related as time jitter. Both the spurious tones and the
noise skirt degrade the performance of the transceiver. In time domain the output
of an ADPLL could be expressed as

𝑣(𝑡) = 𝐴 sin(𝜔 𝑡 + 𝜓(𝑡)) (1.1)

Where A represents the amplitude of the synthesized signal while 𝜔 is its frequency.
𝜓 captures the phase fluctuation and it could be expressed as

𝜓(𝑡) = △𝜓 sin(𝜔 𝑡) + 𝜙(𝑡) (1.2)

The first item represents the periodical modulation and will appear as spurious tone
at the output. The second item in the equation is the random fluctuation and it is
shown as noise-skirt around the carrier frequency.

1.4. PLL based Frequency synthesizer


Although other methods such as direct digital synthesis exit, phase-locked loop
(PLL) based frequency synthesis is by far the most popular one and widely used
in communication systems owing to its feature of generating high frequency with
low phase noise. A generic PLL diagram is shown in Figure 1.2. Essentially it is a
negative feedback system which contains basically three core parts: (1) a phase
detector (PD), (2) a loop filter and (3) an oscillator. The oscillator is the soul as
well as engine to generate a high frequency according to the input reference clock
with a certain ratio. The phase of the output variable signal is compared with
the input reference clock by the phase detector and the phase error is passed to
the loop filter. The filtered phase error signal is processed to control the oscillator
in the direction that the phase error is reduced. When the loop is finally locked,
the average output frequency equals exactly to the input signal under that certain
1.4. PLL based Frequency synthesizer .
5

,QSXW
6LJQDO 3KDVH /RRS 1
26&
'HWHFWRU )LOWHU

Figure 1.2: Essential PLL diagram

ratio. The output is then tracking the input signal. The history of PLL dates back
to early 1930s when British researchers developed the zero-IF receiver to alternate
the famous superheterodyne one proposed by Armstrong in 1918. Problem was
about the rapid drift in frequency with the output of the local oscillator. A French
engineer Henri de Bellescize published the paper [1] about the idea of maintaining
the phase of the oscillator in a desired way by adding a feedback correction. This
is recognized as the first PLL ever published. In 1960s, boosted by large needs in
consumer products such as analog TV and explosion of IC industry, PLL theory and
design got more mature and numerous analog as well as digital PLL structures were
explored and published ever since then. Nevertheless, all-digital PLL did not come
to life until early this century. In the year of 2004, a novel all digital PLL system
was published by TI [2] and it triggered a new trend of PLL design, serving as the
start point of this thesis.

1.4.1. Charge-pump PLL


Figure 1.3 shows a simplified block diagram of the conventional Charge-Pump
PLL (CP-PLL). The oscillator part is implemented as a voltage-controlled oscillator
(VCO) in an analog way. The phase frequency detector (PFD) compares the phase of
the reference signal and the divided down output signal and generates an up/down
pulse with a width proportional to the phase error. The output of the phase detector
controls the magnitude and direction of the charge pump current which is pumped
into or out of the loop filter. The passive loop-filter converts this current into a
tuning voltage for the VCO while suppressing the noise from the reference signal
and the phase frequency detector.
CP-PLL itself is a good architecture for WPAN application in terms of its power
and phase noise performance. However, the analog intensive nature of the CP-PLL
makes its implementation difficult in advanced process nodes. Voltage headroom
.
6 1. Introduction

Charge Pump
1
Loop Filter VCO
Up
FVCO
FREF
PFD
Down

FDIV

Frequency
Divider

Figure 1.3: Simplified block diagram of charge-pump PLL

is reduce in advanced technology nodes due to the reduced voltage supply and this
is a bad news for CP-PLL who basically operates in voltage domain. Besides, the
passives do not scale well with the technology and the loop-filter used in CP-PLL
occupies large area, increasing the overall system costs. In addition, the reduced
gate-oxide thickness in the nanoscale CMOS technologies results in significant cur-
rent leakage through the integration capacitor of the loop-filter. This leakage current
increases the total PLL jitter, thereby degrades the performance.

1.5. Motivation
As analog frequency synthesizer design comes across problems in advanced
CMOS process, methods to integrate RF front-end with baseband processor are
highly desired. Compared with conventional analog design, a digital intensive ap-
proach provides other benefits such as better testability, more reconfigurability,
smaller area and higher degree of integration. Besides, it is also easier for digital
calibration such as DCO gain calibration or fractional spur reducing calibration.
At this proper moment, ADPLL’s emergence meets this anticipation and also en-
ables the way leading to digital assisted RF, e.g. all digital transmitter with two-point
modulation. Compared with its analog counterpart, it is not only more economic
but also technology scaling friendly. Its performance is already proved in standards
1.5. Motivation .
7

such from Bluetooth to GSM [3] to be able to alternate the CP-PLL. What’s more,
the settling procedure is generally more faster owing to the digital algorithm whom
1
it bases on. Numerous publications and researches have been conducted to opti-
mize the performance regarding power, phase noise and spur level. However, it
is still difficult to see a DPLL design realized with sub-mW power consumption. As
shown in Figure 1.4, the only design to break sub-mW barrier among all the state
of the art in the past decade [2] [4] [5] [6] [7] [8] [9] is proposed by IMEC Holst
Centre [5]. Nevertheless, in terms of phase noise and power there is still a large
headroom to fully explore. This is also the start point as well as motivation for
this thesis work, which is to reduced the worst fractional spur with optimized phase
noise and power. The target specifications will be analysed in detail in Chapter 2.

State of the Art Fractional-N PLL


10

ISSCC'12 CP[4]

ISSCC'14[5]
Integrated Ji er (ps^2)

1 ISSCC'13[6] ISSCC'04[2]

ISSCC'11[8]
ISSCC'11’bangbang [7]

JSSCC'04[9]

0.1
0.1 1 10 100
Power (mW)

Figure 1.4: FoM for recent year designs 1 .

1. The figure of merit (FoM) is defined as [ ∗( )][10]


.
8 1. Introduction

1 1.6. Research Contribution


The time table which records my master thesis design is shown in Figure 1.5. As
depicted in the following chapters, a first-ever coarse-fine phase prediction based
ADPLL design with sub-half mW and optimized fractional spur level is proposed.
According to the post-layout simulation and verified model, a state of the art FoM
is expected. This is a one-year one-student project at NXP semiconductor imple-
mented in GlobalFoundries 40 nm-LP technology. My theoretical work mainly lies in
the analysis of fractional spur and rough investigation about mismatch in inverter
based DTC. Nevertheless, the main contribution of mine still comes from the imple-
mentation of a full ADPLL. I am in charge of the system level analysis, modelling
of the PLL, core digital logic and SPI RTL coding as well as front-end (specification
to netlist) synthesizing, circuit level implementation of DCO, divider, counter, clock
gating and DTC. I got access to IMEC Holst Centre’s design data [5] as a reference
in late January this year in TSMC 40 nm and did three times of measurement in
IMEC before March due to administration arrangement. The design is sent for tape-
out on Sep. 11th and the I am preparing the PCB for measurement. I will finish
the measurement at NXP also.

1.7. Outline of the Thesis


This thesis presents the design of a full ADPLL from system level to transistor-
level. It is organized as follows. Chapter 2 begins with a background introduction of
ADPLL architecture. After that, fractional spur issue is analysed in terms of reason,
level estimation as well as proposed solution. Based on the proposed solution,
a coarse-fine phase predictor based ultra-low power ADPLL system is proposed.
Chapter 3 clarifies the simulation method I used and the modelling of work of this
ADPLL design I have done. The implementation of the whole ADPLL is introduced
in Chapter 4 from three aspects: digital logic, mixed signal design and RF design.
Chapter 5 summarizes the top level simulation result while Chapter 6 draws the
final conclusion of this work. The appendix includes the important data not shown
in the bulk part of the thesis, like the register table, some direct simulation results
as well as some related details of derivations described in the main part.
1.7. Outline of the Thesis .
9

Figure 1.5: Master thesis timeline.


.
10 References

1 References
[1] H. de Bellescize, La reception synchrone, L’Onde Electrique (1932).

[2] R. B. Staszewski, K. Muhammad, D. Leipold, C.-M. Hung, Y.-C. Ho, J. L. Wall-


berg, C. Fernando, K. Maggio, R. Staszewski, T. Jung, et al., All-digital TX
frequency synthesizer and discrete-time receiver for Bluetooth radio in 130-
nm CMOS, Solid-State Circuits, IEEE Journal of 39, 2278 (2004).

[3] R. B. Staszewski, J. L. Wallberg, S. Rezeq, H. Chih-Ming, O. E. Eliezer, S. K.


Vemulapalli, C. Fernando, K. Maggio, R. Staszewski, N. Barton, L. Meng-
Chang, P. Cruise, M. Entezari, K. Muhammad, and D. Leipold, All-digital PLL
and transmitter for mobile phones, Solid-State Circuits, IEEE Journal of 40,
2469 (2005).

[4] Y.-H. Liu, X. Huang, M. Vidojkovic, K. Imamura, P. Harpe, G. Dolmans, and


H. De Groot, A 2.7 nJ/b multi-standard 2.3/2.4 GHz polar transmitter for wire-
less sensor networks, in Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), 2012 IEEE International (IEEE, 2012) pp. 448–450.

[5] V. Chillara, Y.-H. Liu, B. Wang, A. Ba, M. Vidojkovic, K. Philips, H. de Groot, and
R. Staszewski, 9.8 An 860𝜇W 2.1-to-2.7GHz all-digital PLL-based frequency
modulator with a DTC-assisted snapshot TDC for WPAN (Bluetooth Smart and
ZigBee) applications, in Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), 2014 IEEE International (2014) pp. 172–173.

[6] J.-W. Lai, C.-H. Wang, K. Kao, A. Lin, Y.-H. Cho, L. Cho, M.-H. Hung, X.-Y. Shih,
C.-M. Lin, S.-H. Yan, Y.-H. Chung, P. Liang, G.-K. Dehng, H.-S. Li, G. Chien, and
R. Staszewski, A 0.27mm2 13.5dBm 2.4GHz all-digital polar transmitter using
34DPA in 40nm CMOS, in Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), 2013 IEEE International (2013) pp. 342–343.

[7] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, A


2.9-4.0-GHz Fractional-N Digital PLL With Bang-Bang Phase Detector and 560
fs-RMS Integrated Jitter at 4.5-mW Power, Solid-State Circuits, IEEE Journal
of 46, 2745 (2011).

[8] N. Pavlovic and J. Bergervoet, A 5.3GHz digital-to-time-converter-based


fractional-N all-digital PLL, in Solid-State Circuits Conference Digest of Tech-
nical Papers (ISSCC), 2011 IEEE International, pp. 54–56.
References .
11

[9] E. Temporiti, G. Albasini, I. Bietti, R. Castello, and M. Colombo, A 700-khz


bandwidth 𝜎𝛿 fractional synthesizer with spurs compensation and linearization
1
techniques for wcdma applications, Solid-State Circuits, IEEE Journal of 39,
1446 (2004).

[10] X. Gao, E. A. Klumperink, M. Bohsali, and B. Nauta, A 2.2 ghz 7.6 mw sub-
sampling pll with-126dbc/hz in-band phase noise and 0.15 psrms jitter in 0.18
𝜇m cmos, (2009).
2
ADPLL System: Background,
Analysis and Proposed
Architecture

All-digital PLL (ADPLL) structure vouched its potential for ultra-low power ap-
plication in nanoscale CMOS technology in recent publications [1], as its con-
ventional analog counterpart finds more difficulties to scale in CMOS technol-
ogy. As one of the most exciting invention in RF field, counter based ADPLL
not only addresses the issues such as reference spur faced by charge-pump
PLL, but also benefits people with more reconfigurability, flexibility features to
meet the stringent requirements set by the modern advanced wireless stan-
dards.

Based on the background introduction given previously, this chapter provides


further investigation on ADPLL design from system aspect. Section 2.1 out-
lines the current popular Digital PLL architectures. Section 2.2 offers basic
background analysis of ADPLL system from both frequency and time per-
spectives. Fractional spur issue is explored in depth in section 2.3. Section
2.4 clarifies the specifications of this thesis based on Bluetooth Low Energy

13
.
14 2. ADPLL System: Background, Analysis and Proposed Architecture

standard. Section 2.5 proposes a novel ADPLL architecture to realize low


noise and small spur at low power.

2 2.1. Current Popular Architectures


hat is the criteria to define whether a PLL is an ”analog” PLL or a ”digital”
W one? The general criteria is up to the fact that whether the three blocks
(phase detector, loop filter, oscillator) in Figure 1.2 are digital intensively or analog
intensively designed. The definition of digital PLL is referring to any PLL architecture
with all the three blocks designed in pure digital way or digital intensively way,
compared with the conventional analog structure such as charge-pump. As one
of the most exciting breakthroughs in RFIC field for the past ten years, ADPLL
has founded promising positions in applications from wireline to wireless systems,
from Mega Hz to mm-wave products and from utra-low power oriented to high
performance intentioned applications. Though shown in numerous patents from
US to Europe, described in multitudinous publications from conferences to journals,
most of the proposed ADPLL designs could be categorized into two types in terms of
the ”feedback phase detector” implementation: divider-based and counter-based
ADPLLs.

2.1.1. Divider-based Digital PLL


The divider-based DPLL architecture, shown in Figure 2.1, could be considered
as a direct digital equivalent structure of the Charge-Pump PLL [2]. As the name
implies, a divider is situated at the first stage of the feedback loop directly after
the DCO output to divide down the variable clock CKV before feeding it to TDC.
A ΣΔ modulator based programmable divider determines the output frequency by
the ratio set in it. The TDC tracks the phase error by digitizing the difference in
the phase of the reference clock and downward scaled output frequency. Thus
the digital phase detector consists of a TDC and a programmable divider. Accord-
ing to the digitized phase error info, digital loop filter generates control words to
adjust the DCO frequency to track the reference input. Referring to Figure 1.3,
conclusion could be drawn that the same operation scheme is shared by both of
the charge-pump PLL and this divider based architecture. This divider-based type
DPLL is already under investigation from 1980s because of the fact that not too
much deviation is there compared with charge-pump version.
2.1. Current Popular Architectures .
15

VCO
FREF
TDC Loop Filter DAC ∼

. 2

÷N

FCW
Σ∆

Figure 2.1: Divider-based DPLL.

2.1.2. Counter-based Digital PLL


This section introduces an architecture works in real time domain as well as
digital domain. This architecture is proposed in the beginning of this century by TI
[3] as a breakthrough in PLL design.
One of the breakthroughs in this architecture lies in the oscillator design. The
DCO is capacitor array based and thus much more compatible with modern sub-
micron technology nodes compared with conventional VCO design as discussed in
[4] [5]. Another important breakthrough lies in the operation scheme of this novel
system. It operates in pure digitized phase domain. Figure 2.2 shows a simplified
block-diagram of the counter based ADPLL. In the feedback loop there is no divider
existing anymore. Instead, the output frequency is set by the frequency command
word (FCW). FCW is determined by the supposed number of cycles of output clock
within one reference cycle. Hence, accumulating the FCW at reference rate pro-
vides the output period normalized reference phase 𝜙 . The integer part of the
output phase 𝜙 is obtained by counting the number of output clock cycles. The
fractional phase error is generally quantized by a fractional phase error quantizer
(usually realized by TDC). However, the design challenges are generally at the TDC
and DCO, which are mixed signal circuits with digital IOs.
.
16 2. ADPLL System: Background, Analysis and Proposed Architecture

DCO
FCW + CKV
Σ + Loop Filter ∼


2 .

FREF
TDC Sampler Σ

CKR
Retiming

Figure 2.2: Counter-based DPLL.

Digitized LF is also an eventful breakthrough in this class of ADPLL. Fully syn-


thesizable implementation not only shrinks significantly the area compared to its
analog correspondence, but also makes it realizable to configure LF’S type as well
as coefficients dynamically during normal operation without impacting phase er-
ror, e.g., gear-shifting technique [4]. Thus much more headroom are allowed to
control the closed-loop bandwidth and loop dynamics, benefiting from the recon-
figurability brought by the digitalization. Besides, frequency or phase modulation
could also be accomplished thanks to the freedom in the FCW control path. Thus,
frequency modulation (FM) can be integrated with ADPLL directly which is much
simpler compared with conventional structure as shown in Figure 1.3.

2.1.3. Comparison of Architectures


Based on the two simplified block diagrams shown above, a conclusion could
be drawn easily that the main difference of the two structures lies in the feedback
loop. This main difference results in distinction in terms of power and phase noise
as discussed below.
1.Divider-based architecture [6] The essential of the scheme is to first
scale down the output frequency to the reference rate and then quantize the phase
2.2. ADPLL System Level Analysis .
17

Table 2.1: Comparison of the two basic DPLL architectures in general.

Merit Divider-based Counter-based

Power Lower higher due to the high TDC operation frequency


2
Complexity Worse simpler due to short operation range required

Flexibility Less Tricks could be introduced due to operation scheme

difference by TDC. Thus the phase detector is quantizing both the integer and
fractional phase error in the same block. In addition, the covering range required
for the TDC requires a prohibiting large since the whole reference period has to be
covered which is really a disadvantage of this structure. Otherwise, a frequency
detection function has to be added. The positive aspect of this structure is the TDC
operation frequency is only at reference rate.
2. Counter-based architecture The essential of the scheme is based in
purely phase domain. Counter and the reference phase accumulator are introduced
to quantize the integer part of the phase error. While the TDC is holding the burden
of quantizing the fractional phase error. Thus the required operation range for the
TDC is only one output frequency period and the counter could be turned off in lock-
in state since integer phase error is supposed to be zero under that circumstance.
Nevertheless, without special clock gating technique, the operation frequency of
the TDC has to be DCO output frequency which is usually rather high.
Table 2.1 is a summary conclusion for the comparison in terms of power and
complexity as well as flexibility. (This comparison is based on the typical designs.
Special techniques such as clock gating, etc. may change the results.)
Based on all the discussion above, counter-based architecture is chosen for
this project due to its lower complexity as well as more flexibility.

2.2. ADPLL System Level Analysis


This section introduces briefly the system level analysis of ADPLL system in
terms of frequency response as well as noise sources and their transfer paths in
the loop. For more details please refer to [4] [7]. Blocks’ impacts for the phase
noise profile would be also investigate and summarised.
.
18 2. ADPLL System: Background, Analysis and Proposed Architecture

2.2.1. PLL Frequency Response


Figure 2.3 shows a general linearised, s-domain model of an ADPLL frequency
synthesizer. It is valid as long as the fluctuation frequencies of interest are much
smaller than the sampling rate which is the reference 𝑓 and thus it commonly
2
holds as long as Gardner’s rule [7] (i.e., the PLL bandwidth 𝑓 is at least 10 times
smaller than the sampling frequency) holds.
The input and output variables in Figure 2.3 are all digitized phases: refer-
ence phase 𝜙 , DCO output phase 𝜙 and after-divider phase 𝜙 as 𝜙 =
𝜙 /𝑀. The quantity N is the ratio between the divided DCO clock and FREF and
is equivalent to FCW here as
𝑓 /𝑀
𝑁= (2.1)
𝑓

Normalized DCO and divider-by-2


Phase Detector Loop Filter
N 2.4 GHz
ϕCKV
Ref.
ϕR + fR ϕCKV D
◦. × × ∼ 1 1
+ LF(s) ∧
− K DCO S D 1.2 GHz
2π ∗ KDCO
1

× + IIR × +
×
α α

×
fR ×
fR
Type-I s s
ρ ρ

Type-II Type-II higher order


Loop Filter

Figure 2.3: Linearised (s-domain) ADPLL.

The block LF(s) shown in Figure 2.3 could be three difference types: it can be
a type-I, with only a proportional gain 𝛼, or type-II with both proportional (𝛼) and
integral (𝜌) paths, or of higher order with IIR filter turned on. These LF parameters
are programmable and can be dynamically configured during normal PLL operation.
For the following frequency response analysis, type-II operation would be assumed
due to its universality (type-I could be considered as type-II with 𝜌 = 0).
For convenience, we would start by open the loop at 𝜙 to calculate the
2.2. ADPLL System Level Analysis .
19

open-loop transfer function 𝐻 (𝑠) for the type-II case. It is given by

𝜙
𝐻 (𝑠) =
𝑁∗𝜙
𝜌𝑓 𝑓 𝐾 1 (2.2) 2
= (𝛼 + )• ∧ • •
𝑠 𝑠 𝑀

Assuming that the DCO gain is estimated correctly, 𝐻 𝑙(𝑠) reduces to


𝜙
𝐻 (𝑠) =
𝑁∗𝜙
(2.3)
𝜌𝑓 𝑓 1
= (𝛼 + )• •
𝑠 𝑠 𝑀
The closed-loop transfer function can then be expressed as

𝜙
𝐻 (𝑠) =
𝜙
𝑁 • 𝑀 • 𝐻 (𝑠)
= (2.4)
1 + 𝐻 (𝑠)
𝛼𝑓 𝑠 + 𝜌𝑓
=𝑁•
𝑠 + 𝑠+ 𝑓

This can be compared to the classical, two-pole system transfer function

2𝜉𝜔 𝑠 + 𝜔
𝐻(𝑠) = 𝑁 (2.5)
𝑠 + 2𝜉𝜔 𝑠 + 𝜔

where 𝜉 is the damping factor and 𝜔 is the non-damped, natural frequency.


The zero lies at 𝜔 = −𝜔 /2𝜉. According to the analogy between Eq. 2.4 and Eq.
2.5 conclusion could be drawn as,

𝜌
𝜔 =√ •𝑓 (2.6)
𝑀
and
1 𝛼
𝜉= • (2.7)
2 √𝑀𝜌
For a type-I loop, the closed-loop transfer function simplifies to
𝛼𝑓
𝐻 (𝑠) = 𝑁 • (2.8)
𝑠+
.
20 2. ADPLL System: Background, Analysis and Proposed Architecture

and the 3-dB bandwidth of the loop is 𝑓 = 𝛼𝑓 /(2𝜋𝑀)


Now, considering the IIR filter in the proportional path of the LF, the transfer
function of a one stage IIR filter in s-domain is
1 + 𝑠/𝑓
2 𝐻 (𝑠) =
1 + 𝑠/𝜆𝑓
(2.9)

Four independently controlled IIR stages are also implemented as an extra option to
shape the phase noise and attenuate the reference and the TDC quantization noise
at -80 dB/decade slope. This strong filtering also helps the attenuate the close-in
fractional spurs. Each IIR stage has an attenuation factor 𝜆 , and the open-loop
transfer function becomes

𝜌𝑓 𝑓 1 1 + 𝑠/𝑓
𝐻 , (𝑠) = (𝛼 + )• • •∏ (2.10)
𝑠 𝑠 𝑀 1 + 𝑠/𝜆𝑓

2.2.2. Noise Sources and Noise Transfer Function


Linear ADPLL model including phase noise sources is shown in Figure 2.4. The
𝜙 , is the phase noise of the input reference clock. Its transfer function is the same
as the closed-loop transfer function of Eq. 2.4. The TDC and the DCO are the only
two places that noise could be injected to the system internally. Due to its digital
nature, the rest of the system is free from time or voltage domain perturbations.
The DCO phase noise 𝜙 , would experience high-pass filtering by the loop. The
closed-loop transfer function is

𝜙
𝐻 , (𝑠) =
𝜙 ,
1
= (2.11)
1 + 𝐻 (𝑠)
𝑠
=
𝑠 + 𝑠+ 𝑓

indicating that the DCO noise has a high-pass characteristics and that it dom-
inates the PLL phase noise outside the loop bandwidth. For phase noise at low
frequency offsets (i.e., in-band part), the type-II loop will suppress it at a slope of
40 dB/decade.
The second internal noise source 𝜙 , arises from the fractional phase error
counter (usually accomplished by TDC) operation of calculating 𝜖 . Here use TDC
2.2. ADPLL System Level Analysis .
21

Normalized DCO and divider-by-2


Phase Detector Loop Filter
ϕn,R N 2.4 GHz
ϕCKV
Ref.
ϕR + fR ϕCKV D
◦. + × × ∼ 1 1
+ LF(s) ∧ +
− K DCO S D
ϕn,F EC
+ 1

2π ∗ KDCO
ϕn,DCO
1.2 GHz
2

Figure 2.4: Noise sources in ADPLL.

as example for convenience, it has several components: quantization, linearity, and


randomness due to thermal effects, in which the quantization noise governed by
Eq. 2.12 is the major contributor [4].

(2𝜋) △𝑡 1
ℒ= ( ) (2.12)
12 𝑀 • 𝑇 𝑓

In Eq. 2.12, △𝑡 is the time resolution of the TDC, and 𝑀 • 𝑇 is the after-
divider clock period at the input of TDC. The factor M is due to the divider. The
closed-loop transfer function of the TDC noise can be expressed as

𝜙
𝐻 (𝑠) =
𝜙 ,
𝑀 • 𝐻 (𝑠)
= (2.13)
1 + 𝐻 (𝑠)
𝛼𝑓 𝑠 + 𝜌𝑓
=
𝑠 + 𝑠+ 𝑓

which is a low-pass response with a gain factor of M within the loop bandwidth.
Therefore the phase noise at the ADPLL RF output due to TDC quantization noise
is simply
(2𝜋) △𝑡 1
ℒ= ( ) (2.14)
12 𝑇 𝑓
Besides the three mentioned above, the finite frequency resolution of the DCO
also contributes to the phase noise at the output and this should be kept much
lower than the natural phase noise of the DCO in order to be negligible. Similar to
Eq. 2.12, the phase noise due to DCO frequency quantization could be derived [4]
.
22 2. ADPLL System: Background, Analysis and Proposed Architecture

as
1 △𝑓 1 △𝑓
ℒ= ( ) (𝑠𝑖𝑛𝑐 ) (2.15)
12 △𝑓 𝑓 𝑓

2 Since the DCO input tuning word is held constant between two different values,
the white noise assumption is not fully justified. Hence, there is a sinc function in
E.q 2.15 to account for the zero-order hold operation on the input tuning word of
DCO.
The quantization noise of the DCO has a 20 dB/decade attenuation, similar to
that of the up-converted thermal noise from the oscillator. As long as this quan-
tization noise is kept sufficiently low compared to the inherent phase noise of the
oscillator , the overall phase noise is not significantly affected. Usually the raw res-
olution of the DCO is too coarse for the required specification. For that reason, ΣΔ
dithering on the smallest switched capacitance is applied. As a consequence, the
quantization noise is shaped in frequency resulting in

1 △𝑓 , 1 𝜋△𝑓
ℒ= ( ) (2𝑠𝑖𝑛𝑐 ) (2.16)
12 △𝑓 𝑓 𝑓

As a consequence, the quantization noise is shaped in frequency resulting as


shown in Eq. 2.16, where 𝑓 is the dithering frequency and n the dithering order.
The equivalent frequency resolution after dithering is given as,

△𝑓
△𝑓 , = (2.17)
2

where 𝑊 is the number of the dithering bits.


The phase noise spectrum at ADPLL RF output is depicted in Figure 2.5, which
follows the previous s-domain analysis and all the related parameters settings obey
a general case only for convenience here. Above all, together with reviewing the
designs in [8] [9] [10], we can already draw the general conclusion for the common
metrics of ADPLL’s playmakers as summarized in Table 2.2 shown below.
Though bandwidth is determined by the filter characteristics, there is a op-
timized value under a certain power budget. According to the conclusion in [8],
it should be around the position where the open loop DCO phase noise spectrum
and loop noise intersects. However, spur issue is more complex and it would be
investigated in more details in the next chapter.
2.3. Worst Inband Fractional Spurious Components of ADPLL Output
Spectrum .
23
Type−2 higher−order ADPLL
−20
ADPLL Loop Setting
fv=2.402 GHz
−40 fR=32 MHz, M=1
LF: α=1/2^3, ρ=1/2^10
IIR tunred off
−60

Noise Sources 2
−80
DCO PN: -107 dBc/Hz
Output phase noise [dB]

@1 MHz offset
−100 Fref PN: -150 dBc/Hz
TDC resolution 2ps

−120

−140

−160
uncorrected DCO
Reference
−180 TDC
Variable (DCO)
Composite
−200
3 4 5 6 7 8
10 10 10 10 10 10
Frequency [Hz]

Figure 2.5: Phase noise spectrum at the output of ADPLL.

Table 2.2: Common metrics and depending blocks

common metrics dependence

Phase noise in-band (phase detector+reference),outside-band (DCO)

Bandwidth Loop Filter

locking behaviour Loop filter+ phase detector

Power Mainly consumed in DCO and phase detector

2.3. Worst Inband Fractional Spurious Components


of ADPLL Output Spectrum
The term spurious denotes any undesirable signal presents at the output of a
frequency system or any portion of it. Here the fractional spur terminology used
in ADPLL are defined as those undesired tones lying at multiples of 𝐹𝐶𝑊 •𝑓
on both sides of the carrier. Generally fractional spurs are channel depending and
the analysis could be rather complicated. These spurs are mainly originated from
the feedback phase detector path. Any periodical error introduced from limited res-
.
24 2. ADPLL System: Background, Analysis and Proposed Architecture

olution and mismatch in phase detector under fractional-N mode would modulate
the output of the ADPLL. Under the lock-in state, the frequency fluctuation of out-
put is as small as tens of kHz so this error-modulating procedure would be exactly
a narrowband FM behaviour. Thanks to the low-pass filtering characteristic loop,
2 the out-of-band fractional spurs generally will not cause serious problem. What
is of interest to designers are usually the positions and levels of worst fractional
spurs shown in-band. Position estimation theory are mature and discussed in pre-
vious publications [11] [12] [13] while the spur level estimation has not been fully
analysed. In the following part of this subsection, spur level estimation derivation
would be demonstrated, in three parts according to the origins of the spurs: limited
resolution, nonlinearity and gain estimation error.

2.3.1. Limited Resolution Introduced Spur


Due to the gate delay and mismatch limitation, a typical inverter based TDC
resolution is limited to 10 ps in 40 nm technology. For most application at 2.4
GHz ISM band this is more than enough. However a TDC with resolution of 10
ps is only a 5.3-bit quantizer to cover the period of a 2.4 GHz signal while the
fractional part resolution of FCW is much finer than this, e.g., 16-bit width in this
project. Eq. 2.12 holds only under the large-signal assumption (spanning multiple
quantization levels). When it comes to the case of close integer FCW channel, the
fractional phase is increasing with step size much finer than the TDC resolution.
Thus there would be a residue error accumulated periodically and tunes the DCO
output spectrum. Figure 2.6 shows the example for the case of such a close integer
channel under the assumption of a 4-bit TDC and FCW with 6-bit fractional part
for convenience (in real case the FCW fractional part is much finer than 6-bit).
As can be observed, the periodical residue (highlighted in green) is resulted with
a peak amplitude as large as resolution of TDC with a period equals to 4 (4 =
( )
2 )reference cycles. After locking this sawtooth-like residue would behave as
a narrowband frequency modulation at the input of the DCO. In order to find the
way of estimation of the spur level, analysis from both close-loop and open-loop
aspects are conducted as follows:

a. Open-loop FM analysis
Frequency modulation (FM) is a process of producing a wave whose instanta-
neous frequency varies as a function of the instantaneous amplitude of a modulating
2.3. Worst Inband Fractional Spurious Components of ADPLL Output
Spectrum .
25

Sketch about residue from limited TDC resolution


1.00
—— C6-bit Fractional Phase ramping track
——4-bit TDC quantization output

0.75
——Residue error repeats every 4 cycle
with peak amplitude as large as TDC 2
resolution
value

0.50

0.25

0.00
0 63 127 191
Time (Reference cycle)

Figure 2.6: Periodical residue resulted from TDC limited resolution.

wave at a rate given by the frequency of the modulating source. From Figure 2.6,
a periodical residue resulted from the rough resolution of TDC could be observed.
This periodically behaved residue could be considered as such a modulating source
of DCO block. In this way, DCO output could be expressed in the method of a
frequency-modulated signal as follows:

𝑉 (𝑡) = 𝑉 cos[𝜔 𝑡 + 2𝜋𝐾 ∫ 𝑉 (𝜏)𝑑𝜏 + 𝜃 ] (2.18)

Where 𝑉 (𝜏) is the variation at the input of the DCO while 𝜃 is the initial phase.
𝐾 is the DCO sensitivity in Hz.
As we can see from the quantization residue ramping shape regarding time,
the ”modulation” due to the quantization residue could be modelled as a sawtooth
narrowband frequency modulation. As far as this paper is concerned, the modu-
lating signal is one of the Fourier Series components of the sawtooth waveform,
which is given by:
𝑉 (𝑡) = 𝐴 cos (2𝜋𝑓 𝑡) (2.19)

where 𝐴 is the peak amplitude of the modulating signal from digital phase error
while 𝑓 is the frequency of it. Thus the variation phase of the modulating signal
.
26 2. ADPLL System: Background, Analysis and Proposed Architecture

in DCO output is
2𝜋𝐾 ∗𝐴
𝐾 ∫ 𝑉 (𝜏)𝑑𝜏 = sin (2𝜋𝑓 𝑡) (2.20)
2𝜋𝑓

2 Here we can define the modulation index m as


𝐾 ∗𝐴
𝑚=
𝑓
(2.21)
△𝑓
= 𝑟𝑎𝑑
𝑓
Suppose the initial phase 𝜃 as 0 for convenience. Now we can rewrite 2.18 again
as
𝑉 (𝑡) = 𝑉 cos [𝜔 𝑡 + 𝑚 ∗ sin(2𝜋𝑓 𝑡)] (2.22)
Now with basic trigonometric knowledge we can get

𝑉 (𝑡) = 𝑉 cos (2𝜋𝑓 𝑡)∗ cos (𝑚∗ sin(2𝜋𝑓 𝑡))−𝑉 sin (2𝜋𝑓 𝑡)∗ sin(𝑚∗ sin (2𝜋𝑓 𝑡))
(2.23)
Considering the narrowband modulation, m should be far less than . Thus we
would have
cos(𝑚 ∗ sin (2𝜋𝑓 𝑡)) ≈ 1 (2.24)
sin(𝑚 ∗ sin (2𝜋𝑓 𝑡)) ≈ 𝑚 ∗ sin (2𝜋𝑓 𝑡) (2.25)
Then

𝑉 (𝑡) = 𝑉 cos (2𝜋𝑓 𝑡) − 𝑉 sin (2𝜋𝑓 𝑡) ∗ 𝑚 ∗ sin (2𝜋𝑓 𝑡) (2.26)

This is equivalent to

𝑉 (𝑡) = 𝑉 cos (2𝜋𝑓 𝑡) − 0.5𝑚 ∗ 𝑉 [ cos (2𝜋(𝑓 + 𝑓 )𝑡) − cos (2𝜋(𝑓 − 𝑓 )𝑡)]
(2.27)
In this way we can see that the single sideband (SSB) spur to signal ratio could be
written in this way:

𝑠𝑝𝑢𝑟 △𝑓
= 20 log (0.5𝑚) = 20 log ( ) (2.28)
𝑠𝑖𝑔𝑛𝑎𝑙 2•𝑓
Here fm would be
𝑓 =2 𝑓
𝐹𝐶𝑊 𝑇 (2.29)
𝑓 = 𝑓
𝑡
where 𝑓 is the reference frequency,
2.3. Worst Inband Fractional Spurious Components of ADPLL Output
Spectrum .
27

b. close-loop PM analysis
As the modulating residue is a sawtooth waveform, we need to recall the
Fourier Series of it here first,

𝑥 (𝑡) =
𝐴 𝐴
− Σ
sin (2𝜋𝑛𝑓𝑡)
(2.30) 2
2 𝜋 𝑛
A is the amplitude of the waveform in time domain which is 𝑡 here. Now the in-
band spur level will be investigated first from close-loop aspect. For convenience,
we can start from phase modulation (with sinusoidal modulating signal, it makes
no difference whether one speaks of phase or frequency deviation because the two
are related by the rate of modulation as Eq. 2.21). Thus rewriting Eq. 2.18 as
follows,
𝑉 (𝑡) = 𝑉 sin[𝜔 𝑡 + 𝐾 𝐴 sin (2𝜋𝑓 𝑡)] (2.31)

The maximum phase deviation is

𝜎 =𝑘 𝐴 𝑟𝑎𝑑 (2.32)

It is related to frequency modulation by

𝜎 =𝑚 (2.33)

Hence the Eq. 2.31 becomes

𝑉 (𝑡) = 𝑉 sin [𝜔 𝑡 + 𝜎 sin (2𝜋𝑓 𝑡)]


= 𝑉 sin (𝜔 𝑡) cos [𝜎 sin (2𝜋𝑓 𝑡)] + 𝑉 cos (𝜔 𝑡) sin [𝜎 sin (2𝜋𝑓 𝑡)]
(2.34)

Under the assumption of lock-in state, the phase deviation is rather smaller than
, Eq. 2.34 becomes,

𝑉 (𝑡) ≈ 𝑉 sin (𝜔 𝑡) ∗ 1 + 𝑉 𝜎 [ cos (𝜔 𝑡) sin (2𝜋𝑓 𝑡)]


𝑉 𝜎 (2.35)
≈ 𝑉 sin (𝜔 𝑡) + ( sin (𝜔 𝑡 + 2𝜋𝑓 𝑡) − sin (𝜔 𝑡 − 2𝜋𝑓 𝑡))
2
Under the assumption that thermal noise level in the loop is much smaller than
the TDC quantization step, the variance of the timing uncertainty would be as large
as
1
𝜎 = ( △𝑡 ) (2.36)
𝜋
.
28 2. ADPLL System: Background, Analysis and Proposed Architecture

The equivalent phase noise energy could be estimated as,


𝜎
𝜎 = 2𝜋 (2.37)
𝑇
where the 𝑇 is one period of the DCO output.
2
Instead of being spread normally over the span from DC to Nyquist frequency,
(i.e., half of the reference frequency 𝑓 ), the phase noise energy is mainly concen-
trated at the fractional spur position thus there is no need to normalize the noise
energy by the Nyquist frequency span to calculate the spur’s level. Hereby the
estimated spur level could be expressed in SSB form as,

𝑠𝑝𝑢𝑟 △𝑡
= 20 log (0.5𝜎 ) = 20 log (2𝜋 )
𝑠𝑖𝑔𝑛𝑎𝑙 2𝑇 (2.38)
𝑡
= 20 log ( )
𝑇

c. comparison
First, since we have the relation between time domain and frequency domain
as
𝑡 𝑓
𝛼 = (2.39)
𝑇 𝑓
The corresponding peak deviation frequency is
1 𝑡
△𝑓 = 𝑓 • (2.40)
𝜋 𝑇
Thus Eq.2.28 is rewritten as,
𝑠𝑝𝑢𝑟 𝑡 ∗𝛼∗𝑓
= 20 log ( ) (2.41)
𝑠𝑖𝑔𝑛𝑎𝑙 2𝜋𝑇 ∗ 𝑓

To show the equivalence between Eq. 2.41 and Eq. 2.38, let’s assume there is one
fractional spur locates at PLL cutoff frequency 𝑓 as

1
𝑓 =𝑓 𝛼𝑓 = (2.42)
2𝜋
where the tangential open-loop response and closed-loop response are both
at unity (actually they are both at -3 dB gain):
𝑠𝑝𝑢𝑟 𝑡 ∗𝛼∗𝑓
= 20 log ( )
𝑠𝑖𝑔𝑛𝑎𝑙 2𝜋𝑇 ∗ 𝑓
(2.43)
𝑡
= 20 log ( )
𝑇
2.3. Worst Inband Fractional Spurious Components of ADPLL Output
Spectrum .
29
Table 2.3: Validation about close integer spur estimation with FCW= 38.0001.

resolution Estimated (dB) Simulated (dB)

15 ps -34.8911 -35.12
2
64 ps -22.2893 -22.31

This is exactly the same as Eq. 2.38. Theoretically estimated results and Verilog
simulated results comparisons are summarised in Table 2.3 for the worst inband
fractional spur level. No TDC nonlinearity is added into the Verilog model and the
channel is chosen to be close integer one (FCW= 38.0001). Bandwidth is set to be
200 kHz while the standard deviation of the jitter from reference clock is 3 ps which
is much lower than the TDC resolution. 𝑇 is 1.2 GHz in this example. However
this matching result holds under the condition that the thermal noise level from the
loop at the input of the TDC is much lower than the TDC quantization residue level,
which is expected. Once they become comparable to each other, the random noise
could dither the quantization error and the spurs energy would be flatten around
more into the while noise floor. And this could be a solution to reduce the fractional
spur as shown in [14] by adding DTC to dither the reference clock before sending
it into the TDC.

2.3.2. TDC Nonlinearity Introduced Spur


Unfortunately, the worst case fractional spurs for most of the channels are not
resulted from resolution related scheme, but caused by the nonlineariy of TDC. As
shown in Figure 2.7, the TDC resolution is set to be infinite but with a nonlinearity.
The INL is assumed to be sinusoidal here for convenience but this INL shape would
be rather complex in the real implementation and the spur level estimation would
be hard to do.
Now the fundamental spur position 𝑓 is related to the fractional part of
FCW as 𝑓 = 𝐹𝐶𝑊 • 𝐹𝑟𝑒𝑓. For example in the case with FCW= 38.0001
together with reference clock at 64 MHz, fractional spurs resulted from nonlinearity
are anticipated at 𝑓 (𝑖) = 𝑖 • 𝑓 , where 𝑖 = (1, 2, ⋯) is the index of the spur
tones. Generally the ones lies within bandwidth would cause problems since they
are not filtered. Nevertheless, the nonlinearity resulted error is still a periodical
series, though not sawtooth alike one,the fundamental item of its corresponding
.
30 2. ADPLL System: Background, Analysis and Proposed Architecture

Nonlinearity resulted residue


1

2 0.75

variable

0.5 Fractional phase Ramping


value

residue

TDC trans fer function


0.25

0 127 255 383


time (Reference cycle)

Figure 2.7: Periodical residue from TDC nonlinearity.

Fourier Series can always be expressed in the form of Eq. 2.19. Then the sinusoidal
modulated DCO could be expressed by a Bessel function [15] series with modulation
index m as defined previously.
𝑉 (𝑡) = 𝑉 {𝐽 (𝑚) sin(𝜔 𝑡) + 𝐽 (𝑚)[sin(𝜔 𝑡 + 2𝜋𝑓 𝑡) − sin(𝜔 𝑡 − 2𝜋𝑓 𝑡)]
+ 𝐽 (𝑚)[sin(𝜔 𝑡 + 2𝜋2𝑓 𝑡) + sin(𝜔 𝑡 − 2𝜋2𝑓 𝑡)]
+ 𝐽 (𝑚)[sin(𝜔 𝑡 + 2𝜋3𝑓 𝑡) − sin(𝜔 𝑡 − 2𝜋3𝑓 𝑡)] + ⋯}
(2.44)
Only when modulation index m is much smaller than 1, we can have the approx-
imation for the fundamental tone in the form of Eq. 2.38. Under this assumption
𝐽 (𝑚) ≈ while the higher order item would be zero. Regarding m, the peak de-
viation value is also related to the shape of the INL. As demonstrated in Figure 2.7,
assume the max INL is 𝑡 then we can go through the same procedure discussed
before to get a result as
𝑠𝑝𝑢𝑟 𝑡
= 20 log (𝜋 ) (2.45)
𝑠𝑖𝑔𝑛𝑎𝑙 𝑇
This nonlinearity is also modelled and simulated with a TDC resolution as 0.5 ps
which is smaller than the thermal noise in the loop so the quantization residue
2.3. Worst Inband Fractional Spurious Components of ADPLL Output
Spectrum .
31

could be fully randomized by the thermal noise. Results match with deviation less
than 2 dB.

2.3.3. TDC Normalization Introduced Spur


2
When the fractional phase exceeds 2𝜋, it wraps back to zero and then ramps
step by step again as shown in the Figure 2.8. If the TDC normalization is not done
correctly, a residue error would be generated from this non 2𝜋 normalization and
thus introduces a sawtooth alike error tuning the DCO as long as the loop is slow
enough compared to the variation of the error. The formula to estimate this kind of
spur level shares the same essential derivation as the TDC resolution resulted spur.
The level estimation would be complex since the residue depends on the fractional
part of FCW. However, once the fractional reference phase’s ramping step is smaller
than TDC resolution, the situation becomes as simple as shown in Figure 2.8.

Wrong TDC gain resulted residue

0.9

variable

4−bit TDC ideal output


value

0.6
residue

Wrongly normalized TDC output

0.3

0.0

0 15 31 47
time (Reference Cycle)

Figure 2.8: Periodical residue from wrongly TDC normalization.

Now the spur position would be related to the number of the TDC bits while
the level could be expressed as
𝑠𝑝𝑢𝑟 𝑡
= 20 log ( ) (2.46)
𝑠𝑖𝑔𝑛𝑎𝑙 𝑇
.
32 2. ADPLL System: Background, Analysis and Proposed Architecture

Table 2.4: Fractional Spur Estimation Summary in General

Sources Position Level Solution

Resolution 𝑓 20 log ( ) finer resolution


2
Nonlinearity 𝐹𝐶𝑊 ∗𝑓 20 log (𝜋 ) better linearity

Normalization 𝐹𝐶𝑊 ∗𝑓 20 log ( ) gain calibration


1 This normalization related error estimation depends on fractional bits of FCW.

Where the residue is related to the TDC bit number n as

𝐺𝑎𝑖𝑛 − 𝐺𝑎𝑖𝑛
𝑡 =2 ∗ •𝑡 (2.47)
𝐺𝑎𝑖𝑛

The fundamental place of the fractional spurs would be 𝑓 = 𝐹𝐶𝑊 ∗𝑓


since this residue would always wrap back to zero together with fractional reference
phase.
Analysis results are summarised in Table 2.4. To draw a conclusion, we need a
quantizer with not only good resolution but also good linearity, together with correct
gain calibration to make the fractional spur as small as possible.

2.4. Specification of the targeted ADPLL Design


This ultra-low power ADPLL project is dedicated to Bluetooth low energy (BLE)
application which is an extension of the conventional Bluetooth standard published
with the version 4.0 of the latter in late 2009. Compared with conventional Blue-
tooth basic the energy demand is required to be reduced by 90% in BLE. This is
made possible by relaxing the radio specifications to help saving more power in
physical layer. In protocol level, connection set-up is also accelerated to make
duty cycle more efficient. For example, the requirements to block interferers from
adjacent channels is reduced. Table 2.5 shows the difference between these two
versions of Bluetooth standard.
For a Bluetooth transmitter application, a frequency synthesizer can be char-
acterized by the phase noise, spurious tones and switching time. The derivations
of specifications are described as follows.
2.4. Specification of the targeted ADPLL Design .
33

Table 2.5: Bluetooth radio specifications for the basic rate and for the BLE extension

Parameter Bluetooth basic rate BLE

RF channels f=2402+k MHz f=2402+2k MHz


2
k=0,1,...,78 k=0,1,...,39

Carrier frequency rate ± 75 kHz ± 150 kHz

Spurious emissions

2 MHz -20 dBm -20 dBm

3 MHz -40 dBm -30 dBm

Required sensitivity -70 dBm -70 dBm

Carrier-to interference ratio(CIR)

1 MHz 0 dB 15 dB

2 MHz -30 dB -17 dB

3 MHz -40 dB -27 dB

2.4.1. Phase Noise Specification

To calculate the tolerable phase noise level, a required signal-to-noise ratio


(SNR) has to be assumed in terms of demodulation performance. A GFSK demod-
ulators for a modulation index of h = 0.5 requires an SNR of about 11 dB. To allow
for sufficient implementation margin, an SNR of 15 dB is required. Assuming a
constant phase noise within the channel bandwidth then the required in-channel
average phase noise ℒ 500 kHz bandwidth channel simplifiers to less than -75
dBc/Hz ( ) which is really not stringent for most nowadays PLL designs. For

spot phase noise requirement at larger offset from the carrier, the adjacent channel
Carrier-to-interference ratio has to be taken into consideration which is -17 dB in
BLE at 2MHz and this results in a requirement for -92 dBc/Hz level at 2 MHz offset.

Here the specifications of the in-band noise floor is set as -85 dBc/Hz and spot
phase noise of -105 dBc/ Hz at 1 MHz offset for much more margin according to
the plan of NXP.
.
34 2. ADPLL System: Background, Analysis and Proposed Architecture

2.4.2. Spurious Tone Specification


The tolerable spur level according to the spurious emission specification is −20
dBc or −30 dBc at 2MHz or 3MHz offset respectively. The tolerable spur level
normalized to the carrier (𝑃 /𝑃 ) can be derived by considering how much
2
an interferer at a channel offset 𝑓 has to be suppressed to fall below a certain
signal to noise ratio. Thus a target for reference spur less than -70 dB will be enough
while inband fractional spurs lower than -32 dB has to be met to avoid serious
adjacent channel interference according to NXP’s requirement. The fractional spur
reduction is also one of the priority of this thesis work.

2.4.3. Settling Time Specification


BLE adopts a frequency hop scheme. A frequency hop transceiver is applied to
combat interference and fading. The nominal hop rate is 1600 hops/s. A settling
time less than 65 𝜇𝑠 with a accuracy finer than ± 150 kHz is enough for this target.
Table 2.6 summarizes the targeted specifications according to the above anal-
ysis. Compared with current leading designs, the difficulties as well as potential
breakthroughs lie in the power consumption and fractional spur level reduction.
The trade-off between power and integrated jitter would be pushed to make this
ADPLL design work power efficiently, which means to generate as low as possible
integrated jitter at a certain low power budget (< 800 𝜇W).

2.5. Implementation from System level


2.5.1. Reference Design Introduction
The reference design becomes available to this thesis work and is published in
[1]. Here the design concept would be covered briefly.
Though TDC is a mature and exciting topic to explore in the direction of better
resolution at lower power consumption, it is already proved there is no certain
trade-off between resolution and power in its DPLL application. Besides being used
in divider-based DPLL, DTC is already introduced in [16] for fractional divider noise
cancellation and again in [17] [18] it is introduced for phase predication in counter-
based DPLL. Without a DTC, the TDC’s burden in the counter-based structure is to
detect the fractional phase error as explained in [4],
𝑡 −𝑡
𝜀=1− (2.48)
𝑇
2.5. Implementation from System level .
35

Table 2.6: Summary of Specifications and Conditions

Technology GlobalFoundries’s 40 nm-LP

Supply Voltage 0.8 V ∼ 1 V (separate low supply for DCO)


2
Power Consumption < 800 𝜇W

locking range 2.2 GHz∼ 3 GHz

Settling time <60 𝜇s

Reference noise floor -135 dBc/Hz

Reference spur <-70 dB

Inband phase noise floor <-85 dBc/Hz

Phase noise @ 1MHz offset <-105 dBc/Hz

Integrated RMS jitter <1.5 ps

Worst fractional spur <-30 dB

where 𝑡 is the rising edge of reference clock and 𝑡 is the rising edge of the
feedback variable clock, 𝑇 is the period of the feedback variable clock, 𝜀 is the
quantization target of TDC. In lock-in state,

𝑅 , +𝜀=1 (2.49)

where 𝑅 , is the fractional part of the reference phase. However it could be ex-
pressed now also as,
𝑡 −𝑡
𝑅 , +1− =1
𝑇 (2.50)
𝑡 + (1 − 𝑅 , )∗𝑇 =𝑡 +𝑇

Thus under the lock-in state assumption, what TDC is tracking is (1 − 𝑅 , ) plus
additional noise and residue in the loop, which is a predicable variant (1 − 𝑅 , )
plus small turbulence. According to Eq. 2.50, DTC could be introduced to handle
the phase prediction work and thus reference clock is always delayed to be aligned
to the next rising edge of the feedback variable phase. TDC’s work now is to only
quantize the residue between these two phases. Figure 2.10 shows the phase
prediction diagram. This phase prediction scheme is already implemented in [17]
and [1].
.
36 2. ADPLL System: Background, Analysis and Proposed Architecture

Figure 2.9: IMEC’s Design Diagram [1].

0 1 2 3 4 5 6 7
CKV
0
PHR

PHR,f FCW,f=0.25
PHR,f TDC
time

(1-PHR,f) DTC TDC

digital

Figure 2.10: Phase Prediction Scheme.


2.5. Implementation from System level .
37

As shown in Figure 2.9, phase detector in reference design consists of 64 stages


of 30 ps DTC followed by a 16 stages of 15 ps TDC to quantize the residue generated
from phase prediction. This structure relaxes the power hungry TDC a lot. Together
with the smart clock gating technique which is called ”snapshot” scheme in [1], only
860 𝜇 W is consumed to meet the specifications.
2
To sum up, this phase prediction assisted counter-based DPLL is the best can-
didate for low power design and thus the phase prediction scheme as well as the
clock gating concept are chosen as the starting concept of this thesis design.

2.5.2. DTC-based Phase Detector


Phase detector and DCO are generally where the power is consumed. However,
for the oscillator there is always a hard trade-off between power and phase noise
[19] [20]. For phase detector, the trade-off is not so clear and it is largely depending
on the operation scheme of the phase detector. This lends us much more headroom
to design a power efficient ADPLL, which means a ADPLL offering low noise and
small spur at low power. The reference design brought a really breakthrough
in terms of the power saving. However, the advantage of this phase prediction
scheme has not been fully taken advantage of and the worst case fractional spur
could be as high as -20 dB around as shown in Figure 2.11.
In order to fully explore the advantage of this DTC-based phase detector,
analysis has to be done regarding the in-band phase noise level, fractional spur
level.Nevertheless, other than the suspicious sources like supply coupling and sup-
ply noise which may add spurs via Amplitude Modulation scheme, the main reasons
for this fractional spurs are still from the three schemes mentioned above as:

a. Fractional Spur and In-band Phase Noise Floor


Essentially, equivalent resolution of DTC-based phase detector is determined
by the one with much finer resolution between DTC and TDC. Its operation could
be analog to a two stage TDC. DTC is also doing a ”quantizing” (phase prediction)
and the residue is left to TDC for further quantization, as shown in Figure 2.12. This
would be analysed in three cases in terms of the relation between their resolutions:

I. DTC resolution » TDC resolution Under this circumstance, the DTC lim-
ited resolution contributes nothing to the output spur or phase noise. Because the
residue error from phase tracking (the fractional phase steps could not be tracked
.
38 2. ADPLL System: Background, Analysis and Proposed Architecture

exactly by DTC resolution) would be detected by the finer TDC. Besides, the large
residue left from DTC could fully scramble the TDC output thus the quantization
noise floor’s assumption still holds (determined by TDC) [4]. Only under close-
integer case would TDC face the deadzone issue [9]. Under this case. the phase
2 prediction scheme experiences no big difference compared with the TDC-only struc-
ture. The only potential impact from DTC stage would be its nonlinearity and gain
estimation error, which could manipulate the TDC output and hence further modu-
lates the output spectrum. Other than these, DTC is only assisting the narrow-range
TDC.

Figure 2.11: Measured worst case spur in IMEC design

II. DTC resolution ≈ TDC resolution This is the case when DTC resolution
is slightly larger or smaller than TDC resolution. Under this circumstance, what DTC
could not track can not be detected out by TDC also. Hereby the TDC is always
facing the deadzone problem theoretically since DTC always aligns the two input to
the extent that the difference is smaller than TDC’s resolution. Nevertheless, in real
case the strong nonlinearity of DTC and thermal noise could enlarge the difference
2.5. Implementation from System level .
39

and relax this serious problem. To solve this problem, what can be done is either to
add dithering to reference or to use worse crystal to save the TDC out of deadzone.

1.00
DTC working principal
2

0.75

variable

DTC
value

0.50
fractional phase

residue input at TDC

0.25

0.00

0 20000 40000 60000


time

Figure 2.12: DTC working principal.

III. DTC resolution « TDC resolution In this situation, what DTC does is
exactly to align the input of TDC into close integer case and it would take extremely
long time for the accumulated error from DTC phase tracking to be detected by
TDC. Under this case the spur and in-band noise floor would be dominated by TDC
mainly.
In a word, in terms of resolution, as long as TDC’s could recognise the residue
left from DTC, this equivalent resolution of phase error detection is determined by
TDC mainly. In terms of nonlinearity resulted spur, both of DTC and TDC would
contribute and generally it is DTC dominating since DTC’s transfer characteristic
would be fully travelled while fractional phase is ramping periodically.
Besides, a wrongly estimated DTC gain also contributes significantly to the
fractional spur. Since DTC’s gain would determine the input error of the TDC, DTC
gain estimation is not only important for spurs but also important for locking. It
would cause serious spur problem in close integer channel with wrongly estimated
DTC gain. Nevertheless, this problem can be totally solved by LMS algorithm as
.
40 2. ADPLL System: Background, Analysis and Proposed Architecture

depicted in [17].
The reference design measured results is rebuilt in the model to verify the
analysis mentioned above, and the result is shown in Figure 2.14. The reference
noise floor is -120 dBc/Hz and 𝛼 = 2 while 𝜌 = 2 . Channel is chosen as
2 FCW=38.001 and reference clock is 32 MHz with RBW=1 kHz. DTC’s resolution
is 22 ps while TDC’s resolution is 15 ps. All these parameters are set up as done
during the measurement. Nonlinearity resulted fractional spurs are concentrating
at multiple of 32 kHz. The INL of DTC is plot in Figure 2.13. The simulation result
matches the analysis really well in the sense that:

Inband phase noise Theoretically the inband phase noise is determined by


TDC resolution and this is verified as seen in figure.

Nonliearity Spur This is dominated by DTC and simulated result is -25.74


dBc/Hz (-55.74+10log(RBW)). Estimated level is -25.89 dBc/Hz with 0.6 LSB INL
(shown in Figure 2.13).

Quantization related spur The stronger spur outside bandwidth at 1.188


MHz is resulted from quantization of DTC’s resolution. (This is the case in which
DTC is slightly more coarse than TDC and the residue is left). This position is also
similar to estimated result which is 1.19 MHz. The position is almost 10 times the
bandwidth thus the equivalent inband quantization spur level is -33.33 dBc/Hz (-
83.33+10log (RBW)+20). The estimated level is -31.39 dBc/Hz. However this 2 dB
deviation could be explained since we roughly add the 20 dB compensation for the
loop filtering effect.

b. DTC-based Bang-Bang Phase Detector


Above all, the analysis done for spur estimation in section 2.3 still applies to
the DTC-TDC DPLL since DTC-TDC structure could be considered as an alternative
to two stage TDC. The only difference is that the DTC is always aligning the input
signals of the succeeding TDC and this causes deadzone problem in potential. How-
ever, an integer alike operation scheme implies us to use Bang-Bang phase detector
instead of TDC in lock-in state since Bang-Bang based DPLL is well known for good
phase noise performance in integer-N mode compared with TDC [21] [22]. Instead
of a mid-tread alike TDC, Bang-Bang’s transfer function is rather nonlinear with only
+1 or -1 output as could be seen in Figure 2.15. As a 1-bit mid-rise quantizer, it
2.5. Implementation from System level .
41

DNL & INL


0.6
INL

0.4
DNL
2
0.2

0
(LSB)

−0.2

−0.4

−0.6

−0.8
0 10 20 30 40 50 60 70
DIGITAL Control CODE

Figure 2.13: Nonlinearity of the rebuilt DTC.

CKV clock: f 0 =1216.031250 MHz, integ PE=6.53841 ps


−50
Simulated
X: 3.086e+04 Theory
Y: −55.74
X: 1.188e+06
Y: −83.33

−100
Spectrum of the2 π*rad SSB phase noise at CKV [dBc/Hz]

−150

−200

−250

−300
2 3 4 5 6 7 8 9
10 10 10 10 10 10 10 10
Frequency [Hz] (FFT: len=4645250, rbw=1000)

Figure 2.14: Reference design rebuilt.


.
42 2. ADPLL System: Background, Analysis and Proposed Architecture

2 +1
0
-1

Resolution
Resolution

(a) (b)

Figure 2.15: (a)TDC as mid-tread quantizer (b) Bang-Bang as mid-rise quantizer

could be as simple as a D flip-flop. The deadzone (TDC outputs zero while error is
accumulating) issue in close integer channel for TDC is not a problem anymore for
Bang-Bang based DPLL. However, without random noise to dither the input signal,
Bang-Bang’s strong nonlinearity would cause limit cycle problem also and make the
loop hard to be analysed in a linear s-domain model.
Thanks to the thermal noise in the loop which is mainly determined by reference
clock, the inputs would be randomized and the transfer function is shaped by the
thermal noise as shown in Figure 2.16.
The transfer function of the Bang-Bang phase detector is then linearised by
the noise with a equivalent gain as

2 1
𝐾 =√ (2.51)
𝜋𝜎

where 𝜎 is the standard deviation of the thermal noise in the loop under assumption
that the noise is Gaussian distributed. Now in the closed loop transfer function the
gain of Bang-Bang would be observed. Use type-I close loop function as an example
which should be,
𝛼𝐾 𝑓
𝐻 (𝑠) = 𝑁 • (2.52)
𝑠+
and the 3-dB bandwidth of the loop is 𝑓 = 𝛼𝐾 𝑓 /(2𝜋𝑀) now.
By far, two negative issues with Bang-Bang phase detector have to be clarified
here. First is the input phase difference should not be too large compared with
2.5. Implementation from System level .
43

t E[k]

+1
FREF_DLY
D Q -1 2
CKV_FB +1
P t -1 t

(a ) (b ) (c )

Figure 2.16: (a)Simplified Bang-Bang (b)Input edge shifted by thermal noise (c) Gain linearised by
thermal noise

thermal noise in the loop otherwise the transfer function is still strongly nonlinear.
Another drawback of Bang-Bang phase detector is the loop bandwidth will be im-
pacted by the gain of the phase detector which means the bandwidth would be
reference noise depending. However since the quality of the reference clock could
be ensured with no big variance, it is expected to see the bandwidth maintained
stable.

Back to the first issue, a rather fine DTC with acceptable linearity is required
to ensure the Bang-Bang phase detector’s input is always randomized by the loop
thermal noise. In such a way the ”TDC” is always working in close integer mode,
ensured by a good DTC. Instead of a mid-tread quantizer TDC, mid-rise based
Bang-Bang phase detector is adopted to take advantage of the thermal noise so
that the conventional deadzone problem is avoided. Now all of the phase noise as
well as fractional spur issue are determined by DTC. No dithering is needed at all
and in theory the PLL is expected to work perfectly as Bang-Bang DPLL does under
integer-N mode as long as the DTC is extremely linear with picosecond resolution.
Hence, all of the burden of the TDC are transferred into DTC and there are good
reasons to do so in terms of power, mismatch as well as compatibility with digital
calibration ( under the target of ultra-low power, delay chain based DTC or TDC is
preferred to maintain low power as well as simplicity):
.
44 2. ADPLL System: Background, Analysis and Proposed Architecture

In terms of power DTC does not need the D Flip-Flop (DFF) to sample the
output value and is consuming lower power due to simplicity compared with TDC.

In terms of linearity If the delay chains for both TDC and DTC are the
2 same, TDC will get additional mismatch contributed from the array of DFFs.

In terms of calibration DTC is a structure like DAC to whom digital tech-


niques such as predistortion as well as DEM can be applied easily. On the other
hand, it would be much more power hungry to apply those tricks to TDC.
Another good reason to go for DTC in terms of resolution would be covered in
Chapter 4. Yet DTC-only-based phase detector has its own drawback also, due to
the limited linear detection range of Bang-Bang quantizer, it is easier to lose lock.
Besides, it is undesired to see the bandwidth may be impacted by the noise in the
loop.
Driven by all the motivations mentioned above, the final structure is proposed
as consisting of a DTC offering resolution at thermal noise level (which is 2 ps around
corresponding to a reference clock of -135 dBc/Hz noise floor) with good linearity
and a Bang-Bang detector. DEM is also implemented to relax potential nonlinearity
issue.

2.5.3. Proposed System


The structure proposed is shown in Figure 2.17. The ”CDTC” means the coarse
DTC here while ”FDTC” represents the fine stage.
The phase detector is based on the discussion above. A coarse-fine DTC struc-
ture is adopted here for power saving and will be fully explained in Chapter 4. The
16-stage coarse DTC unit cell is implemented with a resolution of 58 ps to cover
more than the required range for safety. Then 32-stage 2 ps DTC unit cell is de-
signed to make the phase detector’s resolution comparable with thermal noise to
fully random the Bang-Bang detector’s nonlinear transfer function. DEM is applied
to the DTC part to relax the nonlinearity issue. An improved clock gating technique
called ”time freezer” is proposed here to generate the re-timed reference clock as
well as the variable input to the Bang-Bang detector.
This design is taped out on Sep. 11th. Two additional techniques are not taped
out due to the limit of time, they are:
1. Quadrature phase assisted DTC. This is done by taking advantage of the
2.5. Implementation from System level .
45

4 phase output from the divider by 2. By choosing the phase which is closest
to reference edge from the four phases generated, the required range for DTC is
reduced to only of the original 𝑇 .
2. TDC assisted fast locking. Full range TDC has a advantage which is to let
the loop always observe a linearly quantized phase error and thus locking would
2
be rather fast. Before lock-in, DTC is not doing the correct phase prediction and
this results in a drawback of slower settling time compared to full range TDC based
ADPLL designs. However, a Full range TDC could help observe the fractional phase
error in the range of full 𝑇 period at the beginning. The TDC could be switched
to a 1 bit Bang-Bang operation after lock-in. Since spurious and power as well as
phase noise are discussed only after the loop locks, the resolution, linearity of the
full range TDC has no impact at all. In this way, the advantages of both TDC and
Bang-Bang are combined together.

Phase Detector Loop Filter DCO and divider


FCW Integer
Σ +
fractional
. + LF(s) ∼
1 1
KDT C KT DC
× × Σ Legend
2.4 GHz domain

DEM 1.2 GHz domain


CKR
32MHz domain
FREF CKG
CDTC FDTC TDC/BB Time Freezer ÷2
Integer Path

Figure 2.17: Simplified diagram of proposed architecture

All of these would be demonstrated in detail in Chapter 4.


.
46 References

References
[1] V. Chillara, Y.-H. Liu, B. Wang, A. Ba, M. Vidojkovic, K. Philips, H. de Groot, and
R. Staszewski, 9.8 An 860𝜇W 2.1-to-2.7GHz all-digital PLL-based frequency
2 modulator with a DTC-assisted snapshot TDC for WPAN (Bluetooth Smart and
ZigBee) applications, in Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), 2014 IEEE International (2014) pp. 172–173.

[2] Y.-H. Liu, X. Huang, M. Vidojkovic, K. Imamura, P. Harpe, G. Dolmans, and


H. De Groot, A 2.7 nJ/b multi-standard 2.3/2.4 GHz polar transmitter for wire-
less sensor networks, in Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), 2012 IEEE International (IEEE, 2012) pp. 448–450.

[3] R. B. Staszewski, K. Muhammad, D. Leipold, C.-M. Hung, Y.-C. Ho, J. L. Wall-


berg, C. Fernando, K. Maggio, R. Staszewski, T. Jung, et al., All-digital TX
frequency synthesizer and discrete-time receiver for Bluetooth radio in 130-
nm CMOS, Solid-State Circuits, IEEE Journal of 39, 2278 (2004).

[4] R. B. Staszewski and P. T. Balsara, All-digital frequency synthesizer in deep-


submicron CMOS (John Wiley & Sons, 2006).

[5] R. Staszewski, D. Leipold, C.-M. Hung, and P. Balsara, A first digitally-


controlled oscillator in a deep-submicron cmos process for multi-ghz wireless
applications, in Radio Frequency Integrated Circuits (RFIC) Symposium, 2003
IEEE (IEEE, 2003) pp. 81–84.

[6] S. Levantino, G. Marzin, and C. Samori, An adaptive pre-distortion technique


to mitigate the dtc nonlinearity in digital PLLs, 49, 1762 (2014).

[7] F. M. Gardner, Phaselock techniques (John Wiley & Sons, 2005).

[8] X. Gao, E. A. Klumperink, P. F. Geraedts, and B. Nauta, Jitter analysis and a


benchmarking figure-of-merit for phase-locked loops, Circuits and Systems II:
Express Briefs, IEEE Transactions on 56, 117 (2009).

[9] R. B. Staszewski, K. Waheed, S. Vemulapalli, F. Dulger, J. Wallberg, C.-M.


Hung, and O. Eliezer, Spur-free all-digital PLL in 65nm for mobile phones, in
Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE
International (IEEE, 2011) pp. 52–54.
References .
47

[10] N. Pavlovic and J. R. M. Bergervoet, Digital phase locked loop, (2013), uS


Patent 8,362,815.

[11] E. Temporiti, C. Weltin-Wu, D. Baldi, M. Cusmai, and F. Svelto, A 3.5 GHz wide-
band ADPLL with fractional spur suppression through TDC dithering and feed- 2
forward compensation, Solid-State Circuits, IEEE Journal of 45, 2723 (2010).

[12] E. Temporiti, C. Weltin-Wu, D. Baldi, R. Tonietto, and F. Svelto, A 3 GHz frac-


tional all-digital PLL with a 1.8 MHz bandwidth implementing spur reduction
techniques, Solid-State Circuits, IEEE Journal of 44, 824 (2009).

[13] L. Minjae and A. A. Abidi, A 9 b, 1.25 ps Resolution Coarse 2013;Fine Time-to-


Digital Converter in 90 nm CMOS that Amplifies a Time Residue, Solid-State
Circuits, IEEE Journal of 43, 769 (2008).

[14] K. Waheed, R. B. Staszewski, F. Dulger, M. S. Ullah, and S. D. Vamvakos,


Spurious-free time-to-digital conversion in an ADPLL using short dithering se-
quences, Circuits and Systems I: Regular Papers, IEEE Transactions on 58,
2051 (2011).

[15] V. Manassewitsch, Frequency synthesizers: theory and design, (1987).

[16] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, A


2.9-4.0-GHz Fractional-N Digital PLL With Bang-Bang Phase Detector and 560
fs-RMS Integrated Jitter at 4.5-mW Power, Solid-State Circuits, IEEE Journal
of 46, 2745 (2011).

[17] N. Pavlovic and J. Bergervoet, A 5.3GHz digital-to-time-converter-based


fractional-N all-digital PLL, in Solid-State Circuits Conference Digest of Tech-
nical Papers (ISSCC), 2011 IEEE International, pp. 54–56.

[18] J. Zhuang and R. B. Staszewski, A low-power all-digital PLL architecture based


on phase prediction, in Electronics, Circuits and Systems (ICECS), 2012 19th
IEEE International Conference on (IEEE, 2012) pp. 797–800.

[19] D. Pfaff and Q. Huang, A quarter-micron cmos, 1 ghz vco/prescaler-set for very
low power applications, in Custom Integrated Circuits, 1999. Proceedings of
the IEEE 1999 (IEEE, 1999) pp. 649–652.

[20] D. Ham and A. Hajimiri, Concepts and methods in optimization of integrated


lc vcos, Solid-State Circuits, IEEE Journal of 36, 896 (2001).
.
48 References

[21] N. Da Dalt, Linearized Analysis of a Digital Bang-Bang PLL and Its Validity
Limits Applied to Jitter Transfer and Jitter Generation, Circuits and Systems I:
Regular Papers, IEEE Transactions on 55, 3663 (2008).

2 [22] N. Da Dalt, A design-oriented study of the nonlinear dynamics of digital bang-


bang plls, Circuits and Systems I: Regular Papers, IEEE Transactions on 52,
21 (2005).
3
ADPLL Modelling and
Simulation

Jitter or phase noise is always of great concern to the designers of PLL sys-
tem. Jitter and phase noise are different ways of describing an undesired
variation in the timing of events at the output of the PLL. They are difficult to
predict with traditional circuit simulators because the PLL generates repeti-
tive switching events as an essential part of its operation.

Direct simulation such as SpectreRF based simulation is accurate but usu-


ally takes time as long as several months to get characterized smooth in-band
phase noise and spurs. Based on the analysis work done in Chapter 2, a
more time efficient as well as accurate method is proposed to model and sim-
ulate the ADPLL. In this chapter, the modelling and simulation methodology
of ADPLL is illustrated.

3.1. General Methods


enerally there are two different ways of simulating a PLL system: the voltage
G domain direct simulation and time domain event-driven based simulation are
discussed as follows:

49
.
50 3. ADPLL Modelling and Simulation

3.1.1. Direct Simulation


In many circumstances, SpectreRF can be directly applied to predict the noise
performance of a PLL. Nevertheless the precondition is that the PLL must at least
have a periodic steady state solution. This rules out systems such as Bang-Bang
clock and data recovery circuits and fractional-N synthesizers because they behave
in a chaotic way by design. It also rules out any PLL that is implemented with
a phase detector that has a dead zone. A dead zone has the effect of opening
3 the loop and letting the phase drift seemingly at random when the phase of the
reference and the output of the voltage-controlled oscillator (VCO) are close. This
gives these PLLs a chaotic nature. To perform a noise analysis, SpectreRF must
first compute the steady-state solution of the circuit with its periodic steady state
(PSS) analysis. If the PLL does not have a periodic solution then simulation will
not converge. Unfortunately both the Bang-Bang DPLL and TDC based ADPLL in
close integer mode will have the problem mentioned above. Besides, the oscillator is
working at the frequency of several GHz or tens of GHz while the frequency accuracy
requirement for the system is tens or hundreds of Hz. To get the response of the
oscillator with enough accuracy, simulator will execute the simulation in step of ps
or fs. On the other hand, to characterize an accurate and smooth in-band phase
noise the simulation needs to take several ms which would require simulation time
as long as several months. Together with thousands of transistors involved, the
extremely long time require by SpectreRF simulation for phase noise estimation
would be too long to be adopted.
Nevertheless SpectreRF simulation can still provide an accurate estimation of
the design in terms of power, phase noise in transistor level. Even though PSS is
not applicative for some cases, VCO output time stamps or input tuning signal infor-
mation could still be adopted for processing to estimate the phase noise. Besides,
the meaningful side of SpectreRF simulation is that the power consumption could
be estimated roughly.

3.1.2. Event Driven Verilog-based Simulation


[1] gives a hint on event driven simulation method and this Verilog/VHDL based
simulation is already widely used and verified among DPLL designers. What’s more,
the time domain operation scheme of the counter-based ADPLL makes itself a per-
fect candidate for event driven Verilog-based simulation. The phase noise and lock-
ing behaviour could be predicted really well thanks to the modern CMOS processes
3.2. Mixed Signal Simulation Methodology .
51

Table 3.1: Methodologies Comparison

Metric SpectreRF Event Driven Verilog-based

Power accurate only possible via digital flow

to estimate netlist file at a rough accuracy

Locking Behaviour accurate accurate

Phase Noise accurate accurate


3
Spurious Tones accurate accurate

Time needed 5 days for 50 𝜇s 10 minutes for 1 ms

which are offering steep timing stamps. Table 3.1 summarises the differences be-
tween the two methodologies mentioned above.

3.2. Mixed Signal Simulation Methodology


Though SpectreRF based voltage domain direct simulation could give us suffi-
cient information of what we want to know about the design, time required for this
simulation is prohibiting long. Instead of wasting time in waiting several months
for the simulated results, a sufficient but more efficient simulation method is antici-
pated to try different ideas and structures by simulating the model within relatively
short time.

Table 3.2 shows the proposed mixed signal simulation methodology which is
used in this thesis design. All the design results presented in this thesis are based
on this flow. One thing has to be explained here is about the AMS in the step 4 in
Table 3.2. This is a simulator available in Cadence which is a mixed signal simulator.
It combines the Verilog-based script together with schematic netlist in one simula-
tion. The essential principal is to simulate all the schematic netlist based parts by
SpectreRF while simulating the digital Verilog codes by NCSim. The interaction be-
tween digital code and voltage domain analog netlist is finished by the predefined
rule guided conversion between analog and digital domain.
.
52 3. ADPLL Modelling and Simulation

Table 3.2: Proposed Methodology

Step Phase Detector (Mixed) DCO (RF) LF (Digital)

1 Matlab Matlab Matlab

2 Verilog Verilog Verilog

3(SpectreRF) Sch Sch Synthesized Sch


3 4(AMS) Sch Verilog (sch) Synthesized netlist

5(NCsim) Verilog(lay) Verilog(lay) with backend timing annotation


1 Sch is shot for Schematic here.
2 Lay is short for post-layout extracted information here.

3.3. Phase Detector Modelling


As depicted in Chapter 2, the phase detector realized in this thesis consists of
DTC, Bang-Bang detector and the architecture is mainly inverter based as would
be explained in Chapter 4. This is really easy to be modelled in Verilog while the
nonidealities are considered as follow:

3.3.1. Mismatch
Mismatch among different delay stages would cause serious spurious problem
and this mismatch is modelled by a Gaussian distribution model. However, for a
certain INL shape we can also directly predefined the delay for each stage.

3.3.2. Accumulated Noise


The loop noise in the reference path is mainly from reference clock jitter and
the accumulated jitter from all the delay stages of the DTC and Bang-Bang detector.
Since these noise sources are confronting the same transfer function, all of
them can be referred to the noise in the reference clock. The reference clock used
in the lab is with the noise floor in the range from -140 dBc/Hz to -135 dBc/Hz.

3.3.3. Supply Noise and Dirty Ground


The supply noise and ground bouncing issue is rather complex and may bring
global delay variation to interfere the loop locking. However this is a phenomenon
complex to model but won’t impact the locking. Corresponding solution is proposed
3.4. RF Modelling .
53

in Chapter 4.

3.3.4. Metastability
Metastability could be an issue for Bang-Bang phase detector. It adds really
ignorable effect to the result in terms of the relatively smaller metastability window
compared with the thermal noise in the loop. In this way the transfer function
of the Bang-Bang detector is not so much impacted by this effect. Besides (as 3
will be explained later) in the retiming block, CKV is resampled twice to create a
1.6 ns separation between variable input (CKVG) to the Bang-Bang and resampled
reference CKR. Thus time is rather sufficient for the 1 bit quantizer’s output to reach
to a well defined level.

3.4. RF Modelling
3.4.1. DCO
The oscillator jitter part and wander part are modelled according to [2]. How-
ever, the flicker noise part mainly impacts the integrated jitter but won’t impact the
locking behaviour at all. Besides, considering the flicker noise is not a serious prob-
lem in terms of the design specification, this part is not included in my model. For
jitter noise, the standard deviation of the time domain Gaussian distributed jitter is

𝑇
𝜎△ = √ℒ𝑓 (3.1)
2𝜋

where ℒ is the noise floor and for wander noise the standard deviation is

Δ𝑓
𝜎△ = √𝑇 √ℒ(Δ𝑓) (3.2)
𝑓

where Δ𝑓 represents the offset frequency from the output carrier. The noise is
added to the DCO output in time domain based on Gaussian distribution model.

3.4.2. Feedback Divider


The feedback divider impacts the output phase noise in the same way as the
DCO does. Thus they share the same modelling methodology. The phase noise
profile could be easily obtained together with DCO from post-layout simulation.
.
54 3. ADPLL Modelling and Simulation

3.5. Digital Flow


The digital flow is completed in two steps. First is to translate RTL code to
transistor based netlist with the help of RTL Compiler. Then Encounter is needed
for the back-end flow to generate the corresponding layout.
Timing information of gate delay can be extracted from the digital flow. This
can be added to the Verilog-based model easily for more accuracy without increas-
3 ing simulation time significantly. The digital part mainly impacts the locking while
contributes almost nothing to the output phase noise spectrum.

3.6. Verification of the Model


Measurement of the chip in [3] is conducted during the thesis. Thus the model
based simulation results and the corresponding measurement results can be com-
pared for verification in terms of worst case in-band fractional spurs and the phase
noise. Comparisons are shown as follows:
In terms of settling time (please refer to Figure 3.1 and Figure 3.2), not only
the Bang-Bang behaviour and type II behaviour could be rebuilt but also the settling
time is more or less the same as 16 𝜇𝑠. The deviation comes from the difference in
initial phase offset between reference clock and variable output which is not known
and can’t be rebuilt.
In terms of the phase noise and fractional spur plot (please refer to Figure 3.3
and Figure 3.4), FCW=38.0001 is used here with -120 dBc/Hz 32 MHz reference.
All the other parameters are also set up in the same way other than linearity, which
is modelled roughly based on measured level. The noise floor of -82 dBc/Hz and a
peak at -76 dBc/Hz match the measured results. Besides, the spur positions (reso-
lution resulted at 111 kHz and nonlinearity fundamental at 3 kHz) are also rebuilt.
However, the fractional spurs levels due to nonlinearity are hard to rebuilt but the
quantization resulted one matches the measurement result as -26 dBc/Hz and the
worst case of fractional spurs could easily reach the measured worst case with INL
increased. Thus the simulation method is verified to be reliable for estimation.
3.6. Verification of the Model .
55

Figure 3.1: Measured settling behaviour.

4 settling behaviour
x 10
1
Simulated

0.5
phase error at TDC output

−0.5

−1
BangBang Type II locking

−1.5
10 11 12 13 14 15 16 17 18
time [us]

Figure 3.2: Simulated settling behaviour.


.
56 3. ADPLL Modelling and Simulation

Figure 3.3: Measured phase noise plot.

CKV clock: f =2432.005859 MHz, integ PE=10.07631 ps


0

X: 3148 X: 1.113e+05 Simulated


−60 Y: −53.52 Y: −52.01 Theory

X: 9845
Y: −81.99
−80 X: 1.106e+05
Y: −76.02

−100
Spectrum of the2 π*rad SSB phase noise at CKV [dBc/Hz]

−120

−140

−160

−180

−200

−220

−240

−260
3 4 5 6 7
10 10 10 10 10
Frequency [Hz] (FFT: len=9290284, rbw=300)

Figure 3.4: Simulated phase noise plot.


References .
57

References
[1] R. B. Staszewski, C. Fernando, and P. T. Balsara, Event-driven simulation and
modeling of phase noise of an RF oscillator, Circuits and Systems I: Regular
Papers, IEEE Transactions on 52, 723 (2005).

[2] R. B. Staszewski and P. T. Balsara, All-digital frequency synthesizer in deep-


submicron CMOS (John Wiley & Sons, 2006).
3
[3] V. Chillara, Y.-H. Liu, B. Wang, A. Ba, M. Vidojkovic, K. Philips, H. de Groot,
and R. Staszewski, 9.8 An 860𝜇W 2.1-to-2.7GHz all-digital PLL-based frequency
modulator with a DTC-assisted snapshot TDC for WPAN (Bluetooth Smart and
ZigBee) applications, in Solid-State Circuits Conference Digest of Technical Pa-
pers (ISSCC), 2014 IEEE International (2014) pp. 172–173.
4
ADPLL Implementation in
Transistor-Level

In this chapter, the whole design of an ultra-low power ADPLL for BLE appli-
cation is presented in three parts. Section 4.1 demonstrates the digital flow
based synthesizable low speed digital logic blocks. Section 4.2 describes the
mixed signal blocks such as DTC and counter design. The DCO and divider
design which belongs to RF domain is covered in Section 4.3.

4.1. Low Speed Digital Implementation


igure 4.1 shows a simplified top-level block diagram of the low speed digital
F logic. Digital flow in specification (RTL code) level is based on Verilog due to its
compatibility with NXP digital flow.
The low speed logics are all in the resampled reference clock (CKR) domain
which contains a finite state machine, interface to tx data, phase detector, and a
loop filter for each of the DCO’s three banks: PVT, acquisition, and tracking. The
input frequency control word (FCW) is used to generate the required frequency
according to Eq. 4.1.
𝑓 = 𝐹𝑅𝐸𝐹 ∗ 𝐹𝐶𝑊 (4.1)

59
.
60 4. ADPLL Implementation in Transistor-Level

=35 $

=35 3

3+(>@
7; GDWD>@
LQYB. GFR B 3 >@
GO\BGFR SDWK>@
PHPBGFR 3 >@
63,B)&:>@
7; DOSKD 3 9 7 >@ 27: >@
/RRS FiOWHU 39 7

4 . GFR BPRGQ>@
,QWHUIDFH EDQNBHQ>@
39 7

'7&FWUOBH[W& >@
&.5
&.5
PRGQBRQ

.Y >@
EDQNBVHO>@

EDQNBVHO>@

)&:>@ 3+(>@

'7&FWUOBH[WQ
%DQJ%DQJRXWSXW '7&FWUOBH[WI >@ PHPBGFR $ >@ 27: $% >@
3KDVH
LQYB.WGF >@ LQYB. GWF >@ LQYB. >@ /RRS FiOWHU
$%
%% 'HWHFWRU GFR B $

3+9 , >@ DOSKD $% >@


&.5 UHDGRXW
&.5 EDQNBHQ>@

3+( ) >@ &.5

VUVW

3+(>@

27:BGDWD>@
'7& FWUOI >@

'7& FWUOF >@

PHPBGFR 7 >@ 27: 7 % B , >@


397 PRGH >@
DOSKD 7 % >@
$% PRGH >@
EDQNBVHO>@ UKR>@
&+B6: )60
EDQNBVHO>@ 27: >@
&.5 /RRS FiOWHU
7%
7% B )

URWDWHBRQ LQYB. GFR B 7 >@

ODPEGD>@
'(0
,,5B(1

&.5
EDQNBHQ>@

Figure 4.1: Simplified top-level diagram of low speed digital


4.1. Low Speed Digital Implementation .
61

Where the 𝑓 means the divided feedback variable high frequency signal. The
number of bits of integer part of FCW determines the upper limit of the ADPLL’s
output frequency 2 ∗ 𝑓 according to Eq.4.1, while its fractional part determines
the resolution of the output frequency. The available reference frequency are 16
MHz and 32 MHz and targeted highest output frequency is at least 2.8 GHz. Thus
7 bits are allocated to the integer part. According to the modulation requirement,
potentially 1 kHz should be available as the output resolution. Thus 16 bits are
allocated to the fractional part of FCW.
The state machine is used to assist the bank switching between the three banks
of DCO by determining the duration and order in which each of the three banks of
DCO are activated after the frequency search is triggered. The block ”tx interface” 4
receives the tx data and the FCW from the digital baseband and generates the
input control to the low-frequency and high frequency paths to facilitate two-point
frequency modulation. The phase detector has four functions: (1) generate integer
phase error by comparing output from both variable phase counter and reference
phase accumulator. (2) Predict the next variable phase by accumulating DTC control
words based on accumulated fractional reference phase. (3) Encode the Bang-Bang
output since the Bang-Bang linear range is set to be double of DTC finest step size.
(4) Observe the locking behaviour by looking at integer phase error and frequency
error deviation. Based on (4) the loop can also be duty-cycled after lock-in at the
cost of boosted in-band noise floor since the equivalent reference clock frequency
is reduced. A separate loop filter for each of the three banks then processes the
phase error to generate the appropriate control words for the DCO. Receiving the
DTC control word, an LFSR based binary-to-thermal rotator encoder is implemented
for relaxing the nonlinearity issue by DEM method. The implementation details of
each of these blocks are described briefly in the following sections.

4.1.1. Digital Phase Error Detection Logic


The phase detector implemented in this ADPLL is shown in Figure 4.2. A dif-
ference mode architecture is adopted here: the frequency information is extracted
from the phase to generate frequency error and then the immediate frequency de-
viation information is then accumulated to produce the phase error. In this way,
the instantaneous frequency deviation can be observed and function like freezing or
duty cycle the PLL could be implemented to save power in lock-in state. Difference
of two successive samples of the counter gets subtracted from the integer part of
.
62 4. ADPLL Implementation in Transistor-Level

3+9 , >@
 G3+9 , >@ G3+( , >@
G3+( DFFXP >@
' 4

 ^3+( , >@ÿE`
FNU 4
' 4 QUVW

FNU 4
QUVW
VUVW 3+(>@

)&: ,  >@ G3+5 , >@


' 4
3+( ) >@
FNU 4
QUVW
VUVW

4 Figure 4.2: Integer path of differential mode based Phase error detector.

the FCW to generate the instantaneous integer part of the frequency deviation and
when the loop finally locks in, the integer part of the phase error would be zero.
For the fractional part, a combination of DTC and Bang-Bang detector is used. As
discussed in Chapter 2, the accumulated value of the fractional part of the FCW
generates the DTC control words. The overflow resulted from the accumulation
of fractional part of FCW is added to the integer FCW. The scaling by ”𝑖𝑛𝑣 ” is
required to convert the normalization factor of the delay from divider-by-2 various
period (ckvd2) to the DTC unit delay. This value of ”𝑖𝑛𝑣 ” is obtained from the
kdtc calibration block. The total phase error is obtained by adding the output of
Bang-Bang phase detector to the integer part of the phase error. External control
word is set up to help test the DTC part alone during test.
For the Bang-Bang’s output, either +1 or −1, has to be scaled by a gain which
is the ratio between a fine DTC step to a ckvd2 period as this is its linear detection
range.

4.1.2. TX Interface for Two-point Frequency Modulation


The counter-based ADPLL has a natural wideband FM capability which can be
realized as a two-point modulation scheme according to [1]. In Figure 4.3, one data
path directly modulates a DCO, while another path is introduced to the reference
path for compensation to avoid the phase error impacted by modulating data. Ac-
cording to Chapter 2, former path (DCO path) has a high-pass characteristics while
the latter one (reference path) experience a low-pass filtering. Once both of the
pats are combined an all-pass transfer function is realized and data rate for BLE as
4.1. Low Speed Digital Implementation .
63

1 Mbps could be ensured without restriction.


Implementation of the TX interface is shown in Figure 4.3. On input side a
10-bit modulation data ”TX data” and frequency control word FCW can be seen as
well as a DCO gain for normalizing the modulation data.
For compensating path at the reference path, TX data is added to FCW directly.
10-bit data is correspond to a frequency resolution finer than 1 kHz which is more
than enough. Added by FCW, the compensated control word is fixed at reference
frequency which is PVT free and no normalization is necessary. On the other side,
normalization realized by being scaled by is needed. In order to meet the cor-
responding SNR, resolution of DCO finer than tracking bank (50 kHz) may be asked
and this is implemented by introducing a 5-bit ΣΔ modulator to achieve equivalent
4
sub-2 kHz resolution ( ).

' 4 ' 4 ' 4 ' 4 ' 4 ' 4 ' 4 ' 4


ÿE 
7; GDWD>@ 4 4 4 4 4 4 4 4
 QUVW QUVW QUVW QUVW QUVW QUVW QUVW QUVW

PRGQBRQ
QUVW
&.5


 GO\BGFRSDWK>@
0X[
63,B)&:>@
LQYB. GFR BPRGQ>@

)&:>@
ORZIUHTXHQF\ SDWK

27:BGDWD>@
KLJK IUHTXHQF\ SDWK

Figure 4.3: TX interface block for two-point modulation

4.1.3. 5-bit ΣΔ modulator


A 5-bit first order ΣΔ modulator is used to reduce the quantization noise from
the DCO. Figure 4.4 shows its block diagram. The dithering clock is chosen as either
reference clock or higher rate of divide-by-16 by using the output from variable
.
64 4. ADPLL Implementation in Transistor-Level

counter. It is basically Carry-propagate adder (CPA) architecture based and the


dithering rate is based on the requirement of the noise profile.

IUDF>@
RXW>@ 27: I
' 4
VLJGHOHQ
4
QUVW
RXW>@
QUVW

4 FNU

FNYG


FONBVHO

Figure 4.4: 1st order modulator.

4.1.4. Dynamic Element Matching


Linear, nonlinear and random mismatch could always be observed in integrated
circuit due to linear gradients or even some geometry uncertainties introduced in
fabrication. In order to relax the fractional spurs issue resulted from mismatch in
DTC delay chain, dynamic element matching (DEM) is introduced. DEM is a dynamic
process that reduces the effects of component mismatches in electronic circuits by
rearranging dynamically the interconnections of mismatched components so that
the time averages of the equivalent components at each of the component positions
are equal or nearly equal. By appropriately varying the mismatched components’
virtual positions, the effects of mismatched components can be reduced, eliminated,
or frequency shifted.
Compared to the DTC designed in [2], the thermal code based control logic
used in this design (will be covered in Section 4.2) makes every unit delay cell
weights the same in terms of the final output signal. Thus DEM techniques could
be applied easily to the control logic of the DTC. Linear feedback shift register
(LFSR) based encoder-rotator is used here for simplicity. As shown in Figure 4.5, the
encoder-rotator consists of a 4-bit Galois LFSR which generates a pseudo-random
number as a rotation index. Once the rotation control logic is turned on, the binary
4.1. Low Speed Digital Implementation .
65

to thermal DTC control words encoder starts to rotate the thermal output according
to the rotation index from LFSR. An example is shown in Figure 4.6, the rotation
index is 3 and the encoded thermal control code is right shifted by 3 bits. The
comparison between with and without rotation in terms of fractional spur level is
shown in Figure 4.7.
DEM implemented here is to pseudo-randomly possible usage patterns of the
DTC each reference period such that the error arising from unit delay element mis-
matches is scrambled from cycle to cycle. Thus, DEM increases DTC linearity at the
expense of introducing more noise inband at the output spectrum of the ADPLL.
However, the additional noise introduced by DEM is not a serious issue especially
when the fractional spur requirement becomes more important according to certain 4
standards.

4 ' 4 ' 4 ' 4 '


4 4 4 4

&/.
Figure 4.5: 4-bit Galois LFSR.

Figure 4.6: Principal of LFSR based DEM.


.
66 4. ADPLL Implementation in Transistor-Level

DEM result
−60
With no DEM
With DEM
−80

Spectrum of the 2π*rad SSB phase noise at CKV [dBc/Hz] −100

−120

−140

−160

−180

4 −200

−220

−240

−260
2 3 4 5 6 7 8 9
10 10 10 10 10 10 10 10
Frequency [Hz]

Figure 4.7: Simulated phase noise showing DEM effect.

DEM improvement
6

5.5

4.5
dB

3.5

2.5

2
0 0.5 1 1.5 2 2.5 3
frequency [Hz] x 10
5

Figure 4.8: Simulated results showing reduced spurs.


4.1. Low Speed Digital Implementation .
67

The effect of DEM could be observed from Figure 4.7 and Figure 4.8. The INL
of the example in this simulation is 3 LSB.

4.1.5. Loop Filter

Figure 4.9 shows the top-level view of digital loop filter. In terms of area
and accuracy, DCO’s capacitor array is implemented in a coarse-fine way by three
segmented parts: PVT, Acquisition and Tracking bank as done in [1]. The former
4
two are basically assisting the frequency acquisition by large frequency steps while
the last one determines the frequency accuracy of the DCO as well as the PLL output
signal. Thus different filters are used to generate corresponding control words to
the three bands according to the input phase error bits (PHE). As already mentioned
in previous chapter, type-I loop consists of only a proportional path which is good
regarding large bandwidth, fast lock at the cost of less filtering effect (20 dB/decade
out-of-band attenuation) and thus is used here for the PVT and Acquisition bank.
With additional integral path, type-II loop could filter the DCO noise with attenuation
of 40 dB/decade outside bandwidth at the cost of narrower bandwidth as well as
slower locking speed. Besides, switchable IIR filter is also implemented in addition
as a potential solution for the case when less noise is preferred. The control blocks
implement the functionality to generate the zero-phase restart signals at mode
switch-over and also to freeze the tuning word once the DCO mode is changed. For
the control blocks offer three features: 1. Freeze the corresponding oscillation
control words which is the input of related capacitor bank. This is done by
the bank selection indicator word 𝑏𝑎𝑛𝑘 to set which operation mode is chosen
at the moment. 2. Generate zero-phase restart signal. This is done by use
an OR gate fed by the output control words to DCO and a preset signal which is
”1” under a certain operation mode. However, whenever 𝑏𝑎𝑛𝑘 changes, it would
trigger a D flip-flop to generate a negative pulse with the width as a reference clock.
This negative pulse is used to clear the contents of the phase error accumulator as
the output would be reset to 0. 3. Open loop test plan. The 𝑏𝑎𝑛𝑘 signal with
0 value would choose the control word from SPI externally instead of passing the
filtered PHE corresponding results to DCO. Thus the DCO open loop performance
could be tested.
.
68 4. ADPLL Implementation in Transistor-Level

/RRS FiOWHU3 9 7

*DLQ 27: 3
3523 &75/
QRUP

%DQNBVHO>@
7\SH,
/RRS FiOWHU$%

*DLQ 27: $
3523 &75/
QRUP

%DQNBVHO>@
7\SH,
3KDVH
GHWHFW /RRS FiOWHU7 %
4
3523 7\SH,,

*DLQ ,,5 27: 7


&75/
QRUP FiOWHU

%DQNBVHO>@
,17

Figure 4.9: Top-level view of digital loop filter.

4.2. Mixed Signal Implementation


Situated at the feedback path, mixed signal blocks such as DTC, Bang-Bang
phase detector as well as counter play the determining role of phase detector for
the function of ADPLL as one of the three classic blocks in Figure 1.2. Via it, error
in phase domain is digitized and sent into digital loop filter for processing. If one
compares designing an ADPLL to riding a horse, then the phase detector could be
analogous to the harness as it is adjusting the output signal (horse’s movement) to
track the input control words (rider’s mind).

4.2.1. Coarse-Fine DTC Design


A digital-to-time converter (DTC) performs the opposite operation of a TDC:
It has an n-bit digital input and produces a continuous-time digital signal where
the positions of the edges are determined by the digital control signal. Briefly
speaking, it is a digital controlled delay line. DTC is less frequently seen compared
4.2. Mixed Signal Implementation .
69

with TDC in ADPLL related literatures but is the most essential component in the
phase detector of this design according to Figure 2.17. However, the delay line
(both voltage controlled and digital controlled) topic is really common and could
be found everywhere: from delay line based TDC [3] to delay-locked loop (DLL),
from VCO (ring oscillator) to pulse width modulation (PWM) [4]. The simple but
meaningful block functions as a cornerstone in a lot of systems.
Regarding a delay line based DTC design, both system level (control logic) and
transistor level (unit cell) concepts are discussed as follows:

a. Architecture of DTC
The DTC’s resolution, complexity as well as power are mostly determined at
4
system level. The requirement for DTC in this design is to cover more than the
required range (800 ps which is corresponding to 1.2 GHz as the divide-by-2 of the
output) with as fine as possible step size while burning as low as possible power.
As set by the goal proposed in Chapter 2, the DTC is required to cover more than
800 ps at a resolution at the level of 2 ps around (reference thermal noise level)
with power lower than 40 𝜇𝑊. What’s more, tolerable linearity has to be ensured
for a certain low worst fractional spur, according to the analysis done in Chapter 2.
However, none of the exiting popular DTC structures implemented in DPLL
could meet such a need at low power. In [2], the resolution is as coarse as 15 ps
and the control logic is not simple as well. The low power of the DTC is achieved
at the cost of smallest size inverters with intolerable nonlinearity. What’s more,
the control logic determines that delay is always generated from a certain direction
along the delay chain. This means at some certain channel, certain cells are working
much more often than others. This is rather undesired because certain mismatch
pattern would be visited periodically and thus introduces strong spurious tones at
output. In [5], the resolution is good but the structure is kind of complex in terms
of the huge numbers of MOS capacitor needed and the strong nonlinearity without
look-up-table (LUT) based calibration. Besides, due to the shunt capacitor based
delay unit, power consumed by that structure is really not low. In other publications,
such as [6], the DTC resolution is not fine enough and the power of 200 𝜇W is also
too high compared with the proposed target.
In this design, all the design considerations of the resolution, linearity and
power with conventional TDC design are shifted to DTC part as covered in Chapter 2.
Nevertheless the sub-gate resolution of DTC seems really hard to be implemented.
.
70 4. ADPLL Implementation in Transistor-Level

Though techniques such as passive delay line [7] and resistive interpolation [8]
could help to offer such a fine resolution, still these methods are either too passive
or analog intensive and thus they are not area economical or digitally compatible.
Nevertheless, delay line based DTC should still be the solution. Delay line could be
generally divided into two categories as: absolute delay and relative delay in terms
of the timing relation between input and output. Absolute delay means

𝜙 =𝜙 +𝑛∗𝑡 (4.2)

where the 𝜙 denominates the phase of input and output signal. n is the digital
delay control word while 𝑡 is the resolution of the delay line. Single delay line or
4 pseudo-differential delay line based TDC [3] and common multiplexer based DTC
(shown in Figure 4.10) all belong to this category. Yet sometimes the relative delay
between input and output phase is of more interest and relative delay based delay
line’s working principal is

𝜙 + 𝑛𝑡 =𝜙 + 𝑛𝑡
(4.3)
𝑑=𝑡 −𝑡

where the equivalent resolution d is realized by the relative difference between two
different delay states of the same delay unit. To achieve this relative delay in TDC,
the so called Vernier delay line consists of at least two delay lines is required. In this
way the mismatch between two different delay line would create serious problem
once going into pico second resolution. Besides, fine resolution realized by such a
structure means that incredible long delay line would be unavoidable and this may
result in large power consumption as well as large mismatch. In [9], a coarse-fine
TDC is proposed. In terms of noise and spur this is really well designed. However,
due to the large time amplifier array as well as the complex interface between
coarse and fine TDC, the power is prohibiting high for the BLE application.

ref

Q Multiplexer

Figure 4.10: Delay line DTC based on a large multiplexer.

In this design, DTC is used as an alternative block to replace TDC for tracking
4.2. Mixed Signal Implementation .
71

the fractional phase changing. Bang-Bang Detector is used for the error detection.
The equivalent resolution of this structure is determined by DTC step size. Unlike
TDC circuit, with DTC we can realized the relative Vernier delay thinking by a single
delay line only. In this phase predication scheme, any fixed delay between reference
edge and delayed reference edge could be considered as a initial phase offset which
does not impact the loop locking function. In this way we can tolerate relative delay
instead of an absolute delay. Thus with a single delay line, sub-gate resolution is
already possible as shown in Figure 4.11. Assuming all the delay cells are with
two status (when the control signal is low, the corresponding delay is 𝑡 while
when the control signal is high the corresponding delay is 𝑡 ). When no delay is
desired, there will be a fixed delay as 𝑛𝑡 while additional delay according to the 4
control words of ”1” would increase at the step of (𝑡 − 𝑡 ). Here the equivalent
resolution of the DTC is only 𝑡 − 𝑡 and could be infinitely fine as long as the
mismatch is tolerable. However, in order to cover the required range at a low
power regime, a single fine delay line consists of more than 400 ( ) stages is
impossible. Thus coarse-fine is introduced here to reduce the power consumption.
What necessary is to reduce required number of stages as many as possible in terms
of accumulated jitter, mismatch and power. Finally 16 stages of coarse DTC with
60 ps resolution and 32 stages of fine DTC with 2 ps resolution are implemented
in the tape out as an optimized option according to simulation. In this design, only
the ratio between coarse stage and fine stage needs to be taken care of. Power
hungry blocks such as time amplifier are avoided compared with its counterpart
coarse-fine TDC implementation [9]. Besides, in terms of the ratio calibration, it
could be implemented simply by the LMS algorithm as discussed in [10].

d1 d2 dn

X Ă Y
n
t = nt + Σd ( t - t )
delay 0 i =11
i 1 0

Figure 4.11: Simplified concept of Vernier DTC

b. Unit Cell of DTC


Cascaded inverter based delay line is preferred in this design in terms of low
power and low complexity as well as small area. One important feature of inverter
.
72 4. ADPLL Implementation in Transistor-Level

is its propagation delay 𝜏. 𝜏 is determined by the drive strength of the inverter,


represented by the equivalent drive resistance 𝑅 , and the load capacitance seen
at the output as 𝐶 . Generally 𝜏 could be approximated by

𝜏 = ln(2)𝑅 𝐶 (4.4)

where ln(x) is the natural logarithm of x. 𝐶 here consists of self load and input
capacitance of any block driven by it as well as the capacitance of interconnect
wires. 𝑅 depends on the size of the transistors in the inverter and is different for
NMOS and PMOS. The 𝑅 of a transistor when used with full-swing signal is given

4 by
3 𝑉 (1 − 7𝜆𝑉 /9)
𝑅 = (4.5)
4 𝛽((𝑉 − 𝑉 )𝑉 −𝑉 /2)

where 𝛽 = 𝑘 and W and L are transistor’s width and length respectively. The
parameters k, 𝑉 and 𝜆 and the threshold voltage 𝑉 are technology dependant
constants. According to the analysis, we can see the 𝑅 is proportional to L/W and
within a delay chain consists of the same unit cells, 𝐶 is approximately proportional
to WL. Thus the 𝜏 is proportional to 𝐿 and independent of W in first-order.
Based on this conclusion, we can generally divide delay unit cells into two
main categories: current starved based and shunt capacitor based. The basic idea
of shunt capacitor based is to change the 𝐶 by shunting a equivalent switchable
capacitor at the output of an inverter as shown in Figure 4.12. On the other hand,
current starved based technique is to manipulate the (dis)charging current of the
load capacitor. As shown in Figure 4.15, it is actually changing the 𝑅 by changing
the W/L ratio. However, the DTC in ADPLL mainly consumes dynamic power which
is given by [11]
𝑃=𝐶 𝑉 𝑓 (4.6)

where f is the gate toggling frequency. This equation implies that to generate a
large delay by increasing load capacitance would increase the power consumption.
Thus the current starve based delay cell is chosen for coarse stage. Because it is
not wisdom to cover the whole ckvd2 range by enlarging 𝐶 via the shunt capacitor
unit cell in terms of power. Besides current starve based structure is also featured
with wide tuning range. Regarding capacitor based delay cell, it is more proper for
implementing fine stage in terms of linearity. Benefiting from the single line Vernier
concept, this coarse-fine DTC’s resolution is totally free from low supply voltage
4.2. Mixed Signal Implementation .
73

impact and is really proper for low power low supply application. (Resolution is not
depending on gate delay and thus independent from supply level.)

c. Mismatch for DTC Unit Cell Design


Nonidealities in the CMOS production process cause variations on the proper-
ties of the transistors. Usually, global and local variations are considered separately.
Global variations affect all transistors on one chip in the same way, but may be dif-
ferent on another sample of the same chip. Thus it is also called inter-die variations.
Its effect could be reduced by correct DTC gain estimation and is not of my interest
here.
However, local variations (also known as intra-die variations) changes the prop- 4
erties of each transistor. Local variations are different for every transistor, which
leads to mismatch between components that were intended to be identical. This
is the main source for the nonlinearity discussed in Chapter 2 as a main source for
fractional spurs.
Local transistor variations mainly lies in variations in the threshold voltage 𝑉
and the current factor 𝛽. According to Pelgrom’s law , both variations can be as-
sumed to have a Gaussian distribution with zero mean and the standard deviations,

𝐴
𝜎 = (4.7)
√𝑊𝐿

and
𝜎 𝐴

= (4.8)
√𝑊𝐿
𝛽

where 𝐴 and 𝐴 are technology constants and 𝛽 is the nominal value here.
Though this formula has deviation from real case especially in 40nm technology
node, we can still use it for a qualitative estimation during design. According to
4.4, under the assumption of variations are small and uncorrelated, then 𝜏 could be

approximated by Gaussian distribution with a mean value equal to 𝜏 with standard
deviation as
𝜕𝜏 𝜕𝜏
𝜎 = √( |∧ ∧ 𝜎 ) + ( |∧ ∧ 𝜎 ) (4.9)
𝜕𝑉 , 𝜕𝛽 ,

Conclusion could be derived as


𝜎 𝐴

= (4.10)
𝜏 √𝑊𝐿
.
74 4. ADPLL Implementation in Transistor-Level

where
𝐴
𝐴 =√ +𝐴 (4.11)

(𝑉 −𝑉 −𝑉 /2)
The derivation in detail is attached behind in appendix. In first order this could be
considered as a Pelgrom’s law form except the fact that 𝑉 is proportional to L
which is not a constant. However, as L decreases in modern technology, contribution
from 𝑉 becomes trivial and even negligible in 40 nm. The important conclusion
from 4.10 is that W should be enlarged in order to reduce mismatch while L should
be keep small since 𝜎 is proportional to 𝐿√𝐿.
4 Consider a N-cell delay line and all cells are identical with 𝜏 as the delay mean
value while 𝜎 as the standard deviation. Then it is easily to draw the conclusion
that the largest standard deviation occurs at the end of the delay line and is

𝜎 = 𝜎 √𝑛 (4.12)

where n is the number of the delay cells turned on and being N (all cells are turned
on for delay) gives the largest error. According to the differential nonlinearity (DNL)
definition, it is
𝐷𝑁𝐿 = 𝜏 − 𝜏 (4.13)

The integral nonlinearity (INL) is defined as

𝐼𝑁𝐿 = Σ [𝜏 ] − 𝑛𝜏
(4.14)
=Σ 𝐷𝑁𝐿

Thus 𝐼𝑁𝐿 has zero mean and a standard deviation proportional to 𝜎 √𝑛. The
analysis done above is a really meaningful guideline for transistor sizing.

4.2.2. Fine Bank of DTC


Unit cell is implemented as shown in Figure 4.12. 32 of such shunt capacitor
based unit cell construct the fine bank of the DTC with a resolution around 2 ps in
order to ensure Bang-Bang phased detector works in noise regime. According to
4.10, the L is kept the smallest while W is enlarged to relax mismatch. The layout
is done in an S shape to reduce the linear gradient introduced mismatch during
processing, as shown in Figure 4.21. The width is enlarged till we get a satisfying
mismatch (with a worst fractional spur estimated to be lower than -40 dB) as shown
in Figure 4.13.
4.2. Mixed Signal Implementation .
75

VDD VDD

Vctrl 4

Figure 4.12: Unit cell of fine DTC.

Figure 4.13: Monte Carlo simulation of fine cell.


.
76 4. ADPLL Implementation in Transistor-Level

05:19:27 Thu Oct 16 2014

resolution_sf
Name
resolution_sf (pass) 50.0

mu = 58.2435p
sd = 1.21292p
npass = 200
nfail = 0

40.0

30.0

20.0

4 10.0

0.0
54.0 55.0 56.0 57.0 58.0 59.0 60.0 61.0
(p)

Printed on Page 1 of 1
by nxp68517

Figure 4.14: Monte Carlo simulation of coarse cell.

4.2.3. Coarse Bank of DTC


As shown in Figure 4.15, 16 stages of current starved unit cell construct the
coarse bank of the DTC with a resolution around 58 ps in order to cover the required
ckvd2 range. In order to offer a large delay step at low power, two control signals
are added in order to get large delay step size. One is introduced at the NMOS of
the first stage for signal pulling down and the other one is the introduced at the
starving PMOS of the second stage for pulling up, mainly impacting the rising edge
since we do not care about the timing information carried by the falling edge of
reference clock.
The layout is done in an S shape to reduce the linear gradient introduced
mismatch during processing, as shown in Figure 4.21. The mismatch based on
Monte-Carlo simulation is shown in the Figure 4.14.
In order to have a taste of the potential INL and DNL in real case, the mismatch
extracted from Monte Carlo simulation is modelled in the delay line in Verilog and
the transfer function of the coarse-fine DTC is shown in Figure 4.17, together with
INL, DNL 4.18. However, the INL is based on best-fit straight line method.
In order to let the bandwidth behaving stable-mainly determined by reference
jitter-the jitter accumulated from the delay chain has to be rather low and the
4.2. Mixed Signal Implementation .
77

VDD

VDD VctrlP

Current starve
transistor

VDD

VctrlN Current starve


transistor

Figure 4.15: Unit cell of coarse DTC.

Figure 4.16: Simulated DNL in coarse band due to layout.


.
78 4. ADPLL Implementation in Transistor-Level

DTC transfer function


1200
With Mismatch
Ideal

1000

800
Delay

600

4 400

200

0
0 100 200 300 400 500 600
DIGITAL Control CODE

Figure 4.17: Transfer function of DTC including mismatch.

DNL & INL


2
INL
DNL

1.5

1
(LSB)

0.5

−0.5

−1
0 100 200 300 400 500 600
DIGITAL Control CODE

Figure 4.18: Linearity of DTC including mismatch.


4.2. Mixed Signal Implementation .
79

simulated result is shown in Figure 4.19, the integrated rms jitter is 156 fs while
the phase noise is also far below reference’s level (-140 dBc/Hz).

x 10 Integrated jitter from DTC delay chain, rms value=156 fs phase noise of DTC delay chain
−16
3.5 −150

3 −152

−154
2.5

Noise [dBc/Hz]
−156
2
jitter [s]

−158
1.5
−160

1
−162

4
0.5 −164

0 4 5 6 7 8
−166 4 5 6 7 8
10 10 10 10 10 10 10 10 10 10
Frequency [Hz] Frequency [Hz]

Figure 4.19: (1)JEE of DTC delay chain.(2) Phase noise of DTC delay chain.

The power of the whole DTC is only 20 𝜇W which is rather low. The INL would
be as large as 4 ps.

4.2.4. Input Buffer of DTC


The reference clock is fed externally to the DTC via a buffer. Square wave
reference clock is available in the lab so DC coupled buffer is used here for simplic-
ity. The phase noise of the FREF clock should be kept as low as possible since it
contributes to the output noise floor. The phase noise contribution from the input
reference buffer is mainly the flicker noise of the very first stage, and thus large
length of transistors should be utilized. The next stage ensures rail-to-rail output
to drive a long line from the pad to DTC, where it is buffered again. The structure
is basically a cascaded two stage buffer and the phase noise accumulated from
input buffer together with DTC delay chain is simulated. Based on the simulation,
the phase noise is kept lower than the reference’s floor so still the bandwidth and
Bang-Bang transfer function is dominated by reference noise, as shown in Figure
4.20.

4.2.5. Summary of the proposed DTC


The total power consumption of the coarse-fine DTC is less than 20 𝜇 W with a
tolerable linearity. Layout is done in S shape instead of linear way to reduce impact
from linear gradient mismatch. The layout is shown in Figure 4.21.
.
80 4. ADPLL Implementation in Transistor-Level

phase noise of reference buffer


−135

−140

−145
Noise [dBc/Hz]

−150

−155
4
−160

−165 2 3 4 5 6 7
10 10 10 10 10 10
Frequency [Hz]

Figure 4.20: Input buffer phase noise.

In this design, the fractional phase error actually is tracked by a 2-stage DTC in-
stead of complex TDC at a good resolution and linearity with low power. Bang-Bang
Phase Detector is mainly situated for giving the feedback information. Compared
with coarse-fine TDC proposed in [12], the DTC based phase detector is really en-
joying the benefits from phase prediction.

4.2.6. Supply Noise Monitor of DTC


However, one thing we need to consider is the impact from supply noise on
this single delay line DTC. A dirty ground as well as supply noise would make the
fixed delay of the single delay line based DTC bouncing together with the ground
and thus modulating the input of the Bang-Bang phase detector. For low power
purpose, a simple solution is to introduce a dummy delay line as a 1st order noise
”monitor”. As long as the dummy line experienced by 𝐶𝐾𝑉 is sharing the same
ground as well as supply, the global bouncing effect is reduced to a large extent.
This is shown in Figure 4.22
However, the high frequency could not be fed directly into the dummy delay line
otherwise the power would be extremely high. Thus special care has to be taken
4.2. Mixed Signal Implementation .
81

Figure 4.21: DTC top-level layout.

9''ZLWKQRLVH

    DTC

Fref Fref_delay

   

CKV CKV_delay
Supply noise
cancellation replica

Figure 4.22: Supply noise monitor concept for single delay line based DTC.
.
82 4. ADPLL Implementation in Transistor-Level

here. The high frequency variable clock CKV could be first clock gated according
to reference’s edge, as shown in Figure 4.23 (CKV has to go through dummy path
before time freezer).

CKVD2

D
Q
CK
D R
QB CKV_gated

R
Fref

4
Fref Fref_dly CKV

Figure 4.23: Gated CKV generation.

Then gated CKV ”ckv_gated” is fed into the replica supply noise monitor as
mentioned above, and shown in Figure 4.24. Now a supply noise monitor imple-
mented by a clock gated replica delay path with low power is achieved in a low
power way. Since only one OR gate is toggling at high CKV frequency, the power is
really low with this structure (delay line is still toggling at reference clock rate). In
a word, the potential supply noise introduced spur is suppressed for the first order.

Fref
DTC Fref_dly
Fref_dly
TDC/!!
CKVG

Time
Replica Fref_dly
CKVG
CKV_gated CKV_gated_dly Freezer

Figure 4.24: Replica for noise cancellation.

4.2.7. Bang-Bang Phase Detector


A single TSPC (shown in Figure 4.26) is preferred due to its compatibility with
digital flow as well as considering the limited time for implementation. However,
there are two variable periods before its output being sampled (depicted in Figure
4.27), thus plenty of time is ensured for the DFF to reach a well-defined output
level. In this way metastability is not a serious issue in this design.
4.2. Mixed Signal Implementation .
83

4.2.8. Counter Design

Register Register Register


1 A B OUT
T Q T Q T Q

Q Q Q
clk clk clk

IN
Toggle Register
IN
Register
A T
B
D

clk
Q

Q
Q

Q
4
OUT
clk

Figure 4.25: Synchronous counter basic cell [13].

Counter block performs the tracking of the integer phase of the variable clock by
counting the number of ckvd2 clock cycles. As discussed previously, a 7-bit integer
counter is required. The counter could be implemented either as synchronous or
asynchronous logic. In a synchronous counter, all the flip-flops are clocked by the
input clock. This may leads to a large power consumption but the outputs are
synchronized and the delay is small. Though a asynchronous counter ( it behaves
like a chain of divider-by-2) can save power, the robustness is not satisfying. It
is sure that the output of each stage is only available after the preceding stage’s
output changing with some delay. In order to compensate such a delay, long delay
chain is required for each digit output to ensure resampled clock can sample the
right output. However, the delay chain is PVT sensitive and the delay chain for the
first several high frequency bit would burn a lot of power. Once MSB is sampled at
a improper time then the PLL may loss lock.
Thus from robustness aspect, a synchronous counter is implemented as shown
in Figure 4.25. Though effort spent on making it contain standard cell only, the DFF
in 40nm can not handle frequency of 1.5 GHz under 0.8 V in post-layout simulation.
Thus custom designed TSPC DFF is used to replace the standard DFF. The TSPC
.
84 4. ADPLL Implementation in Transistor-Level

is shown in Figure 4.26. Though the core part burns more power than the asyn-
chronous one, it still wins in terms of the fact that no delay compensation needed
and the final power for the counter is 60 𝜇 W.

VDD VDD VDD VDD VDD

' FON 5HVHW

4
4%
FON FON

4
FON

Figure 4.26: Conventional TSPC DFF.

4.2.9. Time Freezer Design


According to Eq. 4.6, the power consumption is proportional to the signal
frequency. Thus the operation frequency of feedback phase detector needs to be
clock gated since its output is only sampled at the reference rate. The snapshot
clock gating trick is a main reason for reference design [2] to achieve sub-mW
power. Thus this technique is reused also in the thesis design to clock gate down
DCO output signal as one of the input of Bang-Bang phase detector. The generic
thinking for the retiming logic is shown in Figure 4.27. The delayed reference clock
rising edge “𝐹𝑅𝐸𝐹 ”would open a window transparent for the rising edge of
CKVD2 to generate the frozen signal CKVD2f. Then after two CKVD2 clock, CKR
is generated as the resampled reference clock. Time margin of two CKVD2 clock
periods (1.6 ns) is ensured to avoid metastability, realized by DFF sampling. After
these the window is closed for CKVD2 . This scheme is like the ”bullet time” or ”time
freeze” in Hollywood movies which is characterised by the extreme transformation
of time to be slow enough to show normally imperceptible events or movements.
Here the low frequency CKVD2f generation is just a transition in time to detach the
difference between 𝐹𝑅𝐸𝐹 and high frequency frequency CKVD2, which is like to
4.3. RF Implementation .
85

freeze phase error information in time domain.

FREFdly

CKVENB

CKVD2

CKVD2f

CKR
4
terror

Figure 4.27: Timing diagram of time freezer.

However, there is always an additional phase offset introduced during the freez-
ing, which is to say the phase error between two input signal is generally enlarged
due to different logic gates they are passing in this block. This would cause problem
for phase error quantizer since once the distortion is so large that it is away from
linear detectable range then the loop could not lock at all. In [2], delay chains of
buffers are introduced for compensation. However this is really PVT sensitive which
is still risky for the PLL’s locking function. In this design, a dummy logic path as the
one faced by 𝐶𝐾𝑉𝐷2 is added for 𝐹𝑅𝐸𝐹 to generate a compensation signal as
shown in the top path in Figure 4.28.

4.3. RF Implementation
DCO is the soul of ADPLL as it offers the most important output carrier signal.
As shown in Figure 1.2, a DCO is at the heart of an all-digital phase-locked loop
(ADPLL). With a digital input called as oscillator tuning word (OTW), a sinusoidal
output with frequency proportional to the OTW could be gotten. Metrics such as
power consumption, area, and phase noise profile are largely depending on this
block. It is the horse when comparing designing of ADPLL to riding a horse, without
whom there is no meaning to talk about ”PLL” at all (actually it then becomes a
delay-locked loop (DLL)).
General design consideration would be discussed first in this section, followed
by implementation of key building blocks such as inductor, gm pair and capacitor
.
86 4. ADPLL Implementation in Transistor-Level

D D
4 R
Q
R
Q
Fref_dly_compensated

Fref_dly

CKV

D
CKR
Q
CK
D R
QB CKVG

R
Fref_dly

Window operation

D D D
Q Q Q
R R R
CKR
CK

Figure 4.28: Time freezer.


4.3. RF Implementation .
87

array. Post-layout simulation results would be also shown to prove the design idea.

4.3.1. DCO Design


In terms of limited time and low power target, most of the effort for DCO
design is spent to reducing the power based on conventional architecture for safety.
Thus the classic capacitor array based LC DCO is realized here as depicted in [1].
Compared with DAC based LC DCO (DAC+VCO), this structure is more compatible
with digital loop filer and more proper for advanced technology nodes such as 40
nm here. Negative resistance is used to compensate the loss from the parasitic in LC
tank to sustain oscillation. The frequency tuning could be realized by changing the
capacitance rather than the inductor for convenience in monolithic implementation.
4
Switched Inductor would also face even lower quality factor (Q) issue. Hereby, the
digital control is realized by a capacitor array tuned by OTW. The frequency of the
DCO with N capacitor cells and an inductance L is given by
1
𝑓= (4.15)
2𝜋√𝐿 • Σ 𝐶

In this work, a DCO covering a frequency range of 2.2 to 3 GHz with a phase
noise of -110 dBc/Hz at 1 MHz offset from the oscillating frequency and a raw fre-
quency resolution of 50 kHz is targeted. The design goal is to achieve this phase
noise performance over the required tuning range with minimum power consump-
tion. The design variables are the choice of inductor and capacitor values, width
and length of the transistors.

a. Inductor
The power dissipated in the LC tank needs to be compensated so as to sustain
the oscillation, which places a lower limit on the power consumption of the DCO.
This loss power can be calculated by a simple tank model. The loss in the inductor
due to the finite resistance of the metal is represented as a series resistance 𝑅 .
The quality factor of the on-chip inductors is usually worse than that of the capacitor
banks and hence they dominate the overall energy loss of the LC tank. Hence, the
capacitor losses are neglected. The negative resistance is used to model the active
element that compensates the LC tank loss. The maximum energy stored in the
inductor and capacitor is equal and is given by
1 1
𝐸 = 𝐿𝐼 = 𝐶𝑉 (4.16)
2 2
.
88 4. ADPLL Implementation in Transistor-Level

where the 𝐼 and 𝑉 represent the peak values across the LC tank. The
power loss due to 𝑅 can be expressed as
1
𝑃 = 𝐼 𝑅
2
1 𝐶𝑉 𝑅
=
2 𝐿
(4.17)
1𝑉 𝑅
=
2 𝐿 𝜔
1𝑉
=
2 𝐿𝜔𝑄
4 Where 𝜔 is the operating frequency and 𝑄 is the quality factor of the inductor
defined as the ratio between imaginary part of the impedance over its real part.
𝜔𝐿
𝑄 = (4.18)
𝑅
According to Eq. 4.17, we can see that larger inductance and better Q is in favour
of lower power consumption. However, regarding choosing the inductance, upper
limit has to be set. Because for a certain center frequency, larger L means smaller
headroom for capacitance to be tuned which limits the tuning range of the DCO.
What’s more, a larger L means more coupling capacitance to the substrate and thus
a lower self-resonant frequency, beyond which the coils could not be considered
as inductor anymore. Regarding Q, it is more or less limited by the technology.
After estimating the amount of fixed parasitic capacitance from the capacitor banks
and active part from post-layout extraction, a value of 8 nH is chosen after some
iterations. Figure 4.29 shows the plot of the inductance as a function of frequency.
The self-resonance frequency, beyond which an inductor behaves like a capacitor,
is sufficiently far (>5.5 GHz) from the required operating frequencies (2.3–3 GHz).
Figure 4.30 shows the plot of the quality factor of the inductance with fre-
quency. The quality factor is slightly less than 19.

b. Active Part
In terms of reducing power, it seems to be a golden rule that the NMOS-
PMOS complementary pair would achieve lower power compared with single switch
structure such as NMOS-only in term of the fact that the current can get reused
by additional gm. However, this is the truth for high supply. Due to the additional
PMOS pair, voltage headroom is reduced and can not win anymore under low supply
4.3. RF Implementation .
89

−7
x 10 Inductance
1.5

1
Inductance [H]

0.5

4
−0.5

−1
0 2 4 6 8 10
Frequency [Hz] 9
x 10

Figure 4.29: Inductor over frequency.

Quality of Inductor at tt corner


20

18

16

14

12
Q value

10

0
1 1.5 2 2.5 3 3.5 4 4.5 5
Frequency [Hz] x 10
9

Figure 4.30: Q over frequency.


.
90 4. ADPLL Implementation in Transistor-Level

such as 0.4 V in GlobalFoundries 40 nm-LP. Besides, the output swing is also limited
by the PMOS. However, for single switch such as NMOS-only structure, the swing
could be as high as double of supply since the oscillator is DC biased at VDD. In
this ADPLL, the dynamic divider can be used directly as a buffer for the feedback
loop (In order to be connected to PA in TX path, additional AC coupled buffer is
required for sure but it does not contribute power to the PLL budget anymore).
However such a dynamic divider has a requirement for a larger input driving swing
compared with CML based divider. Considering the buffer in [2] consumed more
than 130 𝜇W finally, NMOS-only structure is adopted here for a peak-to-peak swing
at least as large as 450 mV to drive the divider directly without a DCO buffer. The
4 supply can be biased as low as 0.35 V. Thus finally much lower power is consumed
in the RF part. The implemented circuit is shown in Figure 4.31. Current source is
also removed for the lower DC bias.

Figure 4.31: NMOS only DCO designed.

c. Capacitor Array
Design of a capacitor array is rather challenging since a tuning range from 2.2
GHz to 2.8 GHz has to be covered with step size as fine as 50 kHz. This would result
in an inconceivable huge array if thermal structure is adopted. Thus segmented
structure is popular by dividing the capacitor array into 3 banks: PVT bank to cover
4.3. RF Implementation .
91

the full required tunable range with margin with most coarse step, Acquisition bank
to cover one PVT step with much more margin at medium step and Tracking bank
to cover a acquisition step with far more headroom at the required resolution. If
even finer resolution is required then ΣΔ modulator could be used.
The step size requirement could be related to the relationship as

△𝑓 = (2𝜋 𝐿𝑓 )△𝐶 (4.19)

Based on Eq. 4.19, the resolution of each bank is determined by the resolution of
the capacitor array.
Another important requirement of the capacitor banks is their quality factor.
4
The total tank quality factor is given by the mathematical parallel combination of
the inductor and capacitor quality factors. Hence, quality factor of the capacitor
banks should be made larger compared to that of the inductor so that overall Q is
limited by that of the on-chip inductance. As mentioned earlier, a Q around 20 for
the inductor is achievable. Hence, a quality factor at least as large as 50 is targeted
for the capacitor banks.

c.1 PVT Bank


The PVT bank together with the mode-selection bit determine the tuning range
of the DCO. Shown in Figure 4.32, a switched MoM architecture is used here as the
unit cell because it can provide large on-off step size.

Enb
C C

En Enb
Enb Enb

Figure 4.32: PVT bank unit capacitor cell.

When the switch is on, the equivalent capacitance is a parallel of two 𝐶 so


we can get as the on capacitance while when the switch is off, the capacitance
over the two ends is determined by the parasitics of the switch and MoM capacitors.
.
92 4. ADPLL Implementation in Transistor-Level

In on state, Quality factor of the capacitor is degraded mainly by on resistance from


the switch. Recall 𝑅 as discussed previously, we need to increase the aspect ratio
or the overdrive voltage. To maximize the overdrive voltage, two pull-down NMOS
transistors are used to pull the DC voltage of source and drain terminals of the MOS
switch to ground in ON state. The aspect ratio of the MOS switch is then increased
until its 𝑄 reaches 50.
Under off state, we have
𝐶 𝐶
𝐶 = (4.20)
𝐶 + 2𝐶

4 Where the 𝐶 is mainly contributed from drain and source junction capacitors
of the NMOS switch, drain junction capacitance of the pull-down transistors, and
the parasitics from the MoM capacitor to substrate. The Q in the off-state is largely
determined by that of the MoM capacitors which is quite high.
The 5-bit control code is binary weighted and the number of unit cells controlled
by n-bit is corresponding to 2 . The on-off capacitance is simulated based on
extracted value from PVS.
Besides, quality factor is also ensured as shown in Appendix.

Figure 4.33: PVT bank layout.

Layout is shown in Figure 4.33. Capacitance, Q and cover range simulation


results are attached in Appendix A.
4.3. RF Implementation .
93

c.2 Acquisition Bank


Structure same as PVT bank is used to implement the 6-bit medium resolution
bank. The only difference is a smaller MoM capacitance is required now. The Q
value can easily meet 50 target since 𝑄 = .
/
The on-off capacitance is simulated based on extracted value from PVS.
Besides, quality factor is also ensure as shown below,
Layout is shown in Figure 4.34. Capacitance, Q and cover range simulation
results are attached in Appendix A.

Figure 4.34: Acquisition bank layout.

c.3 Tracking Bank


Tracking bank is still MoM capacitor based due to its less vulnerability to PVT
variations compared with MOS varactors.
As shown in Figure 4.35, under off-state the 𝐶 is in series to 𝐶 and the
equivalent capacitance is
𝐶 𝐶
𝐶 = (4.21)
2(𝐶 + 𝐶 )
However when it comes to on state, the equivalent capacitance is now and
the capacitance step is then obtained by

𝐶
△𝐶 = (4.22)
2(𝐶 + 𝐶 )
.
94 4. ADPLL Implementation in Transistor-Level

Cb Cb

Enb
Cs Cs
En Enb

Enb Enb

4
Figure 4.35: Tracking bank unit cell.

Table 4.1: Capacitor Array Summary

Bank Unit Cap Number of bits

PVT 10 fF 5 bits

Acquisition 1.13 fF 6 bits

Tracking 14 aF 9(6MSB+3LSB)

From which we know that a smaller 𝐶 and a larger 𝐶 means the finer resolution.
Tracking bank is segmented also in terms of less area and less fixed capacitance.
The on-off capacitance is simulated based on extracted value from PVS.
Layout is shown in Figure 4.36. Capacitance, Q and cover range simulation
results are attached in Appendix A.
The capacitor bank’s performance could be summarised in Table 4.1.

d. DCO summary
The top-level DCO layout is shown in Figure 4.37. The size is 270 𝜇m * 390
𝜇m which is mainly determined by the inductor.
The power consumption is only 168 𝜇W for the lowest output frequency while
offering a peak-to-peak output swing larger than 500 mV, as shown in Figure 4.38.
From Figure 4.39 the phase noise is about -109 dBc/Hz at 1 MHz offset.
4.3. RF Implementation .
95

Figure 4.36: Tracking bank layout.

4.3.2. Dynamic Divider


As shown in the system diagram before, a divider-by-2 is used right after DCO
output to drive the phase detector block and make it easier for counter to operate
at a lower frequency. Transmission gate based dynamic divider is chosen for its
low power and low phase noise. Being able to be driven by the sinusoidal output
from the DCO directly, this kind of divider can interact with phase detector directly
without the buffer. Considering the big power consumption of the buffer in the
loop, buffer is finally removed from the feedback loop and not included in the final
power budget.

For Quarter Phase Assisted


The schematic of the dynamic divider used in this design is shown in Figure
4.40 . The divider-by-2 contains four transmission gates and inverters connected
in a loop. Each transmission gate is controlled by the differential input clock and
only two of them are conducting in each half of the input clock period.
The most important reason to use this divider is not only because of the low
power but due to the rail-to-rail output swing which can already interact with the
following stages such as the time freezer. With the four phase, a DTC could also
be done at the variable output path. This is depicted in Figure 4.41. By inserting
a MUX, we can always choose the phase of the ckvd2 closest to the reference
.
96 4. ADPLL Implementation in Transistor-Level

Figure 4.37: DCO layout.


4.3. RF Implementation .
97

Transient simulation of DCO


0.7

0.6

0.5

0.4
Time [s]

0.3

0.2
4
0.1

0 −7.01 −7
10 10
Amplitue [V]

Figure 4.38: DCO transient simulation result.

phase noise of DCO


−40

−50

−60

−70
Noise [dBc/Hz]

−80

−90

−100

−110

−120

−130

−140 2 4 6 8 10
10 10 10 10 10
Frequency [Hz]

Figure 4.39: DCO transient simulation result.


.
98 4. ADPLL Implementation in Transistor-Level

CKV CKV
I+ Q+

rst rst
CKV CKV

VDD VDD
nrst nrst
CKV CKV
Q-
I-

4 CKV CKV

nrst rst

Figure 4.40: Transmission based dynamic divider.

edge. Instead of turning on more delay at the reference path, we can just choose
the closest variable phase according to the accumulated fractional reference phase
𝑅 , ’s relationship with 0.25, 0.5 or 0.75. Thus the required coarse DTC range is
reduced.
However, the layout should be carefully drawn as shown in Figure 4.42.
In simulation and model, this architecture consumes similar power to the pure
coarse-fine DTC structure and the mismatch from the four phase can bring serious
fractional spur issue. Thus this method is not implemented in this tape out. The
phase noise of the divider should be also kept much lower than the DCO’s output
as shown in Figure 4.43
4.3. RF Implementation .
99

TDC
coarse quantize

{
fine

{
FREF

0 1 2 3 4 Tv 4
CKV_0 CKV_3 CKV_2 CKV_1 CKV_0_next

Figure 4.41: Quadrature phase assisted Coarse-Fine DTC.

Figure 4.42: Layout of quadrature phase divider.


.
100 4. ADPLL Implementation in Transistor-Level

phase noise of divider


−110
4
−115

−120
Noise [dBc/Hz]

−125

−130

−135

−140 2 4 6 8 10
10 10 10 10 10
Frequency [Hz]

Figure 4.43: Simulated phase noise of divider.


References .
101

References
[1] R. B. Staszewski and P. T. Balsara, All-digital frequency synthesizer in deep-
submicron CMOS (John Wiley & Sons, 2006).

[2] V. Chillara, Y.-H. Liu, B. Wang, A. Ba, M. Vidojkovic, K. Philips, H. de Groot, and
R. Staszewski, 9.8 An 860𝜇W 2.1-to-2.7GHz all-digital PLL-based frequency
modulator with a DTC-assisted snapshot TDC for WPAN (Bluetooth Smart and
ZigBee) applications, in Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), 2014 IEEE International (2014) pp. 172–173.

[3] R. B. Staszewski, J. L. Wallberg, S. Rezeq, H. Chih-Ming, O. E. Eliezer, S. K.


4
Vemulapalli, C. Fernando, K. Maggio, R. Staszewski, N. Barton, L. Meng-
Chang, P. Cruise, M. Entezari, K. Muhammad, and D. Leipold, All-digital PLL
and transmitter for mobile phones, Solid-State Circuits, IEEE Journal of 40,
2469 (2005).

[4] M. Park, M. Perrott, and R. Staszewski, An Amplitude Resolution Improvement


of an RF-DAC Employing Pulsewidth Modulation, 58, 2590 (2011).

[5] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, A


2.9-4.0-GHz Fractional-N Digital PLL With Bang-Bang Phase Detector and 560
fs-RMS Integrated Jitter at 4.5-mW Power, Solid-State Circuits, IEEE Journal
of 46, 2745 (2011).

[6] N. Pavlovic and J. R. M. Bergervoet, Digital phase locked loop, (2013), uS


Patent 8,362,815.

[7] S. Henzler, S. Koeppe, D. Lorenz, W. Kamp, R. Kuenemund, and D. Schmitt-


Landsiedel, A local passive time interpolation concept for variation-tolerant
high-resolution time-to-digital conversion, Solid-State Circuits, IEEE Journal
of 43, 1666 (2008).

[8] M. Mota and J. Christiansen, A high-resolution time interpolator based on a


delay locked loop and an rc delay line, Solid-State Circuits, IEEE Journal of
34, 1360 (1999).

[9] L. Minjae and A. A. Abidi, A 9 b, 1.25 ps Resolution Coarse 2013;Fine Time-to-


Digital Converter in 90 nm CMOS that Amplifies a Time Residue, Solid-State
Circuits, IEEE Journal of 43, 769 (2008).
.
102 References

[10] N. Pavlovic and J. Bergervoet, A 5.3GHz digital-to-time-converter-based


fractional-N all-digital PLL, in Solid-State Circuits Conference Digest of Tech-
nical Papers (ISSCC), 2011 IEEE International, pp. 54–56.

[11] J. M. Rabaey, A. P. Chandrakasan, and B. Nikolic, Digital integrated circuits,


Vol. 2 (Prentice hall Englewood Cliffs, 2002).

[12] L. Minjae, M. E. Heidari, and A. A. Abidi, A Low-Noise Wideband Digital


Phase-Locked Loop Based on a Coarse x2013;Fine Time-to-Digital Converter
With Subpicosecond Resolution, Solid-State Circuits, IEEE Journal of 44, 2808
(2009).
4
[13] M. H. Perrott, High Speed Communication Circuits and Systems Lecture 14
High Speed Frequency Dividers.
5
ADPLL Top Level Completion,
Simulation and Test Plan

In previous chapters, the ADPLL has been analysed in system level regarding
basic transfer function and fractional spur estimation, with the insight into
building blocks, simulation methodology and performance analysis. To forge
all these design considerations into a feasible solution (finally a working chip)
takes significant effort at the top level of the system. This chapter presents
the top-level layout completion, summarized top-level simulation results and
the final test plan.

5.1. Top Level Layout


he top level layout with pin-out is shown in Figure 5.1. The chip size is 1.1 mm
T * 1.1 mm. As can be seen in this figure, top-level the ADPLL is divided into
three parts as explained in Chapter 4: DCO block (RF), DTC block (mixed signal)
and Digital (digital logic and SPI interface). There are 5 different supplies:

1. 𝑉𝐷𝐷 𝑉𝐷𝐷 is the supply of the fractional phase error counter


which is implemented as DTC and Bang-Bang phase detector here.

103
.
104 5. ADPLL Top Level Completion, Simulation and Test Plan

2. 𝑉𝐷𝐷 𝑉𝐷𝐷 is the low DC bias supply of the DCO.

3. 𝑉𝐷𝐷 𝑉𝐷𝐷 is the supply for all the synthesized digital blocks.

4. 𝑉𝐷𝐷 𝑉𝐷𝐷 is the supply of the variable counter to isolate its fre-
quency toggling impact from supply sensitive DTC.

5. 𝑉𝐷𝐷 𝑉𝐷𝐷 is the supply for the output buffer since power con-
tributed from output buffer is not included in this ADPLL design.

5.2. Serial Peripheral Interface


5
The SPI block is an implementation of the slave device in the Serial Peripheral
Interface (SPI) Bus with some registers. The SPI Bus is chosen as it is simple,
flexible and can have a high data transmission rate (can work at MHz or tens of
MHz). The SPI block is clocked by the external clock. Before the start-up of ADPLL,
the SPI block will read the desired information from the off-chip controller via the
signal line MOSI and then save the information to registers. The registers specify
the select, enable and reset timing signals to the sequencer block or the low speed
digital block or the blocks in the mixed signal part. Therefore the working mode
and the status of ADPLL can be flexible. During the transient of ADPLL operation,
some signals within the ADPLL system can be saved to the registers. The SPI block
can read the saved signals and send them to the external controller via the signal
line MISO, either for monitoring purpose or for test purpose. One extra input signal
SPI RESET from pad is used to reset the registers. SPI is added on top of the digital
core part and they are going through the digital flow together. Separate test bench
is built to verify the write-in and read-out function of the synthesized SPI based on
Verilog.
The SPI register table is available at A.

5.3. Top Level Simulation Result


Top level simulation results are presented in this section regarding locking time,
power consumption and phase noise.
5.3. Top Level Simulation Result .
105

Figure 5.1: Top-level ADPLL layout.


.
106 5. ADPLL Top Level Completion, Simulation and Test Plan

5.3.1. locking behaviour


The locking behaviour is simulated both in Verilog based model and the AMS
simulator based simulation as mentioned in Chapter 3. The frequency locking be-
haviour of channel FCW= 37.15 is shown in Figure 5.2. Besides, according to the
sweeping of different channel in Verilog based simulation, the locking time varies
from 20 𝜇s to 30𝜇s from channel to channel.

Figure 5.2: ADPLL locking behaviour.

5.3.2. Power
The power of digital part is simulated based on schematic level only. Other
parts are simulated based on PVS extracted file with post-layout simulation. Th
power budget is shown in Figure 5.3 as well as Table 5.1. The total power is less
than 500 mW and from the pie diagram we can see this PLL’s power consumption
is mainly dominated by the DCO part.

5.3.3. Output spectrum


According to theory analysis, this coarse-fine phase predictor based Bang-Bang
ADPLL’ in-band phase noise floor should be dominated by the levelled up reference
noise. Besides, there should be no spur once the PLL is perfect with no nonlinearity
5.3. Top Level Simulation Result .
107

Table 5.1: Power results

Block Power (𝜇W)

DCO+divider 210

DTC+BangBang 20

Low speed digital 110/165 (estimated as 1.5 times larger in real)

Counter+time freezer 65

Total 460 with digital 1.5 times enlarged for estimation

as long as the gain is rightly estimated and this can be observed in Figure 5.4. Since
the 2 ps resolution of DTC is at the same level as thermal noise, the potential spurs
due to quantization are already randomized into noise. The black curve is the the- 5
oretically estimated result based on TDC based ADPLL model. Here, as mentioned
before, the bandwidth of Bang-Bang based DPLL is shaped by the thermal noise and
thus the real bandwidth is expected to be with slight difference compared with TDC
based DPLL’s theoretical bandwidth (depicted in black). The integrated RMS jitter
is 600 fs now even in close integer channel. However, this could be explained: as
long as the DTC is fine enough with good linearity, the Bang-Bang phase detector
is almost working at an integer-N mode, and this is the channel which it is known
for good performance.

Power Budget of ADPLL

Counter+Time
Freezer
16% DCO+divider
Low speed 49%
Digital
30%
DTC+BangBang
5%

DCO+divider DTC+BangBang Low speed Digital Counter+Time Freezer

Figure 5.3: ADPLL power consumption in pie chart.

Based on this, the PLL with no linearity issue would generate almost the same
phase noise for each channel according to simulation.
.
108 5. ADPLL Top Level Completion, Simulation and Test Plan

CKV clock: f 0 =2432.062500 MHz, integ PE=0.70020 ps


−80
Simulated
Theory of TDC based DPLL

−100

−120
Spectrum of the2 π*rad SSB phase noise at CKV [dBc/Hz]

−140

−160

−180

−200

−220

−240

5 10
2
10
3
10
4
10
5
10
6

Frequency [Hz] (FFT: len=4645251, rbw=1000)


10
7
10
8
10
9

Figure 5.4: Ideal ADPLL at FCW=38.001 channel.

The fractional spur level in the worst case with linearity issue according to
Monte Carlo post-layout simulation is also good as shown in Figure 5.5. And the
fractional spur’s level of -45 dB could be explained by the formula derived in Chapter
2 with 2 dB deviation. This deviation is mainly resulted from the fact that the real
INL is no longer ideal sinusoidal as assumed for the derivation. However, this level
of spur is already good enough for most of the wireless standards (far beyond the
BLE specification). The integrated RMS jitter is 1.2 ps around.

The fractional spur levels now in both close-integer channels and the rest chan-
nels are reduced to a satisfying level. Simulation is based on a reference input with
thermal noise level at -138 dBc/Hz.

To sum up, for channel with worst fractional spur such as FCW=38.0001, the
result is shown in Figure 5.5 with integrated RMS jitter as large as 1.2 ps. Other
channels with fractional part larger than 0.001 are randomly checked and the inte-
grated jitters are around 0.8 ps (other normal channels including the close integer
one such as FCW=38.001 are more or less the same). Result for the case with
channel FCW=38.001 is shown in Figure 5.6.
5.3. Top Level Simulation Result .
109

CKV clock: f 0 =2432.005859 MHz, integ PE=1.13040 ps


−50

X: 2778
Y: −76.11

−100
Spectrum of the2 π*rad SSB phase noise at CKV [dBc/Hz]

−150

−200

−250

−300
10
2
10
3
10
4
10
5 6
10
Frequency [Hz] (FFT: len=4645142, rbw=300)
10
7
10
8
10
9 5

Figure 5.5: ADPLL at FCW=38.0001 channel as worst case.

CKV clock: f 0 =2432.062500 MHz, integ PE=0.82622 ps


−50

X: 3.111e+04
Y: −74.52

−100
Spectrum of the2 π*rad SSB phase noise at CKV [dBc/Hz]

−150

−200

−250

−300
2 3 4 5 6 7 8 9
10 10 10 10 10 10 10 10
Frequency [Hz] (FFT: len=4645251, rbw=300)

Figure 5.6: ADPLL at FCW=38.001 channel as normal case.


.
110 5. ADPLL Top Level Completion, Simulation and Test Plan

5.3.4. Comparison with state-of-art


There is no meaning to talk about FoM at this moment since everything is not
measure yet. However, we can still calculate the FoM based on simulated results to
see in the ”best case”, what this design would be in potential compared with others
[1] [2] [3] [4] [5] [6] [7].

State of Art Fractional-N PLL


10

ISSCC'12 CP[1]

ISSCC'14[2]
Integrated Ji er (ps^2)

1 ISSCC'13[3] ISSCC'04[7]

This Work

5 ISSCC'11[5]

JSSC'11’bangbang[4] JSSC'04[6]

0.1
0.1 1 10 100
Power (MW)

Figure 5.7: Comparison with state-of-art by FoM.

According to FOM definition as,

𝜎 𝑃
𝐹𝑜𝑀 = 10 log[ ∗( )] (5.1)
1𝑠 1𝑚𝑊
In the worst case ( 1.2 ps RMS jitter together with 460 𝜇W power consumption),
the estimated FoM could be as good as -241.7 dB while in the good case (0.8 ps
integrated jitter) the FoM could be as good as -245 dB. However, certain deviation
would always happen and that is why I put a red range around the best FoM point
to represent a tolerable as well as normal deviation between simulation and mea-
surement, considering noisy environment, complex nonlineairty and so on. This is
shown in Figure 5.7.

5.4. Test Plan


Assisted by SPI implementation, the separate block’s test and internal signal
observation are available:
5.4. Test Plan .
111

DCO open loop test The open loop test mode of DCO is available as men-
tioned in Chapter 4.It is realized by the fact that the OTW can be read from the
external DCO control signals via SPI from FPGA. This will be operated before the
close-loop test of the ADPLL.

DTC external control The DTC’s control word can be read from the external
DTC control signals via SPI also from FPGA.

phase error observation The phase error generated from the digital phase
error detection logic can be saved to the registers and read out also via the SPI.

PCB The PCB is implemented with OrCAD and the measurement would start
from middle of December.
5
.
112 References

References
[1] Y.-H. Liu, X. Huang, M. Vidojkovic, K. Imamura, P. Harpe, G. Dolmans, and
H. De Groot, A 2.7 nJ/b multi-standard 2.3/2.4 GHz polar transmitter for wireless
sensor networks, in Solid-State Circuits Conference Digest of Technical Papers
(ISSCC), 2012 IEEE International (IEEE, 2012) pp. 448–450.

[2] V. Chillara, Y.-H. Liu, B. Wang, A. Ba, M. Vidojkovic, K. Philips, H. de Groot,


and R. Staszewski, 9.8 An 860𝜇W 2.1-to-2.7GHz all-digital PLL-based frequency
modulator with a DTC-assisted snapshot TDC for WPAN (Bluetooth Smart and
ZigBee) applications, in Solid-State Circuits Conference Digest of Technical Pa-
pers (ISSCC), 2014 IEEE International (2014) pp. 172–173.

[3] J.-W. Lai, C.-H. Wang, K. Kao, A. Lin, Y.-H. Cho, L. Cho, M.-H. Hung, X.-Y. Shih,
C.-M. Lin, S.-H. Yan, Y.-H. Chung, P. Liang, G.-K. Dehng, H.-S. Li, G. Chien, and
5
R. Staszewski, A 0.27mm2 13.5dBm 2.4GHz all-digital polar transmitter using
34DPA in 40nm CMOS, in Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), 2013 IEEE International (2013) pp. 342–343.

[4] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, A


2.9-4.0-GHz Fractional-N Digital PLL With Bang-Bang Phase Detector and 560
fs-RMS Integrated Jitter at 4.5-mW Power, Solid-State Circuits, IEEE Journal of
46, 2745 (2011).

[5] N. Pavlovic and J. Bergervoet, A 5.3GHz digital-to-time-converter-based


fractional-N all-digital PLL, in Solid-State Circuits Conference Digest of Tech-
nical Papers (ISSCC), 2011 IEEE International, pp. 54–56.

[6] E. Temporiti, G. Albasini, I. Bietti, R. Castello, and M. Colombo, A 700-khz


bandwidth 𝜎𝛿 fractional synthesizer with spurs compensation and linearization
techniques for wcdma applications, Solid-State Circuits, IEEE Journal of 39,
1446 (2004).

[7] R. B. Staszewski, K. Muhammad, D. Leipold, C.-M. Hung, Y.-C. Ho, J. L. Wallberg,


C. Fernando, K. Maggio, R. Staszewski, T. Jung, et al., All-digital TX frequency
synthesizer and discrete-time receiver for Bluetooth radio in 130-nm CMOS,
Solid-State Circuits, IEEE Journal of 39, 2278 (2004).
6
Conclusion

113
.
114 6. Conclusion

6.1. Conclusion for This Work


n this thesis, the analysis and implementation of a sub-half-mW all-digital phase
I locked loop (ADPLL) for frequency synthesis and modulation in ultra-low-power
applications (BLE application) are presented.
In Chapter 2, it has been shown that TDC remains the main bottleneck in
reducing the power consumption of ADPLL to sub-mW level. Besides, the fractional
spurs issue is investigated. Basing on the conclusion drawn from the analysis, a
novel coarse fine phase predictor based Bang-Bang DPLL is proposed to improve
the phase detector’s phase noise performance at low power. Thus fractional spur
as well as power are greatly reduced in system level. Besides, DEM is available
for further improving the linearity. Clock gating technique is realized by the time
freezer block. Also, other tricks such as low power supply noise monitor as well
as quadrature phase divider assisted DTC are also proposed and verified through
simulation.
In Chapter 3, simulation methodology is presented, discussed and compared
6 to the measurement results for verification.
In Chapter 4, circuit level implementation works are summarised. From the
mixed signal aspect, sub-gate resolution DTC delay line is implemented in a simple
but powerful way since the mismatch is reduced to a tolerable level with only 20
uW consumed. What’s more, The RF part power is reduced by using a NMOS-
only structure at low supply. With a large output swing at low power low supply,
buffer is moved out from the PLL loop. Thus power is largely reduced. Regarding
digital part, effort is spent in front-end (RTL to transistor netlist) flow for low power
synthesizing. This also contributes to the final low power budget.
In a word, efforts spent to system level analysis, mixed signal design, RF de-
sign, digital flow, layout are paid back in terms of the reduced power, greatly
enhanced fractional spur compared with the reference design as summarised in
Chapter 5.

6.2. Future Work


The results of this thesis, leave plenty of future work to be done. Apart from
actual measurement of the currently designed chips, a number of points require
further investigation. These can be summarised as follows:
6.2. Future Work .
115

Improving 𝐾 estimation Found during simulation and implementation,


the gain of the DTC is extremely important for the locking and fractional spur per-
formance. Due to limited time allowed, the least squired mean algorithm based
calibration method is used. However a fast and accurate calibration would be really
meaningful for this phase predication based ADPLL design. Because all the consid-
erations as well as worries associated to the nonlinearity of the DTC part would be
relaxed. However, in terms of this coarse-fine DTC, the ratio between coarse stage
and fine stage is a part of the gain for the whole DTC block. Thus the algorithm still
fits well with my structure. Anyway, further effort should be spent on this block to
take fully advantages of the digital calibration friendly characteristic of DTC instead
of conventional TDC.

Improving DCO structure As could be seen in Chapter 4, the class of this


designed DCO is the conventional type with no special tricks for phase noise im-
provement. However, as another main contributor for not only the integrated jitter
but also power, DCO is the soul of the ADPLL and further effort should be spent
in DCO structure. For example, since we are targeting for low supply low power 6
application, the transformer class-F DCO could be investigated to further push the
limit of this proposed system.
A
Appendix

A.1. Capacitor Bank simulation

117
118 A. Appendix

DCO cap_test config 18:45:45 Mon Jun 9 2014

Q:(IF("/V21/MINUS") / (2 * 3.14 * 2.3e+09))


Name
...INUS") / (2 * 3.14 * 2.3e+09)) 15.0

M10: 168.1818m, 12.93457fA

12.5

10.0

Mag (fA)

7.5

5.0

M7: 884.5455m, 2.584615fA


2.5

0.0
0.0 .25 .5 .75 1.0
as

Printed on Page 1 of 1
by nxp69435

Figure A.1: PVT bank unit capacitor cell Capacitance.

Figure A.2: PVT bank unit capacitor cell Q.


A.1. Capacitor Bank simulation 119

Figure A.3: PVT bank swept capacitance.

Figure A.4: Acquisition bank unit capacitor cell Capacitance.


120 A. Appendix

Figure A.5: Acquisition bank unit capacitor cell Q.

Figure A.6: Acquisition bank swept capacitance.


A.1. Capacitor Bank simulation 121

Figure A.7: Tracking bank LSB capacitor cell Capacitance.

Figure A.8: Tracking bank LSB unit capacitor cell Q.


122 A. Appendix

Figure A.9: Tracking bank swept capacitance.

Figure A.10: Tracking bank MSB unit capacitor cell Capacitance.


A.2. SPI register table 123

Figure A.11: Tracking bank MSB unit capacitor cell Q.

A.2. SPI register table

Register BIT Name description Default value


00 0 onon turn on of divider strucutre 1 (off)
1 spitdtc 0
2 dcopath[0] 0
3 dcopath[1] 0
4 dcopath[2] 0
5 bank_en[0] 1
6 bank_en[1] 1
7 bank_en[2] 1
01 0 inv_ktdc[0] 0
1 inv_ktdc[1] 1
2 inv_ktdc[2] 1
3 inv_ktdc[3] 0
4 inv_ktdc[4] 0
5 inv_ktdc[5] 1
6 inv_ktdc[6] 0
7 inv_ktdc[7] 1
02 0 inv_ktdc[8] 1
1 inv_ktdc[9] 1
2 inv_ktdc[10] 1
124 A. Appendix

3 inv_ktdc[11] 0
4 inv_ktdc[12] 0
5 inv_ktdc[13] 1
6 inv_ktdc[14] 0
7 inv_ktdc[15] 0
03 0 DTC_coarse[0] SPI 0
1 DTC_coarse[1] 0
2 DTC_coarse[2] 0
3 DTC_coarse[3] 0
4 DTC_fine[0] SPI 0
5 DTC_finec[1] 0
6 DTC_fine[2] 0
7 DTC_fine[3] 0
04 0 FCW[0] DEFAULT 38 CLOSE IN 0
1 FCW[1] 1
2 FCW[2] 1
3 FCW[3] 0
4 FCW[4] 0
5 FCW[5] 0
6 FCW[6] 0
7 FCW[7] 0
05 0 FCW[8] 0
1 FCW[9] 0
2 FCW[10] 0
3 FCW[11] 0
4 FCW[12] 0
5 FCW[13] 0
6 FCW[14] 0
7 FCW[15] 0
06 0 FCW[16] 0
1 FCW[17] 0
2 FCW[18] 1
3 FCW[19] 0
4 FCW[20] 0
5 FCW[21] 1
A.2. SPI register table 125

6 FCW[22] 0
7
07 0 inv_kdcomod[0] 1
1 inv_kdcomod[1] 0
2 inv_kdcomod[2] 0
3 inv_kdcomod[3] 0
4 inv_kdcomod[4] 1
5 inv_kdcomod[5] 0
6 mod_on 0
7
08 0 memdcop[0] 0
1 memdcop[1] 0
2 memdcop[2] 1
3 memdcop[3] 0
4 memdcop[4] 0
5
6
7
09 0 memdcoa[0] 0
1 memdcoa[1] 1
2 memdcoa[2]] 1
3 memdcoa[3] 0
4 memdcoa[4] 0
5 memdcoa[5] 0
6
7
10 0 memdcot[0] 0
1 memdcot[1] 0
2 memdcot[2] 0
3 memdcot[3] 1
4 memdcot[4] 1
5 memdcot[5] 0
6 memdcot[6] 0
7 memdcot[7]
11 0 memdcot[8] 0
126 A. Appendix

1
2 alphaad[0] 0
3 alphaad[1] 1
4 alphaa[0] 0
5 alphaa[1] 1
6 alphap[0] 0
7 alphap[1] 1
12 0
1 rho[1] 1
2 rho[2] 1
3 rho[3] 1
4 rho[4] 0
5 alphat[0] 0
6 alphat[1] 0
7 alphat[2] 1
13 0 pvt_mode[0] 1
1 pvt_mode[1] 0
2 pvt_mode[2] 0
3 BUFF_en 1
4 iiren 0
5 lambda[0] 0
6 lambda[1] 1
7 lambda[2] 0
14 0
1
2 kdcop[0] 0
3 kdcop[1] 1
4 kdcop[2] 1
5 abmode[0] 0
6 abmode[1] 1
7 abmode[2] 0
15 0 dtc_mu[0] 1
1 dtc_mu[1] 1
2 dtc_mu[2] 0
3 dtc_mu[3] 1
A.3. Derivation for Mismatch for DTC unit cell design 127

4 dtccal 0
5 kdcoa[0] 0
6 kdcoa[1] 0
7 kdcoa[2] 1
16 0 kdcot[0] 0
1 kdcot[1] 1
2 kdcot[2] 0
3 kdcot[3] 0
4 kdcot[4] 0
5 kdcot[5] 1
6 kdcot[6] 1
7 kdcot[7] 0
17 0 Kres[0] 0
1 Kres[1] 1
2 Kres[2] 0
3 Kres[3] 0
4 Kres[4] 0
5 Kres[5] 1
6 Krest[6] 1
7 Kres[7] 0
18 0 Kres[8] 0
1 rotate_en 1
2
3
4
5
6
7

A.3. Derivation for Mismatch for DTC unit cell de-


sign
Here is the details for the derivation procedure from Eq. 4.9 to Eq. 4.10.
Bring Eq. 4.4 into Eq. 4.9, we have
128 A. Appendix

𝜕𝑅 𝜕𝑅
𝜎 = 𝑙𝑛(2)𝐶 √( |∧ ∧ 𝜎 ) + ( |∧ ∧ 𝜎 ) (A.1)
𝜕𝑉 , 𝜕𝛽 ,

𝜎 1 𝜕𝑅 𝜕𝑅
= √( |∧ ∧ 𝜎 ) + ( |∧ ∧ 𝜎 ) (A.2)
∧ 𝑅 𝜕𝑉 , 𝜕𝛽 ,
𝜏

Where |∧ ∧ is
,


𝜕𝑅 𝑅
|∧ ∧ = (A.3)
𝜕𝑉 , ∧
𝑉 −𝑉 −𝑉 /2

and |∧ ∧ is
,

𝜕𝑅 𝑅
|∧ ∧ = − ∧ (A.4)
𝜕𝛽 ,
𝛽

Submitting them into Eq. A.2 we can get the Eq. 4.10.

You might also like