

#### **Copyright Undertaking**

This thesis is protected by copyright, with all rights reserved.

#### By reading and using the thesis, the reader understands and agrees to the following terms:

- 1. The reader will abide by the rules and legal ordinances governing copyright regarding the use of the thesis.
- 2. The reader will use the thesis for the purpose of research or private study only and not for distribution or further reproduction or any other purpose.
- 3. The reader agrees to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

If you have reasons to believe that any materials in this thesis are deemed not suitable to be distributed in this form, or a copyright owner having difficulty with the material being included in our database, please contact <a href="https://www.lbsys@polyu.edu.hk">lbsys@polyu.edu.hk</a> providing details. The Library will look into your claim and consider taking remedial action upon receipt of the written requests.

Pao Yue-kong Library, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong

http://www.lib.polyu.edu.hk

### **Built-In Self-Test Devices and Structures**

## for Mixed Signal Integrated Circuits

.

Yubin Zhang

Master of Philosophy

Department of Electronic and Information Engineering

The Hong Kong Polytechnic University

2006

Pao Yue-kong Library PolyU • Hong Kong

#### **CERTIFICATE OF ORIGINALITY**

I hereby declare that this thesis is my own work and that, to the best of my knowledge and belief, it reproduces no materials previously published or written, nor material that has been accepted for the award of any other degree or diploma, except where due acknowledgement has been made in the text.

Yubin Zhang

## CONTENT

| CERT | TIFICATE OF ORIGINALITY                            | I         |
|------|----------------------------------------------------|-----------|
| CON  | FENT                                               | II        |
| ACK  | NOWLEDGEMENT                                       | IV        |
| PUBL | ICATION                                            | V         |
| CON  | FACT INFORMATION                                   | VI        |
| LIST | OF SYMBOLS AND ABBREVIATIONS                       | VII       |
| LIST |                                                    |           |
|      | OF EQUATIONS                                       | ······IA  |
| LIST | OF FIGURES                                         | X         |
| LIST | OF TABLES                                          | XIII      |
| LIST | OF ALGORITHMS                                      | XIV       |
| ABST | RACT                                               | 1         |
| CHAI | PTER 1 INTRODUCTION                                |           |
| CHAI | PTER 2 INTEGRATED CIRCUIT (IC) TESTING METHODO     | LOGIES 12 |
| 2.1  | SCAN BASED TEST STRUCTURES AND METHODS             |           |
| 2.2  | OSCILLATION BASED TEST METHOD (OTM)                |           |
| 2.3  | QUIESCENT POWER SUPPLY CURRENT $(I_{DDQ})$ TESTING |           |
| 2.4  | MEMS DEVICES TESTING                               |           |
| 2.5  | SIGNAL INTEGRITY TESTING                           |           |
| 2.6  | OTHER TESTING METHODS                              |           |
| CHAI | PTER 3 SELF TESTABLE FULL RANGE WINDOW COMPA       | RATOR     |
| (FRW | 7 <b>C</b> )                                       |           |
| 3.1  | FULL RANGE WINDOW COMPARATOR                       |           |
| 3.2  | SELF-TEST CIRCUIT                                  |           |
| 3.3  | METHODS TO ACHIEVE HIGH ACCURACY                   |           |
| 3.4  | ANALYSIS OF RESISTOR VARIATION IN FRWC             |           |
| 3.5  | SIMULATION RESULT OF FRWC                          |           |
| CHAI | PTER 4 BIST SCHEME USING SCAN CHAIN AND FRWC       |           |
| 4.1  | CORE SELECTING MECHANISM                           |           |
| 4.2  | TEST CONTROLLER                                    |           |
| 4.3  | TESTING INTERFACE IN SOC CORES                     |           |
| 4.4  | CHARACTERISTIC FEATURES OF THE BIST SYSTEM         |           |

| 4.5                                                             | TEST SET-UP AND PROCEDURE                                                                                                                                                                                                                                                                                                                                    |                                                                          |
|-----------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|
| 4.6                                                             | SIMULATION WAVEFORM                                                                                                                                                                                                                                                                                                                                          | 57                                                                       |
| 4.7                                                             | HARDWARE OVERHEAD                                                                                                                                                                                                                                                                                                                                            | 59                                                                       |
| 4.8                                                             | SIMULATION RESULT ANALYSIS                                                                                                                                                                                                                                                                                                                                   | 61                                                                       |
| CHAF                                                            | PTER 5 RAIL TO RAIL VOLTAGE COMPARATOR                                                                                                                                                                                                                                                                                                                       | 64                                                                       |
| 5.1                                                             | FIRST AMPLIFICATION STAGE                                                                                                                                                                                                                                                                                                                                    | 66                                                                       |
| 5.2                                                             | SECOND AMPLIFICATION STAGE                                                                                                                                                                                                                                                                                                                                   | 68                                                                       |
| 5.3                                                             | CURRENT SUMMING STAGE                                                                                                                                                                                                                                                                                                                                        | 69                                                                       |
| 5.4                                                             | OUTPUT BUFFER STAGE                                                                                                                                                                                                                                                                                                                                          | 70                                                                       |
| 5.5                                                             | CHARACTERISTIC DESCRIPTION                                                                                                                                                                                                                                                                                                                                   | 71                                                                       |
| 5                                                               | 5.1 Operating point at 0.02 V                                                                                                                                                                                                                                                                                                                                |                                                                          |
| 5                                                               | 5.2 Operating point at 0.50 V                                                                                                                                                                                                                                                                                                                                |                                                                          |
| 5                                                               | 5.2 Operating point at 0.08 V                                                                                                                                                                                                                                                                                                                                | 02                                                                       |
| Э.,                                                             | 5.5 Operating point at 0.98 V                                                                                                                                                                                                                                                                                                                                | 82                                                                       |
| CHAP                                                            | PTER 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTE                                                                                                                                                                                                                                                                                                              | 82<br>RN                                                                 |
| CHAF<br>COM                                                     | PTER 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTE<br>PACTION                                                                                                                                                                                                                                                                                                   | 82<br>RN<br>87                                                           |
| CHAF<br>COMI<br>6.1                                             | <b>PTER 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTE PACTION</b> BACKGROUND                                                                                                                                                                                                                                                                                    | 82<br>RN<br>                                                             |
| CHAF<br>COMI<br>6.1<br>6.2                                      | <b>PTER 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTE</b><br><b>PACTION</b><br>BACKGROUND<br>INTRODUCTION TO SIGNAL INTEGRITY TESTING                                                                                                                                                                                                                           | RN<br>                                                                   |
| CHAF<br>COMI<br>6.1<br>6.2<br>6.3                               | PTER 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTE<br>PACTION<br>BACKGROUND<br>INTRODUCTION TO SIGNAL INTEGRITY TESTING<br>PROPOSED SIGNAL INTEGRITY TEST PATTERN COMPACTION METHOD                                                                                                                                                                             | RN<br>87<br>                                                             |
| 6.1<br>6.2<br>6.3<br>6.1                                        | <b>PTER 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTE</b><br><b>PACTION</b><br>BACKGROUND<br>INTRODUCTION TO SIGNAL INTEGRITY TESTING<br>PROPOSED SIGNAL INTEGRITY TEST PATTERN COMPACTION METHOD<br>3.1 Test pattern count reduction                                                                                                                           | RN<br>87<br>88<br>91<br>97<br>98                                         |
| CHAF<br>COMI<br>6.1<br>6.2<br>6.3<br>6<br>6                     | <b>PTER 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTE</b><br><b>PACTION</b><br>BACKGROUND<br>INTRODUCTION TO SIGNAL INTEGRITY TESTING<br>PROPOSED SIGNAL INTEGRITY TEST PATTERN COMPACTION METHOD<br>3.1 Test pattern count reduction<br>3.2 Test pattern length reduction                                                                                      | RN<br>87<br>                                                             |
| CHAF<br>COMI<br>6.1<br>6.2<br>6.3<br>6<br>6.4                   | <ul> <li><b>PTER 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTE</b></li> <li><b>PACTION</b></li> <li>BACKGROUND</li> <li>INTRODUCTION TO SIGNAL INTEGRITY TESTING</li> <li>PROPOSED SIGNAL INTEGRITY TEST PATTERN COMPACTION METHOD</li> <li>3.1 Test pattern count reduction</li> <li>3.2 Test pattern length reduction</li> <li>EXPERIMENTAL RESULTS</li></ul> | RN<br>87<br>                                                             |
| CHAF<br>COMI<br>6.1<br>6.2<br>6.3<br>6<br>6.4<br>CHAF           | <ul> <li><b>PTER 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTE</b></li> <li><b>PACTION</b></li></ul>                                                                                                                                                                                                                                                            | RN<br>87<br>88<br>91<br>91<br>97<br>98<br>114<br>118<br>133              |
| CHAE<br>COMI<br>6.1<br>6.2<br>6.3<br>6.3<br>6.4<br>CHAE<br>REFE | <ul> <li><b>PTER 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTE</b></li> <li><b>PACTION</b></li></ul>                                                                                                                                                                                                                                                            | RN<br>87<br>88<br>91<br>91<br>97<br>97<br>98<br>114<br>118<br>133<br>141 |

### ACKNOWLEDGEMENT

Hereby I would like to thank Ir Dr Chi-Kwong Li and Dr Mike W.T. Wong who are my supervisors during my study and research in the Hong Kong Polytechnic University. I am grateful for their warm-hearted advice, direction and help.

#### **PUBLICATION**

- [1] Mike W.T. Wong and Yubin Zhang, "Design and Implementation of Self-Testable Full Range Window Comparator," Proceedings, *IEEE Region 10 International Conference (TENCON*'2004), pp.262-265, Vol. D, November 2004, Chiang Mai, Thailand.
- [2] Yubin Zhang and Mike W.T. Wong, "Self-Testable Full Range Window Comparator," Proceedings, *IEEE Asian Test Symposium* (ATS'2004), pp. 314-318, November 2004, Kenting, Taiwan.
- [3] Yubin Zhang, M. W.T. Wong and C. K. Li, "Built-in Self-test Structure for Analogue Cores in SOC Application using Full Range Window Comparator," accepted, *International Journal of Electronics*.
- [4] Yubin Zhang and Chi-Kwong Li, "Rail-to-rail Analogue Voltage Comparator Based on 1V 50nm CMOS Technology," submitted.

### **CONTACT INFORMATION**

#### Ir Dr Chi-Kwong Li

Email: enckli@

Address:

Department of Electronic and Information Engineering The Hong Kong Polytechnic University Hung Hom, Hong Kong

Phone: (852) 27664693

Fax: (852) 23628439

Mr. Yubin Zhang

Email: yubin.zhang@

Address:

Department of Electronic and Information Engineering The Hong Kong Polytechnic University Hung Hom, Hong Kong

**Phone:** (852) 27664693

Fax: (852) 23628439

## LIST OF SYMBOLS AND ABBREVIATIONS

| ATE   | Automatic Test Equipment                   |
|-------|--------------------------------------------|
| BIST  | Build-In Self-Test                         |
| CMOS  | Complementary Metal Oxide Semiconductor    |
| CSS   | Core Selecting Signal                      |
| CUT   | Circuit Under Test                         |
| DSM   | Deep Sub-Micron                            |
| FRWC  | Full Range Window Comparator               |
| IC    | Integrated Circuit                         |
| ILS   | Integrity Loss Sensor                      |
| I/O   | Input/Output                               |
| IP    | Intellectual Property                      |
| MA    | Maximal Aggressor                          |
| MEMS  | Micro-ElectroMechanical System             |
| MOEMS | Micro-OptoElectroMechanical System         |
| MEFS  | Micro-ElectroFluidic System                |
| МТ    | Multiple Transition                        |
| NMOS  | Negative-channel Metal Oxide Semiconductor |
| OpAmp | Opeartional Amplifier                      |
| OTM   | Oscillation based Test Methods             |
| PMOS  | Positive-channel Metal Oxide Semiconductor |
| SI    | Signal Integrity                           |
| SOC   | System On Chip                             |
| SR    | Slew Rate                                  |

| STC  | Self Test Circuit            |
|------|------------------------------|
| UDL  | User Defined Logic           |
| VF   | Voltage Follower             |
| VLSI | Very Large Scale Integration |
| WIC  | Wrapper Input Cell           |
| WOC  | Wrapper Output Cell          |

# LIST OF EQUATIONS

| Equ. 3-1 |    |
|----------|----|
| Equ. 3-2 |    |
| Equ. 3-3 |    |
| Equ. 3-4 |    |
| Equ. 3-5 |    |
| Equ. 3-6 |    |
| Equ. 3-7 |    |
| Equ. 3-8 | 41 |

## LIST OF FIGURES

| Fig. 2-1. Mixed signal scan chain structure                                     |    |
|---------------------------------------------------------------------------------|----|
| Fig. 2-2. Reconfigurable scan chain structure                                   | 14 |
| Fig. 2-3. Multiple scan chains with data compression                            | 15 |
| Fig. 2-4. Converting an amplifier into an oscillator in test mode               | 18 |
| Fig. 3-1. Window comparator example                                             | 30 |
| Fig. 3-2. The proposed full range window comparator                             | 31 |
| Fig. 3-3. Self test circuit for FRWC                                            | 33 |
| Fig. 3-4. FRWC output identification circuit                                    | 34 |
| Fig. 3-5. Variation range of resistors to ensure the FRWC with 5% resolution    | 40 |
| Fig. 3-6. Result from window comparator corresponding to different $V_i$ values | 41 |
| Fig. 3-7. Accuracy of FRWC corresponding to 5% variation of resistors           | 42 |
| Fig. 3-8. Block diagram of the FRWC setup in self test mode                     | 45 |
| Fig. 3-9. Switch control signals for self test circuit and the generators       | 46 |
| Fig. 4-1. BIST structure based on scan chain and FRWC                           | 50 |
| Fig. 4-2. Test interface for the proposed BIST system                           | 54 |
| Fig. 4-3. Waveform of the simulation result                                     | 58 |
| Fig. 5-1. The proposed voltage comparator circuit                               | 65 |
| Fig. 5-2. PMOS input amplifier                                                  | 67 |
| Fig. 5-3. (a) Cascode current mirror (b) Wide swing cascode current mirror      | 69 |
| Fig. 5-4. Second amplification stage and current summing circuit                | 70 |

| Fig. 5-5. Comparator DC performance within the entire power supply range         | 73 |
|----------------------------------------------------------------------------------|----|
| Fig. 5-6. DC performance with V <sub>-</sub> staying at 0.02V                    | 74 |
| Fig. 5-7. Output transition with V <sub>2</sub> staying at 0.02V                 | 74 |
| Fig. 5-8. Pulse input with V. staying at 0.02V                                   | 75 |
| Fig. 5-9. Comparator's output corresponding to the pulse input shown in Fig. 5-8 | 76 |
| Fig. 5-10. 'Worst case' pulse input with V <sub>2</sub> staying at 0.02V         | 77 |
| Fig. 5-11. Comparator's output corresponding to the input shown in Fig. 5-10     | 77 |
| Fig. 5-12. DC performance with V <sub>-</sub> staying at 0.50V                   | 79 |
| Fig. 5-13. Positive pulse input with V. staying at 0.50V                         | 80 |
| Fig. 5-14. Comparator's output corresponding to the input shown in Fig. 5-13     | 80 |
| Fig. 5-15. Negative pulse input with V <sub>-</sub> staying at 0.50V             | 81 |
| Fig. 5-16. Comparator's output corresponding to the input shown in Fig. 5-15     | 81 |
| Fig. 5-17. DC performance with V <sub>-</sub> staying at 0.98V                   | 82 |
| Fig. 5-18. Pulse input with V <sub>-</sub> staying at 0.98V                      | 83 |
| Fig. 5-19. Comparator's output corresponding to the input shown in Fig. 5-18     | 84 |
| Fig. 5-20. Comparator with rail-to-rail input common-mode range [19]             | 85 |
| Fig. 6-1. Demonstration of signal integrity loss                                 | 90 |
| Fig. 6-2. Distributed RC crosstalk model                                         | 91 |
| Fig. 6-3. Crosstalk model at behavioral level                                    | 92 |
| Fig. 6-4. Test pattern based on MA fault model                                   | 93 |
| Fig. 6-5. Test pattern for three-interconnect crosstalk system based on MT and   | MA |
| (shaded) model                                                                   | 93 |

| Fig. 6-6. Wrapper cell arrangement in interconnect test mode                            | 94  |
|-----------------------------------------------------------------------------------------|-----|
| Fig. 6-7. Detailed structure of wrapper cells for signal integrity test                 | 95  |
| Fig. 6-8. Demonstration of SOC interconnect topology                                    | 95  |
| Fig. 6-9. Complete graph examples 10                                                    | 00  |
| Fig. 6-10. A set of cliques in a graph10                                                | 00  |
| Fig. 6-11. Demonstration of clique partition10                                          | 01  |
| Fig. 6-12. Greedy compression heuristic to compact interconnect SI test patterns 1      | 12  |
| Fig. 6-13. An example of hypergraph and its partitioning                                | 15  |
| Fig. 6-14. Test pattern compaction results on different benchmark circuits with differe | ent |
| number of test patterns                                                                 | 21  |
| Fig. 6-15. Compress two vectors by overlapping non-conflict parts                       | 22  |
| Fig. 6-16. Constructive compression technique12                                         | 23  |

### LIST OF TABLES

| Table 4-1. Hardware overhead of the proposed BIST system         | 0 |
|------------------------------------------------------------------|---|
| Table 5-1. Performance of the proposed comparator                | 4 |
| Table 5-2. Performance of the comparator in [19]                 | 5 |
| Table 6-1. Format of signal integrity test pattern               | 8 |
| Table 6-2.Compression rate for different test pattern sets    12 | 4 |
| Table 6-3. Test pattern compaction on SOC d695    12             | 7 |
| Table 6-4. Test pattern compaction on SOC g1023    12            | 8 |
| Table 6-5. Test pattern compaction on SOC p34392    12           | 9 |
| Table 6-6. Test pattern compaction on SOC p22810    13           | 0 |
| Table 6-7. Test pattern compaction on SOC p93791    13           | 1 |

# LIST OF ALGORITHMS

| Algorithm 6-1. Clique Partitioning Algorithm | 101 |
|----------------------------------------------|-----|
|                                              |     |
| Algorithm 6-2. Greedy Compression Heuristic  | 110 |

#### ABSTRACT

In Integrated Circuits (IC) industry, circuit testing consumes a substantially large portion of the total product cost and design time. IC testing is becoming more significant as the complexity and the integration of current circuits are rising. This is due to the high demand of smaller feature scale of the integrated circuits, being shrunk further down into deep sub-micron (DSM) domain nowadays, and at the same time the complex functions on IC chips that have been going up in an order of magnitude. By designing additional circuits on the IC chip for testing purpose besides those normal logic circuits, the built-in circuits and the corresponding test architectures can make it easier to detect circuit faults and therefore the product design time and cost can be greatly reduced. In addition, test strategies are also required to further reduce testing cost. For example, test pattern generation and test response analysis can be integrated with data compaction algorithm to reduce test data volume and thus reduce test time and cost.

In this thesis, the proposed design of full range window comparator (FRWC) is presented which can be used effectively in IC testing. Such assistant circuits as self test circuit and decision circuit have also been designed to make the FRWC more reliable and accurate. Detailed analysis and simulation have been conducted to show the effectiveness and characteristics of the proposed FRWC design.

A new Built-In Self-Test (BIST) scheme is proposed in the following of this thesis while

this test scheme is based on scan chain structure with the incorporation of the full range window comparator. A number of supporting devices, functional blocks and strategies required in this BIST scheme are also presented in this thesis, such as the core selecting mechanism and the test interface in each core of the System On Chip (SOC). Simulation and analysis of this BIST scheme have also been completed and the details are described in the subsequent chapters of this thesis.

A rail to rail voltage comparator design is represented in the later part of this thesis. The design is based on the BSIM4 50nm CMOS transistor model with power supply of 0-1V. This voltage comparator works with high voltage gain and short delay time. Simulation shows that this voltage comparator can maintain good characteristics even when the input voltages are very close to the rail voltages (power supply voltages). Especially, its transient delay time can maintain short in all testing conditions, which is much better than that of the reference comparators. Detailed analysis and characteristic of this design is described in this thesis.

Finally, a data compaction method is proposed in chapter 6 of this thesis for compressing interconnects signal integrity test pattern. The proposed method not only can reduce the number of test patterns but also can reduce the length of test patterns (the number of bits in a test pattern) so that the total test data volume can be substantially reduced and this in turn proves that the proposed method can substantially reduce the testing time compared with the original uncompacted test patterns.

### **Chapter 1 INTRODUCTION**

Nowadays in semiconductor industry, millions or even billions of transistors can be fabricated on a small area of silicon chip with the help of ever advancing laser-technology and nano-technology being applied into the manufacturing of Very Large Scale Integration (VLSI) circuits [1], which enables the designers to develop very complex systems on a single silicon chip, the so called System On Chip (SOC). However, with this kind of increasing complexity and the shortening of turn around time of new designs, it is not practical any more to design every detail of such chips at gate level. The continuous demand of reducing product cost and time-to-market calls for a much shorter design cycle and substantial reduction of manufacturing cost. Hence, the use of those off the shelf and proven functional blocks, or Intellectual Property (IP) cores [2], becomes a common practice. The reuse of IP cores in integrated circuits design is of tremendous help to shorten the design and development cycle of new system while the IC system is becoming more and more complex and the time-to-market is a crucial factor in most applications especially in the consumer electronics industry. Therefore, a number of IP cores together with User Defined Logic (UDL) circuits are integrated on a chip to make up of a complete functional system, which realizes the System-On-Chip (SOC) into being.

As the value of function per IC area increases, it is important to minimize any wastage hence there is a demand of testing each individual IC being manufactured. At the same time, the issue of IC chips' testing and diagnosing in SOC situation consumes a large portion of the overall product cost and time-to-market, especially for those mixed signal chips. The complexity and difficulty of testing the IC chips have increased due to the following reasons.

(I) Multi-level functional description

The functions of different cores on the same chip may be described at different levels and they are (i) the soft level, (ii) the firm level and (iii) the hard level respectively [3].

(II) Black box concept of IP cores

The core users only have a limited knowledge about the internal structure of the IP cores [4], or it can be said that a core looks like a black box to the users.

(III) Limited access to internal nodes

The accessibility to the internal nodes of a core in SOC, except for those primary output nodes, is limited.

(IV) Large sized test patterns

Test time prolongs as the volume of test data increases, which results in a substantial increase of the overall cost of the chip.

(V) High Cost of ATE

Quite often Automatic Test Equipment (ATE) is used for testing. The complexity of the required ATE drives up its cost as the operating frequency, the functions and the pin count of IC chips continue to increase.

Therefore, nowadays Built-In Self-Test (BIST) methodology becomes more important and necessary to provide solutions alleviating these problems. BIST aims at high circuit testability, short test time and low requirement for external ATE at the cost of small hardware overhead and little modification to the original circuits.

There are many kinds of built-in test structures, test methods and test scheduling algorithms proposed in a number of literatures [3, 7, 9-11, 16, 17, 22-26, 39-44], such as the scan scheme, the oscillating based scheme,  $I_{QQD}$ , reuse of available resources on chip, parallel testing, power supply testing, test architecture for multi-frequency chips and so on. Take the analog voltage scan scheme for an example: The voltages of some nodes in the internal of a circuit, or weighted sum of several nodes' voltages, are the keys to test the corresponding circuit and to diagnose faults occurrence in such circuits. Therefore, voltages scanned from the testing nodes are compared with the pre-calculated values corresponding to fault-free circuits to detect the faulty status of the circuits. For analogue circuits, the voltage of a circuit node is not a fixed value because of the physical parameter variations of the devices fabricated on the IC. For a circuit which can work as the expectation of the designers, an internal node's voltage should be within a certain range which is called the tolerance range. If any of the nodes' voltages falls outside of its corresponding tolerance range, it indicates that there is a

non-compliance and a possible fault could happen on the IC chip which will not guarantee that the circuit can work properly corresponding to the designer's specification and expectation.

If a functional block built on the chip can effectively judge whether voltages scanned from internal nodes of the circuit are within their tolerance ranges corresponding to the value when there is no fault happening to this circuit, it will be of much help to detect and diagnose the faults on the chip. Window comparator is one kind of such functional block. In the voltage window comparator there are two input ports for the HIGH and LOW reference voltages respectively and one input port for the testing voltage to be compared with the HIGH and the LOW reference voltages. The window comparator will output a logic level of one or zero, according to the designer's arrangement, only when the testing voltage is higher than the LOW reference voltage and lower than the HIGH reference voltage. In all other cases the comparator will output the opposite logic value that is zero or one.

There have been several types of window comparator designs proposed by various researchers [5, 7, 8]. Franca in [5] proposed a mixed analogue-digital window comparator with independently programmable boundaries which can achieve high accuracy. However, the circuit and its corresponding working mechanism are very complex because Franca's circuit requires capacitors in the circuit and a number of stages are needed to generate the output signal, which results in a large delay time. Furthermore, Franca's circuit will also occupy a large chip area and this complex design

makes the window comparator fragile and less attractive for practical applications. Venuto et al [6-7] had a number of proposals and in their early design a clocked window comparator based on the normal Operational Amplifier (Op Amp) was proposed. The comparator is connected to a particular circuit node to check the faulty condition of the mixed signal circuit. In that structure it is needed to bring an analogue voltage out of the chip. Venuto and his associates [7] further proposed a digital window comparator using standard digital gates for on chip evaluation of analogue circuits. Subsequently, they suggested [8] an auto-repositioning technique for the compensation of the lot-to-lot fabrication parameter variation for the window comparator proposed earlier in [7]. However, this kind of comparator built with digital gates is with fixed reference voltages and it cannot realize arbitrary reference voltage. At the same time, the repositioning technique can only work for parametric variation of circuits happening in the same direction with limited accuracy.

Here in this thesis a full range window comparator design for voltage compare is proposed and the design is based on operational amplifiers. To avoid the potential danger that a faulty comparator is used to test voltage signals from internal circuit nodes in SOC testing mode, self-test circuit is also designed to be included on the chip for the window comparator such that the corresponding window comparator will be checked before it is used for testing the core internal circuit. The window comparator can be used for testing only after it passes the self-test. Otherwise, the test process using window comparator will not start at all. Complete simulation has been conducted to demonstrate the effectiveness of the circuit design and the results are presented in this thesis.

A new kind of BIST structure, combining the scan and the compare of test nodes' voltages, is proposed here to perform testing and diagnosing in SOC environment. In the proposed BIST system a full range window comparator (FRWC) is used. As mentioned earlier, the window comparator is associated with self-test circuit to ensure its functionality before it's used to compare analogue voltages. This proposed BIST scheme is mainly made up of (i) voltage comparator, (ii) test control block, (iii) core selecting block and (iv) decision circuit. These functional blocks must work together with the corresponding test access interface in each IP core of the SOC. The resultant test response outputted from this BIST structure is a binary bit stream and each bit represents the comparing result of one voltage signal with its corresponding tolerance range. By reading these standard binary bits, faulty condition within the SOC can be detected and the unique fault or equivalent fault set within the faulty core(s) can further be identified, if possible. With the statistic of such fault bit pattern, design engineers can improve the design and the manufacturers can identify manufacturing problem. This BIST structure can also be easily incorporated into the existing test architecture for the analogue portion of a mixed-signal SOC such that a single digital ATE is all that are required for the IC testing.

The specification of any ideal voltage comparator should be of high sensitivity and be with minimum response time. This is not only desirable but also required for many time crucial applications. For a test procedure in which an analogue voltage is sampled, held and compared orderly, a large portion of the testing time is consumed by the voltage comparator. To minimize the testing time, lower the testing cost and improve the test accuracy, good quality and high performance voltage comparators are needed. The difficulties for designing such a voltage comparator lies on the factor that there is a substantial degradation of comparator's performance when input voltages, including testing voltage and reference voltages, are close to power supply voltages (Power supply voltages are also commonly referred as rail voltages and this term is also used in various part of this thesis). At that time voltage comparator shows lower voltage gain and longer response time than that in the case when input voltages are far away from both positive and negative power supplies. As the input voltages are close to rail voltages, some transistors in the comparator circuit are in deep saturation states while some others are in cut off states such that a number of circuit internal nodes have to be charged or discharged over a wide voltage range when the comparator makes a transition. This kind of transition takes a much long time to settle, which implies a long testing time. When there are a large number of testing points in such similar situation, the accumulated delay time for testing the IC may be unacceptable.

A voltage comparator circuit with four stages is designed here to overcome this type of difficulty, especially focusing on improving the comparator's performance when input voltages are close to the rail voltages. The voltage comparator consists of a first amplifier, a second amplifier, a current summing circuit and an output buffer. Simulation based on the 50nm BSIM4 model with 1V power supply voltage shows that this comparator can work well for rail-to-rail inputs. Even when the input voltages are

close to rail voltage, the DC gain and transient response time can still achieve good property so that it enables the voltage comparison with high accuracy and short delay time in the rail-to-rail working range.

Previous IC testing efforts mainly focus on testing internal functionality of circuits. However, with the shrinking feature size of fabrication technologies, testing SOC interconnects is necessary and even required nowadays. The testing time and cost for the SOC interconnect signal integrity faults can be very high. To cope with this problem, a two-dimensional signal integrity test pattern compaction scheme is proposed here. This method not only can reduce the number of test patterns but also can reduce the length of those patterns. Simulation results show that the proposed solution can significantly reduce the overall interconnect test data volume especially when the test pattern count and length are large. Therefore, the testing time for the interconnect signal integrity faults can be reduced substantially.

This thesis is organized as following: In chapter 2 several types of modern IC testing methods are briefly described; In chapter 3, the design of self testable Full Range Window Comparator (FRWC) is presented and detailed analysis is also shown. Simulation of the FRWC has been conducted and the corresponding result is also presented in chapter 3; In chapter 4, a new type of Built-In Self-Test (BIST) structure using FRWC is demonstrated and analyzed while its working mechanism is also explained in detail. The corresponding simulation of this BIST system is described in the latter part of chapter 4. A fast voltage comparator design which is capable of

working within the rail to rail input voltage range is shown in chapter 5 and its performance characteristics are also described in chapter 5; Two-dimensional test pattern compaction strategy for SOC interconnect signal integrity test is proposed in chapter 6. Chapter 7 concludes the above mentioned research and shows the possible project to be worked on in the future.

# Chapter 2 INTEGRATED CIRCUIT (IC) TESTING METHODOLOGIES

Many test architectures and techniques have been presented in open literatures for different kinds of circuits, which includes digital and analogue scan chains, oscillation based test methodology, I<sub>DDQ</sub> testing, newly emerging MEMS testing, signal integrity testing and so on. Different test methods are with different working mechanisms and are for different circuits and faults.

#### 2.1 Scan based test structures and methods



Fig. 2-1 shows the basic structure of scan chain.

Fig. 2-1. Mixed signal scan chain structure

Analogue and digital signals can go through their respective, independent scan chains or these two kinds of signals can go through the same mixed mode scan chain. Take the mixed mode scan chain structure as shown in Fig. 2-1 for an example: The scan chain structure consists of voltage followers or buffers, transmission gates and shift flip-flops. Voltage followers for analogue signals or buffers for digital signals will sample the signals from the internal nodes of integrated circuits and at the same time isolate those internal nodes from the scan chains so that the addition of such testing circuits as the scan chains will not affect the normal operation of the original circuits. Flip-flops for shifting will generate the corresponding control signals for those transmission gates to output the sampled signals onto the scan chain one by one so that there will be no interference between any pair of signals. The signals outputted on the scan chain will then be compared, tested or diagnosed. In this way the internal nodes of integrated circuits can also be monitored besides those primary output nodes so that the circuits' observability can be improved to achieve a better testing and diagnosing capability.

There are also other kinds of scan chain structures. For digital signals, their voltages are at standard logic 1 or 0 level so that their values can be stored at such storage cells as flip-flops and can be shifted through the storage cell chains. Fig. 2-2 shows the typical structure of one kind of such scan chain design.



Fig. 2-2. Reconfigurable scan chain structure

Instead of outputting test signals onto the test output line as the mixed signal scan chain structure shown in Fig. 2-1, the test signals in the reconfigurable scan chain structure, as shown in Fig. 2-2, will be stored in the corresponding boundary store cells and will be shifted along the chain formed by these store cells for both input test vector and output test response. For example, to shift in test vectors to the corresponding ports, stream of binary bits will flow like a ripple along the chain composed of store cells. In each clock cycle, the stream will move forward one cell and finally all bits arrive at their corresponding ports.

In an integrated circuit there may be many scan chains. To reduce required test channel width and test application time, test input data can be compressed before being shifted into the circuit under test (CUT) and test output data can also be compressed before being shifted out of the CUT. This test structure employing both pre- and post-compression data chain is demonstrated in Fig. 2-3 below.



Fig. 2-3. Multiple scan chains with data compression

With multiple scan chains and data compressors to transfer and compress both test input and output data, test application time can be greatly reduced and thus design cycle and cost can also be reduced accordingly.

Scan based Built-In Self-Test (BIST) methods for integrated circuits can be classified into two main categories according to Agrawal et al [22]. The two categories are: test-per-clock and the test-per-scan. For test-per-clock scan BIST structure, each test vector is inputted into the Circuit under Test (CUT) and the corresponding circuit output is captured in every individual clock cycle while testing procedure for test-per-scan based BIST scheme is made up of two cycles:

- (I) The data input cycle: a test vector is inputted, or shifted, into the scan chain.
- (II) The functional and output cycle: the corresponding test response subjected to the input test vector is captured; a functional cycle is conducted after the data input cycle and at the same time the test response is shifted out to the following data processing stage with the next input test vector being shifted into the CUT

meanwhile.

The scan BIST scheme of test-per-clock generally needs fewer number of test vectors to achieve the same fault coverage compared with the scheme based on test-per-scan. On the other hand, the hardware overhead and timing overhead of the test-per-clock based scan BIST are prohibitively higher in many cases. However, the test application time of test-per-scan based BIST scheme is longer compared with that of test-per-clock based BIST scheme as reported by Xiang et al [23] in their recent publication.

A number of other researchers [23-26] have proposed different variations of scan based test architectures and algorithms to improve the fault coverage and reduce test application time. Gupta et al [24] proposed the consideration of including the scan wire length overhead to maximize fault coverage as well as to minimize the number of dummy flip-flops in designing their testing scheme for path delay fault. Their publication also presented the layout awareness, coverage driven scan chain ordering methodologies and algorithms proposed to compute the achievable tradeoffs between path delay fault coverage and number of flip-flops, wire length overhead. Shinogi et al [25] demonstrated that, without increasing the number of I/O pins used in the testing, SOC testing cost could be reduced by introducing parallel cores testing with multiple scan chains using test vector overlapping. Data overlapping algorithm and test controller are also proposed. Xiang et al [23] partitioned scan chain into multiple segments and test responses are captured for each of multiple capture cycles simultaneously when test vectors are shifted in. Instead of driving multiple scan segments by a single scan-in signal, this architecture puts forward the idea of controlling multiple scan segments by different signals.

Power consumption in scan based testing has also been attracting much attention. It is because power dissipation can limit the maximum number of scan chains that can operate in parallel testing and excessive power consumption may result in damaging the IC under test. Sinanoglu and Orailoglu [26] inserted additional logic gates into the scan chains instead of scan cells alone in the scan chains. By means of this kind of scan chain modification both test input vectors and test responses are transformed into new sets of data so that the test scan chains can transit less times with the reduction of power consumption.

Furthermore, test architectures and methods to detect and to diagnose the faults happening to scan chains themselves have also attracted some recent research effort. Li [27] proposed the scheme in which excitation patterns are applied to locate single stuck-at fault and multiple timing faults. In this kind of single excitation pattern only one bit is flipped in the presence of multiple faults. The diagnosis result becomes deterministic with single excitation pattern.

#### **2.2 Oscillation based Test Method (OTM)**

Generally, the procedure for oscillation based IC testing consists of two parts:

(I) The circuits to be tested are partitioned into such functional blocks as amplifiers, phase-lock loops and others. These functional blocks are combined, with

additional circuits if necessary, and converted into oscillators (circuits that oscillate) when the circuits are in test mode.

(II) These oscillators are being excited to oscillate in the test mode and the corresponding frequencies and/or amplitudes of the oscillations are then captured and compared with their derived values from the design specifications of the original fault-free circuit components. If there is any deviation from the above measured values, these values will be used to provide information about the faulty condition of the circuits because faults in the circuits will cause the oscillation frequency and/or amplitude to deviate from the original tolerance band.

Fig. 2-4 shows the circuit structure to build an oscillator which includes the amplifier to be tested.



Fig. 2-4. Converting an amplifier into an oscillator in test mode

The great advantage of Oscillation based Test Methods (OTM) is that this kind of IC testing needs no input test vectors or it can be said that this is a vectorless testing scheme. OTM can work for both BIST scheme and ordinary manufacturing test to

detect defects on chips using external test equipment. For those test methods requiring test vectors or stimuli, compared with the vectorless oscillation testing, special design effort is needed to generate test vectors and external test equipment or built-in circuit is needed to input the test vectors into IC chips during test time while on chip test stimulus generation needs considerable hardware overhead. Vectorless OTM is thus substantially appealing from the point view of test stimulus generation.

OTM have been broadly developed for both digital and analogue integrated circuits in a number of literatures [9, 10, 28, 29]. Diverse approaches have been studied to transform normal integrated circuits into oscillators in test mode and to analyze the resulting oscillation frequency and amplitude so as to achieve high fault coverage and reduce test time. Research has demonstrated that for catastrophic faults OTM can achieve high fault coverage but for parametric faults the maximum fault coverage is limited if only oscillation frequency is measured in oscillation-based testing. Huertas et al [28] demonstrated that measuring oscillation frequency alone can not achieve sufficiently high fault coverage as well as high yield coverage. Subsequently, Huertas and his associates proposed to measure the oscillation amplitude as well as the oscillation frequency to improve fault coverage. About the testing nodes in oscillation-based test, internal circuit nodes, besides those primary output nodes, should also be monitored to achieve high parametric fault coverage since the oscillation amplitude and/or frequency may be sensitive only to the parameter variation of components adjacent to the test nodes.
Test application time in oscillation-based test is an important factor that should be paid enough attention to. A large amount of time is needed for accurate computation of oscillation frequency such that the oscillation-based test time may be on the order of seconds if no efficient test response analysis algorithm is adopted. Test time of several seconds is unacceptable in most cases especially for a go/no-go test in industry. Such a long time is a disadvantage of OTM and will limit the application of oscillation-based test. By means of indirect method of measuring the oscillation frequency, Roh and Abraham [10] use a fast comparator as a signature analyzer to reduce the testing time significantly. However, the charge or discharge of the capacitor in the comparator prolongs the response time of the comparing and the mismatches of those resistors in the comparator degrades the accuracy of reference voltages provided by the simple voltage reference circuit if there is no further improvements.

Different oscillator structures built for oscillation test are sensitive to different sets of components. Fault coverage can be improved to convert the circuit under test into several oscillators in test mode at the cost of prolonged total test time. Three different kinds of configurations (discrete, serial and ring oscillator configuration) were studied by Wong [9] to perform oscillation test for the same circuit, a low pass filter as the benchmark circuit, and the results of these three configurations have been compared to reveal the effectiveness and difficulty of OTM technique. A major drawback of the oscillation-based test is the low capability of fault location identification. The simulation results as reported by Wong [9] have shown that the ring oscillator structure built for testing the benchmark circuit with OTM is the optimal configuration in terms

of fault coverage, capability of fault location identification and the number of extra components required.

A result analysis scheme was also presented by Roh and Abraham [10] for OTM and it was shown that reduction of test time, improvement of tolerance in output response and high fault coverage can be achieved with the addition of a small hardware overhead via their proposed result analysis scheme. Using time-division multiplexing technique, internal test nodes are selected with only a switch and a counter for each test node. Information to detect all catastrophic faults and most of the parametric faults is obtained by means of monitoring the test nodes, including the primary output nodes and the internal nodes, sensitive to components' parameter variation. Li et al [29] introduced the interconnect wires between the cores of SOC to be included in the oscillation rings using modified IEEE P1500 wrapper cells. Their proposed test architecture can detect stuck-at and open faults as well as delay faults and crosstalk glitches on SOC interconnects, which is of great valuable as the distance between connecting wires on the chip becomes very narrow and the high working frequency of the IC chips makes the effect of coupling inductance and capacitance substantially significant.

#### **2.3 Quiescent power supply current (IDDQ) testing**

 $I_{DDQ}$  testing, or quiescent power supply current testing, checks the current signals from power supply instead of the voltage signals to search for information about the faulty condition of the circuit under test.  $I_{DDQ}$  testing is considered as a part of the overall chips testing in many IC companies. The power supply current is very low during the logic quiescent period (between circuit transitions) for static CMOS integrated circuits. The currents of  $I_{DDQ}$  are generally on the order of nano-ampere (nA) for very large scale integrated circuits and can be much lower for smaller scale integrated circuits. State dependent  $I_{DDQ}$  will be elevated if there are stuck-at or other types of faults happening to CMOS circuits. The basic working mechanism of  $I_{DDQ}$  testing is shown in Fig. 2-5.



(a) A faulty CMOS component

(b) Responses of  $V_{OUT}$  and  $I_{DD}$ 

Fig. 2-5. V/I schematics of CMOS IC showing IDDQ testing principles

For the inverter shown in Fig. 2-5, the  $I_{DDQ}$  increases a lot from the value of quiescent state corresponding to fault free circuit when there is a transition in the input signal,  $V_{IN}$ , from logic 1 to 0. For fault free circuit  $I_{DDQ}$  will go down to the very small value of the quiescent state when the circuit transition is completed and then the circuit is under steady state again. However, if there is a gate-source short defect happening to the PMOS transistor which is indicated by \*DEFECT in the figure,  $I_{DDQ}$  will stay at a value which is much higher than that of the fault free circuit because there is a current path

after the circuit transition between  $V_{DD}$  and  $V_{IN}$  resulting from the gate-source short defect. The  $I_{DDQ}$  measurement made after the circuit is stabilized can thus detect the circuit fault, which is shown with arrow in Fig. 2-5. To achieve high fault coverage,  $I_{DDQ}$  should be measured under multiple test vectors since the circuit components are in different states and the circuits are under different working conditions with different test vectors applied.

In the study of Rajsuman [30], it was reported that  $I_{DDQ}$  testing can still work effectively even for IC chips fabricated in deep submicron technology while the difficulty resulting from leakage current can be overcomed by means of substrate bias, lower  $V_{DD}$  and lower temperature. Xu et al [31] proposed an algorithm to effectively locate multiple defects in a circuit using test based on the  $I_{DDQ}$  methodology. First the current values from the test nodes which might be the fault position are calculated. Those node pairs that have opposite values might be possible fault position in this circuit and they are recorded as a set of candidate defect location. Further algorithm can reduce the size of candidate sets. Xie and Wang [32] proposed compact  $I_{DDQ}$  testing sets for bridging faults in combinational circuits while the sets are generated through local optimizing, choosing of bridging faults and generic algorithm so that faults can be tested and located fast. Xue and Walker [33] proposed a built-in current sensor design which can monitor the  $I_{DDQ}$  current with a resolution of 10µA. This built-in system translates the information at current level into digital signals with scan chain readout.

# 2.4 MEMS devices testing

Currently Micro-ElectroMechanical System (MEMS) devices begin to be integrated in System On Chip together with those traditional digital and analogue circuits. There are also many other kinds of micro-systems such as Micro-OptoElectroMechanical System (MOEMS) and Micro-ElectroFluidic System (MEFS). The rapid developing of these devices profits from such techniques as surface machining, bulk micromachining and so on. Nowadays the mass production of MEMS devices has come into being in industry.

The testing of high volume of MEMS devices involves multi-domain issues. The testing is not a pure electrical process and the non-electrical stimuli and responses result in much difficulty for the MEMS testing. Currently custom specific test set up is designed for specific MEMS devices testing and it's difficult to quantify the test quality. Research efforts are needed to establish structural test methods.

It has been shown in the recent SPIE publication [35] concerning "Testing and Characterization of MEMS/MOEMS" that in many cases the packaging and testing costs of MEMS devices can reach up to 80% of the total product cost while testing alone costs around 30%. Effective testing methods are much in need to reduce time to market and product cost. Kerkhoff [36] pointed out that an important aspect of testing MEMS devices is that packaging of these devices has significant influence on the test result such that the final testing should incorporate the packages. In a recent conference, Gueissaz [37] presented that a leak detection method for MEMS packages having small cavity volumes can simultaneously detect both extremely fine and gross leaks. Many MEMS devices are sensitive to the atmosphere surrounding their internal structure because of the chemical and physical interaction between the atmosphere and the internal structure surfaces. A cumulative chemical reaction (oxidation) on chip layers can be used to detect a test gas (oxygen) while the oxidation level of these layers can be assessed by simple optical transmission measurement in the infrared region. Mailly et al [38] showed that their on-line testing of MEMS was realized by means of superimposing thermal variation in the normal operating mode using modulation of electro-thermal excitation and then processing the sensor output to extract the thermal induced signal for further signature analysis. Litovski et al [39] adopted a black box modeling of the non-electronic parts in MEMS devices constructed with artificial neutral networks so that new concepts of MEMS simulation, test and diagnosis were introduced to reduce time cost and improve product reliability.

# 2.5 Signal integrity testing

Signal integrity (SI) shows the capability of an electronic signal to generate correct responses in a circuit The violation to signal integrity includes glitch, voltage overshoot/undershoot, oscillation, excessive signal delay and even signal speedup. With the feature scale of VLSI circuits decreasing to deep sub-micron (DSM) and the working frequency increasing to multiple GHz of the high performance IC chips, nowadays signal integrity has become a major concern in IC testing and designing, especially for the interconnects between those cores embedded in SOC because the interconnect wires are typically much longer than those wires connecting core internal

logic gates. If the noise-induced voltage swing or timing skew departs from the signal tolerance region, functional error may occur. In addition, voltage overshoot may damage the circuits such that IC chip life is much shortened. Some source said that one in five IC chips failed today due to signal integrity related problems.

The reason resulting in signal integrity problems includes crosstalk between adjacent wires, electromagnetic interference, power supply drop, etc. In many cases the crosstalk between adjacent wires, resulting from coupling capacitance and inductance between the wires, are the main cause of signal integrity problems and the crosstalk becomes more serious nowadays because the spacing between circuit wires are decreasing with the shrinking feature size of VLSI circuits and the signals on the wires transit more acutely due to the high working frequency of circuits. Crosstalk between adjacent interconnect lines has become a major performance limiting factor for current IC chips.

Many physical design methods and fabrication solutions (e.g., [42]) have been proposed in the literature to deal with signal integrity related problems. However, none of these methods can guarantee to solve all the signal integrity related problems perfectly. In addition, process variation and manufacturing defects may aggravate the coupling effects between interconnects [59]. Since it is unacceptable to over-design the VLSI circuits to tolerate all possible process variations and it is impossible to predict the occurrence of fabrication defects, manufacturing test strategies are essential to detect signal integrity related errors.

To test signal integrity related problems on IC chip, various signal integrity fault models

[46, 52, 61] and test methodologies [43, 62] have been proposed in the literature. However, none of these methodologies is both effective in terms of fault defect coverage and efficient in terms of testing time. At the same time, although the signal integrity related problems are aggravated in core-based SOC designs [58] because the interconnect wires carrying signals between embedded cores in SOC tend to be long, typically on the order of millimeter range, and hence suffer more from such parasitic effects as coupling capacitance and inductance, most prior work in modular SOC testing focuses on core internal testing only without considering the ever important core external interconnect signal integrity faults. When only open/short faults are considered for interconnect lines, the test method is simple and the test time is relatively short so that the test time for interconnect test almost could be ignored if compared with core internal test. However, the interconnect test time will be much longer, comparable to that of core internal test, if signal integrity problems involve into the testing.

When signal integrity problems resulting from cross-coupling (or crosstalk) among SOC interconnects are considered, the interconnect on which the error effects take place is denoted as victim while the affecting interconnect is denoted as aggressor. Usually, when a single crosstalk event is studied, there is only one victim and one or multiple aggressors. That is to say, research in interconnect cross-coupling is based on how a single interconnect wire is affected by other interconnect wires.

Wrapper cells that surround the cores for the purpose of testing should also be modified to facilitate the signal integrity test. At the receiving end of the interconnect, signal integrity loss sensor (ILS) should be equipped in the wrapper cells so that violation to signal integrity, including voltage overshoot/undershoot and excessive delay, etc., can be detected during the test. At the diving end of the interconnects, wrapper cells are required to generate test patterns or to shift in test patterns to be applied onto those interconnects in the test mode because the driving ports of the interconnects, corresponding to the output ports of embedded cores, are to capture test responses when core internal functionality is tested, which makes the interconnects' signal integrity test very different from core internal test.

#### 2.6 Other testing methods

Resources already existing on chip can facilitate the IC testing with much less hardware overhead. For example, in Hwang and Abraham's study [3], a microprocessor on chip accesses the ports of embedded cores through existing system and peripheral bus to feed test stimuli and capture test responses, which can significantly reduce hardware overhead for testing.

Parallel testing can greatly reduce test time and thus it's better to apply parallel testing at various levels. Arora et al [11] proposed a parallel diagnosis scheme to test memory arrays. The embedded memory arrays are accessed using a bidirectional and serial interface which minimizes the routing overhead introduced by the diagnosis hardware.

Test architecture and procedure should also be optimized to reduce test cost and improve fault coverage. In Zhao and Upadhyaya's work [34], test scheduling algorithm for embedded core based SOC was stated and the procedure for all the testing actions of a chip is arranged in a way to balance the resource usage required by each core so as to achieve the goal that the total test application time is minimized. Therefore, the testing sequence of all the cores to be tested needs to be selected for the chip. For each of the cores on chip, a test method should be selected from a set of alternative test methods with different resource requirement and test time so as to minimize the total SOC test time.

# Chapter 3 SELF TESTABLE FULL RANGE WINDOW COMPARATOR (FRWC)

A voltage window comparator is the device that can judge whether the voltage of a signal is in a specific voltage window. A simple voltage window comparator can be built with two Operational Amplifiers and one XOR gate, which is shown in Fig. 3-1.



Fig. 3-1. Window comparator example

The design effort for this kind of window comparator structure is simple but the Operational Amplifiers (Op Amps) have to operate in full swing mode, i.e. the outputs of the Op Amps have to transit from positive power supply to negative power supply or vice versa if there is a change in the outputs of the Op Amps. This operation mode results in high power consumption and long transition time. In the example of the Operatinal Amplifier (Op Amp) as described by Wong [9], the slew rate (SR) is of the value of 2.2 V/ $\mu$ s. In the case that the positive and negative rail power supply is +5 V and -5 V respectively, the delay time of the Op Amp's output is about 4.6  $\mu$ s to transit from one of the rail voltages to the other. Consequently the operating frequency of this

kind of window comparator can only reach a frequency of not more than 200 kHz even without considering other constraints.

To overcome the above-mentioned shortfalls, in this thesis we propose a novel analogue voltage window comparator design and the proposed circuit schematic is shown below as Fig. 3-2.



Fig. 3-2. The proposed full range window comparator

Previous versions of this kind of window comparator design are published by the Wong and Zhang in [12, 21] and the complete design and analysis are also presented by Zhnag, Wong and Li in [40]. This window comparator structure has the advantages that the Operational Amplifiers will not swing between their corresponding positive and negative rail voltages such that delay time and power consumption can be reduced. In addition to the improvement in the window comparator circuit, we have also proposed such supporting circuits as self test circuit considering the possibility that faults may happen to the comparator circuit itself.

Several important components and the working mechanism of this window comparator are described in detail in the following sections.

#### 3.1 Full range window comparator

In Fig. 3-2,  $V_i$  is the analogue voltage to be compared;  $V_{refh}$  and  $V_{refl}$  are the high and low reference voltage ( $V_{ref}$ ) respectively. The voltage window comparator composed of operational amplifiers, inverters and NOR gate together with the resistors can perform the logic function that  $V_o$  will output logic 1 only in the case that  $V_{refl} < V_i < V_{refh}$ , which means that the testing voltage  $V_i$  is in the voltage window specified by  $V_{refh}$  and  $V_{refl}$ .

The output of Op Amps OA 1/OA 2 in Fig. 3-2 is  $V_{OA} = V_i - V_{ref}$  while in the window comparator structure shown in Fig. 3-1 the outputs of those operational amplifiers will be either positive or negative power supply. In this way the swing range of the operational amplifiers is limited so that both transition time and power consumption can be reduced. The outputs of the OA 1 and OA 2 are inputted into the following inverters.  $V_o$  will output logic 1 only in the case that  $V_{refl} < V_i < V_{refh}$  while in all the other cases  $V_o$ will be logic 0. A NOR gate and an inverter in the upper branch are better than an XOR gate because the former combination costs less hardware and can eliminate the danger that  $V_o$  outputs logic 1 as  $V_{refl} > V_i > V_{refh}$  while at the same time the delay time can be reduced..

#### 3.2 Self-test circuit

To overcome the potential problem that a faulty FRWC is used to test circuits in SOC environment, additional circuit which is called self test circuit shown in Fig. 3-3 is introduced to construct a self-testable full range window comparator (FRWC).



Fig. 3-3. Self test circuit for FRWC

There are two branches in the FRWC circuit and one of them is tested at a time while the inputs of the other branch are set to provide a logic 0 at the corresponding input of the NOR gate. The self test circuit is designed to provide an analogue voltage set for the voltage comparator circuit. In this way a complete self test set comes into being and enables the FRWC self testable. Those switches in the self-test circuit,  $S_{1-6}$ , will be open in a certain sequence so that faults happening to the window comparator circuit can be detected. The analogue voltages required for self test need not to be very accurate, i.e. it's enough if they are in their corresponding range. Therefore, there is no strict requirement for the resistor values in the self test circuit and in fact those resistors can be replaced by transistors working as resistors.

During FRWC self test a decision is to be made whether FRWC passes the self test according to the self test result. In some cases the FRWC output signal will not stay at standard digital logic 0 or 1 voltage level resulting from catastrophic or parametric faults happening to the circuit. This makes it difficult to make the pass/fail decision, so a special circuit shown in Fig. 3-4 is designed to identify the FRWC output.



Fig. 3-4. FRWC output identification circuit

In our experiments the threshold of INV 1, INV 2 is 4 V and -4 V respectively while all other inverters are with 0 V thresholds corresponding to the power supply of +5 V and -5 V in the simulation. When S stays at logic 1 it means that the input voltage,  $V_{in}$ , is identified to check if it's at logic 1 level or not.  $V_{out}$  will be logic 1 only if  $V_{in}$  is higher than 4 V. When S stays at logic 0 it means that the input voltage,  $V_{in}$ , is identified to check if it's at logic 0 it means that the input voltage,  $V_{in}$ , is identified to check if it's at logic 0 level or not.  $V_{out}$  will be logic 1 only if  $V_{in}$  is lower than -4 V. With the help of this circuit one can determine whether an analogue voltage outputted from the FRWC circuit in the test mode is at the desired logic level or not.

# 3.3 Methods to achieve high accuracy

Parametric fault will always happen because definitely circuits can't be fabricated with all parameters 100% matching the design value. Therefore, parameter variation should be considered in the design stage. Various methods have been proposed to tolerate or compensate the variation happening to the circuit devices. For example, Dowlatabadi and Connelly have proposed an offset cancellation technique to reduce the input offset voltage of CMOS differential amplifiers by a factor set by the voltage gain of a feedback loop [14]. If process variation happens to the two inverters following the operational amplifiers, the inverters' thresholds will deviate from their desired values so that the accuracy of the voltage window comparator is reduced. However, there exists accuracy adjusting method to get high precision. An extra chip pin is needed during testing to adjust the switching thresholds of these two inverters to be very close to zero volts so as to compensate parameter variation caused by fabrication inaccuracy and then achieve high precision of the window comparator. The details of this method are described in the following.

The switching threshold of the inverter can be expressed as:

$$\mathbf{V}_{t} = \frac{V_{DD} + V_{tp} + V_{tn} \sqrt{\frac{\beta_{n}}{\beta_{p}}}}{1 + \sqrt{\frac{\beta_{n}}{\beta_{p}}}}$$

It can be seen that the switching threshold will change if the power supply  $V_{DD}$  is

changed. Therefore, inverters' threshold variation can be compensated by changing one of the power supplies, such as the positive power supply, of the inverters to make the threshold back to zero volts. In this way the window comparator can achieve high resolution.

Steps to determine the power supply of the inverter to get zero volts switching threshold are listed here:

- (a) There are two branches in the FRWC circuit and only the power supply for the inverter in one of them is adjusted at a time. The inputs of the other branch are set to provide a logic 0 at the corresponding input of the NOR gate. The two input ports of the branch to be adjusted are set to stay at voltages with a little difference between them so that the window comparator will generate the correct judging signal only in the case that the switching threshold of the inverter is very close to zero volts because  $V_{OA} = V_i V_{ref}$ ;
- (b) Scan the power supply in a certain range, i.e. provide different voltage to the power supply of the inverter.
- (c) Once the window comparator generates the desired judging signal, the present voltage value provided as power supply will be determined to be the power supply of the inverter in the window comparator during test mode.

# 3.4 Analysis of resistor variation in FRWC

When those resisters in the FRWC circuit deviate from their desired value the accuracy of the FRWC will be reduced. The following provides detailed analysis about the resistor variation influence on FRWC accuracy. Here in this section only the resistors' variation is considered and the operational amplifier input offset  $V_{OS}$  has not been taken into account.

The input voltages of OA 1,  $V_+$  and  $V_-$ , and output voltage,  $V_{OA}$ , are with the following relation:

$$V_{+} = \frac{R_{4}}{R_{3} + R_{4}} V_{i}, \quad V_{-} = \frac{V_{OA} - V_{ref}}{R_{1} + R_{2}} R_{1} + V_{ref} = \frac{R_{1}}{R_{1} + R_{2}} V_{OA} + \frac{R_{2}}{R_{1} + R_{2}} V_{ref}$$
Equ. 3-1

In steady state  $V_+ = V_-$  so that

$$V_{OA} = \frac{R_1 + R_2}{R_1} \left(\frac{R_4}{R_3 + R_4} V_i - \frac{R_2}{R_1 + R_2} V_{ref}\right)$$

Assume  $x = \frac{R_1}{R_2}$  &  $y = \frac{R_3}{R_4}$  so that

$$V_{OA} = \frac{x+1}{x} V_i (\frac{1}{y+1} - \frac{1}{x+1} \cdot \frac{V_{ref}}{V_i})$$
Equ. 3-2

In the ideal case that there is no catastrophic fault or parametric variation,  $R_1 = R_2 = R_3$ =  $R_4$  so that

$$x = 1$$
,  $y = 1$  &  $V_{OA} = V_i - V_{ref}$ .

Since the thresholds of those inverters following the Op Amp are set close to zero volts, errors may occur if parametric variation is not neglectable. Any significant deviation of physical dimension(s) of a resistor in an IC may result in such a situation.

Assume  $\left|\frac{V_{ref} - V_i}{V_i}\right| = a$  ( $a \ge 0$ ). The value of a shows the relative difference between  $V_i$ 

and  $V_{ref}$ .

(a)  $V_i > 0$ 

(1)  $V_i > V_{ref}$ 

In this case,  $V_{ref} = V_i(1-a)$ . The correct output result of Op Amp is  $V_{OA} > 0$ if there is no fault happening to the window comparator circuit.

$$V_{OA} = \frac{x+1}{x} V_i \left( \frac{1}{y+1} - \frac{1}{x+1} \cdot \frac{V_{ref}}{V_i} \right) > 0 \quad \Rightarrow \quad y < \frac{x}{1-a} + \frac{a}{1-a} \quad \text{if} \quad a < 1$$
Equ. 3-3

If the values of x, y are out of the range described with the upper inequality, the Op Amp's output will be  $V_{OA} < 0$ , which will make the following inverters and NOR gate output wrong logic values.

(2)  $V_i \leq V_{ref}$ 

In this case  $V_{ref} = V_i(1+a)$  and the Op Amp's output should be  $V_{OA} \le 0$ .

$$V_{OA} = \frac{x+1}{x} V_i \left(\frac{1}{y+1} - \frac{1}{x+1} \cdot \frac{V_{ref}}{V_i}\right) \le 0 \quad \Rightarrow \quad y \ge \frac{x}{1+a} - \frac{a}{1+a}$$
Equ. 3-4

(b)  $V_i < 0$ 

(1)  $V_i > V_{ref}$ 

In this case  $V_{ref} = V_i(1+a)$  and the Op Amp's output should be  $V_{OA} > 0$ .

$$V_{OA} = \frac{x+1}{x} V_i \left( \frac{1}{y+1} - \frac{1}{x+1} \cdot \frac{V_{ref}}{V_i} \right) > 0 \quad \Rightarrow \quad y > \frac{x}{1+a} - \frac{a}{1+a}$$
Equ. 3-5

(2)  $V_i \leq V_{ref}$ 

In this case  $V_{ref} = V_i(1-a)$  and the Op Amp's output should be  $V_{OA} \le 0$ .

$$V_{OA} = \frac{x+1}{x} V_i \left(\frac{1}{y+1} - \frac{1}{x+1} \cdot \frac{V_{ref}}{V_i}\right) \le 0 \quad \Rightarrow \quad y \le \frac{x}{1-a} + \frac{a}{1-a} \quad \text{if} \quad a < 1$$
Equ. 3-6

So finally it can be induced that the variation of those resistors' values is required not to be out of the range of

$$y > \frac{x}{1+a} - \frac{a}{1+a}$$
 and  $y < \frac{x}{1-a} + \frac{a}{1-a}$  if  $a < 1$ .  
Equ. 3-7

For example, when a=0.05 the range becomes

$$\frac{x}{1.05} - \frac{0.05}{1.05} < y < \frac{x}{0.95} + \frac{0.05}{0.95}$$

It means that the variation of those resistors in the comparator circuit can't be out of the range between those two straight lines described in the above expression to judge the input voltages,  $V_i$  and  $V_{ref}$ , with relative difference of no more than a = 5%. The relation is shown in Fig. 3-5.



Fig. 3-5. Variation range of resistors to ensure the FRWC with 5% resolution

Summarizing the above description and analysis, the window comparator's accuracy will reduce because of the resistors' variation. In the ideal case, where x=1 and y=1, the input voltages of the Op Amps are

$$V_{+} = \frac{V_{i}}{2}, \quad V_{-} = \frac{V_{OA} + V_{ref}}{2}$$

The above equation can't describe the input voltages any more if resistors deviate from their designed values. This situation can also be looked as that the effective  $V_i$  and  $V_{ref}$ seen from the Op Amp deviate from their initial values. Therefore, this problem can be analyzed by considering that the reference voltages are not absolutely fixed but in an ambiguity range set by the accuracy of those resistors in the window comparator circuit. The locations of the effective reference voltages, seen from the point view of the Op Amps, in this ambiguity range are determined by the actual values of those resistors that are somewhat away from their designed values and can't be know at the design stage. Therefore, the output result of window comparator is uncertain when the value of  $V_i$  lies in the ambiguity range because the resistor's variation can't be predicted before fabrication. The comparing result, judging if  $V_{refl} < V_i < V_{refh}$  or not, outputted from the window comparator is shown in Fig. 3-6 when  $V_i$  is with different value. From the output result of the window comparator it can be induced that (b = a here):

 $\begin{array}{l} \mathsf{Pass} \twoheadrightarrow (1-b) V_{refl} < V_i < (1+b) V_{refh} \\ \mathsf{Fail} \twoheadrightarrow V_i < (1+b) V_{refl} \quad \mathrm{or} \quad (1-b) V_{refh} < V_i \,. \end{array}$ Equ. 3-8



Fig. 3-6. Result from window comparator corresponding to different  $V_i$  values

Take the case that resistors are fabricated with 5% accuracy for example: If resistor values deviate in the same direction and with the same degree there will be no influence on the window comparator's accuracy because the resistors' ratios, x and y, will not change although the value of single resistor changes. In the worst case when  $R_1$  and  $R_2$ ,  $R_3$  and  $R_4$  change by 5% in opposite direction, there will be two extreme values, 0.905 and 1.105, for x and y. This results in maximum a = b = 0.105 to cover the variation region, which means 10.5% accuracy can be achieved for the window comparator in this situation. Fig. 3-7 shows the resistors' variation range (shadowed area) and the

corresponding boundaries to include this area.



Fig. 3-7. Accuracy of FRWC corresponding to 5% variation of resistors

Finally, it is induced from the above analysis that, if the resistors in the comparator circuit are fabricated with the accuracy of w, the voltage window comparator can achieve the accuracy of  $\frac{2w}{(1-w)}$  resulting from such parameter variation.

#### 3.5 Simulation result of FRWC

Catastrophic faults happening to the FRWC circuit have been simulated on PSPICE and the result is shown here. Self test circuit described previously, without using other circuits or outer test equipment, is used to test the FRWC circuit with one fault injected at one time. The reasons for injecting one fault at a time are:

[1]. Injecting multiple faults will result in much high computation time. For example, if there are N possible faults in a circuit, there will be  $N^2$  kinds of situations if two faults are injected at a time. It's impossible to simulate  $N^2$  times for a typical circuit because of limited time and spending.

[2]. Injecting one fault at a time can generate most cases in reality so that it's acceptable in industry. In addition, multiple faults injected at a time might mask the effect of one another, resulting in undetectable circuit fault.

#### **Fault Models:**

In this chapter we have considered all possible catastrophic faults which consist of short and open faults in resistors and transistors except the transistor gate contact open faults. Moreover, fault simulations have been carried out with open fault being modeled as a 1 G $\Omega$  resistor while short fault being modeled as a 1  $\Omega$  resistor. For example, when an open fault happens to the source of a transistor there will be one 1 G $\Omega$  resistor added between this gate port and the corresponding circuit node; When a short fault happens between the drain and gate of a transistor there will be a 1  $\Omega$  resistor added between these drain and gate ports.

#### Simulation procedure and analysis:

The faults are injected one at a time into the circuit during simulation. Multiple faults occur at the same time are not covered in this thesis. Fault simulations have been carried out using the PSPICE program with the 0.5  $\mu$ m CMOS process parameter listed by Monpapassorn in [15]. Fault simulation is performed at circuit level, i.e. all logic gates and Op Amps are flattened to transistor level.

Fig. 3-8 shows the circuit configuration for self test. The inputs of the window comparator are connected with those outputs of the self test circuit in test mode while in normal operation mode the inputs of the window comparator are connected with the reference voltages and the scanned voltages from the nodes of SOC cores. In test mode the output of the window comparator is connected with the self test identification circuit which is described previously.

The self test procedure mainly consists of two steps:

- (1)  $V_{refl}$  is connected with  $V_{SS}$ , the negative power supply, while  $V_{refh}$ ,  $V_i$  of FRWC are connected with  $V_{ref}$ ,  $V_i$  signals outputted from the self test circuit respectively. At this time the upper branch is tested while the lower branch provides a logic 0 at the connection node with the input of the final NOR gate. Those switches in the self test circuit will be open one at a time in a certain sequence to provide desired voltages to the window comparator.
- (2)  $V_{refh}$  is connected with  $V_{DD}$ , the positive power supply, while  $V_{refl}$ ,  $V_i$  of FRWC are connected with  $V_{ref}$ ,  $V_i$  signals from the self test circuit respectively. At this time the lower branch is tested while the upper branch provides a logic 0 at the connection node with the input of the final NOR gate. Those switches in the self test circuit will also be open one at a time in a certain sequence to provide desired voltages to the window comparator.



Fig. 3-8. Block diagram of the FRWC setup in self test mode

Each analogue signal from the self test circuit will pass through a voltage follower (VF) and a CMOS switch before it arrives at the FRWC. Voltage follower provides high input impedance and high driving capability for the next stage while CMOS switch will separate self test circuit with the FRWC when FRWC is in test operation and is used to test other circuits on chip.

In addition, the signals that are required to control the switches  $S_{1-6}$  in self test circuit and their generation circuits are shown in Fig. 3-9.



(a) Switch control signals



(b) Control signal generators

Fig. 3-9. Switch control signals for self test circuit and the generators

The control signals are designed to force the FRWC output to change between high and low voltages, i.e. logic 1 and 0, so that possible circuit fault can be exposed. As can be seen from Fig. 3-9, the switches of  $S_{1-6}$  will be closed in the sequence of  $S_1$ ,  $S_5$ ,  $S_3$ ,  $S_4$ ,  $S_2$ ,  $S_6$ . This results in the relative magnitude between the voltages from the self test circuit,  $V_i$  and  $V_{ref}$ , changes back and forth.  $V_i$  will be higher, lower, higher, lower, higher, lower than  $V_{ref}$  corresponding to the switch closing sequence of  $S_1$ ,  $S_5$ ,  $S_3$ ,  $S_4$ ,  $S_2$ ,  $S_6$ .

When the upper branch in the window comparator is tested, the window comparator should output a series of logic 101010 corresponding to the relative magnitude of  $V_i$  and  $V_{ref}$  resulting from the switch closing sequence of  $S_1$ ,  $S_5$ ,  $S_3$ ,  $S_4$ ,  $S_2$ ,  $S_6$  if there is no fault happening to the window comparator circuit. When the lower branch is tested the window comparator should output a series of logic 010101, different from the former 101010, corresponding to the same switch closing sequence of  $S_1$ ,  $S_5$ ,  $S_3$ ,  $S_4$ ,  $S_2$ ,  $S_6$  for a fault-free window comparator circuit.

The input S in the output identification circuit functions to tell the circuit what logic

value is expected for the input voltage to be tested. The identification circuit will output logic 1 if the input voltages accord with the expected logic values, that is to say, the voltages under test are at the desired logic levels. Otherwise the output will be logic 0. Correspondingly, the input port of *S* in the identification circuit should be 101010 when the upper branch of the window comparator is tested and *S* should be 010101 when the lower branch is tested. If there is no fault happening to the window comparator circuit the identification circuit will output a series of logic 1 during FRWC self test time. However, the identification circuit is controlled to output logic 0 at first to detect the possible stuch-at-1 fault happening to the identification circuit.

In the design proposed here the self test circuit provides self-test for the window comparator. Fault simulation is performed at circuit level, i.e. all logic gates and Op Amps are flattened to transistor level. A total of 156 single short and open faults of the FRWC are simulated with one fault injected at a time. The simulation result shows that 6 out of 156 faults can't be detected if only the self test circuit is applied. However, further investigation reveals that this set of undetectable faults could be detected if other test technique (such as  $I_{DDQ}$  measurement) other than voltage-based test method was used. For example, if open fault happens to the drain port of transistor M8 in the lower Op Amp, correspondingly the current will drop from the fault-free value of 220  $\mu$  A to 80  $\mu$  A.

In conclusion, an effective self-testable full range window comparator design has been presented in this chapter. Due to the built-in fault tolerant features of the self test circuitry, very high fault coverage of the FRWC is assured. What's more, the effect of resistor variation on FRWC's performance is also analyzed so that FRWC's resolution resulting from resistors' parametric faults can be determined. The major advantages of the proposed self-testable design are easy to implement and has small hardware overhead.

# Chapter 4 BIST SCHEME USING SCAN CHAIN AND FRWC

A new type of Built-In Self-Test (BIST) structure for SOC testing is proposed here based on scan chain structure and the above-mentioned full range window comparator (FRWC). The basic idea of this kind of BIST system is to scan those voltages of internal nodes in the cores of SOC and to judge whether these voltages are in their corresponding tolerance range or not. Faulty condition of the cores can then be diagnosed from the judging results. The basic schematic structure of this kind of BIST system is shown in Fig. 4-1. The circuit built to implement this kind of BIST system is mainly made up of five parts:

- Full Range Window Comparator (FRWC). It has already been described in detail in Chapter 3. Of course, designers can use other types of voltage comparators in the BIST architecture.
- (2) Testing controller. The main function of this circuit block is in charge of controlling the whole testing procedure. Once it receives a valid signal either from an external ATE or a built-in microprocessor on chip, which indicates the beginning to test the cores in SOC, the test controller will activate the BIST system and the test procedure will then be initiated. In the first step the test controller arranges the FRWC to go through self test, i.e. to check whether there is error happening to the FRWC circuit. Once the test controller judges, by the identification circuit, that the

FRWC passes the self test, it will send a valid signal to the core selecting block which will then generate pre-determined core selecting signals to activate the cores in SOC one by one. The cores are enabled one at a time so that those desired analogue voltages in each core will be outputted to the FRWC to be evaluated serially.



Fig. 4-1. BIST structure based on scan chain and FRWC

- (3) Self test circuit. This block provides analogue voltage sets to test the Full Range Window Comparator (FRWC), i.e. to check whether there are errors happening to the FRWC before testing the analogue circuits in SOC using the FRWC.
- (4) Core selecting block. It generates valid signals to activate those cores in the SOC one by one such that the cores are tested serially in a pre-determined order.
- (5) Test interface in each core of SOC. It provides test access to the cores on chip in test mode. The design of test interface should coincide with the other parts of the BIST system. For example, test interface is expected to receive activating signal from the core selecting block and to send out a signal indicating that this core has finished outputting all needed voltages.

### 4.1 Core selecting mechanism

The first problem that should be solved in this BIST system is how to activate the cores in an SOC. About the core selecting mechanism used in the core selecting block and test interface of each core, there are three kinds of options proposed here to be adopted: (1) Direct structure, (2) Chain structure and (3) Combined structure of these two.

(1) Direct structure: Core selecting block directly generates a valid selecting signal for every core in the SOC one by one. That is, there is a line between each core and the core selecting block to transfer core selecting signal. At any time only one core is activated by the corresponding core selecting signal from the core selecting block of this BIST system while at the same time the other core selecting signals are all invalid and the other cores are un-enabled to output voltage signals. In this way only the core being selected will fan out those desired analogue voltages one by one without any interference with other analogue voltages from other cores in SOC. Since every core may have different number of desired test points, the active time for each core selecting signal is different. The core selecting block is required to generate a set of core selecting signals with different active time duration.

- (2) Chain structure: In this structure there is in fact no need to build a core selecting block. Each core is equipped with a core selecting signal receiving port and a core selecting signal generating/outputting port in its test interface circuit. Once a core receives an active core selecting signal at the receiving port it will fan out those desired analogue voltages. After it finishes outputting all those voltages the core will generate an active core selecting signal at the outputting port. In this BIST system the core selecting signal receiving port of each core is connected with the core selecting signal outputting port of the previous core except that the first core receives core selecting signal from the test controller. In this way all cores together look like a chain and the cores in SOC will be activated one after another like a ripple spreading along a line.
- (3) Combined structure: In this structure both direct and chain core selecting structures are adopted. A core selecting block is needed but this block is simpler than that used in the case when all cores are selected directly by signals from core selecting block. Usually, several important cores are activated by signals directly from the core

selecting block while those cores embedded near or in a large core are activated by signals propagating through chain structure.

# 4.2 Test controller

The test controller arranges the process of the test procedure. It will first make the FRWC to take the self test once a valid test mode signal reaches the test controller indicating the beginning of the SOC test. After that the test controller will check the result of the FRWC self test by means of the identification circuit to judge whether there is an error happening to the FRWC. If the result of window comparator self test is right, a valid signal will be sent to the core selecting block from the test controller to generate signals activating the cores in SOC to send out those desired analogue voltages to the FRWC.

The response from the window comparator corresponding to the analogue voltage sets generated by the self test circuit should be a series of certain binary bits. Therefore, the circuit designed in the test controller to judge the self test result should be able to detect such stream of binary bits during the time of FRWC self test. If such stream can't be detected, the FRWC will be marked as a faulty circuit and the following testing procedure will not be activated.

# **4.3 Testing interface in SOC cores**

The shadowed area on the top of each core located in Fig. 4-1 represents test interface circuit attached to the ordinary circuit of a core. In this kind of BIST system the

function that the test interface should complete is to receive core selecting signal, to output those desired voltages serially and to generate core selecting signal for the next core if chain core selecting structure is used.

Several kinds of structure, such as those proposed by Wey [16] and Wurtz [17], have been studied to sample, hold and output the analogue voltages from analogue circuit nodes. Combination of the sampling, holding and scanning techniques with this BIST system forms the test interface which is shown in Fig. 4-2. Switches will be closed one by one so that desired voltages are outputted serially.



Fig. 4-2. Test interface for the proposed BIST system

With the core selecting scheme and the test interface described in section 5.2 and 5.3, it's easy for a designer to embed a core equipped with this kind of test interface at any level of SOC design for BIST purpose.

# 4.4 Characteristic features of the BIST system

There are several outstanding advantages existing in this kind of BIST system.

- (1) Believable: FRWC first goes through self test before it is used to test cores in SOC. If the FRWC can't pass the self test, it means that there is at least one error happening to the FRWC circuit. In that case FRWC won't be used to test circuits in SOC. In this way it is avoided that a broken-down FRWC is used to test cores in SOC resulting in the fault that wrong result is generated.
- (2) Flexible: Each core can have arbitrary number of testing points and there is no need to fix the number of testing points and testing time for each core, which makes the SOC system design flexible. In addition, the location of the cores in SOC needs not to be fixed. Even if a core is deeply embedded into other core, its testing can also be easily combined into this BIST system.
- (3) Simple test interface. The circuit needed for each core to cooperate with this BIST system is relatively simple and has no interference with the ordinary function circuit because of the core selecting and testing mechanisms proposed here. Therefore, the required hardware overhead of this BIST system is small.
- (4) Clear result: Testing result is a series of digital signal and each bit represents the testing result of one test point in a certain core. For example, a logic state of zero (0) can be used to indicate the voltage is within the tolerance range while a logic state of one (1) as otherwise.
# 4.5 Test set-up and procedure

Test set-up and testing procedure for the proposed BIST system can be divided into four different stages and they are listed as the followings:

(a) STAGE (I) Test Mode Validation (TMV)

The TMV signal arrives at the test controller either from the external Automatic Test Equipment (ATE) or the microprocessor on chip. This signal will initiate the SOC test sequences.

(b) STAGE (II) Self Test of the Full Range Window Comparator (FRWC)

The controller make the FRWC complete the self test to make sure that there is no fault happening to the testing circuits themselves. The test controller will then check the result of the self test according to the outputted signal of the FRWC.

(c) STAGE (III) Generation of Test Signals for Selecting Cores

Once the FRWC passes the self test, the test controller will send a valid signal to the core selecting block which will then generate the corresponding selecting signals for the appropriate cores in the SOC so that the testing of each of the intended analogue voltages from the interesting points of each core in the SOC is included in the SOC testing procedures.

(d) STAGE (IV) Scanning of Test Points with the BIST

56

The BIST system scans every interesting point, one after another, to judge if the voltage at this point is within the tolerance range or not. Further more, circuit fault can be diagnosed, if possible, according to the testing result of FRWC.

The testing result from FRWC can be further processed if a certain format of testing result is preferred. What's more, the result stream can be compressed so that response data volume can be reduced. Otherwise, the output of FRWC will be fanned out directly to such outer environment as the testing equipment or on-chip microprocessors.

# 4.6 Simulation waveform

The full system simulation of the proposed BIST architecture using scan chain and full range window comparator has been completed in PSPICE on circuit level. All blocks are built from the very beginning which consists of transistors, resistors and capacitors. In the simulation, besides those above-mentioned circuits, there are some other circuits that are needed to simulate those cores in SOC to be tested by the FRWC. The function of these simulating cores is to output a certain number of analogue voltages when it is activated by a valid core selecting signal.

At first, the simulation result waveform is shown in Fig. 4-3 and the details of these signals are described in the following.



# (a) Several important signals in the simulation



(b) Original PSpice output

Fig. 4-3. Waveform of the simulation result

- CLK, System clock signal inputted into the BIST system. This signal works as the clock for every block of this BIST system.
- (2) Output of FRWC. It shows the output signals of the window comparator including the signals of the beginning self test result.
- (3) Self test pass signal. This signal is generated by the test controller after the FRWC finishes the self test and the test controller has determined that the self test result is just the correct one that is expected.
- (4) Core selecting signal of core1, core2 and core4. These are the core selecting signals generated directly by the core selecting block. In this simulation combined core selecting structure is used. Core1, core2 and core4, marked as ①, ②, ④ in Fig. 4-1, are activated directly by signals from core selecting block while other cores are activated by signals propagating in the chain structure.
- (5) Input voltages of FRWC. These signals show the value of input analogue voltages of  $V_{i}$ ,  $V_{refh}$  and  $V_{refl}$ . These signals can make it clear for the designers to judge whether the BIST system works correctly or not.

# 4.7 Hardware overhead

The circuit for simulation is based on components of 0.5  $\mu$ m CMOS process list by Monpapassorn [15]. The hardware overhead is listed in Table 4-1.

|                   | Transistor | Resistor | Capacitor |  |
|-------------------|------------|----------|-----------|--|
| FRWC*             | 28         | 10       | 2         |  |
| Self test circuit | 678        | 7        | 0         |  |
| CSS**             | 894        | 0        | 0         |  |
| Test controller   | 519        | 1        | 1         |  |
| Connection***     | 70         | 6        | 6         |  |
| Total             | 2189(1295) | 24       | 9         |  |

Table 4-1. Hardware overhead of the proposed BIST system

FRWC\*: Full Range Window Comparator.

CSS\*\*: Core Selecting Signal generating block.

Connection\*\*\*: Extra hardware is required to connect those function blocks, such as the multiplexer before the voltage input port of the FRWC.

In addition, Core1~Core7 used in this simulation, marked as (1)~(7) in Fig. 4-1, are simulative cores that can output analogue voltages. They will generate the same response as what the cores in SOC, equipped with scan chain and core selecting signal receiving/outputting structure, will do during testing mode. The hardware overhead of these cores is not included in that of the BIST system.

From the table of hardware overhead it can be seen that totally there are 2189 transistors,

24 resistors and 9 capacitors needed for this BIST system. One thing should be noted is that 894 of these 2189 transistors are used for Core Selecting Signal (CSS) generating block since combined structure of core selecting mechanism is adopted in this simulation. Three core selecting signals are generated in the simulation and they are valid for 4, 8 and 16 clocks cycles respectively. However, if chain structure of core selecting mechanism is used, no Core Selecting Signal (CSS) generating block is need so that only 1295 transistors are needed for this BIST system.

Most parts of this BIST system are not analogue but digital circuits except the FRWC circuit and those several analogue voltage followers used in the connection part. In both of these two parts, in fact, the basic analogue component is the operational amplifier. This digitalization makes it easy to design and standardize this BIST system.

#### **4.8 Simulation result analysis**

Totally seven cores are tested after the FRWC passes the self test in the full system simulation presented here. The time required for the self test of the window comparator is nine clocks. After that each core outputs three analogue voltage signals serially, one signal per clock, and the FRWC finishes testing these three signals from the same core in four clocks, i.e. testing three analogue voltages in four clock cycles because there is one idle cycle.

The number of clock cycles required to test a core =

the number of analogue voltages to be tested + 1

Although an SOC system that consists of seven cores is built to illustrate the effectiveness of the proposed BIST system in the simulation, only three core selecting signals are generated by the Core Selecting Signal generating block and fed into these three "important" cores. This forms the direct type of core selecting structure while for the other four cores chain type of core selecting structure is used. Each core is activated by a core selecting signal from its previous core. The previous core is arranged to go though testing before the latter core and to output a valid signal for the next core after completing the testing process. In this way both the direct and chain types of core selecting scheme are illustrated here. The simulation result shows that they both works well.

The main disadvantage associated with direct type of core selecting structure is that an extra core selecting signal generating block is needed. This block will cost considerable hardware overhead and design effort. On the other hand for the chain type of core selecting structure the main disadvantage is that, if the core selecting signal outputting circuit in one of the cores doesn't work well, those cores following the faulty one can't get the valid core selecting signal any more. In that faulty situation those following cores will not step into the core testing procedure. The combined core selecting structure in which both direct and chain types of core selecting mechanism are adopted can get a tradeoff among hardware overhead, design effort and stability according to different SOC system requirement.

From the simulation results it can be seen that this kind of Built-In Self-Test (BIST)

system using scan chain structure and Full Range Window Comparator (FRWC) can test the internal nodes of analog circuits by comparing analogue voltage signals in SOC. It can complete the testing procedure only by test block on the chip at the cost of small hardware overhead without the need of outputting analog signals out of the chip to ATE. At the same time, this kind of BIST system has lower requirement for the external ATE such that the testing cost can be reduced.

# **Chapter 5 RAIL TO RAIL VOLTAGE COMPARATOR**

For a typical analogue voltage comparator there are one output port,  $V_o$ , and two input port,  $V_+$  and  $V_-$  while two input analogue voltage signals are connected with  $V_+$  and  $V_$ ports respectively. When the voltage at  $V_+$  is higher than that at  $V_-$ ,  $V_o$  will output, for example, digital 1; Otherwise digital 0 will be outputted from  $V_o$ .

Small offset voltage, high voltage gain, fine input sensitivity, short response time and low power dissipation are desired for an analogue voltage comparator. A straightforward design method is to use an operational amplifier as a comparator. The resulting disadvantage is that this kind of comparator can only achieve an operating frequency of several hundred KHz as described in Chapter 3. Therefore, a comparator circuit must be specially designed.

Voltage comparators have been widely studied and can be used in many ways: Dowlatabadi and Connelly [18] used a voltage comparator, together with a resistor and an amplifier, to compose a random digital signal generator cell; Tewksbury and Brewer [19] proposed a comparator with positive feedback in the decision circuit; Song et al [20] presented a self-biased complementary folded cascode amplifier.

Of course, comparators can also be used in integrated circuit testing. For example, the voltage sampled from a certain node of a circuit is compared with the value corresponding to fault free circuit to see whether this voltage is in its tolerance range. In this way the IC can be tested and diagnosed.

In a word, a comparator with the following properties is desired:

- (1) Rail-to-rail input range;
- (2) Rail-to-rail output;
- (3) Large voltage gain and fine sensitivity to the input voltages;
- (4) Good transient property, short response and transition time.

Here an analogue voltage comparator design is proposed to achieve the above-mentioned goal. Fig. 5-1 shows the circuit diagram of this voltage comparator.



Fig. 5-1. The proposed voltage comparator circuit

 $V_{bp}$  and  $V_{bn}$  are biasing voltages for PMOS and NMOS current sources respectively. Here in this design  $V_{bp} = 0.65$ V and  $V_{bn} = 0.35$ V corresponding to the power supplies of 1V and 0V. The design is based on 50nm BSIM4 models and the circuit is simulated on shareware version of WinSpice v1.05.01.

This comparator is made up of four parts:

- (1) First amplifier, also called the input buffer;
- (2) Second amplifier;
- (3) Current summing circuit. It sums the corresponding currents from the second stage;
- (4) Output buffer. It enables the comparator's output to be rail-to-rail of the power supply and to improve the comparator's driving capability for next devices.

Designed in this way the comparator circuit can have high voltage gain and quick transient response time. The following sections will describe the voltage comparator circuit in detail.

# **5.1 First amplification stage**

If a comparator is to be operated with rail-to rail inputs, it means that the input voltages vary from positive to negative power supply. Complementary input buffers are adopted here in the first stage of this design while at the same time the first stage can also amplify the input voltages and improve the comparator's sensitivity.

The first amplifier is composed of two parts: The upper part is with the PMOS transistors as the input device and NMOS transistors as the loading. The lower part is the corresponding NMOS version amplifier with NMOS transistors as the input device and PMOS transistors as the loading.



Fig. 5-2. PMOS input amplifier

The upper PMOS amplifier is shown in Fig. 5-2 and it can be seen that this amplifier is symmetric. That is to say, the right side and the left side of the circuit are the same. However, the input voltages,  $V_+$  and  $V_-$ , are connected with different transistors at the right and left sides. Therefore, the amplifier is symmetric for  $V_+$  and  $V_-$ , i.e.  $V_+$  and  $V_-$  can change with each other.

For one side of the amplifier only it looks like the self biased differential amplifier. However, the difference between the amplifier here and the biased differential amplifier is that the current source has been divided into two branches and one branch of each amplifier is connected together to form common mode feedback. This amplifier can work well as input stage even when the input voltages are close to negative power supply  $V_{ss}$ . Another thing should be noted is that the size of those transistors working as current source is as same as other transistors in the amplifier. In some examples of differential amplifier design the width of the current source transistor is smaller than that of the other transistors or the length of source transistor is larger than that of others. The size selected here in this circuit can provide better transient property. The transition of the output, from logic 0 to logic 1 or vice versa, will spend less time at the cost of a little higher power dissipation.

The lower part of the first stage is the NMOS input amplifier corresponding to the upper PMOS one. The NMOS amplifier can work well even when the input voltages are close to the positive power supply  $V_{dd}$ . Therefore, when both of these two amplifiers are placed at the first stage of the comparator circuit it can provide the comparator with the ability of rail-to-rail input range.

#### **5.2 Second amplification stage**

In the second amplification stage the method is still used that complementary devices, PMOS and NMOS amplifier in parallel, are adopted at the same time to ensure the comparator circuit can work for rail-to-rail input voltages and to increase DC gain. The properties of this comparator are still good when the input voltages are close to power supply,  $V_{dd}$  or  $V_{ss}$ . There is no load for the amplifier in the second stage. The current is transferred to the next current summing stage. In addition, just like in the first stage, the size of the current source transistors is the same as that of other transistors for better transient performance.

# **5.3 Current summing stage**

The currents coming from the second amplification stage go into the complementary folded cascode amplifier in the third stage of the comparator circuit.

In this current summing stage wide swing cascode current mirror is adopted instead of normal cascode current mirror. This can provide wide output range, large voltage gain and shorter transient time. Fig. 5-3 shows the difference of these two kinds of current mirrors.



Fig. 5-3. (a) Cascode current mirror (b) Wide swing cascode current mirror

In Fig. 5-3 (a):  $VGS_I + VGS_3 = 2 (V_T + \Delta V)$  and  $\Delta V = \frac{(2I/\beta)^{1/2}}{1}$  if body effect is not concerned. Then the output is  $V_T + 2\Delta V$  far away from the power supply.

In Fig. 5-3 (b):  $VGS_1 + VGS_3 = V_T + \Delta V$  and the output is  $2\Delta V$  far away from the power supply.

There is also a feedback in the current summing circuit. Fig. 5-4 shows the second amplification stage together with the current summing circuit. When  $V_{on}$  increases,  $V_1$ ,  $V_3$  decreases and  $V_2$ ,  $V_4$  increases.  $V_1$  is fed to  $M_{10}$  as gate voltage so the current through  $M_{10}$  decreases. In this way the voltage of  $V_{op}$  is pushed high further.



Fig. 5-4. Second amplification stage and current summing circuit

# **5.4 Output buffer stage**

The function of the output buffer stage is to enable the comparator to output signals with rail-to-rail voltage and to improve the driving capability for the load.

The size of those transistors in the output buffer stage needs to be specialized:

a) For the first inverter, the size of the PMOS and NMOS transistors should make the switching threshold of the inverter close to the cross point of  $V_n$  and  $V_p$  in the

forgoing current summing stage so that the output of the comparator will just switch at the point that the input voltages,  $V_+$  and  $V_-$ , cross each other. In this way, both the offset and the transient response time of the comparator can be reduced and the accuracy can be improved at the same time.

b) For the second inverter, the expectation for it is to improve the driving capability so that the width of the transistors should be large. Here, it's chosen that the width of the two transistors is about twice as large of most other transistors in the comparator. Since signal inputted to the second inverter is large enough, the threshold of the second inverter doesn't matter much so that it doesn't need special design effort for threshold concerns.

In conclusion of this chapter, an analogue voltage comparator consisting of four stages is proposed to realize high sensitivity to input voltages and short response time. The detailed characteristic description is presented in the following chapter.

#### **5.5 Characteristic description**

Among all kinds of properties that can describe the performance of a comparator circuit, the voltage gain and the transient response time are the most important for a comparator. These two characteristics determine the comparator's sensitivity and rapidity: How small difference between the two input voltages can be detected, how fast the comparator can generate the judgment result.

It should be noted that when a comparator's performance characteristics are described the working point must be shown clearly. For example, at the voltage of 0.5V the voltage gain of a comparator is 60dB. Some comparators can work well when the input voltages are around the middle of two power supplies but can't work well when the input voltages are close the power supplies so that the voltage gain will be quite low and the transient response time will be very long. Therefore, if a comparator is needed to work in the rail-to-rail range of the power supplies, its performance characteristics must be quite good within the whole rail-to-rail range. Thus special design effort is required to ensure the comparator's capability as the input voltages are close to the power supplies.

To show clearly the performance characteristics of the proposed comparator in the working range from the negative power supply to the positive power supply, three working points are specially studied at (1) 0.02V, (2) 0.50V and (3) 0.98V and the voltage difference of 20 mV between  $V_{+}$  and  $V_{-}$  is selected. The reasons are:

- [1]. Corresponding to the power supply of 0-1V, the range from 0.02V to 0.98V covers96% of the rail-to-rail range.
- [2]. A resolution of 20 mV is enough for most cases in testing internal nodes of analog circuits. In addition, such voltage difference of 20 mV between  $V_+$  and  $V_-$  can show

the performance difference of various comparators while smaller value will result in much longer delay time.

First, the overall DC performance within the whole rail-to-rail range is shown in Fig. 5-5. In this simulation, the negative input voltage *V* changes from 0V to 1V with a step of 0.1V while  $V_+$  scans from 0V to 1V with a step of 0.5mV for every value of *V*.



Fig. 5-5. Comparator DC performance within the entire power supply range

#### 5.5.1 Operating point at 0.02 V

#### (a) DC performance

The negative input voltage,  $V_{-}$ , stays at 0.02V and the positive input,  $V_{+}$ , scans from 0V to 1V with a step of 0.5mV. Fig. 5-6 shows the simulation result.



Fig. 5-6. DC performance with  $V_{\rm s}$  staying at 0.02V

To make the output transition clear, the positive input  $V_+$  scans from 0.019V to 0.021V with a step of 0.02mV and Fig. 5-7 shows the simulation result.



Fig. 5-7. Output transition with *V* staying at 0.02V

Therefore, at working point of 0.02V, the proposed comparator works with DC gain = 75.2 dB, offset voltage = 0.022 mV.

One thing should be noted is that the voltage gain of the comparator circuit is nonlinear.

#### (b) Transient performance

To show the transient response of this comparator, a pulse voltage is inputted into the comparator and the output will show the comparator's transient response.

*V*. stays at 0.02V;  $V_+$  goes from 0V to 0.04V at 10ns and then back to 0V at 20ns. The two input voltages are shown in Fig. 5-8 and the comparator's corresponding output is shown in Fig. 5-9.



Fig. 5-8. Pulse input with  $V_2$  staying at 0.02V



Fig. 5-9. Comparator's output corresponding to the pulse input shown in Fig. 5-8

The transient response time, or the output delay, is 2.9ns and 3.2ns for the positive and negative edge of the comparator's output respectively.

However, the above situation is not the 'worst case' that most requires the comparator's transient capability. A better testing situation for the comparator's transient performance is that one input initially stays at a voltage which makes part of the comparator circuit in the off condition and this input changes to a value that is only slightly different with the other input. Therefore, the amplitude relationship between the two inputs will change in this 'worst case' corresponding to the transition of one of the inputs so that the comparator's output is expected to make a transition.

Fig. 5-10 shows this kind of 'worst case' pulse input. V<sub>-</sub> stays at 0.02V;  $V_+$  goes from 1V to 0V at 10ns and then back to 1V at 20ns. The comparator's

corresponding output is shown in Fig. 5-11



Fig. 5-10. 'Worst case' pulse input with V. staying at 0.02V



Fig. 5-11. Comparator's output corresponding to the input shown in Fig. 5-10

As it can be seen, the first transition delay of the comparator's output increases quite a lot to 6.1ns while the positive edge of the output only has a delay of 0.7ns.

The reason for the long delay of the first transition is:

- (a) One of the comparator's input changes so much so that the output should make a transition;
- (b) Initially some transistors are in deep saturation states and some other transistors are in cut off states. When the pulse input comes, many transistors have to make a transition of their states. The voltages of some nodes in the circuit have to change over a wide range such that the charge and discharge cost a lot of time;
- (c) At the steady state there is only small difference between the two inputs so that the driving potential or capability is small to charge or discharge the nodes inside the circuit.

#### 5.5.2 Operating point at 0.50 V

#### (a) DC performance

The negative input voltage,  $V_{-}$ , stays at 0.50V and the positive input,  $V_{+}$ , scans from 0.499V to 0.501V with a step of 0.02mV. Fig. 5-12 shows the simulation result of the comparator's DC performance at 0.50V.



Fig. 5-12. DC performance with V staying at 0.50V

At 0.50V, DC gain = 77.9 dB, offset voltage = 0.013 mV.

#### (b) Transient performance

In the "worst" input pulse shown in Fig. 5-13,  $V_{-}$  stays at 0.5V;  $V_{+}$  goes from 0V to 0.52V at 10ns and then back to 0V at 20ns. The comparator's corresponding output is shown in Fig. 5-14.



Fig. 5-13. Positive pulse input with V<sub>-</sub> staying at 0.50V



Fig. 5-14. Comparator's output corresponding to the input shown in Fig. 5-13

The delay at the positive, negative edge of the comparator's output is 4.9ns, 0.6ns respectively.

Another 'worst case' input at 0.5V is shown in Fig. 5-15.  $V_{-}$  stays at 0.5V;  $V_{+}$  goes from 1V to 0.48V at 10ns and then back to 1V at 20ns. The corresponding output is shown in Fig. 5-16.



Fig. 5-15. Negative pulse input with V. staying at 0.50V



Fig. 5-16. Comparator's output corresponding to the input shown in Fig. 5-15

The delay at the negative, positive edge of the comparator's output is 4.4ns, 0.9ns respectively.

#### 5.5.3 Operating point at 0.98 V

# (a) DC performance

The negative input voltage,  $V_{-}$ , stays at 0.98V and the positive input,  $V_{+}$ , scans from 0.979V to 0.981V with a step of 0.02mV. Fig. 5-17 shows the simulation result.



Fig. 5-17. DC performance with V. staying at 0.98V

At 0.98V, DC gain = 72.5 dB, offset voltage = 0.019 mV.

# (b) Transient performance

The 'worst case' input at 0.98V is shown in Fig. 5-18:  $V_{-}$  stays at 0.98V;  $V_{+}$  goes from 0V to 1V at 10ns and then back to 0V at 20ns.

The corresponding output is shown in Fig. 5-19.



Fig. 5-18. Pulse input with V. staying at 0.98V



Fig. 5-19. Comparator's output corresponding to the input shown in Fig. 5-18

The delay at the positive, negative edge of the comparator's output is 5.7ns, 0.4ns respectively.

Summarizing the above description, the comparator's performance corresponding to the four "worst" cases is listed in the following Table 5-1.

| Working point                  | 0.02V | 0.5   | 0V  | 0.98V |
|--------------------------------|-------|-------|-----|-------|
| DC gain (dB)                   | 75.2  | 77.9  |     | 72.5  |
| Offset voltage (mV)            | 0.022 | 0.013 |     | 0.019 |
| Positive transition delay (ns) | 0.7   | 4.9   | 0.9 | 5.7   |
| Negative transition delay (ns) | 6.1   | 0.6   | 4.4 | 0.4   |

Table 5-1. Performance of the proposed comparator

For comparison, experiments have also been carried out on the comparator design [19] shown in Fig. 5-20 and its performance is listed as following Table 5-2.



Fig. 5-20. Comparator with rail-to-rail input common-mode range [19]

| Working point                  | 0.02V | 0.50V |     | 0.98V |
|--------------------------------|-------|-------|-----|-------|
| Positive transition delay (ns) | 1.0   | 5.7   | 1.2 | 2.1   |
| Negative transition delay (ns) | 8.6   | 0.9   | 5.6 | 1.6   |

Table 5-2. Performance of the comparator in [19]

The maximum transition delay of our proposed comparator and the one in [19] are 6.1 ns and 8.6 ns respectively while the latter is 41% longer than the former. The maximum transition delay is the bottleneck of the comparator's performance and it determines the maximum frequency at which the comparator can work. Considering the maximum working frequency constrained by the bottleneck of maximum transition delay, our proposed comparator can work up to 164 MHz and the one in [19] can only work at 116 MHz or below, which shows that the maximum working frequency of our proposed

comparator is 41% higher than that of the comparator in [19].

In conclusion, a comparator circuit design and its characteristic description are presented here. The proposed comparator performs well on both DC voltage gain and transient response for rail-to-rail inputs. Due to the good performance of the first stage the sensitivity of this comparator is significantly high. This comparator can work well even when the input voltages are close to the power supplies because both PMOS and NMOS amplifiers are used in the first and second stages. In the above simulation and analysis, the longest transient delay time is 6.1ns. If the corresponding situation can be viewed as the 'worst case', this comparator can work up to 164 MHz.

# Chapter 6 INTERCONNECT SIGNAL INTEGRITY TEST PATTERN COMPACTION

Previously the work in modular SOC testing mainly focuses on testing cores' internal functionality, which includes test vector generation, test response analysis, test architecture design and optimization, etc. For example, the proposed devices and structures in the previous chapters of this thesis are all for core internal logic testing. However, with the shrinking feature size of IC process technologies, especially when the VLSI circuit fabrication technology goes into deep sub-micron (DSM) era, the test for signal integrity (SI) related problems is much required nowadays while the test cost for the SI faults on SOC interconnects (wires connecting embedded cores in SOC) may be comparable to or even higher than the test cost for the core internal functionality corresponding to today's high-performance IC designs. This makes it worse that testing consumes a substantially large fraction of the total IC chip cost. Together with the shrinking feature size, the working frequency of current VLSI circuits has always been growing steadily, to multiple GHz nowadays, while contrarily the power supply and noise margin of VLSI circuits have always been decreasing. The technology advances enable VLSI circuits denser, cheaper and more powerful while at the same time result in more serious crosstalk effects and make circuits more vulnerable if no further methods are applied. For example, nowadays wires carrying signals are closer to each other and thus electromagnetic interference between adjacent wires can not be ignored any more, which means that wires can't be considered to be independent of other wires any more.

SOC interconnect wires are much longer, typically on the order of mm, than those wires connecting cores' internal logic gates so that interconnect wires suffer more from crosstalk. Signal waveform arriving at the end of an interconnect will probably be much different from the signal waveform at the driving end of the interconnect, which includes extra delay, glitch, voltage overshoot/undershoot and so on. Therefore, testing interconnects for short/open faults only is far away from enough for current VLSI circuits while testing signal integrity, the ability of a signal to generate correct response in the circuit, on SOC interconnect is much necessary. The problem becomes serious nowadays that the cost to test the SOC interconnect SI may be comparable to or even higher than the test cost for the core internal functionality.

Here in this chapter an SI test pattern compaction method is presented. This method not only can reduce the number of test patterns but also can shorten the length of test patterns. Experimental results show that the proposed method can significantly reduce the overall test data volume for core external interconnect SI faults testing such that the test time can be reduced a lot.

#### 6.1 Background

With the decreasing scale size and the increasing working frequency of VLSI circuit, signal integrity (SI), the ability of a signal to generate correct responses in a circuit, presented by Guler and Kilic [49], has become a major concern for the interconnects between SOC embedded cores in the IC design. The undesired SI related problems, caused by cross-coupling capacitance and inductance between interconnect wires,

include voltage overshoot/undershoot, glitch, oscillation, excessive signal delay and even signal speedup. Some of such phenomena are shown in Fig. 6-1. If the noise-induced voltage swing or timing delay departs from the signal tolerance region, functional error may occur. In addition, voltage overshoot may damage the circuits such that the life of IC chips is much shortened. Numerous physical design and fabrication solutions (e.g., the one proposed by Becer et al [42]) are proposed in the literature to tackle SI problems but none of them guarantees to solve all the SI related problems perfectly. In addition, process variation and manufacturing defects may aggravate the coupling effects between interconnects, described by Natarajan et al [59]. Since it is unacceptable to over-design the circuit to tolerate all possible process variations and it is impossible to predict the occurrence of fabrication defects, manufacturing test strategies are essential to detect SI related errors.

On the one hand, various SI fault models (e.g., [46, 52, 61]) and the associated test methodologies (e.g., [43, 62]) have been proposed in the literature, but none of them is both effective in terms of fault coverage and efficient in terms of testing time. On the other hand, although the signal integrity related problems are aggravated in core-based SOC designs, as presented by Nordholz et al [58], because interconnect wires carrying signals between embedded cores tend to be long, much longer than core internal wires, and hence suffer more from such parasitic effects as coupling capacitance and inductance, most prior work in modular SOC testing focuses on core internal testing only without considering the ever important core external integrity faults can be

comparable to or even higher than the test cost for the core internal logic functionality, here the problem of signal integrity related test patterns for SOC interconnect test is investigated. The main contribution of this chapter is that a two-dimensional signal integrity test pattern compaction method is proposed to reduce the interconnect signal integrity test pattern data volume and thus reduce the test time.



- T\_SI\_R/T\_SI\_F: Skew immune rising/falling range

Fig. 6-1. Demonstration of signal integrity loss

## **6.2 Introduction to signal integrity testing**

When considering signal integrity problems resulting from cross-coupling (or crosstalk) between SOC interconnects, the interconnect wire on which the error effects take place is denoted as the victim while the interconnect wire affecting others is denoted as the aggressor. Usually, at the time when one crosstalk event is studied, there is only one victim and one or multiple aggressors. That is to say, research in this field is based on how one interconnect is affected by other interconnects. Early attempts, such as Attarha and Nourani [44] and Iyengar et al [47], for testing signal integrity related problems model the crosstalk at circuit level. One of such models, distributed RC crosstalk model, is demonstrated at Fig. 6-2.



Fig. 6-2. Distributed RC crosstalk model

In the model shown in Fig. 6-2, the driving end of an interconnect line is modeled as a
source with resistance while the receiving end is modeled as load capacitance. An interconnect line is modeled with distributed resistors and capacitors. The cross-coupling effects between the two lines are represented by coupling capacitors  $C_x$ .

Circuit level crosstalk model may be more accurate but the complexity of the test pattern generation process imposes limitation on their application to testing SOC interconnects. Signal integrity fault model at behavioral level helps in such case. Fig. 6-3 shows the crosstalk event with one victim and one aggressor at behavioral level.



Fig. 6-3. Crosstalk model at behavioral level

Cuviello et al. [5] proposed a behavioral level signal integrity fault model, called maximal aggressor (MA) fault model. The transition patterns for a crosstalk system with one victim and two aggressors are demonstrated in Fig. 6-4. In this model all the aggressors make the same transition in the same direction at the same time and act collectively to generate the glitch when the victim is quiescent or the delay error when the victim makes an opposite transition. Therefore, only 4N test vector pairs are needed to detect signal integrity faults for a set of N interconnects using this fault model.

|               | P <sub>g0</sub> | P <sub>g1</sub> | Ng1 | Ng0 | d,  | d <sub>f</sub> |
|---------------|-----------------|-----------------|-----|-----|-----|----------------|
| A             | Ī               | Ĩ               | ٦.  | ٦.  | _   | T              |
| v             | 0               | 1               | 1   | 0   | ₹   | _              |
| A             | <b>_</b>        | ₽               | L   | 7   | ₫   | Ł              |
| Vector 1: AVA | 000             | 010             | 111 | 101 | 010 | 101            |
| Vector 2: AVA | 101             | 111             | 010 | 000 | 101 | 010            |

Fig. 6-4. Test pattern based on MA fault model

However, such test patterns may not be able to generate maximum noise/delay on the victim line [47, 54]. Tehranipour et al. [61] presented a multiple transition (MT) fault model that covers all transitions on victim and multiple transitions on aggressors. The MT model and corresponding test pattern for a crosstalk system with two aggressors are demonstrated in Fig. 6-5.

| Quiescent<br>at 0<br>x0x<br>x0x | Transition<br>0→1<br>x0x<br>x1x | Quiescent<br>at 1<br>x1x<br>x1x<br>x1x | Transition<br>1 → 0<br>x1x<br>x0x |
|---------------------------------|---------------------------------|----------------------------------------|-----------------------------------|
| 000                             | 000                             | 010                                    | 010                               |
| 101                             | 111                             | 111                                    | 101                               |
| 001                             | 001                             | 011                                    | 011                               |
| 100                             | 110                             | 110                                    | 100                               |
| 100                             | 100                             | 110                                    | 110                               |
| 001                             | 011                             | 011                                    | 001                               |
| 101                             | 101                             | 111                                    | 111                               |
| 000                             | 010                             | 010                                    | 000                               |

Fig. 6-5. Test pattern for three-interconnect crosstalk system based on MT and MA

(shaded) model

The number of test patterns for this MT fault model is exponential to the number of

interconnects under test. To address this, an empirically determined locality factor k, showing how far the effect of aggressors remains significant, is introduced. Optimally, the total number of test patterns for a set of *N* interconnects using this reduced MT fault model will be  $[N/(k+1)](2k+1)2^{2k+2}$ .

Built-In Self-Test (BIST) has been the primary test methodology used to detect signal integrity related errors. At the driver side of interconnects, test generators are embedded to generate transitions on the aggressors and the victim. At the receiver side of interconnects, various types of integrity loss sensor (ILS) cells (e.g., the one proposed by Bai et al [43] and the one proposed by Tehranipour et al [62]) are designed to detect signal integrity related errors. The wrapper cell arrangement in interconnect test mode is shown in Fig. 6-6 and the detailed structure of wrapper cells for signal integrity testing is shown in Fig. 6-7.



Fig. 6-6. Wrapper cell arrangement in interconnect test mode



Fig. 6-7. Detailed structure of wrapper cells for signal integrity test

Hardware-based test generators may cause over-testing and/or under-testing since not all test patterns generated in the test mode are valid in the normal functional mode of the SOC. In addition, since the SOC interconnect topology can be arbitrary, interconnects between several cores may be close enough to result in signal integrity error, as shown in Fig. 6-8.



Fig. 6-8. Demonstration of SOC interconnect topology

It is very difficult to take this into account for those on-chip hardware-based test pattern generation techniques. As a result, here in this chapter the test stimuli are assumed to be loaded from external test equipment instead of to be generated by on-chip hardware.

Most prior work in SOC test focuses on core internal testing. This is mainly because testing interconnects short/open faults only requires very short time and hence can be paid less attention to in the test planning process. When high signal integrity fault coverage is a concern, however, the testing time for SOC interconnects can be comparable to the testing time for the core internal logic. Goel et al have presented the testing time of an industry SOC to be about 2 million clock cycles corresponding to 140 TAM wires [45]. However, the interconnect signal integrity testing time can be much longer than that for SOCs with several thousand interconnect wires. What's worse, with the shrinking feature size of DSM technology, short interconnects may also suffer from signal integrity problems as presented by Nordholz et al [58]. Therefore, it's possibly needed to detect signal integrity faults for hundreds or even thousands of interconnects in the SOC and even prohibitively large testing time will be obtained. Such conclusions can be drawn from the above discussion that effective test set compaction strategy should be utilized to reduce the volume of the test data for signal integrity faults.

Multiple test structures have been proposed, such as the structures from Geol et al [50] and Marinissen [51]. At the core level, the wrapper output cells (WOC) at the driver ends of the interconnects should be able to apply the necessary consecutive transitions to support signal integrity test while the wrapper input cell (WIC) at the receiving ends of the interconnects should include integrity loss sensor (e.g., [43, 62]) to capture the signal with noise and/or delay violation. A typical design is shown in Fig. 6-6, proposed by Tehranipour et al [62]. From the point of view of test vector applying and test response capturing, interconnect testing is different from core internal testing since test vectors are shifted into WICs of each core and test responses are captured at the WOCs of same core.

## 6.3 Proposed signal integrity test pattern compaction method

Here in this chapter it is assumed that the test patterns for signal integrity faults are given a priori, which can be patterns generated for various signal integrity fault models. Test pattern generation is not a concern of this chapter while the contribution of this chapter focuses on test pattern compaction. In each test pattern, two consecutive logic values should be prepared for each output port of all embedded cores. Consequently, there are four types of conditions for each output port: Staying at 0 or 1 (00 or 11), positive transition (01) and negative transition (10). The format of the assumed given test pattern is shown in Table 6-1, in which 'x' represents the don't-care bit; '0/1' represents that the corresponding core output terminal stays at 0/1 in consecutive cycles while  $\uparrow$  and  $\downarrow$  represent a positive transition and a negative transition, respectively, happening at the corresponding embedded cores' output port. For each test pattern, a postfix is added to denote whether this test pattern utilizes a shared bus line, in which '1' denotes the specific bus line is utilized while 'x' means it is "don't-care".

A victim interconnect is mainly affected by its several neighboring interconnects, as described by Kundu et al [52], while other interconnects have ignorable effect on the victim, therefore only several I/O ports are involved into a crosstalk event although the total I/O ports in an SOC system can be several thousand. The signal integrity test patterns, resulting from such crosstalk mechanism, typically feature a large number of don't-care bits that correspond to those I/O ports not involved into crosstalk.

|    | Core-1 | Core-2 | Core-3 |      | Core-n | Due |
|----|--------|--------|--------|------|--------|-----|
|    | WOC    | WOC    | WOC    | •••• | WOC    | Bus |
| P1 | ↑x↓xx  | xxx    | 0xx↑   |      | xx↑    | xx1 |
| P1 | xxxxx  | x↑x    | xx↓x   |      | ↓xx    | xx1 |
| P3 | x↑xx↓  | x↓x    | xxxx   |      | xxx    | xxx |
| P4 | …xxxx↑ | xxx    | ↓xx…x  |      | x↓x    | 1xx |
| P5 | ↑x↓xx  | 1xx    | xxx↑   |      | ↑xx    | xxx |
|    |        |        |        | •••• |        |     |

Table 6-1. Format of signal integrity test pattern

## 6.3.1 Test pattern count reduction

Because of the large number of don't-care bits in each test pattern, a natural thought to reduce the volume of the test data is to compact multiple test vectors into one vector when they are compatible. Two test patterns are compatible if there is no conflict for each bit of the two test patterns. Since bus lines are based on shared mechanisms and may connect many cores at the same time, it is possible that several signal integrity test patterns trigger the same bus line from different core boundaries and therefore these patterns should not be compacted into one pattern. The postfix that is added to each signal integrity test pattern is used to identify such situation. If the bit values for a specific position in the postfix of two test patterns are both '1', they are marked as incompatible (e.g.,  $p_0$  and  $p_1$  in Table 6-1). Therefore, test pattern conflicts include the conditions that different transitions happen at the same port for the two patterns, or the same bus line is occupied by both of the two test patterns.

The goal here is to reduce the test pattern count as much as possible and the problem of finding the minimum number of compacted test patterns for a given test set can be formulated as a maximum clique partitioning problem, using such algorithm as the one presented by Jha and Gupta [53], in graph theory. That is, a graph is created in such a way that each vertex in this graph corresponds to a test pattern and an edge is added between two vertices if the corresponding two test patterns are compatible. Then a set of signal integrity test patterns, in which any two test patterns are compatible, forms a clique in this graph and the objective here in this chapter is to find a minimum number of cliques covering all the vertices in the graph. Each set of such compatible test patterns can be merged into one compacted test pattern corresponding to a clique in the graph. Such a graph belongs to un-directed type of graphs in which all edges are un-directed because each edge in the graph corresponds to the compatibility of two test patterns.

Here, a clique is defined as any complete sub-graph (not maximum complete sub-graph) of a graph [65] and a complete graph is a graph in which each pair of graph vertices is connected by an edge. The complete graphs with 2-7 vertices are demonstrated in Fig.6-9 while a set of different cliques in the same graph is demonstrated in Fig. 6-10.



Fig. 6-9. Complete graph examples



Fig. 6-10. A set of cliques in a graph

The problem of finding the size of cliques for a given graph is an NP-complete problem as described by Skiena [66] which is both NP (verifiable in nondeterministic polynomial time) and NP-hard (any NP-problem can be translated into this problem). A clique partitioning example is shown in Fig. 6-11, in which the whole graph can be covered by two cliques that are composed of vertices  $\{2,4\}$  and  $\{1,3,5\}$  respectively.



Fig. 6-11. Demonstration of clique partition

Here for the graph corresponding to interconnect signal integrity test patterns, the creation and partition of the graph have been finished as following:

 $\times$  Graph has been built here with the help of Boost C++ libraries [63].

X Using the algorithm, shown in Algorithm 8-1, with improvements on the clique partitioning heuristic shown by DDEL center, University of Cincinnati [64], this graph is partitioned into cliques.

Algorithm 6-1. Clique Partitioning Algorithm

Input: A graph with vertices and edges G

Output: A set of disjoint cliques coving all vertices of the graph C

1. begin Clique\_Partition

- 2. Select\_StartingVertex(G)  $\rightarrow$  vertex S
- 3. **if** (no *S* can be found) program terminates
- 4. **if** (*S* has no compatible vertex) {
- 5. *S* represents a clique
- 6. *S* is removed from the graph
- 7. go to 2
- 8. }
- 9. neighboring vertices of  $S \rightarrow$  vertex set *B*
- 10. Select\_Neighbor(B)  $\rightarrow$  vertex R
- 11.  $Merge(S, R) \rightarrow vertex S'$
- 12. vertex  $S' \rightarrow$  vertex S
- 13. go to 3
- 14. ends Clique\_Partition
- 15. sub-procedure Select\_StartingVertex(*G*)
- 16. vertices with maximum degree in  $G \rightarrow$  vertex set M
- 17. if (the number of vertices in *M* equals to one) {
- 18. output the only vertex in  $M \rightarrow$  vertex *S*
- 19. sub-procedure terminates

20. }

| 21. the first vertex in $M \rightarrow$ vertex S |                                                                     |  |  |  |  |
|--------------------------------------------------|---------------------------------------------------------------------|--|--|--|--|
| 22. <b>fo</b> r                                  | ( each vertex $m$ in $M$ ) {                                        |  |  |  |  |
| 23.                                              | compute the sum of degree of all the neighboring vertices of $m$    |  |  |  |  |
| 24.                                              | if (the sum of $m$ is greater than the one corresponding to $S$ ) { |  |  |  |  |
| 25.                                              | vertex $m \rightarrow$ vertex S                                     |  |  |  |  |
| 26.                                              | }                                                                   |  |  |  |  |
| 27. }                                            |                                                                     |  |  |  |  |

- 28. ends Select\_StartingVertex
- 29. sub-procedure Select\_Neighbor(*B*)
- 30. check the compatibility of every pair of vertices in B
- 31. compute the compatibility of each vertex b in B
- 32. vertices with maximum compatibility  $\rightarrow$  vertex set *M*
- 33. if (the number of vertices in *M* equals to one) {
- output the only vertex in  $M \rightarrow$  vertex R34.
- 35. sub-procedure terminates
- 36. }
- 37. the first vertex in  $M \rightarrow$  vertex *R*

38. **for**(every vertex *m*in *M*) {

39. **if**(the degree of *m* is less than the degree of *R*) {

40. vertex  $m \rightarrow$  vertex R

41. }

42. }

43. ends Select\_Neighbor(*B*)

44. sub-procedure Merge(*S*, *R*)

45. add a new vertex S'

46. **for**(every neighboring vertex  $n_S$  of S) {

47. **if**( $n_S$  is also the neighboring vertex of R) {

48. add an edge between S' and  $n_S$ 

49. }

50. }

51. **for**(every neighboring vertex  $n_R$  of R) {

52. **if** $(n_R$  is also the neighboring vertex of S) {

53. add an edge between S' and  $n_R$ 

54. }

55. }

56. remove all the edges connected with R or S

57. remove vertices R and S

58. ends Merge(S, R)

The algorithm shown here for clique partitioning is to generate a set of disjoint cliques for an input graph and these cliques cover all the vertices of the graph. The fundamental method of this algorithm is to find a clique during each cycle by merging two picked vertices of the graph iteratively. The explanation for this heuristic is shown as following.

At the beginning of each iteration, a vertex is selected as the starting vertex, or seed, denoted as *S* using the sub-procedure Select\_StartingVertex. The selecting rule used here for starting vertex is to select the vertex with the highest degree. Here in this case, the degree of a vertex is the number of edges from or to this vertex. On the other hand, the degree of a vertex is in fact the number of compatible test patterns that the test pattern, corresponding to this vertex in the graph, has. The goal of clique partitioning here is to achieve minimum number of cliques, instead of finding the maximum clique and so on because each clique of the graph corresponds to a compacted test pattern. The number of cliques corresponds to the number of compacted test patterns and minimum number of compacted test patterns will achieve minimum test data volume and test time. One thing should be noted is that there is no vertex priority needed here, in the Selecting\_StartingVertex sub-procedure and the next steps, compared to the algorithm

shown by DDEL center, University of Cincinnati [64]. This improvement can reduce the algorithm complexity and computation time.

The reason to select the vertex with maximum degree is that the degree of a vertex can be indicative of the probability that the vertex will be in the largest clique in the graph. However, it is possible that there exist multiple vertices with the same maximum degree. Instead of selecting one of these vertices arbitrarily, starting vertex is selected from these vertices by means of further information. For each of the vertices with maximum degree, all its neighboring vertices that are connected with the vertex through an edge directly are scanned and the degree of every neighboring vertex is summed up together. The vertex with the maximum sum of degree of all its neighboring vertices is selected from the multiple vertices with the same maximum degree. That is, starting vertex is selected as the vertex with maximum degree and, if necessary, with maximum sum of degree of all its neighboring vertices. If the number of such vertices, with maximum degree and maximum sum of degree of all neighboring vertices, is still greater than one, one of them can be selected arbitrarily as the starting vertex. In the algorithm used here, the first such vertex is selected.

Once the starting vertex S is determined, all its neighboring vertices, denoted as vertex set N, and all the edges between any pair of these vertices, which include the starting vertex S and its neighboring vertices, are taken into account in the following steps of this iteration. Correspondingly, a test pattern and all its compatible test patterns together with their compatibility are considered here. For each neighboring vertex (or each component n in N), the number of vertices which are in the neighboring vertices set N and compatible with this neighboring vertex n is computed and denoted as DN. One thing must be made clear is that the number of compatible vertices of vertex n, DN, here is not the degree of vertex n. The degree of vertex n takes into account all the compatible, or neighboring, vertices of vertex n in the whole graph while the number of compatible vertices, DN, computed previously takes the into account only the compatibility in the domain composed of the starting vertex S and its neighboring vertices. That is to say, the compatibility DN is the degree of vertex n in the sub-graph composed of only the starting vertex S and all its neighboring vertices, so DN will always be no more than the degree of vertex n.

After the compatibility of each vertex, DN, in the neighboring vertex set N is computed, the vertex with maximum compatibility is selected and denoted as R here. The reason to select vertex R with maximum compatibility is to try to include maximum number of vertices in the clique and thus to achieve minimum number of cliques because those vertices that are incompatible with R will not become the components of this clique. Selected in such a way, the vertex R in fact excludes minimum number of vertices from the sub-graph.

Similar to the case of selecting the starting vertex, multiple vertices with the same maximum compatibility may exist. To deal with the complexity, the degree of these multiple vertices is considered here as further information for selection and the vertex with minimum degree is selected. That is to say, once the starting vertex S is determined,

the vertex with maximum degree in the sub-graph and minimum degree in the whole graph is selected as R from the neighboring vertices of S. The reason to select the vertex with minimum degree in the whole graph is to make the left vertices, excluded from the current clique, with more compatibility so that the possibility is higher for them to form new cliques and thus the number of cliques is reduced. If we are not lucky enough one more time such that there are still multiple such vertices, one of them can be selected arbitrarily as R. In our program, the first such vertex is selected.

By now the starting vertex *S* has been selected from the whole graph and the vertex *R* has been selected from the neighboring vertices of *S*. The process now arrives at the step of merging *S* and *R* into *S*<sup>'</sup> while vertex *S*<sup>'</sup> is a vertex newly added to the graph. What should be noted here is that vertex *S*<sup>'</sup> is different from other vertices because each of the original vertices corresponds to an original test pattern while the newly added vertex *S*<sup>'</sup> has no counterpart in the original test patterns. Corresponding to the addition of new vertex *S*<sup>'</sup>, new edges are added to the graph and all these edges are connected with vertex which is compatible with both vertices *S* and *R*. That is to say, a vertex is compatible with the new vertex *S*<sup>'</sup> if and only if this vertex is compatible with both vertices of *S* and *R* are both removed and all edges connected with *S* and *R* are also removed. The number of vertices in the graph decreases by one and the number of edges decreases or remains the same after the completion of the merging step.

After the adding and removing of vertices and edges, a new graph is created and at the same time vertex S' begins to work as a new starting vertex, i.e., vertex S' becomes a new vertex S. Just like what has been done in the previous steps, the vertex S and all its neighboring vertices make up of a new sub-graph. Another vertex is selected as R in the sub-graph and R is merged with S into S' so that the number of vertices reduces by one again. The iteration, with searching and merging, stops until the compatibility of vertex S becomes zero or S has no compatible vertex any more. A clique is found at that moment and all the vertices merged into S make up of this clique. In the test pattern space, the test patterns corresponding to those vertices in the clique are compatible with one another and they can be merged into one compacted pattern.

The ultimate *S* can be removed now from the graph, which means that all the vertices merged into *S* and all the edges connected with these vertices are removed from the graph. The program will go back to the beginning with a smaller graph and another iteration begins. In conclusion, in each iteration of the algorithm one clique is found. At the beginning of each iteration a starting vertex is selected as seed and at the end of the iteration all the vertices in that clique are removed from the graph. As a new iteration begins, the algorithm starts from a smaller graph. When all cliques are found using such algorithm, there is no vertex nor edge in the graph any more such that the clique partitioning terminates.

The computational complexity of the above manner is high although both the algorithm and the program have been optimized. For example, it takes about ten hours to compact 10,000 test patterns for ITC'02 SOC test benchmark circuit p34392. The analysis for such long computation time reveals two reasons:

- (I) This kind of computation requires high hardware resources. Take the above case for example; there are  $10^4$  vertices and about  $0.4 \times 10^8$  edges in the graph built for  $10^4$  test patterns. The computation of such big graph requires high volume of memory and high speed of CPU.
- (II) In the process of searching for each clique, the computation complexity is  $O(n^2)$  while n is the number of vertices left in the graph.

Since the computation time is very long to perform high volume of experiments on benchmark circuits, especially for those big SOC systems, a greedy compression heuristic, inspired by the above reason (II), is used to reduce the computation time. The pseudo code of this method is depicted in Algorithm 8-2. The algorithm takes the original test set *Po* as input and output the compacted pattern set *Pc*. A compacted test pattern is generated in each inner loop (Lines 4-7) by merging the first pattern  $p_1$  in the un-compacted test set *Pu* with its following compatible patterns in one pass. The algorithm stops when all test patterns are compacted and outputs the compacted test set *Pc*.

Algorithm 6-2. Greedy Compression Heuristic

Input: Original test pattern set Po

Output: Compacted test pattern set Pc

- 1. initialize  $Pc = \Phi$ ; Pu = Po;
- 2. **while**(|Pu| > 0) {
- 3. set  $p_c = Pu(1)$ ;  $Pm = \{ Pu(1) \}$ ;
- 4. **for**(i = 2 **to** |Pu|) {
- 5. **if** $(p_c \text{ and } Pu(i) \text{ are compatible})$  { 6. merge  $p_c$  and Pu(i) to  $p_c$ ; 7.  $Pm = Pm \cup \{Pu(i)\};$ . } 8.  $Pu = Pu \setminus Pm; Pc = Pc \cup \{p_c\};$ . } 9. **return** Pc; 10. end

The main steps in the greedy compression heuristic is demonstrated in Fig. 6-12.



Fig. 6-12. Greedy compression heuristic to compact interconnect SI test patterns

This method is explained as following. At the beginning of each loop the first test pattern in the un-compacted pattern set is selected as the starting compacted pattern  $p_c$ . The program then fetches every test pattern Pu(i) orderly and checks its compatibility with  $p_c$ . If Pu(i) and  $p_c$  are compatible, they are merged into a new pattern and the original  $p_c$  is replaced by this pattern. The next test pattern Pu(i+1) is then fetched and the compatibility of Pu(i+1) and this new  $p_c$  is checked so that these two patterns are merged if they are compatible.  $p_c$  will become "bigger" during the process of compatibility checking and pattern merging, which means that the number of don't care bits in  $p_c$  becomes less. After the last pattern has been processed, this loop terminates and one compacted test pattern comes into being while all the original test patterns werged into this compacted pattern have been eliminated from the test pattern set. Afterwards, a new loop begins from the beginning of a new test pattern set with less number of patterns than the previous loop. After multiple loops with iterative

compatibility checking and pattern merging, all the original test patterns are merged into multiple compacted test patterns.

The computation complexity of this greedy compression heuristic mainly shows simplification at several steps compared with the graph method. (i) No graph file is needed such that the requirement for computation resources is reduced. One test pattern is only fetched at the time when it is to be checked and merged with the compacted pattern  $p_c$ . (ii) The first test patter is always selected as the starting pattern while complex rules are applied to select the starting vertex in graph method which cost a lot of computation time. (iii) The un-compacted patterns are checked merged one by one orderly while in the graph method one vertex is selected from the neighboring vertices of the starting vertex with complex rules applied.

Obviously this greedy strategy is not optimal and the quality of the resulting *Pc* depends on the order of the test patterns because the first un-compacted pattern is always selected as the starting pattern and merged with its following patterns orderly. However, the experiments show that similar compaction ratio, about five percent worse than that of graph method, can be achieved compared to the clique partitioning formulation with much less execution time. For example, only half an hour is needed to compact 10,000 test patterns for ITC'02 SOC test benchmark circuit p34392. This is mainly because of the following two reasons:

 (I) No optimal solution for the maximum clique partitioning problem exists and hence it's also needed to use heuristic to solve it; (II) Since the test patterns feature a large percentage of don't-care bits, each pattern is compatible with about 95% of other patterns. From the point view of compatibility, which determines how many edges there are from/to one vertex in the graph, most of the patterns are similar with one another and thus the ordering of the test patterns does not affect much in practice.

The above compaction scheme to reduce test pattern count can be viewed as reducing the volume of the test data in the vertical dimension as shown in Table 6-1. As each signal integrity test pattern involves only a few cores' terminals (denoted as care-cores of this signal integrity test pattern), the boundaries of those don't-care cores (e.g., core 1 for  $p_x$  in Table 6-1) can be bypassed and the length of the test pattern can be reduced. The above strategy can be viewed as compacting the test pattern in the horizontal dimension as shown in Table 6-1 and the details are discussed in the following paragraphs.

## 6.3.2 Test pattern length reduction

Based on the above observation, instead of compacting all the test patterns as a whole set and hence the length of every compacted test pattern is still the sum of all cores' WOC numbers, a method to partition the entire signal integrity test set into several groups and compress the patterns in each group separately is proposed here so that the test pattern length in each group can be less than the original length. To achieve this goal, first it is needed to partition all the SOC cores into several core groups. After that, classify the signal integrity test patterns in such a way that the test patterns whose care cores are all within the same core group form a test pattern group. In each test pattern group, the length of each test pattern is reduced to be the sum of the WOC numbers of the cores in this core group instead of the WOC numbers of all SOC cores. For those remaining test patterns whose care cores fall into multiple core groups, either repeat the core partitioning process so that some other test pattern groups are created to cover the care cores of the remaining test patterns as much as possible, or simply group all the remaining patterns as a whole in an extra group.

To achieve maximum compression ratio for test time reduction, the objective here is to minimize the number of remaining patterns and at the same time each group has roughly balanced test pattern length, i.e., roughly equal sum of all cores' WOC numbers in each group. This problem can be formulated as a hypergraph partitioning problem. An example hyptergraph and its portioning are shown in Fig. 6-13.



Fig. 6-13. An example of hypergraph and its partitioning

A hypergraph is a graph in which generalized edges may connect more than two nodes and correspondingly the edges in a hypergraph is called hyperedges [68].

In the example shown in Fig. 6-13, there are four hyperedges that connect vertices  $\{1, 5\}, \{1, 4\}, \{4, 6, 7\}$  and  $\{2, 3, 7\}$  respectively. If the hypergraph is required to be partitioned into two fractions, vertices are divided into two parts composed of vertex  $\{2, 3, 7\}$  and  $\{1, 4, 5, 6\}$  respectively in order to achieve roughly equal partition and minimum cut of hyperedges. The hyperedge connecting vertex  $\{4, 6, 7\}$  is cut by the partition in Fig. 6-13 while all other hyperedges remain intact in one of the sub-graphs formed by the partition.

In the case of core grouping, each node in the hypergraph corresponds to a core and a hyperedge is added for each test pattern while the hyperedge connects all the care cores (nodes) of this pattern. The weight of each node is the number of WOCs of the corresponding core while the weight of each hyperedge is the number of times that those cores connected by this hyperedge are the care-cores of test patterns since there might be multiple test patterns having the same care-cores. The hypergraph partitioning problem has been well-researched in the literature and the hMetis package by Karypis Lab, University of Minnesota [57, 67] is used here to solve the problem. The graph built according to the above described rules is inputted into hMetis package. Those nodes in each fraction of the hypergraph created by the hMetis form a core group respectively and those hyperedges being cut by the partitioning process correspond to remaining patterns, forming one extra group, whose length remains the same as the original

patterns. Parameters are adjusted for hMetis package to achieve roughly balanced partition and minimum number of hyperedge being cut in order to reduce the total data volume of all the test patterns. In such a way the total signal integrity test data is reduced and thus test time can be much less than before.

In conclusion of 9.3.1 and 9.3.2, the main procedure to compact signal integrity test patterns in two dimensions is as following.

- (I) Identify the care-cores of each test pattern;
- (II) Partition the SOC cores into several core groups according to the information from (I) by building a hypergraph and using hMetis package.
- (III) Pile those test patterns whose care-cores belong to the same core group into the same test pattern group. One extra group contains those test patterns whose care-cores involve several core groups;
- (IV) Check the compatibility of each test pattern with all other patterns in the same pattern group;
- (V) Based on the information in (IV), compact the test patterns in the same group either by building graphs and clique partitioning or by greedy compression heuristic.

The main contribution of the method proposed here are

- Compacting test patterns in the horizontal dimension, (i.e. reducing pattern length, described in (II), (III));
- (2) Compacting the test patterns in the vertical dimension, or reducing pattern count, (described in (V)).

Resulting from the reduction of both test pattern count and test pattern length, the total test data volume can be greatly reduced using the test pattern compaction algorithm proposed here.

## **6.4 Experimental Results**

To analyze the effectiveness of the proposed solution, experiments are carried out for several ITC'02 benchmark SOCs from [55]. Since the test patterns for core interconnect signal integrity faults of these benchmark SOCs are not available at hand, firstly test patterns are generated for the experiments in the following manner.

- (I) In each test pattern there is one victim and Na ( $2 \le Na \le 6$ ) aggressors, where at most two aggressors are outside of the victim core boundary. The victim and the aggressors are randomly selected from all the WOC ports of SOC cores.
- (II) In addition, a 32-bit bus is assumed to be utilized in all the benchmark SOCs. In the test pattern generation, the probability that the bus is used by a test pattern is set as 50%. If the bus is used in a particular pattern, 1~Na occupied bits are

randomly generated, which means that those bits are used to transfer signals in a test pattern. Bus occupation is described in the postfix of the pattern and a "1" represents the corresponding bit is occupied by a test pattern. Bus occupation should also be taken into account when compatibility between test patterns is checked.

Pattern sets with  $3x10^3$ ,  $10^4$ ,  $3x10^4$  and  $10^5$  random patterns are generated respectively to demonstrate the effectiveness of the proposed technique with different number of test patterns. Table 6-3 to Table 6-7 shows the results of the two-dimensional test compaction scheme on different benchmark circuits. The results in all these tables are based on the greedy compression heuristic in the test pattern count reduction process and the results using graph method haven't been listed here. The meaning of those symbols in the tables is explained as following.

The SOC cores are partitioned into  $N_g$  core groups using hMetis package [67] with  $N_g$  = 1, 2, 4, 8 respectively. Therefore, the row with Ng = 1 is for the case when all the test sets are compressed as a whole without partitioning. However, when Ng = 2 the cores will be partitioned into two core groups and the test patterns will be classified into three test pattern groups. Among these three pattern groups, two of them are composed of test patterns with less pattern length while the test patterns in the third pattern group still remains the original test pattern length. Similarly, five and nine pattern groups are formed for Ng = 4 and 8 respectively.

Nr represents the number of original test patterns.

*Nc* denotes the number of finally compacted test patterns.

The total test data volume, denoted by *Ds*, is calculated as the sum of the test pattern length times the corresponding test pattern count in each test pattern group.

 $\Delta Ds$  is calculated as the test data volume reduction percentage compared to the case when Ng = 1. That is to say,  $\Delta Ds$  shows the effect of test length shortening (core grouping) on the total test data volume reduction.

It can be observed from the tables that with test pattern merging only (test pattern count reduction) the compaction ratio of the total compacted test data volume over the original test set is close to  $\Delta V = 3\%$  (i.e.,  $\Delta V = Nc / Nr \times 100\%$  when Ng = 1). For example, in Table 6-3 for benchmark circuit d695, 282 compacted test patterns are created in the case of Nr = 10,000 when Ng = 1 so that the compaction ratio  $\Delta V = 2.82\%$ .

With test pattern length reduction by cores grouping, the test data volume can be further reduced for up to more than 20% on top of  $\Delta V$ . For example, 288, 314 and 355 compacted patterns are left after the two-dimension compaction when Ng = 2, 4 and 8 respectively in the case with Nr = 10,000 shown in Table 6-3 for benchmark circuit d695. Although the numbers of finally compacted patterns are greater than that of Ng = 1, however, the total data volume can be reduced because many of the finally compacted patterns are with less length when core grouping method is applied. In the above case the total test data volume can be reduced further by 19.17%, 25.10% and 26.47% when cores are partitioned into 2, 4 and 8 groups respectively and minimum

test pattern compaction ration can be achieved if cores are partitioned into 8 groups in this case. However, partitioning cores into 8 groups will not always result in minimum compaction ratio. For example, minimum compaction ration is realized when cores are portioned into 4 groups in the case of Nr = 3,000 for benchmark SOC g1023 shown in Table 6-4.

Some experimental results are shown in Fig. 6-14 with histogram to demonstrate the effectiveness of this compaction method on different benchmark circuits with different number of original test patterns. The vertical axis in Fig. 6-14 represents the finally total test data volume in terms of million bits and the four columns in each cluster show the data volume with Ng = 1, 2, 4 and 8 respectively from left to right.



Fig. 6-14. Test pattern compaction results on different benchmark circuits with different number of test patterns

It can be seen from Fig. 6-14 that the compaction scheme proposed here is more

effective for big circuits with large number of test patterns, which indicates that the application of the compaction scheme will be promising in industry. The detailed experiment data can be found in Table 6-3 ~ Table 6-7.

While for the test data compression technique proposed by Tehranipour et al [62], in each step of the vector compression process one pattern is appended to the bit streams obtained up to that point overlapping similar part. Fig. 6-15 demonstrates the basic idea of compressing two vectors by means of overlapping the common part in both the two vectors. There should be no conflict in the common part, which means that at each position of the common part the two corresponding bits in the two vectors should be the same or one of them should be don't care bit. To compress the two vectors as much as possible, as much bits as possible should be found to form the overlapped parts.



Fig. 6-15. Compress two vectors by overlapping non-conflict parts

In the compression process using the above-mentioned technique, one number is needed

for each pattern to store the information of how many bits have been shifted during the compression because such number is required to decompress a given (compressed) bit stream in order to reconstruct the patterns from such bit stream. For example,  $d_i$  and  $d_j$  in Fig 6-15 are such kind of shift bits numbers for vector  $V_i$  and  $V_j$ , respectively.

Such compression process continues for each of the patterns by compressing (overlapping and appending) one pattern at a time with the combined pattern which is obtained by compressing up to the previous step. The progress is illustrated in Fig. 6-16 with four example vectors  $V_1$ -  $V_4$ .



Fig. 6-16. Constructive compression technique

The decompression data (e.g.  $d_1-d_4$  in Fig. 6-16), showing the number of shifted bits, are stored in the ATE and will be used to control the TMS signal in the test mode. Considering the case when initially there are *m* patterns each with length of *l*, the total

length of the finally compressed test data, to be delivered under test mode, will be  $\sum_{i=1}^{n} d_i$ 

while the total length of the initially uncompressed patterns is  $l \cdot m$ . Therefore, the

compression rate will be  $\frac{l \cdot m - \sum_{i=1}^{n} d_i}{l \cdot m} \times 100\%$ . The experimental results of compression

rate for three kinds of interconnect fault models (MA, deterministic and pseudorandom) are listed below:

| Application Method | Compression Rate [%] |              |              |  |
|--------------------|----------------------|--------------|--------------|--|
|                    | <i>m</i> =8          | <i>m</i> =16 | <i>m</i> =32 |  |
| МА                 | 37.5                 | 37.8         | 38.3         |  |
| Deterministic      | 46.7                 | 57.2         | 59.8         |  |
| Pseudorandom       | 58.1                 | 61.2         | 63.8         |  |

Table 6-2. Compression rate for different test pattern sets

It can be seen that the highest compression rate, 63.8%, happens in the case of m=32 with pseudorandom pattern (Note, random test patterns are used in the previous experiments with our proposed algorithm.), which means that the total test data volume can be compressed to be 36.2% of the initial uncompressed one. However, using our compaction algorithm proposed here, the total data volume can always be dramatically reduced up to 3% which is much lower than the best one achieved by the compression technique proposed in [62]. Of course, the patterns used in the two experiments using

these two techniques respectively are not the same so that the result can't be reviewed as absolute. However, our algorithm can achieve experimental result much better than theirs and our algorithm is implicitly powerful than theirs because

- (I) the technique in [62] has only utilized the similarity of a pattern with another pattern while our algorithm has explored the similarity of each pattern with all other patterns; therefore, our algorithm can find and thus provide more chance to compress patterns;
- (II) the technique in [62] has only utilized the similarity of two patterns at one side of each pattern while our algorithm has explored the similarity of each pattern with all other patterns at the position of all the bits.

Moreover, the number of shift bits needs to be stored for each pattern in order to decompress the compressed bit stream [62] while there is no such need at all in our algorithm. Therefore, our algorithm can greatly reduce the complexity of compressing, decompressing and testing.

In summary of this chapter, with the shrinking feather size and increasing frequency of VLSI circuits, testing interconnect signal integrity has become much necessary nowadays and such testing time might be prohibitive if there is no further improvements. Here we have presented our algorithm for compacting interconnect SI test patterns which can not only reduce the number of patterns but also can shorten the length of patterns so that the total test data volume can be much reduced. Compared with other technique, our algorithm is inherently better because we have explored the similarity of each pattern with all other patterns at the position of all pattern bits. Experimental

results have demonstrated the success of our algorithm.

|       | SOC d695              |        |                 |     |                        |                 |
|-------|-----------------------|--------|-----------------|-----|------------------------|-----------------|
|       | N <sub>r</sub> =3,000 |        |                 |     | N <sub>r</sub> =10,000 | )               |
| $N_g$ | Nc                    | Ds     | $\Delta Ds(\%)$ | Nc  | Ds                     | $\Delta Ds(\%)$ |
| 1     | 93                    | 288858 | /               | 282 | 875892                 | /               |
| 2     | 108                   | 260196 | -9.92           | 288 | 708002                 | -19.17          |
| 4     | 117                   | 230168 | -20.32          | 314 | 656016                 | -25.10          |
| 8     | 130                   | 215458 | -25.41          | 355 | 644039                 | -26.47          |

| Table 6-3. Te | est pattern | compaction | on SOC d695 |
|---------------|-------------|------------|-------------|
|---------------|-------------|------------|-------------|

|       | SOC d695               |         |                 |      |         |                 |
|-------|------------------------|---------|-----------------|------|---------|-----------------|
|       | N <sub>r</sub> =30,000 |         |                 |      | )       |                 |
| $N_g$ | Nc                     | Ds      | $\Delta Ds(\%)$ | Nc   | Ds      | $\Delta Ds(\%)$ |
| 1     | 781                    | 2425786 | /               | 2453 | 7619018 | /               |
| 2     | 805                    | 2029855 | -16.32          | 2520 | 6330063 | -16.92          |
| 4     | 862                    | 1901408 | -21.62          | 2608 | 5776955 | -24.18          |
| 8     | 940                    | 1851683 | -23.67          | 2769 | 5639062 | -25.99          |
|       |     | SOC g1023             |                 |     |                        |                 |  |  |  |
|-------|-----|-----------------------|-----------------|-----|------------------------|-----------------|--|--|--|
|       |     | N <sub>r</sub> =3,000 | )               |     | N <sub>r</sub> =10,000 |                 |  |  |  |
| $N_g$ | Nc  | Ds                    | $\Delta Ds(\%)$ | Nc  | Ds                     | $\Delta Ds(\%)$ |  |  |  |
| 1     | 83  | 455255                | /               | 278 | 1524830                | /               |  |  |  |
| 2     | 99  | 443035                | -2.68           | 306 | 1351202                | -11.39          |  |  |  |
| 4     | 105 | 390453                | -14.23          | 316 | 1231557                | -19.23          |  |  |  |
| 8     | 126 | 401608                | -11.78          | 335 | 1156850                | -24.13          |  |  |  |

Table 6-4. Test pattern compaction on SOC g1023

|       |     | SOC g1023             |                 |      |                         |                 |  |  |  |  |
|-------|-----|-----------------------|-----------------|------|-------------------------|-----------------|--|--|--|--|
|       |     | N <sub>r</sub> =30,00 | 0               |      | N <sub>r</sub> =100,000 | )               |  |  |  |  |
| $N_g$ | Nc  | Ds                    | $\Delta Ds(\%)$ | Nc   | Ds                      | $\Delta Ds(\%)$ |  |  |  |  |
| 1     | 795 | 4360575               | /               | 2449 | 13432765                | /               |  |  |  |  |
| 2     | 823 | 3646085               | -16.39          | 2527 | 11233225                | -16.37          |  |  |  |  |
| 4     | 850 | 3310127               | -24.09          | 2590 | 10298042                | -23.34          |  |  |  |  |
| 8     | 901 | 3211471               | -26.35          | 2665 | 9831632                 | -26.81          |  |  |  |  |

|       | SOC p34392 |                       |                 |     |                        |                 |  |  |
|-------|------------|-----------------------|-----------------|-----|------------------------|-----------------|--|--|
|       |            | N <sub>r</sub> =3,000 | )               |     | N <sub>r</sub> =10,000 | )               |  |  |
| $N_g$ | Nc         | Ds                    | $\Delta Ds(\%)$ | Nc  | Ds                     | $\Delta Ds(\%)$ |  |  |
| 1     | 93         | 267933                | /               | 286 | 823966                 | /               |  |  |
| 2     | 102        | 232953                | -13.06          | 297 | 686289                 | -16.71          |  |  |
| 4     | 116        | 221295                | -17.41          | 328 | 646552                 | -21.53          |  |  |
| 8     | 152        | 217954                | -18.65          | 395 | 665157                 | -19.27          |  |  |

Table 6-5. Test pattern compaction on SOC p34392

|    |     | SOC p34392 |                 |      |             |                 |  |  |  |  |
|----|-----|------------|-----------------|------|-------------|-----------------|--|--|--|--|
|    |     | Nr = 30,0  | 00              |      | Nr = 100,00 | 0               |  |  |  |  |
| Ng | Nc  | Ds         | $\Delta Ds(\%)$ | Nc   | Ds          | $\Delta Ds(\%)$ |  |  |  |  |
| 1  | 767 | 2209727    | /               | 2510 | 7231310     | /               |  |  |  |  |
| 2  | 784 | 1812550    | -17.97          | 2564 | 5976378     | -17.35          |  |  |  |  |
| 4  | 839 | 1678097    | -24.06          | 2662 | 5412729     | -25.15          |  |  |  |  |
| 8  | 973 | 1642850    | -25.62          | 2990 | 5374511     | -25.68          |  |  |  |  |

|       |     | SOC p22810            |                 |     |                        |                 |  |  |  |
|-------|-----|-----------------------|-----------------|-----|------------------------|-----------------|--|--|--|
|       |     | N <sub>r</sub> =3,000 | )               |     | N <sub>r</sub> =10,000 |                 |  |  |  |
| $N_g$ | Nc  | Ds                    | $\Delta Ds(\%)$ | Nc  | Ds                     | $\Delta Ds(\%)$ |  |  |  |
| 1     | 91  | 633087                | /               | 273 | 1899261                | /               |  |  |  |
| 2     | 100 | 551423                | -12.90          | 299 | 1650915                | -13.06          |  |  |  |
| 4     | 109 | 509494                | -19.52          | 317 | 1498619                | -21.09          |  |  |  |
| 8     | 128 | 515998                | -18.49          | 352 | 1437030                | -24.34          |  |  |  |

Table 6-6. Test pattern compaction on SOC p22810

|       |     | SOC p22810            |                 |      |                         |                 |  |  |  |  |
|-------|-----|-----------------------|-----------------|------|-------------------------|-----------------|--|--|--|--|
|       |     | N <sub>r</sub> =30,00 | 0               |      | N <sub>r</sub> =100,000 | )               |  |  |  |  |
| $N_g$ | Nc  | Ds                    | $\Delta Ds(\%)$ | Nc   | Ds                      | $\Delta Ds(\%)$ |  |  |  |  |
| 1     | 765 | 5322105               | /               | 2481 | 17260317                | /               |  |  |  |  |
| 2     | 792 | 4447422               | -16.43          | 2539 | 14140124                | -18.08          |  |  |  |  |
| 4     | 813 | 3929589               | -26.16          | 2616 | 12870199                | -25.43          |  |  |  |  |
| 8     | 868 | 3800641               | -28.59          | 2720 | 12145754                | -29.63          |  |  |  |  |

|       |     | SOC p93791            |                 |     |                        |                 |  |  |  |
|-------|-----|-----------------------|-----------------|-----|------------------------|-----------------|--|--|--|
|       |     | N <sub>r</sub> =3,000 | )               |     | N <sub>r</sub> =10,000 |                 |  |  |  |
| $N_g$ | Nc  | Ds                    | $\Delta Ds(\%)$ | Nc  | Ds                     | $\Delta Ds(\%)$ |  |  |  |
| 1     | 91  | 969059                | /               | 271 | 2885879                | /               |  |  |  |
| 2     | 103 | 851510                | -12.13          | 293 | 2507958                | -13.10          |  |  |  |
| 4     | 106 | 750780                | -22.52          | 310 | 2222363                | -22.99          |  |  |  |
| 8     | 125 | 714955                | -26.22          | 331 | 2119383                | -26.56          |  |  |  |

Table 6-7. Test pattern compaction on SOC p93791

|    |     | SOC p93791        |                 |      |                     |                 |  |  |  |  |
|----|-----|-------------------|-----------------|------|---------------------|-----------------|--|--|--|--|
|    |     | <i>Nr</i> = 10,00 | 0               |      | <i>Nr</i> = 100,000 |                 |  |  |  |  |
| Ng | Nc  | Ds                | $\Delta Ds(\%)$ | Nc   | Ds                  | $\Delta Ds(\%)$ |  |  |  |  |
| 1  | 757 | 8061293           | /               | 2468 | 26281732            | /               |  |  |  |  |
| 2  | 790 | 6786332           | -15.82          | 2577 | 22102780            | -15.90          |  |  |  |  |
| 4  | 834 | 6243358           | -22.55          | 2576 | 19651865            | -25.23          |  |  |  |  |
| 8  | 883 | 5901943           | -26.79          | 2690 | 18863425            | -28.23          |  |  |  |  |

In conclusion of this chapter, with the shrinking feature size of process technologies, the test cost for the SOC interconnect signal integrity faults can be prohibitive. To cope with this problem, a two-dimensional signal integrity test pattern compaction scheme is proposed here which can reduce both the test pattern count and pattern length. Experimental results show that the proposed solution can significantly reduce the overall SOC interconnects test data volume especially when the test pattern count is large and the pattern length is long. Therefore interconnect signal integrity test time can be greatly reduced.

## **Chapter 7 CONCLUSION AND FUTURE WORK**

In the above chapters the proposed designs of

- (I) self testable full range window comparator,
- (II) BIST system based on voltage scan chain and window comparator,
- (III) fast rail to rail voltage comparator and
- (IV) interconnect signal integrity test pattern compaction algorithm

are presented.

The proposed window comparator structure has the advantages that the Operational Amplifiers will not swing between their corresponding positive and negative rail voltages such that delay time and power consumption can be reduced. In addition, we have also proposed such supporting circuits as self test circuit and FRWC output identification circuit considering the possibility that faults may happen to the comparator circuit itself. Accuracy adjusting method has also been proposed here to get high precision by compensating the fabrication variation of the comparator circuit. Analysis about the resistor variation influence on FRWC accuracy shows that, if the resistors in the comparator circuit are fabricated with the accuracy of *w*, the voltage window comparator can achieve the accuracy of not worse than  $\frac{2w}{(1-w)}$  resulting from such parameter variation. What's more, all catastrophic faults possibly happening to the

FRWC circuit have been simulated on PSPICE to illustrate the effectiveness of the proposed design. Self test circuit, without using other circuits or outer test equipment, is used to test the FRWC circuit with one fault injected at one time. In the design proposed here the self test circuit provides a complete test for the window comparator to ensure the circuit is faultless so that the output of the FRWC during test of SOC is believable.

The Built-In Self-Test (BIST) structure for SOC testing proposed here is based on scan chain structure and the full range window comparator. The basic idea of this BIST system is to scan those voltages of internal nodes in the cores of SOC and to judge whether these voltages are in their corresponding tolerance range or not. Faulty condition of the cores can then be diagnosed from the judging results. The circuit built to implement this BIST system is mainly made up of five parts: (1) Full Range Window Comparator (2) Testing controller (3) Self test circuit (4) Core selecting block. About the core selecting mechanism used in the core selecting block and test interface of each core, there are three kinds of options proposed here to be adopted: (1) Direct structure, (2) Chain structure and (3) Combined structure of these two. Correspondingly, test interface circuit attached to the ordinary circuit of each core is also proposed in cooperation with this BIST system. The test interface should receive core selecting signal, output those desired voltages serially and generate core selecting signal for the next core.

Simulation result of this BIST system shows that the time required for the self test of the window comparator is nine clock cycles and the number of clock cycles required to test

a core is the number of analogue voltages to be tested plus one. From the simulation results it can be seen that this BIST system is effective to test analogue voltage signals in SOC by outputting believable testing result in the format of binary beams. This BIST system can easily be realized in SOC and the supporting test interface in each core is simple no matter how deeply a core is embedded into the SOC. It can complete the testing procedure at the cost of small hardware overhead and, at the same time, this BIST system has lower requirement for the external ATE such that the testing cost can be reduced.

The rail-to-rail fast voltage comparator design proposed in this thesis is made up of four parts: (1) First amplifier which can also be considered as the input buffer; (2) Second amplifier; (3) Current summing circuit. It sums the corresponding currents from the second stage; (4) Output buffer which enables the comparator's output to be rail-to-rail of the power supply and to improve the comparator's driving capability for next devices. This proposed comparator can realize (I) Rail-to-rail input range (II) Rail-to-rail output (III) Large voltage gain and fine sensitivity to the input voltages (IV) Good transient property, short response and transition time.

Three working points at (1) 0.02V, (2) 0.50V and (3) 0.98V are specially studied to show the performance characteristics of the proposed comparator with the power supply of 0-1V. Simulation result shows that the proposed comparator performs well on both DC voltage gain and transient response for rail-to-rail inputs. This comparator can work well even when the input voltages are close to the power supplies, which is much better

than that of other comparators known so far.

With the shrinking feature size of process technologies, the test cost for the SOC interconnect signal integrity faults can be prohibitive. To cope with this problem, a signal integrity test pattern compression scheme is proposed here which can reduce both the test pattern count and pattern length.

To shorten the length of test pattern, first all the SOC cores are partitioned into several core groups. After that, classify the signal integrity test patterns in such a way that the test patterns whose care cores are all within the same core group form a test pattern group. In each test pattern group, the length of each test pattern is reduced to be the sum of the WOC numbers of the cores in this core group instead of the WOC numbers of all SOC cores. This problem can be formulated as a hypergraph partitioning problem. In the case of core grouping, each node in the hypergraph corresponds to a core and a hyperedge is added for each test pattern while the hyperedge connects all the care cores (nodes) of this pattern. The hMetis package is used here to solve the hypergraph partitioning problem so that the SOC cores are partitioned.

Test pattern account is reduced by merging multiple test patterns into one pattern when they are compatible, which means that there is no conflict in each bit of these test patterns. The information of bus occupying is also taken into account since bus lines are based on shared mechanisms and may connect many cores at the same time so that it is possible that several signal integrity test patterns trigger the same bus line from different core boundaries. The goal here is to reduce the test pattern count as much as possible and the problem of finding the minimum number of compacted test patterns for a given test set can be formulated as a maximum clique partitioning problem. A graph is created in such a way that each vertex in this graph corresponds to a test pattern and an edge is added between two vertices if the corresponding two test patterns are compatible. Then, minimum number of cliques are searched to cover all the vertices of the graph while each clique represents a compacted test pattern and the number of cliques is the number of finally compressed test patterns. In addition, a greedy compression heuristic is proposed to reduce the computation time.

The main contribution of the method proposed here are (1) reducing pattern length and (2) reducing pattern count. Experiments shows that the proposed solution can significantly reduce the overall SOC interconnect test data volume, especially when the test pattern count is large and the pattern length is long. Therefore interconnect signal integrity test time can be greatly reduced, which indicates that the application of the compaction scheme will be promising in industry for big SOCs.

The IC testing is a quite challenging matter nowadays as the semiconductor technology goes into the deep-submicron region. Currently the number of transistors on a chip goes from million to billion and different kinds of cores are integrated onto a single chip. Improvement on IC testing can substantially reduce design time and chip cost. The following section lists the proposed research projects on IC testing.

[1]. Minimization of the hardware overhead introduced by the BIST system

Due to the scan nature of the proposed method the hardware overhead for the current BIST scheme has already been made small. On the other hand, the hardware overhead for the extra control signals and the analogue voltages from analogue circuits to the testing device can be large. It is therefore worthy to realize part of the BIST system using the existing resources on chip (such as built-in processors, A/Ds and D/As) so that hardware can be further reduced. Another direction to reduce test hardware is to improve the scan path structure for both the analogue and digital circuits.

### [2]. Parallel testing

In the current version of this BIST scheme, nodes' voltages are tested one after another. In addition, the test responses are transported outside serially. A better approach is to provide parallel testing to reduce test time.

#### [3]. Test and output data compression

To reduce the data amount of test responses, data compression or transformation can be used. Therefore, data compression algorithm and circuit is needed. For complicated circuits the amount of test data to be fed into the circuit is large. So data compression can also be applied to the test input data.

[4]. Oscillation based testing method

The advantage of oscillation testing is that no input test data is needed.

Research can be further done on circuit configuration to convert the circuit into oscillators in test mode so that the circuit complexity can be reduced and fault coverage can be improved. Test response analysis method can also be improved to reduce test time and test channel width.

[5]. Wrapper cell design

Marinissen et al have proposed the wrapper design for embedded core test [56] while Iyengar et al have proposed the idea of optimizing wrapper cell and test structure design at the same time [48]. At the driver side of the interconnections, the wrapper cells are needed to generate or apply required patterns with different transition, delay time and time skew. At the receiver side of the integrity sensors to detect such errors as overshoot, undershoot, glitches and extra delay. This kind of sensor should be accurate enough to accommodate the circuit performance with low power and high working frequency. In addition, the sensor should not cost much hardware overhead so that it can be embedded into wrapper cells for testing.

[6]. Automatic test pattern generation

Automatic test pattern generation is much required because it's impossible to generate huge volume of test patterns for complex circuit by hand. We plan to generate test patterns after layout. That is, such circuit information as coupling capacitance can be abstracted after the circuit layout is fixed so that we can generate the test patterns based on the abstracted information. There will be coupling capacitance among wires that are close with each other such that test patterns are needed, and will be generated, for these wires.

# REFERENCES

- [1] A. Allan, D. Edenfeld, W. H. Joyner Jr., A. B. Kahng, M. Rogers, and Y. Zorian,
  "2001 Technology Roadmap for Semiconductors," *IEEE Computer*, vol. 35, pp. 42-53, Jan. 2002
- [2] F. Beenker, B. Bennetts, and L. Thijssen, Frontiers in Electronic Testing, Volume 3, Kluwer Academic Publishers, 1995.
- [3] Sungbae Hwang, and Jacob A. Abraham, "Reuse of Addressable System Bus for SOC Testing," Proceedings of *the14th Annual IEEE International ASIC/SOC Conference*, pp. 215-219, Sept. 2001.
- [4] K. De, "Test Methodology for Embedded Cores Which Protects Intellectual Property," Proceedings, VLSI Test Sym. (VTS-97), pp. 2-9, May 1997.
- [5] J. E. Franca, "Analogue-Digital Window Comparator with Highly Flexible Programmability," *Electronics Letters*, Vol.27, No.22, pp.2063-2064, October 1991.
- [6] D. De Venuto, M. J. Ohletz, and G. Matarrese, "Static and Dynamic On-Chip Test Response Evaluation using a Two-Mode Comparator," Proceedings, *ETW* 2000, pp.47-52, May 23-26, 2000, Cascais, Portugal.
- [7] D. De Venuto, M. J. Ohletz, and B. Ricco, "Testing of Analogue Circuits via (Standard) Digital Gates," Proceedings, *International Symposium on Quality*

Electronic Design (ISQED'02), pp. 112-119, 2002.

- [8] D. De Venuto, M. J. Ohletz, and B. Riccò, "Self-Positioning Digital Window Comparators for Mixed-Signal DfT," Proc. of *IEEE 9th International Conference on Emerging Technologies and Factory Automation (ETFA)*, Vol. 1, pp. 438-443, Lisbon, Portugal, 16-19 September 2003.
- [9] M. W.T. Wong, "On the Issues of Oscillation Test Methodology," *IEEE Transaction on Instrumentation and Measurement*, 49, pp. 240-245, 2000.
- [10] J. Roh, and J. A. Abraham, "A Comprehensive Signature Analysis Scheme for Oscillation-test," *IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems*, 22, pp. 1409-1423, 2003.
- [11] V. Arora, W. B. Jone, D. C. Huang, and S. R. Das, "A Parallel Built-in Self-diagnostic Method for Nontraditional Faults of Embedded Memory Arrays," *IEEE Transactions on Instrumentation and Measurement*, 53, pp. 915-932, 2004.
- [12] Mike W.T. Wong and Yubin Zhang, "Design and Implementation of Self-Testable Full Range Window Comparator," Proceedings, *IEEE Region 10 International Conference (TENCON*'2004), pp.262-265, Vol. D, November 2004, Chiang Mai, Thailand.
- [13]P. E. Allen, and D. R. Holberg, CMOS Analog Circuit Design (2nd ed. Oxford, U.K.: Oxford Univ. Press), 2002.

- [14] A. B. Dowlatabadi, and J. A. Connelly, "A New Offset Cancellation Technique for CMOS Differential Amplifiers," *ISCAS*'95, pp. 2229-2232, 1995.
- [15] A. Monpapassorn, "An Analogue Switch Using a Current Conveyor," Int. J. Electronics, Vol. 89, No. 8, pp.651-56, 2002.
- [16]C.-L. Wey, "Built-in Self-test (BIST) Structure for Analog Circuit Fault Diagnosis," IEEE Transactions on Instrumentation and Measurement, 39, pp. 517-521, 1990.
- [17]Larry T. Wurtz, "Built-In Self-Test Structure for Mixed-Mode Circuits," *IEEE Transactions on Instrumentation and Measurement*, Volume: 42, Issue: 1, pp25 29, Feb 1993.
- [18] A. B. Dowlatabadi and J. A. Connelly, "A Generic Voltage Comparator Analogue Cell Produced in Standard Digital CMOS Technologies," *Circuits and Systems, IEEE 39th Midwest symposium on*, Vol. 1, pp.35-38, Aug. 1996.
- [19]R. Jacob Baker, CMOS circuit design, layout, and simulation, New York: IEEE Press, 2005.
- [20] B. G. Song, O. J. Kwon, I. K. Chang, H. J. Song, and K. D. Kwack, "A 1.8 V Self-biased Complementary Folded Cascode Amplifier," *AP-ASIC* '99, *The First IEEE Asia Pacific Conference on ASICs*, pp. 63-65, Aug. 1999.
- [21] Yubin Zhang and Mike W.T. Wong, "Self-Testable Full Range Window Comparator," Proceedings, IEEE Asian Test Symposium (ATS'2004), pp. 314-318,

November 2004, Kenting, Taiwan.

- [22] V. D. Agrawal, C. R. Kime, and K. K. Saluja, "A tutorial on built-in self-test, part 1: Principles," IEEE Design Test Comput., vol. 10, no. 2, pp. 73–82, Apr. 1993.
- [23] Dong Xiang, M. J. Chen, J. G. Sun, and H. Fujiwara, "Improving Test Effectiveness of Scan-based BIST by Scan Chain Partitioning," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 24, No. 6, pp. 916-927, June 2005.
- [24] Puneet Gupta, A. B. Kahng, I. I. Mandoiu, and P. Sharma, "Layout-aware Scan Chain Synthesis for Improved Path Delay Fault Coverage," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 24, No. 7, pp. 1104-1114, July 2005.
- [25] Tsuyoshi Shinogi, Y. Yamada, T. Hayashi, T. Yoshikawa, and S. Tsuruoka, "Parallel Core Testing with Multiple Scan Chains by Test Vector Overlapping," VLSI Design, Automation and Test (VLSI-TSA-DAT), 2005 IEEE VLSI-TSA International Symposium on, pp. 204-207, 27-29 April 2005.
- [26]Ozgur Sinanoglu and Alex Orailoglu, "Test Power Reductions through Computationally Efficient, Decoupled Scan Chain Modifications," *IEEE Transactions on Reliability*, Vol. 54, No. 2, pp. 215-223, June 2005.

[27] James Chien-Mo Li, "Diagnosis of Single Stuck-at Faults and Multiple Timing

Faults in Scan Chains," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 13, No. 6, pp. 708-718, June 2005.

- [28]G. Huertas, D. Vazquez, D. A. Rueda, and J. L. Huertas, "Effective Oscillation-based Test for Application to a DTMF Filter Bank," Proceedings, *International Test conference*, pp. 549-555, Sept. 1999.
- [29]K. Shu-Min Li, C. L. Lee, C. Su, and J. E. Chen, "Oscillation Ring Based Interconnect Test Scheme for SOC," *Design Automation Conference, Proceedings* of Asia and South Pacific, Vol. 1, 18-21, pp. 184-187, Jan. 2005.
- [30]R. Rajsuman, "Iddq Testing for CMOS VLSI," *Proceedings of the IEEE*, Vol. 88, Issue 4, pp. 544-568, April 2000.
- [31]Burong Xu, Hongbing Xu, Ying Deng and Houjun Wang, "A Novel Method for Defect Location Using IDDQ," Communications, Circuits and Systems, 2004 International Conference on, Volume 2, pp. 1325-1328, 27-29 June 2004.
- [32] Hua Xie and Houjun Wang, "IDDQ Test Generation for BF Fault in Combinational Circuit Based on Genetic Algorithms," *Communications, Circuits and Systems,* 2005 International Conference on, Volume 2, pp. 1358-1361, 27-30 May 2005.
- [33] Bin Xue and D.M.H. Walker, "Built-in Current Sensor for IDDQ Test," Current and Defect Based Testing, 2004 IEEE International Workshop on, pp. 3-9, 25 April 2004.

- [34]D. Zhao and S.J.A. Upadhyaya, "Generic Resource Distribution and Test Scheduling Scheme for Embedded Core-based SoCs," *Instrumentation and Measurement, IEEE Transactions on*, Volume 53, Issue 2, pp. 318-329, April 2004.
- [35]Reliability, Testing and Characterization of MEMS/MOEMS, Proceedings of SPIE, http://www.spie.org/.
- [36]H. G. Kerkhoff, "Testing of MEMS-Based Microsystems," Test Symposium, 2005 European, pp. 223-228, 22-25 May 2005.
- [37]F. Gueissaz, "Ultra Low Leak Detection Method for MEMS Devices," *Micro Electro Mechanical Systems, 18th IEEE International Conference on*, pp. 524-527, 30 Jan.-3 Feb. 2005.
- [38] F. Mailly, F. Azais, N. Dumas, L. Latorre and P. Nouet, "Towards On-Line Testing of MEMS Using Electro-Thermal Excitation," *Test Symposium, 2005 European*, pp. 76-81, 22-25 May 2005.
- [39] V. Litovski, M. Andrejevic and M. Zwolinski, "ANN Based Modeling, Testing and Diagnosis of MEMS," *Neural Network Applications in Electrical Engineering*, 2004 7th Seminar on, pp. 183-188, 23-25 Sept. 2004.
- [40] Yubin Zhang, M. W.T. Wong and C. K. Li, "Built-in Self-test Structure for Analogue Cores in SOC Application using Full Range Window Comparator,"

accepted, International Journal of Electronics.

- [41] Yubin Zhang and Chi-Kwong Li, "Rail-to-rail Analogue Voltage Comparator Based on 1V 50nm CMOS Technology," submitted.
- [42] M. Becer, R. Vaidyanathan, C. Oh, and R. Panda, "Crosstalk Noise Control in An SoC Physical Design Flow," Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 23(4):488–497, April 2004.
- [43]X. Bai, S. Dey, and J. Rajski., "Self-Test Methodology for At-Speed Test of Crosstalk in Chip Interconnects," Proc. DAC, pp. 619–624, 2000.
- [44] A. Attarha and M. Nourani, "Test Pattern Generation for Signal Integrity Faults on Long Interconnects," Proc. VTS, pp. 336–341, 2002.
- [45]S. K. Goel, K. Chiu, E. J. Marinissen, T. Nguyen, and S. Oostdijk, "Test Infrastructure Design for the NexperiaTM Home Platform PNX8550 System Chip," Proc. DATE, pp. 108–113, 2004.
- [46] M. Cuviello, S. Dey, X. Bai, and Yi Zhao. "Fault Modeling and Simulation for Crosstalk in System-on-Chip Interconnects," Proc. ICCAD, pp. 297–303, 1999.
- [47] W.-Y. Chen, S. K. Gupta, M. A. Breuer, "Test Generation for Crosstalk-Induced Delay in Integrated Circuits," Proc. *ITC*, pp. 191–200, 1999.
- [48] V. Iyengar, K. Chakrabarty, and E. J. Marinissen, "Co-Optimization of Test

Wrapper and Test Access Architecture for Embedded Cores," Springer JETTA, 18(2):213–230, Apr. 2002.

- [49] M. Guler and H. Kilic, "Understanding the Importance of Signal Integrity," IEEE Circuits and Devices Magazine, 15(6):7–10, November 1999.
- [50]S. K. Goel and E. J. Marinissen, "Effective and Efficient Test Architecture Design for SOCs," Proc. *ITC*, pp. 529–538, 2002.
- [51]E. J. Marinissen, R. Arendsen, G. Bos, H. Dingemanse, M. Lousberg, and C. Wouters, "A Structured and Scalable Mechanism for Test Access to Embedded Reusable Cores," Proc. *ITC*, pp. 284–293, 1998.
- [52] S. Kundu, S. T. Zachariah, Y.-S. Chang, and C. Tirumurti, "On Modeling Crosstalk Faults," IEEE *TCAD*, 24(12):1909–1915, Dec. 2005.
- [53] N. Jha and S. Gupta, Testing of Digital Systems, Cambridge University Press, 2003.
- [54]S. Naffziger., "Design Methodologies for Interconnects in GHz+ ICs," Proc. ISSCC, 1999.
- [55]E. J. Marinissen, V. Iyengar, and K. Chakrabarty, "ITC'02 SOC Test Benchmarks," http://www.extra.research.philips.com/itc02socbenchm/.
- [56]E. J. Marinissen, S. K. Goel, and M. Lousberg, "Wrapper Design for Embedded Core Test," Proc. *ITC*, pp. 911–920, 2000.

- [57]N. Selvakkumaran and G. Karypis, "Multi-Objective Hypergraph Partitioning Algorithms for Cut and Maximum Subdomain Degree Minimization," Proc. *ICCAD*, pp. 726–733, 2003.
- [58] P. Nordholz, D. Treytnar, J. Otterstedt, H. Grabinski, D. Niggemeyer, and T. W. Williams, "Signal Integrity Problems in Deep Submicron Arising from Interconnects between Cores," Proc. VTS, pp. 28–33, 1998.
- [59]S. Natarajan, M. A. Breuer, and S. K. Gupta, "Process Variations and Their Impact on Circuit Operation," Proc. DFT, pp. 73–81, 1998.
- [60] P. Varma and S. Bhatia, "A Structured Test Re-Use Methodology for Core-Based System Chips," Proc. *ITC*, pp. 294–302, 1998.
- [61]M. H. Tehranipour, N. Ahmed, and M. Nourani, "Testing SoC Interconnects for Signal Integrity Using Extended JTAG Architecture," *IEEE TCAD*, 23(5):800–811, May 2004.
- [62] M. H. Tehranipour, N. Ahmed, and M. Nourani, "Testing SoC Interconnects for Signal Integrity Using Boundary Scan," In Proc. VTS, pp. 158–163, 2003.
- [63]Boost C++ libraries, http://www.boost.org/.
- [64]C Clique partitioning heuristic, DDEL center, University of Cincinnati, http://www.ececs.uc.edu/~ddel/theses/jay\_new/node16.html.

[65] Clique -- from Volfram MathWorld, http://mathworld.wolfram.com/Clique.html.

[66] S. S. Skiena, The Algorithm Design Manual, New York: Springer-Verlag, 1997.

[67]hMetis – Hypergraph & Circuit Partitioning, Karypis Lab, University of Minnesota, http://glaros.dtc.umn.edu/gkhome/metis/hmetis/overview.

[68] Hypergraph -- from Wolfram MathWorld,

http://mathworld.wolfram.com/Hypergraph.html.

### APPENDIX

SocName **p93791** TotalModules 33 Options Power 0 XY 0

Module 0 Level 0 Inputs 103 Outputs 79 Bidirs 66 ScanChains 0 : Module 0 TotalTests 0

Module 2 Level 2 Inputs 40 Outputs 34 Bidirs 0 ScanChains 0 : Module 2 TotalTests 1 Module 2 Test 1 ScanUse 1 TamUse 1 Patterns 192

Module 3 Level 2 Inputs 40 Outputs 29 Bidirs 0 ScanChains 0 : Module 3 TotalTests 1 Module 3 Test 1 ScanUse 1 TamUse 1 Patterns 648

Module 5 Level 1 Inputs 102 Outputs 80 Bidirs 66 ScanChains 0 : Module 5 TotalTests 1 Module 5 Test 1 ScanUse 1 TamUse 1 Patterns 6127

Module 7 Level 2 Inputs 9 Outputs 32 Bidirs 0 ScanChains 0 : Module 7 TotalTests 1 Module 7 Test 1 ScanUse 1 TamUse 1 Patterns 177 Module 8 Level 2 Inputs 9 Outputs 32 Bidirs 0 ScanChains 0 : Module 8 TotalTests 1 Module 8 Test 1 ScanUse 1 TamUse 1 Patterns 177

Module 9 Level 2 Inputs 43 Outputs 34 Bidirs 0 ScanChains 0 : Module 9 TotalTests 1 Module 9 Test 1 ScanUse 1 TamUse 1 Patterns 192

Module 10 Level 2 Inputs 267 Outputs 128 Bidirs 0 ScanChains 0 : Module 10 TotalTests 1 Module 10 Test 1 ScanUse 1 TamUse 1 Patterns 1164

Module 11 Level 1 Inputs 146 Outputs 68 Bidirs 72 ScanChains 11 : 82 82 82 81 81 81 18 18 17 17 17 Module 11 TotalTests 1 Module 11 Test 1 ScanUse 1 TamUse 1 Patterns 187

Module 15 Level 2 Inputs 44 Outputs 34 Bidirs 0 ScanChains 0 : Module 15 TotalTests 1 Module 15 Test 1 ScanUse 1 TamUse 1 Patterns 288

Module 16 Level 2 Inputs 137 Outputs 64 Bidirs 0 ScanChains 0 : Module 16 TotalTests 1 Module 16 Test 1 ScanUse 1 TamUse 1 Patterns 396

Module 17 TotalTests 1 Module 17 Test 1 ScanUse 1 TamUse 1 Patterns 216

Module 18 Level 2 Inputs 79 Outputs 34 Bidirs 0 ScanChains 0 : Module 18 TotalTests 1 Module 18 Test 1 ScanUse 1 TamUse 1 Patterns 42

Module 21 Level 2 Inputs 79 Outputs 34 Bidirs 0 ScanChains 0 : Module 21 TotalTests 1 Module 21 Test 1 ScanUse 1 TamUse 1 Patterns 42

Module 22 Level 2 Inputs 42 Outputs 34 Bidirs 0 ScanChains 0 : Module 22 TotalTests 1 Module 22 Test 1 ScanUse 1 TamUse 1 Patterns 42

Module 24 Level 2 Inputs 17 Outputs 4 Bidirs 0 ScanChains 0 : Module 24 TotalTests 1 Module 24 Test 1 ScanUse 1 TamUse 1 Patterns 3072

Module 25 Level 2 Inputs 29 Outputs 16 Bidirs 0 ScanChains 0 : Module 25 TotalTests 1 Module 25 Test 1 ScanUse 1 TamUse 1 Patterns 2688

Module 26 Level 2 Inputs 42 Outputs 34 Bidirs 0 ScanChains 0 : Module 26 TotalTests 1 Module 26 Test 1 ScanUse 1 TamUse 1 Patterns 96 Module 27 Test 1 ScanUse 1 TamUse 1 Patterns 916

Module 28 Level 2 Inputs 109 Outputs 50 Bidirs 0 ScanChains 0 : Module 28 TotalTests 1 Module 28 Test 1 ScanUse 1 TamUse 1 Patterns 396

Module 30 Level 2 Inputs 43 Outputs 34 Bidirs 0 ScanChains 0 :

Module 30 TotalTests 1 Module 30 Test 1 ScanUse 1 TamUse 1 Patterns 192

Module 31 Level 2 Inputs 148 Outputs 70 Bidirs 0 ScanChains 0 : Module 31 TotalTests 1 Module 31 Test 1 ScanUse 1 TamUse 1 Patterns 204

Module 32 Level 2 Inputs 268 Outputs 128 Bidirs 0 ScanChains 0 : Module 32 TotalTests 1 Module 32 Test 1 ScanUse 1 TamUse 1 Patterns 3084