Design of Efficient Multiplier Using Vhdl
Short Description
Download Design of Efficient Multiplier Using Vhdl...
Description
DESIGN OF EFFICIENT MULTIPLIER USING VHDL A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Technology in Electronics and Communication Engineering By ARUN SHARMA
Department of Electronics and Communication Engineering J.M.I.T. RADAUR 2010
CERTIFICATE This is to certify that the thesis entitled, “TO DESIGN EFFICIENT MULTIPLIER USING VHDL ” submitted by MR. Arun Sharma in partial fulfillments for the requirements for the award of Master of Technology Degree in
Electronics
and
Communication Engineering at J.M.I.T. Radaur is an authentic work carried out by him under my supervision and guidance. To the best of my knowledge, the matter embodied in the thesis has not been submitted to any other University / Institute for the award of any Degree.
Date:
Mr. Manoj Arora
.
H.O.D.E.C.E.Deptt. J.M.I.T.Radaur
CONTENTS
Abstract……………………………………………………………………………… 05 List of Figures………………………………………………………………………………ii List of Tables………………………………………………………………………………… iii Introduction………………………….………………………………………………09 VHDL ……………………………………………………………………………....14 Filters……………………...……………………………………….…………...…...26 Type of filters…………………………………………….…..…………………......29 Adders….………………………………………………….. ………………………31 Binary multipliers…..…………….…………………………………….……..…….41 Results……………………………..…………………….…….………....................54 Conclusion………………………………………………………………………… .57 Refrences…………………………………………………………………………….59
Abstract There are different entities that one would like to optimize when designing a VLSI circuit. These entities can often not be optimized simultaneously, only improve one entity at the expense of one or more others.The design of an efficient multiplier circuit in terms of power, area, and speed simultaneously, has become a very challenging problem. Power dissipation is recognized as a critical parameter in modern VLSI design field. In Very Large Scale Integration, low power VLSI design is necessary. Multiplication occurs frequently in finite impulse response filters, fast Fourier transforms, convolution, and other important DSP and multimedia kernels. The objective of a good multiplier is to provide a physically compact, good speed and low power consuming chip. To save significant power consumption of a VLSI design, it is a good direction to reduce its dynamic power that is the major part of total power dissipation. In this thesis, we propose high speed low-power multiplier algorithms. The booth multiplier will reduce the number of partial products generated by a factor of 2. The adder will avoid the unwanted addition and thus minimize the switching power dissipation. The proposed high speed low power multiplier can attain speed improvement and power reduction in the Booth encoder when compared with the conventional array multipliers.
This thesis presents an efficient implementation of high speed multiplier using the array multiplier,shift & add algorithm,Booth multipliet,pyramid algorithm & modify pyramid algorithm. In this thesis we compare the working of the these multipliers by implementing each of them separately.
Chapter-1
LOW POWER CONSUMPTION
INTRODUCTION Multipliers are key components of many high performance systems such as FIR filters, microprocessors, digital signal processors, etc. A system’s performance is generally determined by the performance of the multiplier because the multiplier is generally the slowest clement in the system. Furthermore, it is generally the most area consuming. Hence, optimizing the speed and area of the multiplier is a major design issue. However, area and speed are usually conflicting constraints so that improving speed results mostly in larger areas. As a result, a whole spectrum of multipliers with different area-speed constraints have been designed with fully parallel. Multipliers at one end of the spectrum and fully serial multipliers at the other end. In between are digit serial multipliers where single digits consisting of several bits are operated on. These multipliers have moderate performance in both speed and area. However, existing digit serial multipliers have been Plagued by complicated switching systems and/or irregularities in design. Radix 2^n multipliers which operate on digits in a parallel fashion instead of bits bring the pipelining to the digit level and avoid most of‘the above problems. They were introduced by M. K. Ibrahim in 1993. These structures are iterative and modular. The pipelining done at the digit level brings the benefit of constant operation speed irrespective of the size of’ the multiplier. The clock speed is only determined by the digit size which is already fixed before the design is implemented.
CHAPTER 2
BASICS OF MULTIPLICATION
3.1. Basic binary multiplier The operation of multiplication is rather simple in digital electronics. It has its origin from the classical algorithm for the product of two binary numbers. This algorithm uses addition and shift left operations to calculate the product of two numbers. Two examples are presented below. 10 x 8 = 80
-6 x 4 = -24
1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 00 0 0 0 0 0 00 0 0 1 1 1 0 1 0 1 01 0 0 0 0 0 0 1 01 0 0 0 0 1 1 1 0 1 0 0 0 Figure 3.1.1: Basic binary multiplication The left example shows the multiplication procedure of two unsigned binary digits while the one on the right is for signed multiplication.. The first digit is called Multiplicand and the second Multiplier. The only difference between signed and unsigned multiplication is that we have to extend the sign bit in the case of signed one, as depicted in the given right example in PP row 3. Based upon the above procedure, we can deduce an algorithm for any kind of multiplication which is shown in Figure 3.1.2. Here, we assume that the MSB represents the sign of digit.
Figure 3.1.2: Signed multiplication algorithm
3.2. Partial product generation Partial product generation is the very first step in binary multiplier. These are the intermediate terms which are generated based on the value of multiplier. If the multiplier bit is ‘0’, then partial product row is also zero, and if it is ‘1’, then the multiplicand is copied as it is. From the 2nd bit multiplication onwards, each partial product row is shifted one unit to the left as shown in the above mentioned example. In signed multiplication, the sign bit is also extended to the left. Partial product generators for a conventional multiplier consist of a series of logic AND gates as shown in Figure 3.2.1.
X7
X6 X5 X4 X3 X2 X1 X0 Yi
PPi7 PPi6 PPi5
PPi4 PPi3 PPi2 PPi1 PPi0
Figure 3.2.1: Partial product generation logic Careful optimization of the partial-product generation can lead to some substantial delay and area reduction [1].
Chapter 4 Type Of Adders ADDER In electronics, an adder is a digital circuit that performs addition of numbers. In modern computers adders reside in the arithmetic logic unit (ALU) where other operations are performed. Although adders can be constructed for many numerical representations, such as Binary-coded decimal or excess-3, the most common adders operate on binary numbers. In cases where two's complement is being used to represent negative numbers it is trivial to modify an adder into an adder-subtracter Types of adders For single bit adders, there are two general types. A half adder has two inputs, generally labeled A and B, and two outputs, the sum S and carry C. S is the two-bit XOR of A and B, and C is the AND of A and B. Essentially the output of a half adder is the sum of two one-bit numbers, with C being the most significant of these two outputs. The second type of single bit adder is the full adder. The full adder takes into account a carry input such that multiple adders can be used to add larger numbers. To remove ambiguity between the input and output carry lines, the carry in is labeled Ci or Cin while the carry out is labeled Co or Cout. Half adder
Fig-4.1 Half adder circuit diagram
A half adder is a logical circuit that performs an addition operation on two binary digits. The half adder produces a sum and a carry value which are both binary digits.
Following is the logic table for a half adder: Input Output A B C S 0 0 0
0
0 1 0
1
10 0
1
11 1 0 Table.1
Full adder
Inputs: {A, B, Carry In} → Outputs: {Sum, Carry Out} Fig-4.2 Circuit Of Full Adder
Fig-4.3 Schematic symbol for a 1-bit full adder
A full adder is a logical circuit that performs an addition operation on three binary digits. The full adder produces a sum and carries value, which are both binary digits. It can be combined with other full adders (see below) or work on its own.
Input Output A B Ci Co S 000 0
0
00 10
1
0 1 00
1
01 1 1 1 0 00
0 1
1 0 1 0 1 1 1 1 0 0 1 1 1 1 1 Table.2 Note that the final OR gate before the carry-out output may be replaced by an XOR gate without altering the resulting logic. This is because the only discrepancy between OR and
XOR gates occurs when both inputs are 1; for the adder shown here, one can check this is never possible. Using only two types of gates is convenient if one desires to implement the adder directly using common IC chips.
A full adder can be constructed from two half adders by connecting A and B to the input of one half adder, connecting the sum from that to an input to the second adder, connecting Ci to the other input and or the two carry outputs. Equivalently, S could be made the three-bit xor of A, B, and Ci and Co could be made the three-bit majority function of A, B, and Ci. The output of the full adder is the two-bit arithmetic sum of three one-bit numbers.
CHAPTER 5 Multiplier types
Multipliers are categorized relative to their applications, architecture and the way the partial products are produced and summed up. Based on all these, a designer might find following types of multipliers.
5.1. Array multipliers In array multipliers, the counters and compressors are connected in a serial fashion for all bit slices of the Partial Product parallelogram. As can be seen in Figure 21, the array topology is a two-dimensional structure that fits nicely on the VLSI planar process [2].
PP0
PP1
PP2
Full Adder
PP3 Full Adder
PP4 Full Adder
S0 Figure 5.1: Array multiplier mechanism There are several possible array topologies including simple, double and higherorder arrays.
5.2. Simple array multiplier In this type of array, the output of each row of counters (3:2 compressors) is the input to the next row of counters [2]. In the simple array, each row of [3:2] compressors adds a partial product to the partial sum, generating a new partial sum and a sequence of carries. The delay of the array depends on the depth of the array. Therefore, the summing time for the simple array is N-2 [3:2] compressor delays, where N is the number of partial products. The drawback of this type of array is the hardware is underutilized. The counters are used only once in the calculation of the result, for the remaining time, they are idle. This drawback can be diminished by pipelining the array
so
that
several multiplications can occur simultaneously.Pipelining
would increase
the throughput of the multiplier, but would also increase the
latency and area of the multiplier. A fully pipelined array is normally avoided, since the array would be faster than the clock of processor. Figure 4.2.1 depicts the layout of a simple array topology. The dots represent the partial products.
Figure-5.2: Layout Of Simple Array Technology
5.3.Serial/Parallel Multiplier In a serial/parallel multiplier,the multiplicand x arrives bit serially while the multiplier a is applied in a bit parallel format.A common approach used in such multipliers is to generate a row or diagonal of bit products in each time lot and perform the additions concurrently. Suppose the data is positiveX>0.Using carry save adder shift & add algorithm can be applied as shown in figure.
Figure-5.3: Serial/parallel multiplier Since X is processed bit serially and coefficient a is processed bit parallel,this type of multiplier is called a serial/parallel multiplier
5.5.Shift-and-Add Multiplier Shift-and-add multiplication is similar to the multiplication performed by paper and pencil. This method adds the multiplicand X to itself Y times, where Y de-notes the multiplier. To multiply two numbers by paper and pencil, the algorithm is to take the digits of the multiplier one at a time from right to left, multiplying the multi-plicand by a single digit of the multiplier and placing the intermediate product in the appropriate positions to the left of the earlier results.As an example, consider the multiplication of two unsigned 4-bit numbers, 8 (1000) and 9 (1001).
Multiplicand
1000
Multiplier
1001
In the case of binary
1000
multiplication, since the digits
0000 0000 1000 Product
1001000
are 0 and 1, each step of the multiplication is simple. If the multiplier digit is 1, a copy of the multiplicand (1 multiplicand) is placed in the
proper positions; if the multiplier digit is 0, a number o 0 digits (0 multiplicand) are placed in the proper positions. Consider the multiplication of positive numbers. The first version of the multiplier circuit, which implements the shift-and-add multiplication method for two n-bit numbers, is shown in Figure 5.5.1.
Figure 5.5.1: First version of the multiplier circuit
The 2n-bit product register (A) is initialized to 0. Since the basic algorithm shifts the multiplicand register (B) left one position each step to align the multiplicand with the sum being accumulated in the product register, we use a 2n-bit multiplicand register with the multiplicand placed in the right half of the register and with 0 in the left half.
Figure 5.5.2.: The first version of the multiplication algorithm.
Figure 5.5.2 shows the basic steps needed for the multiplication. The algorithm starts by loading the multiplicand into the B register, loading the
multiplier into the Q register, and initializing the A register to 0. The counter N is initialized to n. The least significant bit of the multiplier register (Q0) determines whether the multiplicand is added to the product register. The left shift of the multiplicand has the effect of shifting the intermediate products to the left, just as when multiplying by paper and pencil. The right shift of the multiplier prepares the next bit of the multiplier to ex-amine in the following iteration. Example 1 Using 4-bit numbers, perform the multiplication 9 12 (1001x1100). Answer Table 2 shows the value of registers for each step of the multiplication algorithm. Structure of Computer Systems Table 3. Multiply example using the first version of the algorithm.
Step
A
Q
B
0
0000 0000
1100
0000 1001
Initialization
1
0000 0000
1100
0001 0010
Shift left B
0000 0000
0110
0001 0010
Shift right Q
0000 0000
0110
0010 0100
Shift left B
0000 0000
0011
0010 0100
Shift right Q
0010 0100
0011
0010 0100
Add B to A
0010 0100
0011
0100 1000
Shift left B
0010 0100
0001
0100 1000
Shift right Q
0110 1100
0001
0100 1000
Add B to A
0110 1100
0001
1001 0000
Shift left B
0110 1100
0000
1001 0000
Shift right Q
2
3
4
Operation
The original algorithm shifts the multiplicand left with zeros inserted in the new positions, so the least significant bits of the product cannot change after they are formed. Instead of shifting the multiplicand left, we can shift the product to the right. Therefore the multiplicand is fixed relative to the
product, and since we are adding only n bits, the adder needs to be only n bits wide. Only the left half of the 2n-bit product register is changed during the addition. Another observation is that the product register has an empty space with the size equal to that of the multiplier. As the empty space in the product register disap-pears, so do the bits of the multiplier. In consequence, the final version of the multi-plier circuit combines the product (A register) with the multiplier (Q register). The A register is only n bits wide, and the product is formed in the A and Q registers. Figure 5.5.3 shows the new version of the circuit.
Figure 5.5.3: Final version of the multiplier circuit. the final version of the multiplication algorithm is shown in Figure 5.5.3.
Booth Multiplication Algorithm Booth Multiplication Algorithm Booth algorithm gives a procedure for multiplying binary integers in signed ²’s complement representation. I will illustrate the booth algorithm with the following example: Example, 2 tenx (- 4) ten 0010 two* 1100 two 0010 the Booth table Step 1: Making I. From the two numbers, pick the number with the smallest difference between a series of consecutive numbers, and make it a multiplier. i.e., 0010 -- From 0 to 0 no change, 0 to 1 one change, 1 to 0 another change ,so there are two changes on this one 1100 -- From 1 to 1 no change, 1 to 0 one change, 0 to 0 no change, so there is only one change on this one. Therefore, multiplication of 2 x (– 4), where 2 ten(0010 two) is the multiplicand and (– 4) ten(1100 two) is the multiplier. II. Let X = 1100 (multiplier) Let Y = 0010 (multiplicand) Take the 2’s complement of Y and call it –Y –Y = 1110 III. Load the X value in the table. IV. Load 0 for X-1 value it should be the previous first least significant bit of X V. Load 0 in U and V rows which will have the product of X and Y at the end of operation. VI. Make four rows for each cycle; this is because we are multiplying four bits numbers.
U 0000
V 0000
X 1100
Table.4
0
X-1 Load the value 1st cycle 2nd cycle 3rd Cycle 4th Cycle
Step 2: Booth Algorithm Booth algorithm requires examination of the multiplier bits, and shifting of the product. Prior to the shifting, the multiplicand may be added to partial product, partial subtracted from the partial product, or left unchanged according to the following rules: Look at the first least significant bits of the multiplier “X”, and the previous least significant bits of the multiplier “X - 1”. I 00 Shift only 11 Shift only. 01 Add Y to U, and shift 10 Subtract Y from U, and shift or add (-Y) to U and shift II Take U & V together and shift arithmetic right shift which preserves the sign bit of 2’s complement number. Thus a positive number remains positive, and a negative number remains negative. III Shift X circular right shift because this will prevent us from using two registers for the X value.
U 0000
V 0000
0000 0000
X 1100 0 0110
X-1
Shift only
0
Repeat the same steps until the four cycles are completed. U 0000 0000 0000
V 0000 0000 0000
X X-1 1100 0 0110 0 0011 0
Shift only
U V X 0000 0000 1100 0000 0000 0110 0000 0000 0011 1110 0000 0011 1111 0000 1001
U V 0000 0000 0000 0000 0000 0000 1110 0000 1111 0000 1111 1000
X 1100 0110 0011 0011 1001 1100
X-1 0 0 0 0 1
Add –Y (0000 + 1110 = 1110) Shift
X-1 0 0 0 0 1 1
Shift only
We have finished four cycles, so the answer is shown, in the last rows of U and V which is: 11111000 two Note: By the fourth cycle, the two algorithms have the same values in the Product register
Booth multiplication algorithm for radix 4 One of the solutions of realizing high speed multipliers is to enhance parallelism which helps to decrease the number of subsequent calculation stages. The original version of the Booth algorithm (Radix-2) had two drawbacks. They are: (i) The number of addsubtract operations and the number of
shift operations becomes variable and becomes
inconvenient in designing parallel multipliers. (ii) The algorithm becomes inefficient when there are isolated i’s. These problems are overcome by using modified Radix4 Booth algorithm which scan strings of three bits with the algorithm given below: 1) Extend the sign bit 1 position if necessary to ensure that n is even. 2) Append a 0 to the right of the LSB of the multiplier. 3) According to the value of each vector , each Partial Product will he 0, +y , -y, +2y or -2y. The negative values of y are made by taking the 2’s complement and in this paper Carry-look-ahead (CLA) fast adders are used. The multiplication of y is done by shifting y by one bit to the left. Thus, in any case, in designing a n-bit parallel multipliers, only n/2 partial X(i) products are generated. X(i-i)
X(i-2)
y
0
0
0
+0
0
0
i
+y
0
I
0
+y
0
I
i
+2y
i
0
0
-2y
i
0
i
-y
i
I
0
-y
i
I
i
+0
Table 6 Radix4 Modified Booth algorithm scheme for odd values of I .
VHDL :THE LANGUAGE
Chapter-6
V.H.D.L.The Language
EXPERIMENTAL Many applications demand high throughput and real-time response, performance constraints that often dictate unique architectures with high levels of concurrency. DSP designers need the capability to manipulate and evaluate complex algorithms to extract the necessary level of concurrency. Performance constraints can also be addressed by applying alternative technologies. A change at the implementation level of design by the insertion of a new technology can often make viable an existing marginal algorithm or architecture. The VHDL language supports these modeling needs at the algorithm or behavioral level, and at the implementation or structural level. It provides a versatile set of description facilities to model DSP circuits from the system level to the gate level. Recently, we have also noticed efforts to include circuit-level modeling in VHDL. At the system level we can build behavioral models to describe algorithms and architectures. We would use concurrent processes with constructs common to many high-level languages, such as if, case, loop, wait, and assert statements. VHDL also includes user-defined types, functions, procedures, and packages." In many respects VHDL is a very powerful, high-level, concurrent programming language. At the implementation level we can build structural models
using
component
instantiation
statements
that
connect
and
invoke
subcomponents. The VHDL generate statement provides ease of block replication and control. A dataflow level of description offers a combination of the behavioral and structural levels of description. VHDL lets us use all three levels to describe a single component. Most importantly, the standardization of VHDL has spurred the development of model libraries and design and development tools at every level of abstraction. VHDL, as a consensus description language and design environment, offers design tool portability, easy technical exchange, and technology insertion
VHDL: The language An entity declaration, or entity, combined with architecture or body constitutes a VHDL model. VHDL calls the entity-architecture pair a design entity. By describing alternative architectures for an entity, we can configure a VHDL model for a specific level of investigation. The entity contains the interface description common to the alternative architectures. It communicates with other entities and the environment through ports and generics. Generic information particularizes an entity by specifying environment constants such as register size or delay value. For example, entity A is port (x, y: in real; z: out real); generic (delay: time); end A; The architecture contains declarative and statement sections. Declarations form the region before the reserved word begin and can declare local elements such as signals and components.Statements appear after begin and can contain concurrent statements. For instance, architecture B of A is component M port ( j : in real ; k : out real); end component; signal a,b,c real := 0.0; begin "concurrent statements" end B;
The variety of concurrent statement types gives VHDL the descriptive power to create and combine models at the structural, dataflow, and behavioral levels into one simulation model. The structural type of description makes use of component instantiation statements to invoke models described elsewhere. After declaring components, we use them in the component instantiation statement, assigning ports to local signals or other ports and giving values to generics. invert: M port map ( j => a ; k => c); We can then bind the components to other design entities through configuration specifications in VHDL's architecture declarative section or through separate configuration declarations. The dataflow style makes wide use of a number of types of concurrent signal assignment statements, which associate a target signal with an expression and a delay. The list of signals appearing in the expression is the sensitivity list; the expression must be evaluated for any change on any of these signals. The target signals obtain new values after the delay specified in the signal assignment statement. If no delay is specified, the signal assignment occurs during the next simulation cycle: c
View more...
Comments