# Adders, Multipliers and Comparators

[Source](https://www.realdigital.org/doc/0bb5bcc5527bc3161178769ce85e6463)

```{exercise}
:label: what_is_a_bus
What is a bus?
```
```{solution} what_is_a_bus
:class: dropdown
Circuits typically have input and output signals. These can be independent of each other like reset and clock. They can also be related each other like the individual bit signals of an 8 bit byte input. If individual signals are associated with each other, this bundle of signals is called a *bus*.
```

## Learning Goals

- Understand the operation of adders, multipliers, subtractors, and comparators;
- Be comfortable designing circuits using structural Verilog.

## Background

### Bit slice design
[Source](https://www.realdigital.org/doc/9eca319eb242a07d31574667ef02360a)

```{exercise}
:label: what_is_the_problem_with_using_truth_tables_for_buses
1. What is the problem with using truth tables for buses?
2. How can we overcome this problem?
```
```{solution} what_is_a_bus
:class: dropdown
1. Truth tables can become unwieldy for buses. Take for example an adder circuit with two byte input and one byte output. For each output bit, we would have a table with $2^{16}$ rows.
2. By using a *divide and conquer* approach. We divide the circuit into smaller circuits by dividing the inputs into bits. In other words we slice the bytes into bits. Then we implement the small circuit and replicate it as many times as the width of the inputs. Typically these blocks have to communicate with each other and we have to take this into consideration in the implementation – for example the carry bit in an adder circuit.
```

```{exercise}
:label: what_are_the_pros_and_cons_of_bit_slicing
1. What are the pros and cons of bit slicing?
<!--TODO give an example where the bit slicing leads to a worse result compared to an alternative implementation -->
```
```{solution} what_are_the_pros_and_cons_of_bit_slicing
1. Bit slicing can be applied to every problem where the problem can be divided to single bit operations. So it is versatile. It may not lead to an optimal solution though.
<!--TODO 2. -->
```
### Adders
[Source](https://www.realdigital.org/doc/21dd954e568975f7e97373e977e1ba52)
#### Ripple carry adder (RCA)

```{exercise}
:label: what_are_the_inputs_and_outputs_of_a_full_adder
1. What is a full adder?
2. What are the inputs and outputs of a full adder?
```
```{solution} what_are_the_inputs_and_outputs_of_a_full_adder
:class: dropdown
1. A full adder is a bit slice circuit of an adder. It sums up three bits: two bits of the addends and a carry bit from the less significant bit.
2. Three inputs (two addends and one carry), two outputs (sum and carry)
```

```{exercise}
:label: what_is_the_motivation_for_the_half_adder
What is the motivation for the half adder?
```
```{solution} what_is_the_motivation_for_the_half_adder
:class: dropdown
The least significant bit slice of an adder does not need to input any carry bit. So some transistors can be cut away from the full adder. The output is the half adder.
```

```{exercise}
:label: what_is_a_carry_ripple_adder
What is a carry ripple adder?
```
```{solution} what_is_a_carry_ripple_adder
:class: dropdown
A carry ripple adder adds the bits individually and *ripples* the carry bit from the least significant adder to the most significant one.
```

#### Carry-lookahead adder (CLA)
<!-- TODO author katex formula render errors -->
```{exercise}
:label: what_is_the_difference_of_carry_lookahead_adder
What is the difference of the carry-lookahead adder compared with the ripple carry adder?
```
````{solution} what_is_the_difference_of_carry_lookahead_adder
:class: dropdown
The carry look ahead adder tries to avoid the ripple of the carry bit by calculating carry bits earlier in each bit slice if possible. This approach stems from the fact that a part of the adding operation is independent of the less significant bits so we can preprocess this part (called *generate*) and create a separate path for that. For the carry propagation we create a second path which does not have go through the generate logic but has its own fast path (called *propagate*). This way a carry that has to be propagated to the more significant bits needs to go through less gates compared to a ripple adder where only a single path for communication between slices exists.

How does it work? First let us enumerate the cases in a single bit slice:

A. If two addend bits in a slice are 1:

   1. Carry: we know for sure that this slice will *generate* a carry bit for the more significant bit slice.
   2. Output: depends on the carry from the less significant bit slice.

B. If only one of the bits in a bit slice is 1:

   1. Carry: depends on the carry coming from the less significant bits. In this case the *propagate* signal is activated.
   2. Output: depends on the carry from the less significant bit slice.

C. If all the bits are 0:

   1. Carry: there will be no carry, the carry from the less significant bit can only influence the output
   2. Output: depends on the carry input

(note:we_can_convert_the_propagate_circuit_to_xor)=
```{note}
Case A can also be seen as a propagate case if the carry input is active, but the generate path will have already output a carry. It does not matter if the generate will output the carry earlier or not – the circuit has to account for the propagate case anyway which is the worst case. In summary, if we generate, then we do not have to bother about propagate.
```

The generate and propagate logic is typically implemented in a separate block as follows:

```{image} https://upload.wikimedia.org/wikipedia/commons/0/04/4-bit_carry_lookahead_adder.svg
:alt: en:User:Cburnett, CC BY-SA 3.0, via Wikimedia Commons
:target: https://commons.wikimedia.org/wiki/File:4-bit_carry_lookahead_adder.svg
```
From the analysis above we can deduce the following boolean equations:

```verilog
assign
	g(i) = a(i) & b(i),
	p(i) = a(i) + b(i);
```
The circuit for `g(i)` is already available in full adder which saves us some transistors in the carry-lookahead circuit:

```{image} https://upload.wikimedia.org/wikipedia/commons/a/a9/Full-adder.svg
:alt: en:User:Cburnett, CC BY-SA 3.0, via Wikimedia Commons
:target: https://commons.wikimedia.org/wiki/File:Full-adder.svg
```
What about `p(i)`? In the full adder we have instead `a(i) ^ b(i)`. Can we maybe use the `exclusive or` instead of `or`? We can see the `or` operator also as: `( a(i) & b(i) ) | ( a(i) ^ b(i) )`, in words the `or` operation consists of an `exclusive or` and an `and`. We do not see this circuit in the full adder either, so we may have to introduce an additional `or` to create `p(i)`. This is not the end of the story though.

Remember the note above that [we do not have to bother about propagation if we generate](note:we_can_convert_the_propagate_circuit_to_xor). This means that it is enough that we take an `xor` for the propagate case, because if `a & b == True`, then the circuit will generate anyway and we do not need to consider this in the propagate circuit. This countermeasure allows us to use the `a ^ b` circuitry in the full adder and saves some more transistors as shown in the circuit below:

(fig:four_bit_adder_with_carry_lookahead)=
```{figure} https://upload.wikimedia.org/wikipedia/commons/1/16/Four_bit_adder_with_carry_lookahead.svg
:alt: Trex4321, CC0, via Wikimedia Commons
:target: https://commons.wikimedia.org/wiki/File:Four_bit_adder_with_carry_lookahead.svg
Circuit diagram for a four bit adder with carry lookahead
```
When we compare the above circuit with the full adder circuit, then we see that we did not use any additional gates for the generate and propagate signals.

Let us continue with the carry. There is a carry input for the current bit slice if the less significant bit slice generates or propagates. We propagate the carry of the less significant bit slice. The less significant bit slice in turn outputs a carry when it propagates (the carry of the second less significant bit slice `i-2`) or generates:
```verilog
assign
	c(i) = g(i-1) | ( p(i-1) & c(i-1) ),

// c(i-1) written out:
	c(i) = g(i-1) | ( p(i-1) & (
	                            g(i-2) | ( p(i-2) & c(i-2) )
	                           );
```
When we expand the `and` operator after `p(i-1)` we get:
```
assign
	c(i) = g(i-1) | ( p(i-1) & g(i-2) ) | ( p(i-1) & p(i-2) & c(i-2) )
```
We can say that the current slice inputs a carry if:

- the less significant (`i-1`th) bit slice generates
- or the `i-1`th bit slice propagates and the `i-2`th bit slice generates
- or the `i-1`th and `i-2`th bit slice propagate and there is a carry input to the `i-2`th bit slice

We can observe this circuitry in {ref}`fig:four_bit_adder_with_carry_lookahead`. We see that the more bit slices, the larger propagate circuitry grows.

Typically a group of bit slices are packed together, e.g., 4 bits, which is in turn used to generate larger adders. To connect these blocks, the group-generate (`gg`) and -propagate (`gp`) signals are used. These work similar to their bit slice counterparts. Let `N` be the size of the group. Then:
```verilog
assign
	gg = g(N-1) |
	     ( p(N-1) & g(N-2) ) |
	     ( p(N-1) & p(N-2) & g(N-3) ) |
	     ...
		 ( p(N-1) & ... & p(1) & g(0) ),
	pg = p(N-1) & p(N-2) & ... & p(0),

	// carry of the group
	cg = gg | (pg & group carry_input);
```
We see that `cg` corresponds to the carry output of the most significant bit slice.

<!--TODO is there another advantage of group signals other than hierarchy? -->

---

How does the critical path look like? In worst case, all the bit slices have to propagate and none of them generates. Then propagating the carry of the least significant bit slice in a 4 bit adder requires traveling through the following gates:

```verilog
assign
	c(4) = c(0) & p(1) & p(2) & p(3);
```

We see that we can propagate the carry for the whole block to the output after a single gate delay instead of rippling through all the digits in the ripple carry adder.

An alternative explanation can be found in the article [Carry-lookahead Adder on Wikipedia](https://en.wikipedia.org/wiki/Carry-lookahead_adder#Carry_lookahead_method).
````

#### Designing an adder structurally vs behaviorally

```{exercise}
:label: how_do_we_design_an_adder_structurally_behaviorally
How do we design an adder
1. structurally?
2. behaviorally?
```
```{solution} how_do_we_design_an_adder_structurally_behaviorally
:class: dropdown
1. The basic building blocks of an adder are the full and half adder which can be described combinationally. After creating these modules we instantiate them according to the bit width of the adder and the adder type. Finally we interconnect them.
2. By using the adder operator. This way we leave the adder implementation choice to the synthesizer.
```

(ch:representing_negative_binary_numbers)=
### Representing negative binary numbers
[Source](https://www.realdigital.org/doc/e1f5e216ec8568e4fe883ed4ecf4a275)

```{exercise}
:label: how_can_we_represent_negative_numbers_in_binary
1. How can we represent negative numbers using binary digits?
2. What are the drawbacks of these representations?
```
```{solution} how_can_we_represent_negative_numbers_in_binary
:class: dropdown
1. a. [sign-magnitude](https://en.wikipedia.org/wiki/Signed_number_representations#Sign%E2%80%93magnitude)
   b. [ones' complement](https://en.wikipedia.org/wiki/Signed_number_representations#Ones'_complement)
   c. [two's complement](https://en.wikipedia.org/wiki/Signed_number_representations#Two's_complement)
2. Sign-magnitude has a representation which is natural to humans, but not optimal for computer arithmetic. The increase for the positive numbers is achieved by doing an unsigned increment ($010 + 001 = 011$), but if we apply the same operation to a negative number, this leads to a decrement: ($110 + 001 = 111$, but $-2 + 1 \neq -3$). This makes the addition operator dependent on the sign and this is not quite efficient for a hardware implementation. For a more efficient implementation having less exceptions is desirable.

   The problem of sign-magnitude can be solved by ones' complement by representing the negative version of a number by creating pairs where each pair additively *complement* each other with respect to the binary number that (1) consists of only *one*s (2) has the same width as the number to be complemented. In other words, if we add a negative and positive versions of a number, then we should get a binary number comprising only ones (hence *ones'* complement and not *one's*). This guarantees that the unsigned addition works the same with negative numbers. The only problem is that we represent 0 using two representations: `0...0` and `1...1`.

   The problem of ones' complement can be solved by complementing the positive numbers to the $2^N$, where $N$ is the number of digits. For example in a two digit number:

   - 0 would complement with ${2^2}_d = {100}_b, 100_b - 0_b = 100_b$ which is not included in the range. This is good, because we represent $0_d$ with $0_b$ anyway and we do not waste another binary code for 0. In other words there is only a single representation for 0 compared with ones' complement.

     We solved the problem for 0 but we get an asymmetry. The number of positive and negative numbers represented in two's complement cannot be equal anymore. Take for example ${10}_b$. Two's complement of this number is the same: ${10}_b$. We could assign this binary code either -2 or 2. All the negative numbers have the first digit 0, so it is better to assign -2 to ${10}_b$.

   - Numbers other than 0 (and -4): e.g. 1 will complemented with $100_b - 1_b = 11_b$, which is represented as `11`.

   Subtracting the numbers from $2^N$ manually is tedious. Converting first to ones' complement by inverting all digits and then incrementing the number is much easier.
```

### Subtractors
[Source](https://www.realdigital.org/doc/1156ec579e4e4c5c2bb71ee082e33fb3)

#### Ripple Borrow Subtractor

```{exercise}
:label: how_is_a_subtractor_implemented
How is a (1) half and (2) full subtractor implemented?
```
````{solution} how_is_a_subtractor_implemented
:class: dropdown
1. In $m-s=d$, $m$ is the minuend, $s$ is the subtrahend and $d$ is the difference. $d$ is one if one of the inputs is one, which corresponds to an `xor` operation. We have an additional borrow output $b$ which is active if and only if $m = 0$ and $s = 1$:
   ```verilog
   assign
   	d = m ^ s,
   	b = ~m & s;
   ```

2. For the full subtractor we need an additional borrow input $b_\mathrm{in}$ which must be subtracted as well. The subtraction can be realized with an additional `xor` on the difference output of an half subtractor:
   ```verilog
   assign
   	d = m ^ s ^ b_in;
   ```
   The equation is the same as in addition.

   The borrow must be active if $s+b_\mathrm{in} \gt m$:

   | $m$ | $s$ | $b_\mathrm{in}$ | $b_\mathrm{out}$ |
   |-----|-----|-----------------|------------------|
   |  0  |  0  | 0               | 0                |
   |  0  |  0  | 1               | 1                |
   |  0  |  1  | 0               | 1                |
   |  0  |  1  | 1               | 1                |
   |  1  |  0  | 0               | 0                |
   |  1  |  0  | 1               | 0                |
   |  1  |  1  | 0               | 0                |
   |  1  |  1  | 1               | 1                |

   This table corresponds to:
   ```verilog
   assign
   b_out = (~m & b_in) |
   	       (~m & s) |
   	       (s & b_in);
   ```
   To save some transistors we can make use of the `xor` gate in the `m ^ s` circuit. If `(~m & ~s) & (m & s)`, then $b_\mathrm{out}$ directly depends on $b_\mathrm{in}$:

   | $m$ | $s$ | $b_\mathrm{in}$ | $b_\mathrm{out}$ |
   |-----|-----|-----------------|------------------|
   |  0  |  0  | 0               | 0                |
   |  0  |  0  | 1               | 1                |
   |  1  |  1  | 0               | 0                |
   |  1  |  1  | 1               | 1                |

   This corresponds to `xnor` which can be built by negating the output of `m ^ s`. Then we append the rest of the minterms, which is `~m & s`. So we get:
    ```verilog
   assign
   b_out = ( ~(m ^ s) & b_in ) |
           ( ~m & s );
   ```

   This equation for borrow and the equation for the difference correspond to the following circuit:
   ```{image} https://upload.wikimedia.org/wikipedia/commons/8/83/Full-sub-Fixed.svg
   :alt: 48panda, CC BY-SA 4.0, via Wikimedia Commons
   :target: https://commons.wikimedia.org/wiki/File:Full-sub-Fixed.svg
   ```
   By cascading many full subtractors we can build a *ripple borrow subtractor*.
````

#### Using Adders as Subtractors

```{exercise}
:label: how_can_we_build_a_subtractor_using_an_adder
How can we build a subtractor by using an adder?
```
````{solution} how_can_we_build_a_subtractor_using_an_adder
:class: dropdown
Subtraction can be represented as $a + (-b)$. We have already shown in {ref}`ch:representing_negative_binary_numbers` that we can use the signed addition with two's complement. What we only need is to incorporate in the adder the ability to create the two's complement of $b$.

Two's complement can be achieved by a selective inversion of the bits and a subsequent increment. We know that the most significant bit in two's complement is the sign, so we can use this bit to activate inversion and an additional increment. The carry input of a full adder can be used for this increment operation which is usually tied to 0. So if subtraction is activated we

1. invert the bits of the second addend
2. activate the carry in of the *least significant* full adder to convert result from the ones' complement to the two's complement.

The first operation can be achieved with an `xor` on the bits other than the most significant (sign) bit. The `xor` operator inverts the first operand if the second operand is 1 and vice-versa. The circuit implementation of the result looks as follows:

```{image} https://www.realdigital.org/img/309399dd2d8d53f950848517d2cbe6f9.svg
:alt: CC-BY SA 4.0, Clint Cole
:target: https://www.realdigital.org/doc/1156ec579e4e4c5c2bb71ee082e33fb3#using-adders-as-subtractors
```
````

### Comparator
[Source](https://www.realdigital.org/doc/a39d855f71772426c968c0151112b476)

<!-- TODO author
Figure 4. Truth table for bit-sliced 
"inputs from neighboring slices" => should be better: "inputs from the less significant slice"
-->

```{exercise}
:label: what_is_a_digital_comparator
What is a digital comparator and where is it useful?
```
```{solution} what_is_a_digital_comparator
:class: dropdown
A [digital comparator](https://en.wikipedia.org/wiki/Digital_comparator) or magnitude comparator is a circuit which compares two numbers and outputs three bits whether the first number is:

1. greater than 
1. less than 
1. equal to

the second number. It is typically used in the arithmetic logic unit of processors.
```

```{exercise}
:label: how_can_we_implement_a_digital_comparator
How can we implement a digital comparator?
```
```{solution} how_can_we_implement_a_digital_comparator
:class: dropdown
The comparator input bits are related each other similar to the adder or subtractor circuits. We implemented the circuits for the adder and subtractor using bit-slicing, so we can use the same approach for the comparator by creating a truth table for a full comparator with ${eq}_\mathrm{in,out}$, ${gt}_\mathrm{in,out}$ and ${lt}_\mathrm{in,out}$ and eventually cascading these comparators. The inputs of the least significant comparator are tied to constant values – only ${eq}_\mathrm{in}$ is active.
<!--TODO create the truth table -->
```

<!-- TODO structurally describe a comparator using generate ? -->

### Multipliers
[Source](https://www.realdigital.org/doc/42f40f5d5d502c3672477a2bbd78b697)

```{exercise}
:label: how_is_a_binary_multiplier_implemented
How is a binary multiplier implemented?
```
```{solution} how_is_a_binary_multiplier_implemented
:class: dropdown
A binary multiplier is based on the pen and paper method where one multiplicand is multiplied by each of the second multiplicand. Depending on the significance of the multiplied digit the result is shifted to the left and the empty digits are filled with zero. These partial products are then summed with an adder.
<!--TODO circuit diagram? -->
```

### Binary to BCD and BCD to Binary 
[Source](https://www.realdigital.org/doc/6dae6583570fd816d1d675b93578203d)

```{exercise}
:label: how_can_we_convert_from_bcd_to_binary_and_vice_versa
How can we convert a BCD number to a binary number and vice-versa?
```
```{solution} how_can_we_convert_from_bcd_to_binary_and_vice_versa
:class: dropdown
Each of the BCD digits have four bits to represent a natural decimal digit. We have to multiply each BCD with $10^n$, where $n$ is the significance of the BCD number.

To convert a binary number to BCD [the double dabble](https://en.wikipedia.org/wiki/Double_dabble) algorithm is used. The algorithm works as follows:
1. initialize enough BCD binary digit fields, e.g., if the input is a byte, then we need three BCD digit fields with four binary digits each.
2. we write the binary number on the right
3. we begin left-shifting the binary number on the right to the BCD fields. At every left-shift if one of the BCD fields is greater than 5, then we increment this field by 3, otherwise we continue shifting.
4. after shifting  $n$ times, we stop.

Why do we have to add 3 if the number is greater than 5? Firstly we limit every BCD field to 10 numbers, even we can represent 16 numbers with four digits. We have to compensate for this difference of 6. We would normally add 6 when we reach 10, but it needs less resources <!--TODO test this --> if we add 3 when we reach 5. The next left shift will create a 6 out of the previous 3.
```

<!--TODO example verilog code? -->

## Requirements

TODO
<!-- project 2:
note that we only add or subtract two signed numbers in two's complement
how can we detect an overflow in addition? In the exercises the carry = cin xor cout of the last bit slice, but the projects asks to use `+` operation.
-->