1.	ARM 64-bit Assembly
2.	Simplest programs
3.	Integers
4.	Main memory
5.	Arrays
6.	Branching and loops
7.	Functions
8.	Floating point

3. Representation of Integers and Shift Instructions

1. Introduction

We have to develop a good theoretical understanding of binary representation of numbers. The focus of this section is on integers. In later sections, we will learn how fractions and real numbers are represented in computers.

We will start with introducing the representation of integers and binary representation. Then, we will learn about two's complement, which is the technique used to store integer in modern computers. In the end we will learn about shift instructions LSL, LSR, and ASR in the assembly language. The shift instruction LSL is in certain situations much faster than the multiplication instruction MUL. The shift instructions LSR and ASR can be used to implement most basic divisions of integers.

2. Numbers in base \(b\) and binary numbers

Until 12th century, the Europeans did not use the decimal system to represent numbers. They wrote I, II, III, IV, and V instead of \(1\), \(2\), \(3\), \(4\), and \(5\). Chinese did almost the same. The characters for the first five numbers in Chinese alphabet are 一, 二, 三, 四, and 五.

These old traditional number systems need infinitely many symbols to support infinitely many numbers. The decimal system needs only \(10\) symbols: \(0\), \(1\), \(2\), \(\dots\), \(9\). Let us consider a positive integer \(n\) and let us assume that it has \(k\) digits. We will denote the digits from right to left by \(d_0\), \(d_1\), \(\dots\), \(d_{k-1}\). To avoid confusion with multiplication in the case when digits are labeled as variables, we will write a horizontal line over the digits and have an equation of the form \[ n=\overline{d_{k-1}\dots d_1d_0}.\] For example, if \(n=4529\), then we will write \(k=4\), \(d_0=9\), \(d_1=2\), \(d_2=5\), and \(d_3=4\). We can write \(n=\overline{4529}\) as well, but the horizontal line is not necessary because all digits are known and none of them is represented with a variable.

The following equation holds: \[\overline{d_{k-1}\dots d_1d_0}=d_{k-1}\cdot 10^{k-1}+d_{k-2}\cdot 10^{k-2}+\cdots+ d_1\cdot 10^1+d_0\cdot 10^0.\]

We are very used to this representation of numbers. The system is called decimal system and the number \(10\) is called the base of the decimal system. It is believed that \(10\) is chosen as the base for the system because humans have \(10\) fingers on their hands. The universe doesn't treat the number \(10\) as a special one. We can easily replace \(10\) with any other base and have a perfectly well functioning mathematics.

We will now define the representation of integers in an arbitrary base \(b\), where \(b\) is an integer bigger than \(1\).

We will now assume that \(b> 1\) and that \(n\) is a positive integer.

Definition The sequence of numbers \(d_0\), \(d_1\), \(\dots\), \(d_{k-1}\) is called representation of the number \(n\) in base \(b\) if \(d_0\), \(d_1\), \(\dots\), \(d_{k-1}\) are integers from \(\{0\), \(1\), \(\dots\), \(b-1\}\) that satisfy \[n=d_{k-1}\cdot b^{k-1}+d_{k-2}\cdot b^{k-2}+\cdots + d_1\cdot b^1+d_0\cdot b^0.\] The numbers \(d_0\), \(d_1\), \(\dots\), \(d_{k-1}\) are also called digits of \(n\) in base \(b\).

Theorem If \(n\) and \(b\) are integers such that \(n\geq 1\) and \(b > 1\), then there exists a representation of the integer \(n\) in base \(b\). If we require the first digit \(d_{k-1}\) to be non-zero, then the representation is unique.

This theorem is an easy result from elementary number theory. The uniqueness is proved by using the method of contradiction and the existence is proved using mathematical induction.

Uniqueness. Assume that there are two representations of the integer \(n\). In other words, assume that there are two sequences \(\left(d_0\right.\), \(d_1\), \(\dots\), \(\left.d_{k-1}\right)\) and \(\left(e_0\right.\), \(e_1\), \(\dots\), \(\left.e_{l-1}\right)\) such that \begin{eqnarray*} n&=&d_{k-1}\cdot b^{k-1}+\cdots + d_1\cdot b^1+d_0\\ &=&e_{l-1}\cdot b^{l-1}+\cdots+ e_1\cdot b^1+e_0. \end{eqnarray*} Assume that \(j\) is the biggest index for which \(d_j\neq e_j\). If \(k > l\), then we take \(j=k-1\). If \(k < l\)< then we take \(j=l-1\). Without loss of generality assume that \(d_j > e_j\). Then we can cancel the terms \(d_{j+1}\cdot b^{j+1}\) and \(e_{j+1}\cdot b^{j+1}\) from both sides of the previous equation because they are equal. We can also cancel the terms \(d_{j+2}\cdot b^{j+2}\) and \(e_{j+2}\cdot b^{j+2}\), and so on. We are left with the equation \begin{eqnarray*} d_j \cdot b^j+ \cdots + d_1\cdot b^1+d_0 &=& e_j\cdot b^j+\cdots+ e_1\cdot b^1+e_0. \end{eqnarray*} Let us denote by \(M\) the number on the left (and the right) side of the previous equation. Using that \(e_{j-1}\leq b-1\), \(e_{j-2}\leq b-1\), \(\dots\), \(e_1\leq b-1\), and \(e_0\leq b-1\), we obtain \begin{eqnarray*} M& \geq & d_j \cdot b^j \geq \left(e_j+1\right)\cdot b^j=e_j\cdot b^j+b^j. \tag*{(1)} \end{eqnarray*} Observe that \[(b-1)+(b-1)\cdot b^1+(b-1)\cdot b^2+\cdots +(b-1)\cdot b^{j-1}=(b-1)\cdot \frac{b^j-1}{b-1}=b^j-1 < b^j.\] Therefore, the equation (1) becomes \begin{eqnarray*} M& > & e_j\cdot b^j+(b-1)+(b-1)\cdot b^1+(b-1)\cdot b^2+\cdots +(b-1)\cdot b^{j-1}\\ &\geq& e_j\cdot b^j+e_{j-1}\cdot b^{j-1}+\cdots +e_1\cdot b^1+e_0\\ &=& M. \end{eqnarray*} This is a contradiction.

Existence. We use the principle of mathematical induction to prove a stronger statement. If \(n\) is a positive integer smaller than \(b^k\) then there exist a sequence \(\left(d_0\right.\), \(d_1\), \(\dots\), \(\left.d_{k-1}\right)\) of non-negative integers of length \(k\) whose elements are from the set \(\{0\), \(1\), \(\dots\), \(b-1\}\) such that \[n=d_{k-1}\cdot b^{k-1}+\cdots + d_1\cdot b^1+d_0\cdot b^0.\]

The statement is obvious for \(k=1\). Assume that \(k\geq 1\) and that the statement holds for \(k\). We will now prove that it holds for \(k+1\). Let \(n\) be an integer smaller than \(b^{k+1}\). When divided by \(b^k\) the number \(n\) gives a quotient and remainder. Let us denote by \(q\) and \(r\) the unique positive integers such that \(n=q\cdot b^k+r\) and \(r\in\{0\), \(1\), \(\dots\), \(b^k-1\}\). According to the induction hypothesis there exists a sequence \(d_0\), \(d_1\), \(\dots\), \(d_{k-1}\) for which \[r=d_{k-1}\cdot b^{k-1}+\cdots + d_1\cdot b+ d_0.\] By defining \(d_k=q\) we obtain \begin{eqnarray*} d_k\cdot b^k+d_{k-1}\cdot b^{k-1}+\cdots + d_1\cdot b+ d_0&=& q\cdot b^k+r=n. \end{eqnarray*} This completes the proof.

Problem 1. If the representation of the number \(n\) in base \(8\) is \(\overline{37105}_8\), what is the decimal representation of \(n\)?

Problem 2. If the representation of the number \(n\) in base \(5\) is \(\overline{2144}_5\), what is the decimal representation of \(n\)?

Problem 3. What is the representation of the number \(352\) in base \(5\)?

Let us denote by \(d_{k-1}\), \(d_{k-2}\), \(\dots\), \(d_2\), \(d_1\), and \(d_0\) the digits of the number \(352\) in base \(5\). The following equation holds \begin{eqnarray*} d_{k-1}\cdot 5^{k-1}+d_{k-2}\cdot 5^{k-2}+\cdots + d_2\cdot 5^2+d_1\cdot 5^1+d_0\cdot 5^0 &= & 352 . \tag*{(1)}\end{eqnarray*} We will find an easy way to obtain the digit \(d_0\) just by looking at the equation (1). Let us analyze the left-hand side of (1). Except for the last term \(d_0\cdot 5^0\), each of the numbers \(d_{k-1}\cdot 5^{k-1}\), \(d_{k-2}\cdot 5^{k-2}\), \(\dots\), \(d_2\cdot 5^2\), and \(d_1\cdot 5^1\) is divisible by \(5\). The right hand side is \(352\). The number \(352\) gives remainder \(2\) when divided by \(5\). Thus, we must have \(d_0=2\). We now place the value \(d_0=2\) in equation (1) and obtain \begin{eqnarray*} d_{k-1}\cdot 5^{k-1}+d_{k-2}\cdot 5^{k-2}+\cdots + d_2\cdot 5^2+d_1\cdot 5^1+2 &= & 352 . \end{eqnarray*} We can cancel 2 from both sides and get \begin{eqnarray*} d_{k-1}\cdot 5^{k-1}+d_{k-2}\cdot 5^{k-2}+\cdots + d_2\cdot 5^2+d_1\cdot 5^1&= & 350 . \end{eqnarray*} Both sides of the previous equation can be divided by \(5\). The equation becomes \begin{eqnarray*} d_{k-1}\cdot 5^{k-2}+d_{k-2}\cdot 5^{k-3}+\cdots + d_2\cdot 5^1+d_1\cdot 5^0 &= & 70 . \tag*{(2)}\end{eqnarray*} We will now use the same logic as before and obtain the digit \(d_1\). Each of the terms \(d_{k-1}\cdot 5^{k-2}\), \(d_{k-2}\cdot 5^{k-3}\), \(\dots\), \(d_2\cdot 5^1\) is divisible by \(5\). The right-hand side is \(70\). The number \(70\) has remainder \(0\) when divided by \(5\). Therefore, \(d_1=0\). The equation (2) becomes \begin{eqnarray*} d_{k-1}\cdot 5^{k-2}+d_{k-2}\cdot 5^{k-3}+\cdots + d_2\cdot 5^1&= & 70 .\end{eqnarray*} Dividing both sides by \(5\) gives us \begin{eqnarray*} d_{k-1}\cdot 5^{k-3}+d_{k-2}\cdot 5^{k-4}+\cdots + d_3\cdot 5^1 + d_2\cdot 5^0&= & 14 . \tag*{(3)}\end{eqnarray*} Using the same reasoning as before we conclude that \(d_2=4\). The equation (3) becomes \begin{eqnarray*} d_{k-1}\cdot 5^{k-3}+d_{k-2}\cdot 5^{k-4}+\cdots+d_4\cdot 5^2 + d_3\cdot 5^1 + 4&= & 14 .\end{eqnarray*} We now subtract \(4\) from both sides and divide the remaining quantities by \(5\) to obtain \begin{eqnarray*} d_{k-1}\cdot 5^{k-4}+d_{k-2}\cdot 5^{k-5}+\cdots+d_4\cdot 5^1 + d_3\cdot 5^0&= & 2 .\end{eqnarray*} We immediately conclude that \(d_3=2\) and \(d_4=d_5=\cdots =0\). Therefore, the representation of \(352\) in base \(5\) is \[\overline{352}_{10}=\overline{2402}_5.\]

Definition The representation in base \(2\) is also called a binary representation.

Problem 4. Determine the integer \(n\) whose binary representation is \(\overline{1101011}_2\).

Problem 5. Determine binary representation of the number \(n=150\).

Let \(d_0\), \(d_1\), \(d_2\), \(\dots\), \(d_{k-1}\) be the binary digits of \(n\). The number \(n\) and its binary digits must satisfy the equation \begin{eqnarray*} 150&=&d_{k-1}\cdot 2^{k-1}+d_{k-2}\cdot 2^{k-2}+\cdots+d_3\cdot 2^3+d_2\cdot 2^2+d_1\cdot 2^1+d_0. \end{eqnarray*} Since \(150\) is divisible by \(2\), we conclude that \(d_0=0\). The last equation becomes \begin{eqnarray*} 150&=&d_{k-1}\cdot 2^{k-1}+d_{k-2}\cdot 2^{k-2}+\cdots+d_3\cdot 2^3+d_2\cdot 2^2+d_1\cdot 2^1. \end{eqnarray*} We can divide both sides by \(2\) and obtain \begin{eqnarray*} 75&=&d_{k-1}\cdot 2^{k-2}+d_{k-2}\cdot 2^{k-3}+\cdots+d_3\cdot 2^2+d_2\cdot 2^1+d_1. \end{eqnarray*} Since \(75\) is odd, we conclude that \(d_1=1\). The last equation becomes \begin{eqnarray*} 75&=&d_{k-1}\cdot 2^{k-2}+d_{k-2}\cdot 2^{k-3}+\cdots+d_3\cdot 2^2+d_2\cdot 2^1+1. \end{eqnarray*} We now subtract \(1\) from both sides and obtain \begin{eqnarray*} 74&=&d_{k-1}\cdot 2^{k-2}+d_{k-2}\cdot 2^{k-3}+\cdots+d_3\cdot 2^2+d_2\cdot 2^1. \end{eqnarray*} We can divide both sides by \(2\) and get: \begin{eqnarray*} 37&=&d_{k-1}\cdot 2^{k-3}+d_{k-2}\cdot 2^{k-4}+\cdots+d_4\cdot 2^2+d_3\cdot 2^1+d_2 . \end{eqnarray*} Since \(37\) is odd, we must have \(d_2=1\). The last equation becomes \begin{eqnarray*} 37&=&d_{k-1}\cdot 2^{k-3}+d_{k-2}\cdot 2^{k-4}+\cdots+d_4\cdot 2^2+ d_3\cdot 2^1+1. \end{eqnarray*} We subtract \(1\) from both sides. \begin{eqnarray*} 36&=&d_{k-1}\cdot 2^{k-3}+d_{k-2}\cdot 2^{k-4}+\cdots+d_4\cdot 2^2+d_3\cdot 2^1. \end{eqnarray*} Dividing both sides by \(2\) gives us \begin{eqnarray*} 18&=&d_{k-1}\cdot 2^{k-4}+d_{k-2}\cdot 2^{k-5}+\cdots+d_5\cdot 2^2+d_4\cdot 2^1+d_3 . \end{eqnarray*} The number \(18\) is divisible by \(2\). Therefore, \(d_3=0\). Dividing both sides by \(2\) gives us \begin{eqnarray*} 9&=&d_{k-1}\cdot 2^{k-5}+d_{k-2}\cdot 2^{k-6}+\cdots+d_6\cdot 2^2+d_5\cdot 2^1+d_4 . \end{eqnarray*} Since \(9\) is not divisible by \(2\), we must have \(d_4=1\). We subtract \(1\) from both sides and divide the remaining equation by \(2\) to get \begin{eqnarray*} 4&=&d_{k-1}\cdot 2^{k-6}+d_{k-2}\cdot 2^{k-7}+\cdots+d_7\cdot 2^2+d_6\cdot 2^1+d_5 . \end{eqnarray*} The last equation implies that \(d_5=0\). Dividing both sides by \(2\) gives us \begin{eqnarray*} 2&=&d_{k-1}\cdot 2^{k-7}+d_{k-2}\cdot 2^{k-8}+\cdots+d_8\cdot 2^2+d_7\cdot 2^1+d_6 . \end{eqnarray*} We immediately obtain that \(d_6=0\) and \begin{eqnarray*} 1&=&d_{k-1}\cdot 2^{k-8}+d_{k-2}\cdot 2^{k-9}+\cdots+d_8\cdot 2^1+d_7 . \end{eqnarray*} Finally, we get \(d_7=1\) and \(d_8=d_9=\cdots =0\). Thus, \(k=8\). The binary representation of \(150\) is \[150=\overline{10010110}_2.\]

Problem 6. Create the program that reads a number from the user input and prints its binary representation.

Problem 7. Create a program that reads the string s consisting of digits \(0\) and \(1\) only and determines the integer \(n\) whose binary representation is s.

3. Unsigned integers

Non-negative integers are the easiest to handle. In computer science they are called unsigned integers because we do not need to worry about the sign. We can assume it is \(+\). The binary digits are the only elements that have to be stored.

Unsigned integers were more common in past, but not so much today. The computers had very small memory in past. If your program needed both positive and negative integers, then you would be kind of sad. You would need to devote one entire bit for the sign. If you knew that your program does not need negative numbers, you would feel happy and lucky. You would use unsigned integers. That extra bit would be an extra digit. The possible range for the data would be bigger than if one of the bits were reserved for the sign.

In modern C++ there are two major data types for unsigned integers. The first one is unsigned int and it occupies \(32\) bits. The second type is unsigned long and it takes \(64\) bits. They are rarely used any more.

4. Signed integers

Let us denote by \(l\) the total number of bits that we are allowed to use for storing our integer. For most C++ compilers, the type int has \(l=32\) while the type long has \(l=64\).

We will treat \(l\) as a general number and in our examples we will keep things simple by using \(l\) that is much smaller than \(32\) and \(64\). We will often use \(l=8\).

The maximal positive number that we will be able to store is \(2^{l-1}-1\). The minimal negative number will be \(-2^{l-1}\). Thus, the range of all integers will be \([-2^{l-1},2^{l-1}-1]\). There are exactly \(2^l\) numbers in this range. Instead of storing the number \(x\), the computer memory will contain Two's complement of \(x\). This is the formal definition of two's complement.

Definition For \(x\in[-2^{l-1},2^{l-1}-1]\) the two's complement of \(x\) is the number consisting of the last \(l\) binary digits of \(2^l+x\).

Problem 8. Assume that \(l=8\). Determine the two's complement of numbers \(0\), \(10\), \(35\), \(-35\), \(-90\), \(127\), \(-128\).

From the previous exercise we can observe that for positive \(x\) that satisfies \(x\leq 2^{l-1}-1\), the two's complement is always equal to the binary expansion of \(x\). The number \(2^l+x\) will have digit \(1\) at the position \(l+1\) which gets discarded. Moreover, the left-most digit of the two's complement is equal to \(0\). However, if \(x\) is a negative number that satisfies \(x\geq -2^{l-1}\), then the two's complement will have exactly \(l\) digits. There won't be any digit to discard. The left-most digit of the two's complement will be \(1\).

Thus, the left-most digit of the two's complement is \(0\) if the number is positive, and \(1\) if the number is negative. The left-most digit is the same as the sign.

The arithmetic operations with signed integers can be performed in the same way as if the integers are unsigned. If there is any carryover that would occupy more than \(l\) bits, that carryover should just be ignored.

We are now ready to analyze the code from the Example 1.

Problem 9. What does the following code print?

#include<iostream>
int main(){
    int x=1 ;
    for(int i=0;i<35;++i){
        std::cout<<i<<"\t"<<x<<"\n";
        x = x*2; 
    }
    return 0;
}

5. Left shift

From very young age, all humans learn that it is the easiest to multiply by numbers such as 10, 100, and 1000. The multiplication by \(10^k\) in decimal system consists of adding \(k\) zeroes to the right end of the number.

Similarly, multiplication by \(2^k\) in binary is equivalent to adding \(k\) zeroes at the end of the number. This operation is called left shift or logical left shift.

If we want to multiply the content of the register X9 by the number by \(2^5\) and store the result in X10, we would need to give the following instruction.

LSL     X10,     X9,     5

Mathematically, the instruction is equivalent to MUL X10, X9, 32. However, MUL is much slower. Hence, if we are multiplying the number by a power of \(2\), then it is much faster to use the shift instruction instead of the multiplication.

The shift operator can be a faster alternative even to multiplications by certain numbers that are not exactly powers of two. Here is one example: Multiplication by 17 can be achieved with

MUL     X10,     X9,     17

The faster way to multiply by 17 is

LSL     X8,      X9,     4      /* X8 = X9 * 2^4 = 16 * X9 */
ADD     X10,     X9,     X8     /* X10 = X9 + X8 = 17 * X9 */

Problem 10.

Assume that the register X1 contain the integer \(a\). Use the appropriate shift instruction to evaluate \(512\cdot a\) and store the result in the register X2. Do not use the instruction MUL.

Problem 11.

Assume that the register X1 contain the integer \(a\). Evaluate \(520\cdot a\) and store the result in the register X2. Do not use the instruction MUL.

6 Right shifts

There are two right shift instructions: arithmetic right shift ASR, and logical right shift LSR.

The logical right shift always adds zeroes to the left. The arithmetic right shift copies the first bit and adds it to the left.

When the numbers are signed, they are expressed in two's complement, then the arithmetic right shift by \(k\) bits corresponds to the integer division by \(2^k\). The integer division is the operation that discards the remainder.

When the numbers are unsigned, then the logical right shift is the operation that corresponds to the integer division by \(2^k\).

If we want to make the arithmetic right shift of the register X9 by \(5\) places and store the result in X10, we would need to give the following instruction.

ASR     X10,     X9,     5

An analogous syntax is used for the logical right shift.

In the exercise below, the numbers are signed, and you should use the arithmetic right shift.

Problem 12.

Assume that the register X1 contain the integer \(a\). Use the appropriate shift instruction to evaluate the integer division \(a/64\) and store the result in the register X2.