i\

Am29332 Advanced Instruction Manual Micro Devices

GEE6SUIV

| wi o = r q &. \e] 3 @ 3 c =

Advanced Micro Devices

Am29332 32-Bit Arithmetic Logic Unit

Instruction Manual

© 1987 Advanced Micro Devices

Advanced Micro Devices reserves the right to make changes in its products without notice in order to improve design or performance characteristics.

This manual neither states nor implies any warranty of any kind, including but not limited to implied warranties of merchantability or fitness for a particular application. AMD assumes no responsibility for the use of any circuitry other than the circuitry embodied in an AMD product.

The information in this publication is believed to be accurate in all respects at the time of publication, but is subject to change without notice. AMD assumes no responsibility for any errors or omissions, and disclaims responsibility for any consequences resulting from the use of the information included herein. Additionally, AMD assumes no responsibility for the functioning of undescribed features or parameters.

09287B

Table of Contents

Product Overview

Functional Description

Data Types

Instruction Format Instruction Classification Instruction Set Summary Data Movement Instruction Logical Instructions

Single - Bit Shift Instructions Prioritize Instructions Arithmetic Instructions Shift/Rotate Instructions

Bit - Manipulation Instructions Field Logical Instructions Mask Instructions

Glossary

Detailed Instruction Description in Alphabetical Order Appendix A

BCD Arithmetic

Appendix B

Multiplication Division

oO

mh ek ek ks t i ==} O ONO OO

—s

=i, ok

' t 1 a_i ok oi—

1-15 147 1-18 1-18

Am29332 Instruction Manual

DAy-DAgy PAy-PAg PBy-PB3 DBp -0B3y

32

4 4 32 PARITY INSTRUCTION CHECKER DECODE 4 ee CA aa | ei 32 MLINK [> z : MCin L > MUX

Wo-W4 [> WIDTH Ed MUX FUNNEL

SHIFTER POS ree] is oh ALU AND

PRIORITY ENCODER

5 ee en & = aaa eee PRE OVP POST O/P RS Lo

STATUS

MUX ie MASTER/SLAVE PARITY COMPARATOR GENERATION \/ \/ \/ 5 32 4 UN Ta UN 4 ) ¢ C,Z,N, V,-L MSERR Yo "Yo PY)-PYs

BD007031

Figure 1. Detailed Block Diagram

CHAPTER 1 Am29332 32-Bit Arithmetic Logic Unit

DA-Bus

9 Instr.

Position 6 of Field x >

07995A-48A STATUS Y-Bus

Figure 2.

PRODUCT OVERVIEW

The Am29332 is a 32-bit wide, high-performance, non-expand- able Arithmetic Logic Unit (ALU). It has two 32-bit wide input ports (A and B) and one 32-bit wide output port (Y). These three ports provide flexibility and accessibility for high-perfor- mance processor designs. Dedicated input and output ports provide a flow-through architecture and avoid the penalty associated with switching the bus half-way through the cycle for input and output of data. The chip is designed for use with a dual-access RAM (Am29334) as a register file. In addition, the three-bus architecture facilitates the connection of other arithmetic units in parallel with the Am29332 for high-perfor- mance systems.

The Am29332 supports one-, two-, three-, and four-byte arithmetic operations. It also supports multiprecision arithme- tic and multiple bit shifts. For logical operations, it can handle variable-length fields of up to 32 bits. The chip incorporates dedicated hardware to allow efficient implementation of a two bit-at-a-time (modified Booth) multiply algorithm, supporting signed and unsigned arithmetic data types. Similarly, hardware is provided to support a bit-at-a-time divide algorithm, also supporting signed and unsigned arithmetic data types. An internal 32-bit register (Q) is used by the multiply and divide hardware for double precision operands. For business applica- tions, the Am29332 supports variable-length BCD arithmetic.

Field logical instructions operate on bit-fields taken from the A and B data inputs; they may be of variable width and starting position. A is normally the source input and B the destination input. In general, destination bits not falling within a specified

DA DB

Parity

Error DB-Bus

STATUS Reg

Y-Bus

Simplified Block Diagram

field are passed by the ALU unchanged. Field width and position are specified either by direct inputs to the chip, or by entries in the status register. There are two kinds of field logical instructions aligned and non-aligned. The first type of instruction assumes that source and destination fields are aligned and the operation is performed only for bits within the specified fields. In the second type of instruction, source and destination fields are normally non-aligned. However, it is always assumed that one field (either source or destination) is least-significant-bit (LSB) aligned.

If the destination field is LSB aligned then the source field is downshifted in order to make it LSB aligned as well. Down- shifting is accomplished by making the 6-bit position input equal to the two's complement of the number of places the field is to be downshifted. If the source field is LSB aligned then it is upshifted in order to align it with the destination. Upshifting is accomplished by making the position inputs equal to the number of places the field is to be upshifted. Any other type of field operation is not allowed. Whenever the field crosses the word boundary, the portion not falling within the word boundary is ignored. This effect is useful when perform-

ing operations on fields that overlap two different words.

Instructions to perform straightforward multiple-bit shifts (ei- ther up or down) are also provided. Additionally, it is possible to extract a bit-field from a word in one instruction, even if that field overlaps a word boundary.

1-1

al

CHAPTER 1 Am29332 Instruction Manual

The power and the flexibility of the processor comes partly from its ability to generate a mask to control the width of an operation for each instruction without any overhead. For all byte aligned instructions (three quarters of the instruction set), the mask is either 1, 2, 3 or 4 bytes wide and is generated from the byte width input (Ig - 17). For all field instructions the mask is of variable width and is generated from the position inputs (Po -Ps) and the width inputs (Wo - Wa). Table 1 describes the position displacement from the position inputs and Table 2 the bit field from the width inputs.

TABLE 1. POSITION INPUTS AND BIT DISPLACEMENT

TABLE 2. WIDTH INPUTS AND BIT FIELD

| inputs

| Wa | Wa | We | Wi | Wo 0 0 0 0 0

32 1 2

31 Whenever the width of the operand is less than 32-bits, all unselected bits from the inputs of the ALU are passed to the output without any modification. Depending upon the instruc- tion type, unselected bits are taken from different sources. For example in all single operand instructions, bits from the source operand (from either A or B input) are passed in unselected bit positions. For two operand instructions, bits from the B’ input

are passed in unselected bit positions. There are some exceptions which are explained in the instruction set section.

The processor has a 32-bit status register to indicate the status of different operations performed. The status register is loaded at the rising edge of the clock with new status unless the HOLD signal is HIGH. The bit position for each status bit is given in the functional description. The least significant byte of the status register holds the six position bits (PRo - PRs). The two most significant bits of this byte may be read or loaded but are otherwise unused by the ALU. The second byte (bits 8 to 15) consists of the five width bits (WRo WR,4) and three read- only bits that are a combinational function of other status bits, and which indicate useful branch conditions. The third byte consists of ALU status bits plus bits for high-speed multiply and divide. The most significant byte holds intermediate nibble carries for BCD operations. An extract-status instruction is provided which allows a Boolean value to be formed from any

1-2

Bit Field Ww

selected bit. This is particularly useful in machines employing a stack architecture. Instructions to save and restore the status register are provided. As the entire status of each instruction is stored in the status register, interrupts at any microinstruction boundary are feasible.

The processor has a 32-bit wide priority encoder to support floating-point and graphics operations. The priority encoder supports all byte aligned data types the result is dependent upon the byte width specified. The result of a priority encode is also loaded into the position bits of the status register. The result of the prioritize operation can then be used in the following clock cycle, e.g., to normalize a floating-point num- ber or to help detect the edge of a polygon in graphics applications.

To support system diagnostics, the Am29332 has a special '"'Master-Slave'’ mode. To use this mode, two chips are connected in parallel, and hence receive the same instructions and data. The master chip is used for the normal data path. However, in the slave chip, all outputs becomes inputs. The slave compares the outputs of the master with its own internally generated result. If the two do not match, the slave will activate an error signal.

As a further diagnostic aid, byte-wise parity checking is performed at both the A and B data inputs. The "'parity” signal is activated if an error is detected. Parity bits (one per byte) are generated for the 32-bit output bus.

FUNCTIONAL DESCRIPTION

A detailed description of each functional block is given in the following paragraphs.

64-Bit Funnel Shifter

The 64-bit funnel shifter is a combinatorial network. The 64-bit input is formed from a combination of the A and B inputs. This may be left-shifted by up to 31 bits before being used by the ALU. The output of the shifter is the most significant 32 bits of the result. The 64-bit shifter can be used on either the A or B operands to perform barrel shifts (either up or down) or rotates. The operation is controlled by positioning operands properly at the input of the 64-bit up-shifter.

The number ''n'' by which the operand is shifted comes from two sources: the microprogram memory via the Po Ps pins or the internal register (byte 0 of the status register), PRo PRs, as selected by an instruction bit.

In general, the 6-bit position input, Po - Ps, takes a 6-bit two's complement number representing upshifts from 0 to 31 places (positive numbers) or downshifts from 1 to 32 places (negative numbers).

Mask Generator

The mask generator logic provides the ability to generate the appropriate mask for an operand of given width and position. The generation of the mask depends upon two types of instructions. The first type has byte boundary aligned oper- ands (widths of either 1, 2, 3 or 4 bytes) with the least significant bit aligned to bit 0. The width of an operand is specified by the byte width inputs (Ig and I7) as shown in Table 3. The second type of instruction has operands of variable width (1 to 32 bits) and position. The operand is specified by the width inputs (Wo - W4) and the position inputs (Po Ps)

indicating the least significant bit position of the operand. Thus, in this type of instruction the operand may or may not be least significant bit aligned. Depending upon the type of instruction, the mask generator first generates a fence of all zeros starting from the least significant bit with the width specified either by the byte width or the width input fields. This fence can be upshifted by up to 31 bits by the 32-bit mask shifter. Whenever the mask is moved up over the 32-bit boundary, it does not wrap around. Instead, ONE's are inserted from the least significant end. This configuration provides the ability to operate on a contiguous field located anywhere in a word, or across a word boundary.

The mask generator can be used as a pattern generator by allowing the mask to pass through ALU (by using the PASS- MASK instruction). For example, a single-bit wide mask can be generated and by shifting it up by different amounts can give walking ONE or walking ZERO patterns for memory tests.

TABLE 3.

Arithmetic and Logical Unit

The ALU is a three input unit which uses the mask as a second or third operand in every instruction. The mask is used to merge two operands. For all selected bits (wherever the mask is 0), the desired operation specified by the instruction input is performed, and for all unselected bits either corresponding destination bits or zeros are passed through. The status of each operation (carry, negative, zero, overflow, link) applies to the result only over the specified width. For all byte aligned arithmetic and logical operations (first three quarters of the instruction set), the status is extracted from the appropriate byte boundary. For all field operations (last quarter of the instruction set), the operand width is assumed to be 32 bits for status generation. The ZERO flag always indicates the status of all bits selected by the mask.

The actual width of the ALU Is 34 bits. There are two extra bits used for the high speed signed and unsigned multiplication instructions. These two bits are automatically concatenated to the most-significant end of the ALU depending upon the width specified for the operation. Since the modified Booth algorithm requires a two-bit down-shift each cycle, these ALU bits generate the two most-significant bits of the partial product.

The ALU is capable of shifting data down by two bits for the multiplication algorithm, up by one bit for the divide algorithm and _ single-bit-up-shifts.

The processor is capable of performing BCD arithmetic on packed BCD operations. This logic generates nibble carries (BCD digit carry) from propagate and generates signals formed from the A and B operands. In order to simplify the hardware while maintain- ing throughput, the BCD add and subtract operations are per- formed in two cycles. In the first cycle, ordinary binary addition or subtraction is performed and BCD nibble carries are generated. These are blocked from affecting the result at this stage, but are saved in the status register to be used later for BCD correction (NC.-NC.). In the second cycle all BCD numbers are adjusted by

CHAPTER 1

Am29332 Instruction Man

examining the previously generated nibble carries. Since all the necessary information is stored in the status register, the proces- sor can be interrupted after the first BCD cycle.

Priority Encoder

The priority encoder is provided to support floating-point arithmetic and some graphics primitives. The priority encoder takes up to 32- bits as input and generates a 5-bit wide binary code to indicate location of the most significant one in the operand. Input to the priority endcoder comes from the input multiplexer, which masks all bits that the user does not want to participate in the prioritization. The priority encoder supports 8, 16, 24, and 32-bit operations depending upon the byte width specified. For each data type the priority encoder generates the appropriate binary weighted code. For example, when a byte width of two is specified (|,- 1, = 10), the output of the encoder is zero when bit 15 is HIGH. However, if byte width of four is specified (1,- |, = 00), the output of the encoder is 16 (decimal) if bit 15 is HIGH and bits 31 - 16 are LOW. Table 4 shows the output for each data type. If none of the inputs are HIGH or the most significant bit of the data type specified is HIGH, then the output is zero. The difference between these two cases is indicated by the Z-flag of the status register which is HIGH only if all inputs are zero.

Q-Register

The Q-register holds dividend and quotient bits for division, and multiplier and product bits for multiplication. During division, the contents of the Q-register are shifted left, a bit at a time, with quotient bits inserted into bit 0. During multiplica- tion, the contents of the Q-register are shifted right, two bits at a time, with product bits inserted into the most-significant two bits (according to the selected byte width). The Q-register may be loaded from the A or B inputs and read onto the Y bus.

Master-Slave Comparator

Ail ALU outputs (except MSERR) employ three-state buffers. The master-slave comparator compares the input and output of each buffer. Any difference causes the MSERR signal to be made true. In Slave mode, all output buffers are disabled. Outputs from a second ALU may then be connected to the equivalent pins of the first. The comparator in the slave will then detect any difference in the results generated by the two. When the Y bus is three-stated by making Output-Enable false, the Y bus master-slave comparators are disabled.

Parity Logic

For each byte of the DA and DB inputs there is an associated parity bit (8 in all). If a parity error is detected on any byte, the Parity-Error signal is made true. Four parity signals (one per byte) are also generated for the Y bus outputs. EVEN parity is employed for the Am29332.

Status Register

All necessary information about operations performed in the ALU is stored in the 32-bit wide status register after every microcycle. Since the register can be saved, an interrupt can occur after any cycle. The status register can be loaded from either the A or B input of the chip and can be read out on the Y bus for saving in an external register file. For loading, the byte width indicates how many bytes are to be updated. The status register is only updated if the HOLD input is inactive.

Each byte of the status register holds different types of information (see Figure 3). The least significant byte (bits 0 to 7) holds eight position bits (PRo -PR7) for the data shifter.

ual

1-3

CHAPTER 1 Am29332 Instruction Manual

1-4

The two most significant bits are not used. The next most significant byte (bits 8 to 15) holds the 5-bit width field (WRo - WRa) for the mask generator. The three most-signifi- cant bits of that byte (bits 13 to 15) are read-only bits that represent three different conditions extracted from the other bits of the status register. They are C+ Z, N ® V, and (N ® V) + Z for bits 13, 14 and 15 respectively. These bits can be read on the Yo pin by the extract-status instruction. The next byte contains all the necessary information generated by an ALU operation. The least-significant four bits (bits 16 to 19) hold carry, negative, overflow and zero flags. Bit 20 holds link information for single bit shifts and bits 21 and 22 are used by

TABLE 4.

Highest Priority Encoder Active Bit Output

17 -Ig = 00 (32-bit) None

31

30

29

28

l7-!g = 01 (8-bit) None

\7-Ig = 10 (16-bit) None 15 14 13 12

l7-Ig=11 (24-bit) None

23 22 21 20

the multiply and divide instructions. The M flag holds the multiplier bit for the modified Booth algorithm or it holds the sign comparison result for the divide algorithm. The S flag holds the sign of the partial remainder for unsigned division. Both the flags (M and S) are provided as a part of the status register so that multiply and divide instructions can be inter- rupted at microinstruction boundaries. The most significant byte of the status register holds nibble carries for BCD arithmetic. Since BCD arithmetic is performed in two cycles, the nibble carries are saved in the first cycle and used in the second cycle. Since all the information is stored, BCD instruc- tions are also interruptible at the microinstruction boundary.

( ' ! Statuso_7: Position Register

[pry | Pre | prs | PRs | Ro | eRe | PR | PRO | 7 6 5 4 3 2 1 0

StatuSg_49: Width Register Status 43: C+Z Status ;4: N@V Read Only Status ,5: (N@V)+Z

SIGNED } SIGNED | UNSIGNED 15 14 13 12 11 10 9 8 Status 46: Carry Status 47: Negative Status jg: Overflow Statusjg: Zero Statusg0: Link Status: Multiply (and divide) Bit Statuso9: Sign Flag Statuso3: 0

pots [ejziyi{se|

23 22 21 20 19 18 17 16

Statuso4_31: Nibble Carries 31 30 29 28 27 26 25 24

Note: Overflow is defined as follows: V = (carry in to MSB) ® (carry out of MSB)

Figure 3. ALU Status Register Bit Assignment

Am29332 INSTRUCTION SET

Data Types The Am29332 supports the following data types:

1. Integer 2. Binary-coded decimal 3. Variable-length bit field

The first two data types fall into the category of byte boundary aligned operands (Figure 4). The size of the operand could be 1 byte, 2 bytes, 3 bytes or 4 bytes. All operands are least significant bit (bit 0) aligned. The byte width is determined by bits lg and |7 of the instruction as shown in Table 5.

TABLE 5. Pe Ter I7 Bytes ea a oe ee a i a ee

The third data type has operands of variable width (1 to 32 bits) as shown in Figure 4. The operand is specified by width inputs (Wo —- Wa) and position inputs (Po - Ps). The position inputs indicate the least significant bit position of the operand. Depending on bits lg and I7 of the instruction, the width and position inputs can be selected from either the Status Register or the Width and Position Pins as shown in Table 6.

7 0

4 BYTES

TBOOO0096

Byte Boundary Aligned Operands

31 p+ wel Pp p-1 0

w-1 0 TB000630

Variable-Length Bit Field

p = Bit displacement of the least significant field with re- spect to bit 0. w = Width of bit field.

Figure 4. Data Types

CHAPTER 1 Am29332 Instruction Manual |

TABLE 6.

Unsigned -128 to +127 0 to 255 -2'5 to +2'5_ 4 - 223 to 2¢9_1

Integer 1 byte 8 bits 2 bytes 16 bits

3 bytes 24 bits

wool to. Orta

4 bytes 32 bits Numeric, 2 digits per byte. Most-significant digit may be used for sign.

Dependent on position and width inputs.

BCD 1 to 4 bytes (8 digits)

Variable 1 to 32 bits

Instruction Format The Am29332 has two types of Instruction Formats: 1. Byte Boundary Aligned Instructions (FORMAT 1):

Ig ly oy 'o TBO00098

2. Variabie-Length Field Bit Instructions (FORMAT 2):

lg ly ole lo

5 0

10 6 wore] reson

TBOO00099

For instructions that allow a field to be shifted up or down, Po-Ps is a two's-complement number in the range -32 to + 31 representing the direction and magnitude of the shift. For instructions that assume a fixed field position, Po - P4 repre- sent the position of the least-significant bit of the field and Ps is ignored.

CHAPTER 1 Am29332 Instruction Manual

1-6

Instruction Classification ALU instructions can be classified as follows: A. Byte Boundary Aligned Operand Instructions:

1. Arithmetic Binary, BCD ~ Multiply steps - Division steps (single and multiple precision)

2. Prioritize 3. Logical 4. Single-bit shifts 5. Data movement B. Variable-Length Bit Field Operand Instructions: 1. N-bit shifts and rotates 2. Bit manipulations 3. Field logical operations (aligned, non-aligned, extract) 4. Mask generation

Three-fourths of the ALU instructions apply to operands that are byte boundary aligned. For these instructions, two orthog- onal issues are the width of the operand (in bytes) and the contents of the high order unselected bytes on the Y bus. As mentioned earlier, the width of the operand is specified by lg and |7. With the exception of a few instructions, the unselected bytes are assigned values as follows: for single operand instructions, unselected bytes are passed unchanged from the source (A or B). For two operand instructions, unselected bytes are passed unchanged from the destination (B input).

In the last quarter of the instruction set, the width of the operand is from 1 to 32 bits (based on the width input) for field operations, 32 bits for N-bit shift operations and 1-bit for bit- oriented operations. In the case of field-aligned and single-bit operands, the position bits (Po-P,4) determine the least significant bit of the operand. In the case of N-bit shifts and field non-aligned operands, the position bits Po Ps is a 6-bit signed integer determining the magnitude and direction of the shift.

Flags Byte-Aligned Instructions The zero flag always looks only at the selected bytes:

Z ~- (Y and bytemask (byte width) = 0)

Similarly, N < sign bit (Y, byte width), where the function "sign-bit’’ returns bit 7, 15, 23, or 31 of the first argument for byte widths 01, 10, 11, or 00 respectively.

Also, C = carry (byte width) returns the carry from the appropriate byte boundary, and:

V =< overflow (byte width) = (carry into MSB) ® (carry out of MSB)

returns the overflow from the appropriate byte boundary.

The link (L) flag is generally loaded with the bit moved out of the highest selected byte in the case of upshifts, or the bit moved out of the least significant byte for downshifts. Figure 5 shows the shift operation using link bit. Other status flags have specialized uses, explained in the following sections.

Shift Down:

+—1, 2, 3, or 4 bytes —+

DF006190

Figure 5. Upshift/Downshift Using Link Bit

Variable-Length Field Instruction:

Generally, only N and Z are affected. N takes the most- significant bit of the 32-bit result (i.e., N « Y31). Z detects zeros in the selected field of the result (ie., Z + (Y and bitmask (position, width) = 0)).

Output Select

The Register Status pin, RS, may be used to switch the C, Z, N, V, and L output pins between the direct output of the ALU and the outputs of the corresponding bits in the status register. If the direct status output is selected, then for instructions that do not affect a particular flag (e.g., carry for logical arithmetic) that output will reflect the state of its corresponding bit in the status register. Similarly, when the HOLD signal is made HIGH, the C, Z, N, V and L pins will be made equal to the contents of the status register, regardless of the RS input.

CHAPTER 1 Am29332 Instruction Manual

INSTRUCTION SET SUMMARY

Operand Size: Variable Byte Width: 1, 2, 3, 4 Bytes

Data Type

e Increment by one, two, four

e Decrement by one, two, four

e Add, addc (carry = macro/micro) Binary Integer

e Sub, subr and BCD Arithmetic e Subc, subrc (carry/borrow)

e BCD sum and difference correct steps

e Negate (two's complement)

e Multiply steps (modified Booth) (Signed and unsigned) Binary Integer

e Divide steps (non-restoring)

Single-Bit e Upshift with 0, 1, link fill ; Le Shifts e Downshift with 0, 1, link, sign fill oll and OUD PI CcisiOn)

e Zero extend Data e Sign extend hisvoment e Pass-status, Q-Reg e Load-status, Q-Reg e Merge

Operand Size: 32 Bits

Data Type we e Upshift by 0 to 31 bits with 0 fill te lat e Downshift by 1 to 32 bits with 0, sign fill Binary e Rotate by 0 to 31 bits

Operand Size: Single Bit

Data Type

Bit e Extract

e Set Binary Manipulation x Recet

Operand Size: Variable Length Bitfield: 1 to 32 sits

Field Logical (aligned and non-aligned)

Data Type

@ Not, OR, XOR, AND, extract, insert

CHAPTER 1 Am29332 Instruction Manual

TABLE 6-1. DATA MOVEMENT INSTRUCTIONS

Y | ¥ Output | Code Description

Taenoexta | 00 | Zoro éwend | 0 [A |

ce

rsenexra [oo | ——=SSid Son |

CMERGeAS | oe | Wee Awin | 8 | ANegeS oF i

MERGEB-A Merge B with A B Merge A

PASS-STAT Pass Status Register LDSTAT-A Load Status Register

PASS-Q | 06 | Pass Q Register

Note: 1. These instructions use the byte aligned instruction format (FORMAT 1).

Legend: Unsel = Unselected Byte(s) Sel = Selected Byte(s)

A =A Input

B =B Input

Q=Q Register

+ = Updated only if byte width is 3 or 4 * = Updated

Examples: 2, ZERO EXTB Pass lower two bytes of B to Y with zero fill on upper two bytes

0, LOADQ-A Load all four bytes of A into Q Register pass updated Q Resistor to Y

1-8

CHAPTER 1 Am29332 Instruction Manual

TABLE 7. LOGICAL INSTRUCTIONS

coda | _onsorpton [umes] ew Le [wf] 2 [vale

| NOTA | 08 | One's Complement NoTB | 09 | Le

XNOR

Note: 1. These instructions use the byte aligned instruction format (FORMAT 1).

AND

Legend: Unsel = Unselected Byte(s) Sel = Selected Byte(s)

A=A Input

B =B Input

Q=Q Register * = Updated

Examples: 2, NOT-A Complement low order two bytes of A and output to Y with high order two bytes of A uncomplemented. 1, AND AND first byte of A and B. Output to Y with high three

bytes of B. TABLE 8-1. SINGLE-BIT SHIFT INSTRUCTIONS (SINGLE PRECISION)

ee code | desertion [uncer] sets [w]e lz |v] wo.

Doanahit,Zeo FH | A | W=AsnYnn=o| | [1] [|

| = . 7 DN1-1F-A i ; i Yj = Aj + 1; Ymsb = 1 Downshift, Link Fill

DN1-LF-B ¥;=Bi a1, Ymsb =L

x = i = be | Yie Ait Yep =N | PYi= Bist Ymsb=N | iS

Y Le _

a

im

ae

DN1-AR-A DN1-AR-B UP1-OF-A UP1-0F-B UP1-1F-A UP1-1F-B

DN1-1F-B Yi =Bi +4, Ymsb = 1 Downshift, Sign Fill

DN1-LF-A Yi =Ai+1, Ymsb =L Upshift, Zero Fill

c=

i

Y, =

Y,=Ai-1, Yo=1

p= i= =

Y¥, = Bj_-41, Yo=1

24 A 25 28 A

29 2C A

2D 30 A

31

34 Upshift, One Fill A ee UP1-LF-A 38 | Upshift, Link Fill A

Note: 1. These instructions use the byte aligned instruction format (FORMAT 1).

Example: 2, UP1-1F-A Shift lower two bytes of A up one bit. Set LSB to 1. Fill unselected bytes to upper two bytes of A.

1-9

CHAPTER 1 Am29332 Instruction Manual

TABLE 8-2. SINGLE-BIT SHIFT INSTRUCTIONS (DOUBLE PRECISION)

¥ Output &@ Register | Status Code | __ Description Selected Bytes | s | m| it | z |v iN | co.

Ce Meese yf ft lok DN1-1F-AQ Downshift, One Fill

a 3 = ) ee ; DN1-AR-AQ Downshift, Sign Fill ee i 2

2 2 2

UP1-0F-AQ UP1-0F-BQ UP1-1F-AQ UP1-1F-BQ

) DN1-AR-BQ Upshift, Zero Fill ee ee Upshift, One Fill ee eee UP1-LF-AQ Upshift, Link Fill

Notes: 1. These instructions use the byte aligned instruction format (FORMAT 1). 2. Y Unselected byte from A, Q Unselected byte unchanged. 3. Y Unselected byte from B, Q Unselected byte unchanged.

26 27

A

B

E 2F 32 33 36 37 3A 3B

Legend: Unsel = Unselected Byte(s) Sel = Selected Byte(s) A=A Input B=B Input Q=Q Register * = Updated

Example: 0, DN1-AR-BQ Shift 64 bits (all 32 bits of both B and Q) down by one bit. LSB of B fills MSB of Q. MSB of B set to sign bit (bit N of status register).

E B (32 bits) Q (32 bits)

sign bit link status bit 3, UP1-LF-AQ Shift 48 bits (24-bits of A and 24-bits of Q) up by one bit. MSB of 24-bit Q fills LSB of A. MSB of 24-bit A sets link status bit. LSB of Q is filled with original link value. W/Z) ® (24 bits) V/4/A Q (24 bits)

DFO06200

CHAPTER 1 Am29332 Instruction Manual

TABLE 9. PRIORITIZE INSTRUCTIONS

a ee ¥ Output ps|ui{etzivin|eo

Cie eee rerio [00 pee ee

Notes: 1. These instructions use the byte aligned instruction format (FORMAT 1). 2. Priority also loaded into STATUS <7:0> 3. Refer to Table 4.

Legend: A=A Input B=B Input Q=Q Register = Updated Example: 3, PRIOR-A Value placed on Y is 2

t

Assume A is 01001011 00100010 00000000 00000000

TABLE 10-1. ARITHMETIC INSTRUCTIONS

ode ee Is{m{ujz{vinic | =

12 Increment by One

= pmeor

Oo

ak SY seeks, | Om} w

= © > |

_

108) | —_

be BN > | ae)

a on

DO] >| l 1 | & i; Ri PM

INCR4-A INCR4-B Decrement by One DECR2-A Decrement by Two DECR2-B 2. Borrow, rather than carry, is generated if BOROW is HIGH (borrow = carry). 3. Nibble bits are set by these instructions. NEG-A (or NEG-B) and DIFF-CORR may be used to

INCR2-A Increment by Two DECR-A DECR4-A 18 Decrement by Four DECR4-B form 10's complement of a BCD number. Use SUM-CORR (for increment) or DIFF-CORR (for

INCR2-B Increment by Four DECR-B Notes: 1. These instructions use the byte aligned instruction format (FORMAT 1). decrement) to increment or decrement a BCD number.

Legend: Unsel = Unselected Byte(s) Sel = Selected Byte(s) A= A Input B=B Input Q=Q Register * = Updated

Example: 2; DECR4-A Decrement lower two bytes of A by 4

CHAPTER 1 Am29332 Instruction Manual

TABLE 10-2. ARITHMETIC INSTRUCTIONS

[vow [sane code] oeserpion [ume] fell lelv[nle.

FE Oe rapoe (| aa Ada wits Gary [8 farere a | PI PPP

Subtract

a ee Ree |B B+At1 saree 38 SEMI EIEESEIEL Correct BCD Nibbles B Lee

SUBR SUBC

SUBRC

Subtract with Carry

SUM-CORR-A SUM-CORR-B DIFF-CORR-A DIFF-CORR-B

Notes: 1. These instructions use the byte aligned instruction format (FORMAT 1).

2. BOROW is LOW. For subtract operations, a borrow rather than a carry is stored in STATUS if BOROW is HIGH. Carry is always generated for ADD regardless of BOROW.

3. First, the nibble carries NCo~-NC7 are tested. Any nibble carry/borrow that is set to 1 generates ''6"’ internally as a correction word and then the correction word is added (SUM-CORR- ) or subtracted (DIFF-CORR- ) from the operand. NCg-NC7 are not affected by this operation.

4. Use SUM-CORR or DIFF-CORR to add or subtract a BCD number.

5. Use ADDC, SUBC, or SUBRC to perform operations on integers longer than 32 bits.

6. Carry bit is obtained from MCin if M/m is HIGH. Otherwise, carry is obtained from the C status bit.

for Addition Corrected B

Correct BCD Nibbles Corrected A

for Subtraction

Legend: Unsel = Unselected Byte(s) Sel = Selected Byte(s) A=A Input B=B8 Input Q=Q Register

* = Updated only if byte width is 3 or 4

Example:

0, ADD Add two 32-bit two's-complement integers

CHAPTER 1 Am29332 Instruction Manual

TABLE 11-1. DIVIDE INSTRUCTIONS (Aligned Format)

le—lo Unselected Code Description Bytes Is/m|ufzivinic|

Signed Divide Steps

[Sioned Ove Steps —OSCSC~“s*s*S*s~“‘—*“—s*~“‘“‘S~SSSOSOSOSCSC‘“‘“‘CS*S*SSSSOSCSCSCSCSCSCSC‘C~*S Tsoweinst | 4€ | Frat ration for Sone Dwwe SST OP Tsoverer | 50 | herate Step (@bis-1 time SiS Pv SOLAS yee PELE SOILASTS Pe eee

Unsigned Divide Steps First Instruction for Unsigned Divide Te | va lll

UDIVFIRST

UDIVSTEP iterate Step (#bits - 1 times) Multiprecision Divide Steps Powe

MPUDIVSTP3

Used for Unsigned Divide Correction Steps

REMCOAR | 58 | Correct Remainder After Dvide | BY ET quocorr | 59 | Correct Quotient after Divide BY

or | armen | Samet| “Set |prw | op | wat» ce owe_s[ ce ce aan coowr ft Coon [ Ceowr[

Note: Divisor in A, Dividend in A Quotient in Q, Remainder in B

Legend: A=A Input

B =B Input S = Status Register Q=Q Register

R1 = Quotient

R2 = Dividend

R3 = Remainder

R4 = Divisor

1-13

CHAPTER 1 Am29332 Instruction Manual

TABLE 12-1. MULTIPLY INSTRUCTIONS (Aligned Format)

Ig -1 Unselected Code Description pie . is|[mitiz{vinic Signed Multiply Steps i

Pswurinst [5 F | Frat many newcion —S—C~isSC~*t SiS” Psmurster [Se | trate sep (wois/2- ise ——s«dY ® —<dY TT TP

Unsigned Multiply Steps Pumuirinst [| 58 | Frstmutipy nsincion SST SST SU OT TTT Fumuster | 50 | terate stop (#bis/2-isiep S| | YT TT

Recess

UMULLAST Last multiply instruction PBT YM

TABLE 12-2. EXAMPLE CODING FORM (Unsigned Multiply)

Cc or | oxanen | Sees | Set [ow | or | wa | roaton | am | oan | vour | oF. a a A

ULMULFIRST

ae ee Fe a aaa ee ee pT 8 fuMuster | Te os | | [eee een (ca) re ee eer ae Ee LE sige et die et we ey

Am29332 Y-Out

UMULLAST PASS-Q

Note: 1. Put ALU output in B. 2. Multiplicand in A, Multiplier in Q Product (HIGH) in B, Product (LOW) in Q

Legend: A=A Input

B =B Input S = Status