More coursework: 1 - A | B | C | D | E | F | G | H | I - J | K - L | M | N - O | P - S | T | U - Y

Comparing motorola and intel math coprocessors

Floating Point Coprocessors

The designer of any microprocessor would like to extend its

instruction set almost infinitely but is limited by the quantity of silicon

available (not to mention the problems of testability and complexity).

Consequently, a real microprocessor represents a compromise between what

is desirable and what is acceptable to the majority of the chip's users. For

example, the 68020 microprocessor is not optimized for calculations that

require a large volume of scientific (i.e. floating point) calculations. One

method to significantly enhance the performance of such a microprocessor is

to add a coprocessor. To increase the power of a microprocessor, it does not

suffice to add a few more instructions to the instruction set, but it involves

adding an auxiliary processor that works in parallel to the MPU (Micro

Processing Unit). A system involving concurrently operating processors can

be very complex, since there need to be dedicated communication paths

between the processors, as well as software to divide the tasks among them.

A practical multiprocessing system should be as simple as possible and

require a minimum overhead in terms of both hardware and software. There

are various techniques of arranging a coprocessor alongside a

microprocessor. One technique is to provide the coprocessor with an

instruction interpreter and program counter. Each instruction fetched from

memory is examined by both the MPU and the coprocessor. If it is a MPU

instruction, the MPU executes it; otherwise the coprocessor executes it. It

can be seen that this solution is feasible, but by no means simple, as it would

be difficult to keep the MPU and coprocessor in step. Another technique is

to equip the microprocessor with a special bus to communicate with the

external coprocessor. Whenever the microprocessor encounters an operation

that requires the intervention of the coprocessor, the special bus provides a

dedicated high-speed communication between the MPU and the coprocessor.

Once again, this solution is not simple. There are more methods of

connecting two (or more) concurrently operating processors, which will be

covered in more detail during the specific discussions of the Intel and

Motorola floating point coprocessors.

Motorola Floating Point Coprocessor (FPC) 68882

The designers of the 68000-family coprocessors decided to implement

coprocessors that could work with existing and future generations of

microprocessors with minimal hardware and software overhead. The actual

approach taken by the Motorola engineers was to tightly couple the

coprocessor to the host microprocessor and to treat the coprocessor as a

memory-mapped peripheral lying inside the CPU address space. In effect,

the MPU fetches instructions from memory, and, if an instruction is a

coprocessor instruction, the MPU passes it to the coprocessor by means of

the MPU's asynchronous data transfer bus. By adopting this approach, the

coprocessor does not have to fetch or interpret instructions itself. Thus if the

coprocessor requires data from memory, the MPU must fetch it. There are

advantages and disadvantages to this design. Most notably, the coprocessor

does not have to deal with, for example, bus errors, as all fetching is

performed by the host MPU. On the other hand, the FPC can not act as a bus

master (making it a non-DMA device), making memory accesses by the FPC

slower than if it were directly connected to the address and data bus.

In order for the coprocessor to work as a memory mapped device, the

designers of the 68000 series of MPU's had to set aside certain bit patterns

to represent opcodes for the FPC. In the case of the 68000's, the FPC is

accessed through the opcode 1111(2). This number is the same as 'F' in

hexadecimal notation, so this bit pattern is often referred to as the F-line.

Interface

The 68882 FPC employs an entirely conventional asynchronous bus

interface like all 68000 class devices, and absolutely no new signals

whatsoever are required to connect the unit to an MC 68020 MPU. The

68882 can be configured to run under a variety of different circumstances,

including various sized data buses and clock speeds. What follows is a

diagram of connections necessary to connect the 68882 to a 68020 or 68030

MPU using a 32-bit data path.

As mentioned previously, all instructions for the FPC are of the F-line

format, that is, they begin with the bit pattern 1111(2). A generic coprocessor

instruction has the following format: the first four bits must be 1111. This

identifies the instruction as being for the coprocessor. The next three bits

identify the coprocessor type, followed by three bits representing the

instruction type. The meaning of the remaining bits varies depending on the

specific instruction.

Coprocessor Operation

When the MPU detects an F-line instruction, it writes the instruction

into the coprocessors memory mapped command register in CPU space.

Having sent a command to the coprocessor, the host processor reads the

reply from the coprocessor's response register. The response could, for

example, instruct the processor to fetch data from memory. Once the host

processor has complied with the demands from the coprocessor, it is free to

continue with instruction processing, that is, both the processor and

coprocessor act concurrently. This is why system speed can be dramatically

improved upon installation of a coprocessor.

MC 68882 Specifics

The MC 68882 floating point coprocessor is basically a very simple

device, though it's data manual is nearly as thick as that of the MC 68000.

This complexity is due to the IEEE floating point arithmetic standards rather

than the nature of the FPC. The 68882 contains eight 80-bit floating point

data registers, FP0 to FP7, one 32-bit control register, FPCR, and one 32-bit

status register, FPSR. Because the FPC is memory mapped in CPU space,

these registers are directly accessible to the programmer within the register

space of the host MPU. In addition to the standard byte, word and longword

operations, the FPC supports four new operand sizes: single precision real

(.S), double precision real (.D), extended precision real (.X) and packed

decimal string (.P). All on-chip calculations take place in extended precision

format and all floating point registers hold extended precision values. The

single real and double real formats are used to input and output operands.

All three real floating point formats comply with the corresponding IEEE

floating point number standards. The FPC has built in functions to convert

between the various data formats added by the unit, for example a register

move with specified operand type (.P, .B, etc).

The 68882 FPC has a significant instruction set designed to satisfy

many number-crunching situations. All instructions native to the FPC start

with the bit pattern 1111(2) to show that the instruction deals with floating

point numbers. Some instructions supported by the FPC include FCOSH,

FETOX, FLOG2, FTENTOX, FADD, FMUL and FSQRT. There are many

more instructions available, but this excerpt demonstrates the versatility of

the 68882 unit.

One of the registers within the FPC is the status register. It is very

similar in function to the status register in a CPU; it is updated to show the

outcome of the most recently executed instruction. Flags within the status

register of the FPC include divide by zero, infinity, zero, overflow,

underflow and not a number. Some of the conditions signaled by the status

register of the FPC (for example divide by zero) require an exception routine

to be executed, so that the user is informed of the situation. These exceptions

are stored and executed within the host MPU, which means that the FPC can

be used to control loops and tests within user programs - further extending

the functionality of the coprocessor.

Intel Math Coprocessor 80387 DX

In many respects, the Intel 80387 math coprocessor (MCP) is very

similar to the MC 68882. Both designs were influenced by such factors as

cost, usability and performance. There are, however, subtle differences in

the designs of the two units.

Firstly, I shall discuss the similarities between the designs followed by

differences. Like the 68882, the 80387 requires no additional hardware to be

connected to a 80386. It is a non-DMA device, having no direct access to the

address bus of the motherboard. All memory and I/O is handled by the CPU,

which upon detection of a MCP instruction passes it along to the MCP. If

additional memory reads are necessary to load operands or data, the MCP

instructs the CPU to perform these actions. This design, although reducing

MCP performance when compared to a direct connection to the address bus,

significantly decreases complexity of the MCP as no separate address

decoding or error handling logic is necessary. The connection between the

CPU and the MCP instruction is via a synchronous bus, while internal

operation of the MCP can run asynchronously (higher clockspeed).

Moreover, the three functional units of the MCP can work in parallel to

increase system performance. The CPU can be transferring commands and

data to the MCP bus control logic while the MCP floating unit is executing

the current instruction. Similar to the 68882, the 80387 has a bit pattern

(11011(2)) reserved to identify instructions intended for it. Also, the registers

of the MCP are memory mapped into CPU address space, making the

internal registers of the MCP available to programmers.

Internally, the 80387 contains three distinct units: the bus

control logic (BCL), the data interface and control unit and the actual

floating point unit. The data interface and control unit directs the data to the

instruction decoder. The instruction decoder decodes the ESC instructions

sent to it by the CPU and generates controls that direct the data flow in the

instruction buffer. It also triggers the microinstruction sequencer that

controls execution of each instruction. If the ESC instruction is FINIT,

FCLEX, FSTSW, FSTSW AX, or FSTCW, the control unit executes it

independently of the FPU and the sequencer. The data interface and control

unit is the unit that generates the BUSYÝ, PEREQ and ERRORÝ signals

that synchronize Intel 387 DX MCP activities with the Intel 80386 DX CPU.

It also supports the FPU in all operations that it cannot perform alone (e.g.

exceptions handling, transcendental operations, etc.).

The FPU executes all instructions that involve the register stack,

including arithmetic, logical, transcendental, constant, and data transfer

instructions. The data path in the FPU is 84 bits wide (68 significant bits, 15

exponent bits, and a sign bit) which allows internal operand transfers to be

performed at very high speeds.

Interface

The MCP is connected to the MPU via a synchronous connection,

while the numeric core can operate at a different clock speed, making it

asynchronous. The following diagram will clarify this.

The following diagram shows the specific connections necessary

between the 80386 MPU and the 80387 MCP.

A typical coprocessor instruction must begin with the bit pattern

11011(2) to identify the instruction for the coprocessor. The bus control logic

of the MCP (BCL) communicates solely with the CPU using I/O bus cycles.

The BCL appears to the CPU as a special peripheral device. It is special in

one important respect: the CPU uses reserved I/O addresses to communicate

with the BCL. The BCL does not communicate directly with memory. The

CPU performs all memory access, transferring input operands from memory

to the MCP and transferring outputs from the MCP to memory.

Coprocessor Operation

When the CPU detects the arrival of a coprocessor instruction, it

writes the instruction into the coprocessors memory mapped command

register in CPU space. Having sent a command to the coprocessor, the host

processor reads the reply from the coprocessor's signals. The response

could, for example, instruct the processor to fetch data from memory. Once

the host processor has complied with the demands from the coprocessor, it is

free to continue with instruction processing, that is, both the processor and

coprocessor act concurrently. This is why system speed can be dramatically

improved upon installation of a coprocessor.

80387 Specifics

Just like the MC 68882 floating point coprocessor, the Intel 80387 is

basically a very simple device. Like any reasonable math coprocessor, it

conforms to the IEEE standards of floating point number representations.

The 80387 contains eight 82-bit floating point data registers (including a 2-

bit tag field), R0 to R7, one 16-bit control register, one 16-bit status register

and a tag word (that contains the tag fields for the eight data registers). The

MCP also indirectly uses the 48-bit instruction and data pointer registers of

the 80386 host processor, even though these are external to the unit. Because

the FPC is memory mapped in CPU space, these registers are directly

accessible to the programmer within the register space of the host MPU. In

addition to the standard word, short and long (16, 32 and 64-bit) integer

operations, the MCP supports four new operand sizes: single precision real,

double precision real, extended precision real and packed binary coded

decimal strings. All on-chip calculations take place in extended precision

format and all floating point registers hold extended precision values. The

single real and double real formats are used to input and output operands.

All three real floating point formats comply with the corresponding IEEE

floating point number standards. The MCP has built in functions to convert

between the various data formats added by the unit.

The 80387 has a significant instruction set designed to satisfy many

number-crunching situations. All instructions native to the MCP start with

the bit pattern 11011(2) to show that the instruction should be directed to the

coprocessor. Some (of the over 70) instructions supported by the MCP are

FCOMP, FDIV, FSQRT, FSINCOS, FINIT. There are many more

instructions available, but this excerpt demonstrates the versatility of the

80387 unit, which is very similar to that of the 68882 unit.

One of the registers within the MCP is the status register. Just like for

the 68882, the status register shows the outcome of the most recently

executed instruction. Flags within the status register of the FPC include

divide by zero, infinity, zero, overflow, underflow and invalid operation.

Some of the conditions signaled by the status register of the FPC (for

example divide by zero) require an exception routine to be executed by the

host MPU, so that the user is informed of the situation. These exceptions are

stored and executed within the host MPU, which means that the MCP can

again be used to control loops and tests within user programs - further

extending the functionality of the coprocessor.

The Intel 80387 DX MCP register set can be accessed either as a

stack, with instructions operating on the top one or two stack elements, or as

a fixed register set, with instructions operating on explicitly designated

registers. The TOP field in the status word identifies the current top-of-stack

register. A ``push'' operation decrements TOP by one and loads a value into

the new TOP register. A ``pop'' operation stores the value from the current

top register and then increments TOP by one. Like the 80386 DX

microprocessor stacks in memory, the MCP register stack grows ``down''

toward lower-addressed registers. Instructions may address the data registers

either implicitly or explicitly. The explicit register addressing is also relative

to TOP.

A notable feature of the 80387 is the addition of a tag field of 2 bits to

each of the eight floating point registers. The tag word marks the content of

each numeric data register, as Figure 2.1 shows. Each two-bit tag represents

one of the eight numeric registers. The principal function of the tag word is

to optimize the MCP's performance and stack handling by making it possible

to distinguish between empty and nonempty register locations. It also

enables exception handlers to check the contents of a stack location without

the need to perform complex decoding of the actual data.

Evaluation of the two Coprocessor

I started this paper thinking that the Motorola math coprocessor had to

be better in design, implementation and features than its Intel counterpart.

Throughout my research I came to realize that my opinions were based on

nothing but myths. In many respects the two coprocessors are very similar to

each other, while in other respects the coprocessors differ radically in design

and implementation. I will sum up the points I consider most important.

1. Intel uses a synchronous bus between the CPU and the MCP, while

the actual internal floating unit can run asynchronously to this.

This increases complexity of the design as synchronization logic

must exist between the two processors, but like this the floating

point unit can run at a higher clock speed than the CPU upon

installation of a dedicated clock generator.

2. The (logical, not physical) addition of tag fields to the data

registers in the 80387 to signal certain conditions of the data

registers makes certain operations that support tags much faster, as

certain information does not need to be decoded as it is "cached" in

the tag fields.

3. The 80387 can use its registers either in stack mode or absolute

addressing mode. Though some operations require stack

addressing, this feature adds a little more flexibility to the MCP

(even though the stack operations might be a legacy from the 8087

or 80287).

In most other fields, the coprocessors are equals. They have the same

number of data registers, both add their own instruction set and registers to

programmers in a transparent fashion and both support the same IEEE

numeric representation standards. Probably both coprocessors have similar

processing power at equal clockspeed as well. Even though the Motorola

coprocessor seems to be superior by name, I have to admit that the 80387

gets my vote for more flexibility and thoughtful optimizations (tags).



About this resource

This coursework was submitted to us by a student in order to help you with your studies.


Search our content:


  • Download this page
  • Print this page
  • Search again

  • Word count:

    This page has approximately words.


    Share:


    Cite:

    If you use part of this page in your own work, you need to provide a citation, as follows:

    Essay UK, Comparing Motorola And Intel Math Coprocessors. Available from: <https://www.essay.uk.com/coursework/comparing-motorola-and-intel-math-coprocessors.php> [27-05-20].


    More information:

    If you are the original author of this content and no longer wish to have it published on our website then please click on the link below to request removal: