INTRODUCTION TO ASSEMBLY LANGUAGE PROGRAMMING

INTRODUCTION TO ASSEMBLY LANGUAGE PROGRAMMING

This topic does not discuss the general features of computers, microcomputers, addressing methods, or instruction sets; you should refer to An Introduction to Microcomputers for that information.

THE MEANING OF INSTRUCTIONS

The instruction set of a microprocessor is the set of binary inputs that produce defined actions during an instruction cycle. An instruction set is to a microprocessor what a function table is to a logic device, such as a gate, adder, or shift register. Of course, the actions that the microprocessor performs in response to its instruction in­puts are far more complex than the actions that logic devices perform in response to their inputs .

An instruction is a binary bit pattern - it must be available at the data inputs to the microprocessor at the proper time in order to be interpreted as an instruction. For example, when the 6800 microprocessor receives the 8-bit binary pattern 01001111 as the input during an instruction fetch, the pattern means:

"Clear (put zero in) Accumulator A" Similarly, the pattern 10000110 means:

"Load Accumulator A with the contents of the next word of program memory"

The microprocessor (like any other computer) only recognizes binary patterns as instructions or data; it does not recognize words or octal, decimal, or hexadecimal numbers.

A COMPUTER PROGRAM

A program is a series of instructions that cause a computer to perform a particular task.

Actually, a computer program includes more than instructions; it also contains the data and memory addresses that the microprocessor needs to accomplish the tasks defined by the in­structions. Clearly, if the microprocessor is to perform an addition, it must have two numbers to add and a place to put the result. The computer program must determine the sources of the data and the destination of the result as well as the operation to be performed.

Most microprocessors execute instructions sequentially unless one of the instructions changes the execution sequence or halts the computer; i.e., the processor gets the next instruction from the next higher memory address unless the current instruction specifically directs it to do otherwise.

Ultimately, every program is transformed into a set of binary numbers. For exam­ple, this is the 6800 program which adds the contents of memory locations 6016 and 61 is and places the result in memory location 6216 :

10010110

01100000

10011011

01100001

10010111

01100010

This is a machine language, or object, program. If this program were entered into the memory of a 6800-based microcomputer, the microcomputer would be able to execute it directly.

THE PROGRAMMING PROBLEM

There are many difficulties associated with creating programs as object, or binary machine language, programs. These are some of the problems:

1) The programs are difficult to understand or debug (binary numbers all look the same, particularly after you have looked at them for a few hours).

2) The programs are slow to enter, since you must determine each bit individually.

3) The programs do not describe the task which you want the computer to perform in anything resembling a human-readable format.

4) The programs are long and tiresome to write.

5) The programmer often makes careless errors that are very difficult to find.

For example, the following version of the addition object program contains a single bit error. Try to find it:

10010110

01100000

10011011

01110001

10010111

01100010

Although the computer handles binary numbers with ease, people do not. People find binary programs long, tiresome, confusing, and meaningless. A programmer may even­tually start remembering some of the binary codes, but such effort should be spent more productively.

USING OCTAL OR HEXADECIMAL

We can improve the situation somewhat by writing instructions as octal or hexadecimal, rather than binary, numbers. We will use hexadecimal numbers In this topic because they are shorter, and because they are the standard for the microprocessor industry. Table 1-1 defines the hexadecimal digits and their binary equivalents. The 6800 program to add two numbers now becomes:

96

60

9B

61

97

62

At the very least, the hexadecimal version is shorter to write and not quite so tiring to examine.

Errors are somewhat easier to find in a sequence of hexadecimal digits. The er­roneous version of the addition program, in hexadecimal form, becomes:

96

60

9B

71

97

62

The mistake is easier to spot.

What do we do with this hexadecimal program? The microprocessor only unders­tands binary instruction codes. The answer is that we must convert the hexadecimal numbers to binary numbers. This conversion is a repetitive, tiresome task. People who attempt it make all sorts of petty mistakes, such as looking at the wrong line, dropping a bit, or transposing a bit or a digit.

This repetitive, grueling task is. however, a perfect job for a com­puter. The computer never gets tired or bored and never makes silly mistakes. The idea. then, is to write a program which takes hexadecimal numbers, converts them into binary numbers, and places the binary numbers into the microcomputer memory. This is a standard program provided with many microprocessors; it is called a hexadecimal loader.

Is a hexadecimal loader worth having? If you are willing to write a program using binary numbers and are prepared to enter the program in its binary form into the computer, then you will not need the hexadecimal loader.

If you choose the hexadecimal loader, you will have to pay a price for it. The hex­adecimal loader is itself a program that you must load into memory. Furthermore, the hexadecimal loader will occupy memory — memory that you may want to use in some other way.

The basic tradeoff, therefore, is the cost and memory requirements of the hexadecimal loader versus the savings in programmer time.

A hexadecimal loader is well worth its small cost.

A hexadecimal loader certainly does not solve every programming problem. The hex­adecimal version of the program is still difficult to read or understand; for example, it does not distinguish instructions from data or addresses, nor does the program listing provide any suggestion as to what the program does. What does 86 or 3F mean? Memorizing a card full of codes is hardly an appetizing proposition. Furthermore, the codes will be entirely different for a different microprocessor, and the program will re­quire a large amount of documentation.

Table 1-1. Hexadecimal Conversion Table

Hexadecimal Digit

Binary Equivalent

Decimal Equivalent

0

0000

0

1

0001

1

2

0010

2

3

0011

3

4

0100

4

5

0101

5

6

0110

6

7

0111

7

8

1000

8

9

1001

9

A

1010

10

B

1011

11

C

1100

12

D

1101

13

F

1111

15

INSTRUCTION CODE MNEMONICS

An obvious programming improvement is to assign a name to each instruction code. The instruction code name is called a mnemonic, or memory aid. The instruc­tion mnemonic should describe in some way what the instruction does.

In fact, every microprocessor manufacturer (they cannot remem­ber hexadecimal codes either) provides a set of mnemonics for the microprocessor instruction set. You do not have to abide by the manufacturer's mnemonics; there is nothing sacred about them However, they are standard for a given microprocessor, and therefore understood by all users. These are the instruction names that you will find in manuals, cards, topics, arti­cles, and programs. The problem with selecting instruction mnemonics is that not all in­structions have "obvious" names. Some instructions do (e.g., ADD, AND, OR), others have obvious contractions (e.g., SUB for subtraction, XOR for exclusive-OR), while still others have neither. The result is such mnemonics as WMP, PCHL, and even SOB (try to guess what that means!). Most manufacturers come up with some reasonable names and some hopeless ones. However, users who devise their own mnemonics rarely seem to do much better than the manufacturer.

Along with the instruction mnemonics, the manufacturer will usually assign names to the CPU's registers. As with the instruction names, some register names are obvious (e.g., A for Accumulator) while others may have only historical significance. Again, we will use the manufacturer's suggestions simply to promote standardization.

If we use standard 6800 instruction and register mnemonics, as defined by Motorola, our 6800 addition program becomes:

LDAA

60

ADDA

61

STAA

62

The program is still far from obvious, but at least some parts of it are comprehensible. ADDA is a considerable improvement over 9B; LDAA and STAA do suggest loading and storing, respectively. We now know which lines are instructions and which are data or addresses. Such a program is an assembly language program.

THE ASSEMBLER PROGRAM

How do we get the assembly language program into the com­puter? We have to translate it, either into hexadecimal or into bin­ary numbers. You can translate an assembly language program by hand, instruction by instruction. This is called hand assembly.

Hand assembly of a three-instruction sequence may be illustrated as follows:

Instruction Name

Addressing Method

Hexadecimal Equivalent

LDAA direct

ADDA direct

STAA direct

direct

direct

direct

96

9B

97

As in the case of hexadecimal to binary conversion, hand assembly is a rote task which is uninteresting, repetitive, and subject to numerous minor errors. Picking the wrong line, transposing digits, omitting instructions, and misreading the codes are only a few of the mistakes that you may make. Most microprocessors complicate the task even further by having instructions with different word lengths. Some instructions are one word long, while others are two or three words long. Some instructions require data in the second and third words; others require memory addresses, register numbers, or other information.

Assembly is another rote task that we can assign to the microcomputer. The microcomputer never makes any mistakes when translating codes; it always knows how many words and what format each instruction requires. The program that does this job is an Assembler. The Assembler program translates a user program, or source program written with mnemonics, into a machine language program, or object pro­gram, which the microcomputer can execute. The Assem­bler's input is a source program, and its output is an object program.

The tradeoffs that we discussed in connection with the hexadecimal loader are magnified in the case of the Assembler. Assemblers are more expensive, occupy more memory, and require more peripherals and execution time than do hexadecimal loaders. While users may (and often do) write their own loaders, few care to write their own assemblers.

Assemblers have their own rules that you must learn to abide by. These include the use of certain markers (such as spaces, commas, semicolons, or colons) in appropriate places, correct spelling, the proper control information, and perhaps even the correct placement of names and numbers. These rules typically are a minor hindrance that can be quickly overcome.

ADDITIONAL FEATURES OF ASSEMBLERS

Early assembler programs did little more than translate the mnemonic names of instruc­tions and registers into their binary equivalents. However, most assemblers now pro­vide such additional features as:

1) Allowing the user to assign names to memory locations, input and output devices, and even sequences of instructions.

2) Converting data or addresses from various number systems (e.g., decimal or hex­adecimal) to binary and converting characters into their ASCII or EBCDIC binary codes.

3) Performing some arithmetic as part of the assembly process.

4) Telling the loader program where in memory parts of the program or data should be placed.

5) Allowing the user to assign areas of memory as temporary data storage and to place fixed data in areas of program memory.

6) Providing the information required to include standard programs from program libr­aries, or programs written at some other time, in the current program.

7) Allowing the user to control the format of the program listing and the input and output devices employed. _

All of these features, of course, involve additional cost and memo­ry. Microcomputers generally have much simpler assemblers than do larger computers, but the tendency always is for the size of as­semblers to increase. You will often have a choice of assemblers. The important criterion is not how many off-beat features the Assembler has, but rather how convenient it is to use in normal practice.

DISADVANTAGES OF ASSEMBLY LANGUAGE

The Assembler, like the hexadecimal loader, does not solve all the problems of programming. One problem is the tremendous gap between the microcomputer in­struction set and the tasks which the microcomputer is to perform. Computer in­structions tend to do things like add the contents of two registers, shift the contents of the Accumulator one bit, or place a new value in the Program Counter. On the other hand, a user generally wants a microcomputer to do something like check if an analog reading has exceeded a threshold, look for and react to a particular command from a teletypewriter, or activate a relay at the proper time. An assembly language program­mer must translate such tasks into a sequence of simple computer instructions. The translation can be a difficult, time-consuming job.

Furthermore, if you are programming in assembly language, you must have detailed knowledge of the particular microcomputer that you are using. You must know what registers and instructions the microcomputer has, precisely how the instructions affect the various registers, what addressing methods the computer uses, and a myriad of other information. None of this information is relevant to the task which the microcomputer must ultimately perform.

In addition, assembly language programs are not portable. Each microcomputer has its own assembly language, which reflects its own architecture. An assembly language program written for the 6800 will not run on the 8080. the F8, or the PACE. For example, the addition program written for the 8080 would be:

LDA            60H

MOV             B.A

LDA             61H

ADD                 B

STA            62H

The lack of portability not only means that you will not be able to use your assembly language program on another microcomputer; it also means that you will not be able to use any programs that were not specifically written for the microcomputer you are using. This is a particular drawback for microcomputers, since these devices are new and few assembly language programs exist for them. The result, too frequently, is that you are on your own. If you need a program to perform a particular task, you are not likely to find it in the small program libraries that most manufacturers provide. Nor are you likely to find it in an archive, journal article, or someone's old program file. You will probably have to write it yourself.

HIGH-LEVEL LANGUAGES

The solution to many of the difficulties associated with assembly language programs is to use, instead, "high-level" or "procedure-oriented" languages. Such languages allow you to describe tasks in forms that are problem-oriented rather than computer-oriented. Each statement in a high-level language performs a recognizable function; it will generally corres­pond to many assembly language instructions. A program called a Compiler transl­ates the high-level language source program into object-code or machine-language instructions.

Many different high-level languages exist for different types of of tasks. If, for example, you can express what you want the computer to do in algebraic notation, you can write your program in FORTRAN (FORmula TRANslation language), the oldest and most widely used of the high-level languages. Now, if you want to add two numbers, you just tell the computer:

SUM = NUMB1 + NUMB2

That is a lot simpler (and a lot shorter) than either the equivalent machine language pro­gram or the equivalent assembly language program. Other high-level languages in­clude COBOL (for business applications), ALGOL and PASCAL (other algebraic languages), PL/1 (a combination of FORTRAN, ALGOL and COBOL), and APL and BASIC (languages that are popular for time-sharing systems).

ADVANTAGES OF HIGH-LEVEL LANGUAGES

Clearly, high-level languages make programs easier and faster to write. A common estimate is that a programmer can write a program about ten times as fast in a high-level language as compared to assembly language. That is just writing the pro­gram; it does not include problem definition, program design, debugging, testing, or documentation, all of which become simpler and faster. The high-level language pro­gram is, for instance, partly self-documenting. Even if you do not know FORTRAN, you probably could tell what the statement illustrated above does.

High-level languages solve many other problems associ­ated with assembly language programming. The high-level language has its own syntax (usually defined by a national or international standard). The language does not mention the in­struction set. registers, or other features of a particular com­puter. The compiler takes care of all such details. Programmers can concentrate on their own tasks; they do not need a detailed understanding of the underlying CPU architec­ true -- for that matter, they do not need to know anything about the computer they are programming.

Programs written in a high-level language are portable - at least, in theory. They will run on any computer that has a stan­dard compiler for that language.

At the same time, all previous programs written in a high-level language for prior com­puters are available to you when programming a new computer. This can mean thou­sands of programs in the case of a common language like FORTRAN or BASIC.

DISADVANTAGES OF HIGH-LEVEL LANGUAGES

Well, if all the good things we have said about high-level languages are true, if you can write programs faster and make them portable besides, why bother with as­sembly languages? Who wants to worry about registers, instruction codes, mnemonics, and all that garbagel As usual, there are disadvantages that balance the advantages.

One obvious problem is that you have to learn the "rules" or "syntax" of any high-level language you want to use. A high-level language has a fairly complicated set of rules. You will find that it takes a lot of time just to get a program that is syntactically correct (and even then it probably will not do what you want). A high-level computer language is like a foreign language. If you have a little talent, you will get used to the rules and be able to turn out programs that the compiler will accept. Still, learning the rules and trying to get the program accepted by the compiler do not contribute directly to doing your job.

Here, for example, are some FORTRAN rules:

- Labels must consist entirely of numbers and must be placed in the first five card col­umns

- Statements must start in column seven

- Integer variables must start with the letters I. J, K, L, M or N

Another obvious problem is that you need a compiler to transl­ate programs written in a high-level language to machine language. Compilers are expensive and use a large amount of memory. While most assemblers occupy 2K to 16K bytes of memory (IK = 1024), compilers usually occupy much more memory- So, the amount of overhead in­volved in using the compiler is rather large.

Furthermore, only some compilers will make the implementa­tion of your task simpler. FORTRAN, for example, is well-suited to problems that can be expressed as algebraic formulas. If, however, your problem is controlling a printer, editing a string of characters, or monitor­ing an alarm system, your problem cannot be easily expressed in algebraic notation. In fact, formulating the solution in algebraic notation may be more awkward and more difficult than formulating it in assembly language. The answer is, of course, to use a more suitable high-level language. Some such languages exist, but they are far less widely used and standardized than FORTRAN. You will not get many of the advantages of high-level languages if you use these so-called system implementation languages.

High-level languages often do not produce very efficient machine language programs. The basic reason for this is that compilation is an automatic process which is riddled with com­promises to allow for many possibilities. The compiler works much like a computerized language translator - sometimes the words are right, but the sounds and sentence structures are awkward. A simple compiler cannot know when a variable is no longer being used and can be dis­carded, when a register should be used rather than a memory location, or when varia­bles have simple relationships. The experienced programmer can take advantage of shortcuts to shorten execution time or reduce memory usage. A few compilers (known as optimizing compilers) can also do this, but such compilers are much larger and slower than regular compilers.

The general advantages and disadvantages of high-level languages are:

Advantages:

- More convenient descriptions of tasks

- Greater programmer productivity

- Easier documentation

- Standard syntax

- Independence of the structure of a particular computer

- Portability

- Availability of library and other programs

Disadvantages:

- Special rules

- Extensive hardware and software support required

- Orientation of common languages to algebraic or business problems

- Inefficient programs

- Difficulty of optimizing code to meet time and memory requirements

- Inability to use special features of a computer conveniently

HIGH-LEVEL LANGUAGES FOR MICROPROCESSORS

Microprocessor users will encounter several special difficulties when using high-level languages. Among these are:

- Few high-level languages exist for microprocessors

- No standard languages are widely available

- Few compilers actually run on microcomputers. Those that do often require very large amounts of memory

- Most microprocessor applications are not well-suited to high-level languages

- Memory costs are often critical in microprocessor applications

The lack of high-level languages is partly a result of the fact that microprocessors are quite new and are the products of semiconductor manufacturers rather than computer manufacturers. Very few high-level languages exist for microprocessors. The most com­mon are the PL/1-type languages such as Intel's PL/M, Motorola's MPL, and Signetics' PLμS

Even the few high-level languages that exist do not conform to recognized standards, so the microprocessor user cannot expect to gain much program portability, access to program libraries, or use of previous experience or programs. The main advantages re­maining are the reduction in programming effort and the smaller amount of detailed understanding of the computer architecture that is necessary.

The overhead involved in using a high-level language with microprocessors is considerable. Microprocessors themselves are better suited to control and slow interactive applications than they are to the character manipulation and language analysis involved in compilation. Therefore, most compilers for microprocessors will not run on a microprocessor-based system. Instead, they require a much larger com­puter; i.e., they are cross-compilers rather than self-compilers. A user must not only bear the expense of the larger computer, he must also physically transfer the program from the larger computer to the microprocessor-based computer.

A few self-compilers are available. These compilers run on the microcomputer for which they produce object code. Unfortunately, they require large amounts of memory (16K or more), plus special supporting hardware and software.

High-level languages also are not generally well-suited to microprocessor applications. Most of the common languages were devised either to help solve scientific problems or to han­dle large-scale business data processing. Few microprocessor applications fall in either of these areas. Most microprocessor applications involve send­ing data and control information to output devices and receiving data and status infor­mation from input devices. Often the control and status information consists of a few binary digits with very precise hardware-related meanings. If you try to write a typical control program in a high-level language, you often feel like someone who is trying to eat soup with chopsticks. For tasks in such areas as test equipment, terminals, naviga­tion systems, and business equipment, the high-level languages work much better than they do in instrumentation, communications, peripherals, and automotive applications.

Applications better suited to high-level languages are those which require large memories. If the cost of a single memory chip is im­portant, as in a valve controller, electronic game, appliance con­troller, or small instrument, then the inefficiency of high-level languages is intolerable. If, on the other hand, the system has many thousands of bytes of memory anyway, as in a terminal or test equipment, the in­efficiency of high-level languages is not as important. Clearly, the size of the program and the volume of the product are important factors as well. A large program will greatly increase the advantages of high-level languages. On the other hand, a high-volume application will mean that fixed software development costs are not as impor­tant as memory costs that are part of each system.

WHICH LEVEL SHOULD YOU USE?

That depends on your particular application. Let us briefly note some of the factors which may favor particular levels:

Machine Language

- Virtually no one programs in machine language. Its use can not be justified considering the low cost of an assembler.

Assembly Language

- Small to moderate-sized programs

- Applications where memory cost is a factor

- Real-time control applications

- Limited data processing

- High-volume applications

- More input/output or control than computation

High-Level Languages

- Large programs

- Low-volume applications requiring long programs

- Applications requiring large memories

- More computation than input/output or control

- Compatibility with similar applications using larger computers

- Availability of specific programs in a high-level language which can be used in the ap­plication

Many other factors are also important, such as the availability of a large computer for use in development, experience with particular languages, and compatibility with other applications.

If hardware will ultimately be the largest cost in your application or if speed is critical. you should favor assembly language. But be prepared to spend extra time in software development in exchange for lower memory costs and higher execution speeds. If soft­ware will be the largest cost in your application, you should favor a high-level language. But be prepared to spend the extra money required for the supporting hardware and software.

Of course, no one except some theorists will object if you use both assembly and high-level languages. You can write the program originally in a high-level language and then patch some sections in assembly language. However, most users prefer not to do this because of the havoc it creates in debugging, testing and documentation.

HOW ABOUT THE FUTURE?

We expect that the future will tend to favor high-level languages, for the following reasons:

-Programs always seem to accumulate extra features and grow larger

- Hardware and memory are becoming less expensive

- Software and programmers are becoming more expensive

- Memory chips are becoming available in larger sizes at lower "per bit" cost, so actual savings in memory cost are less likely

- More suitable and more efficient high-level languages are being developed

- More standardization of high-level languages will occur

Assembly language programming of microprocessors will not be a dying art any more than it is now for large computers. But longer programs, cheaper memory, and more ex­pensive programmers will make software costs a larger part of most applications. The edge in many applications will therefore go to high-level languages.

WHY THIS TOPIC ?

If the future would seem to favor high-level languages, why have a topic on as­sembly language programming? The reasons are:

1) Most current microcomputer users program in assembly language (almost 2/3, ac­cording to one recent survey).

2) Many microcomputer users will continue to program in assembly language, since they need the detailed control that it provides.

3) No suitable high-level language has yet become widely available or standardized.

4) Many applications require the efficiency of assembly language.

5) An understanding of assembly language can help in evaluating high-level languages.

The rest of this topic will deal exclusively with assemblers and assembly language pro­gramming. However, we do want readers to know that assembly language is not the only alternative. You should watch for new developments that may significantly reduce programming costs, if such costs are a major factor in your application.

Labels: