Chapter 1: Computers and Their Languages
At the end of this Chapter, you will be able to describe the structure of the typical computer and to outline the various levels of language which can be used to program them.
To enable you to do this, you must be able to:
- a. List an advantage of structured program design.
- b. Describe the function of the CPU.
- c. State the function of ROM.
- d. State the meaning of the term Firmware.
- e. State the function of RAM.
- f. Describe the meaning of the term "Bus".
- g. Outline how input and output is enabled.
- h. Describe how a hexadecimal code is derived from a binary string.
- j. Explain the term "Assembly Language".
- k. Explain what is meant by a High Level Language.
- l. Outline the advantages and disadvantages of Interpreters v Compilers.
1.0 Welcome to 'The Utter Novice Guide to GW-BASIC' and the world of computer languages. For those of you with no programming experience behind you - relax! You will find that constructing your own programs will become a fascinating and enjoyable part of your life. It is certainly not the intensely mathematical practice of popular rumour and I shall be avoiding that particular black art wherever possible. Before we go on, however, let me climb up on to my soap box for a moment and expound on a subject dear to my heart - structured programming.
1.1 Structured Programming. Over the past few years, I have seen many 'Teach Yourself SPLODGE 3' types of book come and go. The author leads the acolyte through the mysteries of the language and, at the end, the reader finds himself able to construct his own programs in SPLODGE 3. Does this mean he can write GOOD programs? Not at all. If YOU are to learn to program, you might as well do it properly and by that I mean designing your programs using some sort of methodology before you actually sit down to turn it into BASIC. A good program should perform its specified function correctly, be reliable, simple, maintainable and be capable of testing. For these reasons, I will take you through a design methodology before diving into the depths of GW-BASIC. We will learn "Structured Design Methodology" (SDM), also known as 'Top down' to its friends. There are other methodologies around but I consider this one almost tailor made for GW-BASIC and will encourage you to use good programming practices. Don't let the thought of learning SDM put you off - it is extremely easy to learn.
1.2 Format of the Book. Let me turn your attention now to the way in which the book is arranged. Look at the top of this Chapter, where you will find 2 sections the Training Objectives (TOs) and the Enabling Objectives (EOs). The TOs state the skill(s) you are expected to have acquired by the end of the Chapter. This gives you a specific goal to aim for and the Chapter is designed to meet that goal. However, the TOs give the high level picture. There are many sub skills and items of knowledge which must be mastered to enable you to meet the TOs. These, of course, are the EOs. The Chapter itself is arranged in paragraphs. Each has a number beginning with the Chapter number, a decimal point and then the paragraph number itself. I will make use of these paragraph numbers as we go through the book. They give a very precise way of specifying an exact location for any references. Now scroll down to the end of this Chapter. There you will see the Post Test. The function of this test is to enable you to check whether you have met the EOs and, through them, the TOs. The answers to the questions are to be found in Annex A. Have a look at Annex A. Note that, at the end of each answer, there is a number in brackets. This refers to the paragraph in which the subject of that particular question was introduced. If you give a wrong answer to any of the questions, you should refer back to the paragraphs and study the material again. Re-attempt those questions you got wrong when you are confident that you have the subject under your belt. I cannot stress too strongly that you must religiously attempt the Post Tests. They are your best way of ensuring that you have mastered the subject matter sufficiently to meet the TOs and you must meet the TOs if you are to be in a position to go on to the next Chapter. This is because each Chapter is designed to build upon the material introduced in the previous one. It is assumed, therefore, that you are in complete mastery of the material before that next area is started. It is no good thinking that any little niggling doubts, or one wrong answer which you leave unclarified will become clearer later on. A small problem now becomes a big problem soon. More than in most walks of life, Murphy's Law reigns supreme in the 'Land of the Silicon Chip'. So there we are the format of the book is TO/EO, subject matter arranged in paragraphs and the Post Test. Let's now turn our attention to the main subject of this Chapter - Computers and their Languages. To put GW-BASIC into context, it is necessary to look at the structure of the 'typical computer'. Those of you who feel that they are sufficiently familiar with this area can skip over and go on to the next section of the Chapter but beware the TOs!
THE STRUCTURE OF A TYPICAL COMPUTER
1.3 'Silicon Fred'. Of course, there is no such thing as a 'typical' computer but we can generalise things to illustrate the main components. Quite often, this particular subject rather puts people off It sounds complicated and technological. However, in essence, the functions of the various parts of a computer are simple enough to understand as long as you have something to hang your hat on. We will use the analogy of drawing parallels between the various elements of the computer and their 'equivalents' in the human body. To highlight this comparison, let me introduce you to your electronic counterpart - Silicon Fred.
1.4 Fred's Thinking Powers. I am assured that it is a medical fact that most of us have brains. What do these organs do for us? Well, they do the thinking and they control our bodies. How does this equate to Fred? In all computers there is a very important chip which performs the same sort of functions - the Central Processing Unit (CPU). It controls all the other chips on the computer board and does the arithmetic. Surprisingly, though, its mathematical genius is somewhat limited. Most CPUs can really only add and subtract numbers less than 256. They are also quite good at moving bits of memory around and performing simple logical operations. How, then, does the PC do all the powerful file and number handling? The answer lies in the fact that CPUs can do these simple things very fast, indeed. So, complex things can be simulated by splitting them into a number of smaller operations. For example, multiplication can be simulated by repeatedly adding. If this is done fast enough, who is to know the difference?
1.5 How Does Fred Remember Things? Despite the speed of Fred's thinking powers, it would all come to nothing if he were not able to remember anything partial totals, where to put the answer, and so on. He obviously needs his equivalent of our memory. But it is important to note that there are 2 broad classes of memory. Let's look at another type of animal - the duck. When a duckling hatches out of its egg, it can paddle after mum almost immediately and feed itself. How does it do this? No one has taught it these skills. The answer, of course, is instinct. Human beings, on the other hand, have become rather distant from instinct and only the important ones remain, such as the instinct for self preservation, the eye blink reflex and keeping a beer glass level. When you switch Fred on, he knows what type of a computer he is and how to do an immediate check of his memory to ensure that nothing is wrong. He knows how to do maths and store things away. All these things are the result of in-built programs which are stored in an un-erasable form in his memory. They are not capable of being altered by him or you because they are important 'instincts' which must not be tampered with (can you learn to stop the eye blink reflex?). This type of program is called firmware, since it is written and installed into the machine by the manufacturer. The type of memory into which it is placed is called Read Only Memory (ROM). In other words, you cannot write anything into it. An important point to note is that this type of memory retains its contents even when power to the machine is removed. This sort of thing is important for humans, too, or we would forget to breathe when we fall asleep (as happens to many unfortunates who have this 'routine' missing from their brains in real life). A further point worth noting is that a computer's abilities are very much a combination of the native abilities of the CPU and the routines provided in the Firmware. You can take a CPU from one kind of computer, add new Firmware and the result is a totally different machine. For example, remember the Sinclair QL? Its CPU was originally designed to control washing machines. Remember the programmable toy called 'Big Trak'?
Its CPU started life controlling the Exocet missile! It's all a question of Firmware and suitable peripheral equipment. Of course, while 'instinct' is very important, it is no good if we can't store anything. Fred would be unable to learn anything, since learning implies storing for future reference. Fred therefore needs something like our ordinary memory. He has a number of chips called Random Access Memory (RAM) which means that he can store, retrieve, alter or delete any of the items held in RAM, at random.
1.6 Fred's Nervous System. I mentioned earlier, that Fred's CPU, just like our brain, controls the other chips on his board. Now we need to look at how this is effected. Think of lifting your arm. How does it know when to move and how far? We pass the information down the nerves to the appropriate muscles and it all happens. In Fred's case, he actually has 3 nervous systems, all working together. Round his Printed Circuit Board (PCB), he has groups of wires, or lines. These are known as 'Buses'. He has a Control Bus, an Address Bus and a Data Bus. Let's suppose that he wants to store the number 250 in a specific location in his memory. Each RAM chip has a certain number of "cells" which can hold a number less than 256 and each of these cells has its own, private telephone number or 'address'. When a chip (except the CPU) is not actually being used, it falls into a dormant state, not actually switched off but not actually switched on either. This is done to conserve power. Since Fred knows the address of the cell where he wants to store his number, he also knows the specific chip in which that address lies. He must 'wake up' that chip by prodding it with a blip of electricity sent down the Control Bus to tell it that there is incoming data. The chip obediently wakes up and waits for Fred to convert the address of the required cell into binary (this is a code composed of a series of "0"s and "1"s). He then puts this code onto the Address Bus by putting current on each line on the PCB which corresponds to a '1' in the code. All chips are connected to the Address Bus but only the one which has be prodded alive by the Control Bus is awake to hear it. Once the address is passed to the chip, the 'door' of the cell is opened and the number 250, also converted into a binary code, is sent via the Data Bus, to the cell. The buses go quiet and the chip returns to its slumbers. If, on the other hand, the CPU wanted data from the chip, it would have told it via the Control Bus that data was to be retrieved, the address would be sent and a copy of the data held in that cell would be passed to the CPU via the Data Bus.
1.7 Fred's 'Five' Senses. At this point, we have a Fred who can think and store things in memory by virtue of his Firmware but has no means of acquiring data or telling the world of his genius. In other words, he needs some method of inputting data and outputting it. There is a whole class of chips which fall under the heading of 'Input/Output' (I/O) devices. Let's think for a moment about our senses. Take sight and sound, for example. In the case of sight, the retina receives light waves and converts them into blips of electricity which are transmitted to the centre of sight along the optic nerve. Similarly, the ear receives sound waves and converts them into blips of electricity which are, this time, routed to the centre of hearing. The point is that irrespective of the nature of the external stimulus, it ends up as blips of electricity inside the brain since that is the format needed by the neurones. Of course, specialised 'devices' are required to make this conversion. It would be no good grafting your ear where an eye was. It is not designed to convert light to electricity. All this applies equally to Fred. His equivalent of sight or hearing is his keyboard. He has a chip to convert the keypresses into a code which is then put on the Data Bus and sent to the CPU. Similarly, his equivalent of speech is his display screen. When he wants to tell you something, he will find out the address of the display chip, wake it up to tell it that some data is on its way, passes the exact address on the display where he wants it to go and then pushes the data to be displayed down the Data Bus.
1.8 Keeping Time Inside Fred. The problem is that all the things I have been describing happen fast and furious inside Fred. It's no good passing data to a chip if it has not yet fully woken up, for such things take a finite time. There has to be some way in which things happen at specific, equally spaced intervals, like a metronome. Fred has a sort of 'heart' in the form of a quartz crystal, like the one in most watches these days, which oscillates at a very specific frequency and sends pulses to the CPU. Each time the CPU receives a pulse, it will carry out an operation and even if it has a number of things queuing up to be done, it will wait for each pulse before executing the next one. I mentioned earlier in this paragraph that chips take a definite time to wake up and open their gates and this is important when considering the clock speed. If Fred was to use a faster quartz oscillator, he would appear to do things correspondingly faster. But there is a limit to this sort of thing. If the clock speed chosen is faster than the response time of the memory chips, the CPU and the other chips will not be able to talk to one another because the CPU will be speaking too fast.
1.9 The Complete Fred. Well, there it is. Fred has a brain (the CPU), instinct (ROM Firmware), memory (RAM), a nervous system (the buses), senses (the I/O chips) and, finally, a heart (the crystal oscillator). He is a complete silicon man. All of this is applicable to the PC itself and, hopefully, you can see how your machine runs, in general terms. This completes our discussion of computer structure and we now turn our attention to computer languages.
1.10 Binary Code. I have already discussed binary code in the context of memory addresses but, in fact, binary code is the fundamental language of computers. Each type of CPU has its own particular grammar and vocabulary expressed in this code. The words are very simple and may include things like 'LDAA &2030', which means "load the Accumulator A (a storage space within the CPU itself) with the contents of the cell whose address is &2030". The code for this instruction might be, for argument's sake, 01011100. When the computer is switched on, it automatically jumps to a specific address in ROM and retrieves the instruction stored there. It will then execute this instruction, retrieve the next one and so on. The computer is most at home in binary because it is a fault tolerant medium. A '1' is represented by a pulse of electricity on the appropriate line of the PCB, while a '0' is represented by no pulse. It matters little, then, whether the actual voltage varies with time. What matters is the presence or absence of a pulse and not the actual value of that pulse in voltage terms. Note that we normally talk about a binary number being composed of a string of 8 digits (called 'bits' after the term Binary digIT). The 8 bits together make up a 'byte'. Before we look more closely at the binary system, let's think about the decimal system. Take, for example the number 1024. You recognise that number as being 'One Thousand and Twenty Four' but how did you do that. Well, you have been taught that the rightmost figure represents so many units while the next one is so many tens, the next so many 100s and so on. We show 100s in mathematics as 10 ^ 2 (ten multiplied by itself twice), 10s as 10 ^ 1 (10 multiplied by itself once) and units as 10 ^ 0 (no tens at all). Each digit in the number is multiplied by the corresponding power of 10. So, 4 units are multiplied by no tens, giving just 4. Next, the 2 is multiplied by 10 ^ 1 = 20. There are no hundreds but there is 1 further on what is the power at that position? Yes, it is 10 ^ 3 = 1000. So we have 1 x 1000 = 1000. Add it all up and we get one thousand and twenty four. You, with your incredible brain immediately leapt to the leftmost number and saw that it was in the 'thousands' column. You were able to do this because you have become skilled over the years in recognising those 'sizes' of numbers which you meet in the normal course of daily life. To prove that this is the case, you must try a much bigger number than you would normally encounter. Try the number 12301234455. The answer is 'Twelve Billion, Three Hundred and One Million, Two Hundred and Thirty Four Thousand, Four Hundred and Fifty Five'. The point is that you had to start at the right hand side and work out what each position in the number stood for in order to work out what the leftmost digit was. As I said, each position in the number is an increasing power of ten (because we are working in decimal). In the case of binary, each position is a power of 2, starting with no 2s at all, 2 ^ 1, 2 ^ 2 etc. So, the number 10101010 represents 0 x 2^0, 1 x 2^1, 0 x 2^3, 1 x 2^4, 0 x 2^5, 1 x 2^6, 0 x 2^7 and, finally 1 x 2^7. All this together gives 170. Work it out for yourself. This level of language is called 'Machine Code'. The problem is, of course, that while Fred might be very happy playing about with '1's and '0's, humans are decidedly not. Look at the 2 versions of the same program in Fig 1.0. The left hand version is correct while the right hand one has a single erroneous item. Time yourself and see how long it takes you to discover the 'error'.
1.11 Hexadecimal Code. As you can see, it is not only very easy to introduce errors into this type of programming, but it is also extremely hard to discover just where the error lies. Moreover, it is very hard for we humans to memorise the various codes in the CPU's vocabulary. To try to get round these problems, hexadecimal code was employed. In decimal, we have 10 separate digits 0-9 (because we have 10 fingers). This means that each digit position in the number line is a factor of 10 greater than the one to the right. In the binary system, each digit position is 2 times greater than the previous position. But, since bytes are normally arranged in multiples of 8 bits, a number system with the same sort of characteristics was required. Hexadecimal has 16 separate digits, 0 9 and A F. This means that each position is 16 times the previous one. Counting in Hexadecimal (Hex, for short) goes from 0 and ends at F before you carry to the next position. Just think of it as having 16 fingers on each hand. To convert a byte into hex, all you have to do is to split the byte into 2 groups of 4 bits, called nibbles (well, half a byte just HAS to be a nibble - yes?). Treat each nibble as though it was a separate number. Let's try out the number I used in the last paragraph - 10101010. This gives us 2 separate nibbles - 1010 1010. These equate to 10 and 10 in decimal or A and A in hexadecimal. So, 10101010 is equal to AA in hex. But why go to all this trouble? Well, if you think back to Fig 1.0, you will remember how hard it was to see an error in the program. Imagine, however, if all of these 8 bit bytes were replaced by 2 digit hex numbers. What a difference this would make to finding the errors in the code. Furthermore, the codes for the words in the CPU's vocabulary could much more easily be learned by the programmer. The problem is that, while the computer is quite at home in binary, it needs a little program to convert the hex into binary before it can understand it. This is normally included by the manufacturer in his firmware, but by no means always. We can use hex in GW-BASIC and addresses in the computer will normally be specified in this way.
1.12 Assembly Code. All this is very good in its way, but it takes a long time to become so conversant with the CPU language that you have memorised the entire command set. This learning process would be made much simpler if, instead of a hexadecimal code, we could use a sort of abbreviated 'English' for the instruction and compose programs entirely of such commands. Again, just as in the case of hex, a program is needed to convert these instructions into the native binary of the machine. This type of program is called an 'Assembler' and is normally purchased as an extra. Think back to paragraph 1.10 when I used the command 'LDAA &2030'. This is an example of an assembler command. A point to note is that there is a one to one relationship between the hex or binary commands of the CPU and an assembler instruction. It is all a matter of making things easier for the human programmer by interposing a program to translate his instructions into binary. If you want to know a bit more about assembler and actually use a really great one, check out my articles on Assembly and EasyCode.
1.13 High Level Languages. Assembly language might sound just the ticket and, indeed it is very useful, but it is extremely difficult to use (although nothing like binary). You, the programmer, have to know all the addresses of the chips in the computer and the intricate workings of the CPU. Moreover, if you decide to use another type of computer, all your programming skills will be set at naught because that CPU will have a different langauge of its own, its chips will be different, have different addresses and so on. In other words, assembly language is not 'portable'. In the old days, you and I would never have been allowed near a computer. It was surrounded by its own shell of servants and gurus. Let's imagine you were a university student studying economics and you wanted to use the university computer to work out a financial analysis. You would tell a programmer what you wanted to do, give him the data on which you wanted to work and leave him to it. He would write the program and, with the data, it would be punched onto cards and put in the job queue. When your job came round, it would be loaded into the computer and the results would be output as punched card or tape and returned to you. All this was because computers were very expensive beasts indeed in those days. They had to work at maximum efficiency 24 hours a day if they were to pay. The owners of such machines simply could not afford to have amateurs playing about, using up computer time. Another problem with this system was that it was programmed in assembly language. It would have taken you far too much time to learn it had you wanted to do it yourself. Moreover, the programmers themselves had a hard time of it if the computer system was changed or they moved to another job. Their productivity went to nil for a long time as they wrestled with the new code and system architecture. In the mid 1960s, Kemeny and Kurtz, lecturers at Dartmouth College, USA, decided enough was enough. What was needed, they reasoned, was a much bigger program which would allow the user to input his program in instructions which were very close to natural English. The translating program would take each instruction and turn it into the one or more binary instructions which the computer could understand. Moreover, and this is the cunning bit, if the translation part of the program could be altered to turn the 'English' into code for a different CPU, programmers on the first type of machine could move to the second type of machine and program that one just as easily as the first, since the 'English' was just the same. In this way, 'High Level Languages' were born. A high level language is one where the instructions might be made up of hundreds of individual CPU instructions as opposed to Assembly Language. Each assembler instruction is equivalent to only one CPU instruction.
Kemeny and Kurtz
The Language developed by Kurtz and Kemeney was the Beginners' All purpose Symbolic Information Code, or BASIC, and it soon caught on. But just as English itself has many different dialects, BASIC soon started to change as different computer manufacturers added their own little extras to take advantage of the new hardware they had designed into their system. The introduction of VDUs, for example, brought in a whole new range of commands alien to the original Dartmouth BASIC. Finally, the advent of home micro computers instigated a flush of new variants. All of these dialects have, at their core, most of the original flavour of the Dartmouth version of BASIC but it is important to note that differences in the way statements are expressed can cause problems for the translation program. This is why learning BASIC from a book written for a computer other than the PC would be interesting, to say the least, and was the reason "The Utter Novice Guide to GW-BASIC" was written. Of course, all of the above work in BASIC was not to say that other languages were not being developed, too. BASIC, however, was the only one specifically designed for beginners. So, how does the translation program turn the BASIC code (known as the source code) into the CPU instructions (called the object code)? There are 2 main types of translation programs which do this - Interpreters and Compilers.
1.14 Interpreters. An interpreter is a program which takes your source code, line by line, and turns it into CPU code, executing the instructions as it goes along. It usually includes some form of editor to allow you to enter the Source, list it on the screen, edit it and run it. It is all very self contained. Should the program have errors in it, the interpreter kindly indicates where the error might lie and you can display the offending line, make adjustments and then run the program again. Most home computers have a BASIC interpreter included in the firmware and this type of translation program makes for a very easy and interactive way of writing high level programs. However, interpreters suffer from 2 major drawbacks - they are very slow and the interpreter has to reside in memory with the program at all times. Take the example of a program loop containing a piece of source code. Each time the computer runs that loop, the interpreter comes to it an re converts it into CPU code despite the fact that it has already done so a millisecond before. This means that time is wasted in executing each instruction and the program consequently runs more slowly than it need to. Furthermore, since the interpreter has to be in memory with the program, a great deal of space is wasted, too. While this is less of a problem today with IBM PCs typically holding a megabyte or more of main memory, it was a very constricting factor only a few years ago.
1.15 Compilers. The second approach to the problem is that of the Compiler. Here, the program translates your source code into object code once and stores it as a file. You can then remove the compiler from memory, call in the object code and run it. Because the instructions are not being interpreted each time, the code runs much faster and, since the compiler and the object code are not co resident, a great deal of space is saved. Compilers would therefore seem the ideal choice but they do have disadvantages. One of these is the fact that there is no interactive capability. You write your program on a word processor, just like a letter, and store it on file. You then call up the compiler, tell it the name of the source file and it begins to compile it to object code. Should it find an error, it will give an indication of where the problem might lie. You then have to re enter the word processor, call up the source file again, make alterations and start the process all over again. Very tedious. Moreover, certain source commands will be compiled using sets of pre written machine code routines called library routines and, in addition to producing the object code, the compiler must also join it up with the routines and resources that the source has said it needs. This is known as linking.
1.16 The GW-BASIC Editor. So, what do we have in the case of the GW-BASIC? Well, since the language is largely geared towards the beginner, ease of use must the highest priority. For this reason, GW-BASIC is an interpreted form of BASIC. Certainly, this means that your programs will be relatively slow. 'Slow', however, is relative - you will be able to write programs which can generate in seconds the square roots of 100 numbers. It is only in the areas of mathematically-orientated graphics that the speed penalty of an interpreter becomes apparent. Since this is a field unlikely to be investigated by the beginner, the interactive advantages of an interpreter outweigh the speed drawbacks. Don't despair, however, those of you who really want speed coupled with BASIC can buy compilers to do it - Microsoft's 'QuickBASIC', for example.
1.17 Overview of the Chapter. In this Chapter, I have outlined the general format the book will take. I have stressed the importance of good design as well as correct code. For those of you with no prior knowledge of the subject, I have given an 'Early Learning Centre' version of the structure of the typical computer and tried to relate functional areas of the machine to parts of the human body. The various levels of languages to be found in computers were discussed. These ranged from binary, through hexadecimal to assembly language and from there to the first high level language BASIC. The means by which the BASIC translation program produces object from source were discussed and the pros and cons of interpreters and compiler outlined. Finally, the GW-BASIC Editor was briefly introduced to complete the Chapter.
1.18 Post Test. That concludes Chapter 1. You should now attempt the Post Test. Read through the Chapter once again if you are unsure of any area and try not to refer back while attempting the test. Check your answers with those contained in Annex A. Pay particular attention to wrong answers. Read the paragraphs relating to these again and re attempt the problem questions when you feel you have mastered the subject. Do not proceed to the next Chapter until you are satisfied that you are fully conversant with the material contained in this one.
CHAPTER 1 - POST TEST
1. List one advantage of structured program design.
2. What does the abbreviation "CPU" stand for?
3. What is the function of the CPU?
4. What does "ROM" stand for?
5. What is the function of ROM?
6. State the meaning of the term "Firmware".
7. What does "RAM" stand for?
8. What is the function of RAM?
9. How does RAM differ from ROM?
10. What is a "Bus" in the context of a computer?
11. Name the 3 Buses in a typical computer.
12. Describe one way in which the computer acquires data from outside and one way in which it passes data to the outside world.
13. How is a hex number derived from a binary number?
14. What is an "Assembly Language"?
15. In what way is it better than machine code?
16. What is a "High Level Language"?
17. How does a high Level Language differ from Assembly Language? 18. What Is "BASIC"?
19. What is meant by "Source Code"?
20. What do we call the code derived from source code, which the CPU can understand?
21. What is an "Interpreter"?
22. What is a "Compiler"?
23. Name one advantage of an Interpreter.
24. Name one disadvantage of a Compiler.
Select your next chapter