4K BASIC FOR ZX-80 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1. ZX-80 user's view, (a) Program Listing The user inputs via the keyboard, lines of BASIC for insertion into the program and commands for immediate execution. While he is doing this he sees a display which is divided into two parts: the upper part is a "window" on the program listing, while the lower part displays the line or command he is currently outputting. Normally, the lower part is large enough to hold the whole line (which will take more than one screen tine if it contains more than 32 characters), there is a blank line between the two parts, and the upper part occupies the remainder of the screen. If, however, there is insufficient RAM to hold a disptay of this size (each character on screen occupies 1 byte of RAM*) the upper part of the display will be shrunk line by line until the display file is small enough. When the upper part has disappeared altogether the lower part shrinks character by character. The computer maintains a "current line number" * RAM stands for random access memory or store. for editing the program and the display is always organised so that the line with that number, or the preceding line If no line with that number exists, is on the screen if at all possible. If there is a line with the current number, it is displayed with a symbol consisting of a reverse video [>] between its line number and the text of the line; if there is none then the. reverse video [>] will not appear. Three keys are provided for changing the current line number: "" changes it to the line number of the preceding line, "" changes it to the line number of the following line, and "HOME" resets it to zero. If there is no preceding line "" sets it to the line number of the first line in the program; similarly "" will set it to the last line. There are two other ways which the current line number can change: inserting a line into the program sets it to the line number of that line, and the command "LIST n" will set it to n. When the current line is off the top of the screen, the window moves up so that it becomes the first line. When it is at or just off the bottom the window moves down a line. If it is well beyond the bottom of the window, the window moves down so that it becomes the second line on the screen. (b) Input area The lower part of the screen contains the line the user is currently typing in. This line may be a command or a line of program; in the latter case it will begin with a line number (in the range 1 to 9999) and in the former case there should be no number, although zero in practice counts as "no number" here. Somewhere in the line a "cursor" is displayed. This indicates two things: the position in the line where symbols will be inserted, and whether an unshifted alphabetic key will be treated as a keyword (eg "LIST") or a letter (eg "A"). The cursor is in the form of an inverse video for keywords or for letters. Note that this cursor, although displayed in the line and occupying a character position on the screen, does not form part of the line and is ignored by anything interpreting the line. A second symbol, similar in principle to the cursor, may also be displayed: this is in the form of an inverse video and indicates that the line is not a syntactically correct BASIC statement. It is positioned such that the part to the left of it could be the beginning (or the whole) of a syntactically correct BASIC statement eg if "20 LET A = B + 5" is input from left to right then. displayed at the end of
In most cases the symbol is displayed as far to the right as is consistent with the above description; however there are a few circumstances where this is not quite so, for instance in LET A = ASC("X") although "LET A=ASC" is a syntactically correct statement (ASC here being an integer variable) the is not displayed after "ASC" but rather before it. This is because "ASC(...)" has already been identified as a function call, but as no built-in function with the name "ASC" is available it is faulted. Having identified it as a function call, however, the computer does not then consider other possible parses. The following keys are available to alter the input line: (i) single-character symbols: letters, digits punctuation, etc the symbol is inserted to the left of the cursor. (ii) multi-character "tokens": “**”, "AND", "OR", "NOT", TO", "THEN" keywords. Each of these is stored in the computer as a single byte, which, as in (i), is inserted to the left of the cursor. However, they appear on the screen as more than one character. Those that are alphanumeric (ie all except "**") are preceded and followed by a space, the preceding space being omitted (a) at the beginning of the line (b) Where it follows another alphanumeric token. (This rule means that programs appear well-laid-out on the screen without using up scarce RAM space for explicit space characters. Inserting an explicit space character before or after an alphanumeric token always inserts one extra space in the displayed form.) (iii) "RUBOUT" deletes the symbol or token to the left of the cursor. (iv) cursor control keys and skip the cursor past the next symbol or token to the right and left respectively. (v) "EDIT" replaces the input line with a copy of the current line from the program. If no line has the current line number, the first line after it is used. If the current line is after the last line in the program, the last line is used. If there are no lines of program at all, then an empty line is used. Note that any existing input line is lost; "EDIT" followed by "NEWLINE" is in fact the quickest way to get rid of an unwanted line, but beware typing "EDIT" in mistake for "NEWLINE"! (vi) "NEWLINE" is ignored if the inverse video symbol is present. Otherwise it marks the end of input and the line or command is submitted to the system (see next section). Note that the whole line is submitted, not just the part to the left of the cursor. (c) After input When a line with a nonzero line number is submitted to the system, it is inserted into the program, any existing line with the same number being first deleted. The input area is then cleared. A special case is where the new line consists only of a line number, possibly preceded by spaces: the existing line (if any) is deleted but nothing replaces it and it therefore simply disappears from the listing. The "current line number" is still set to its number, however, so the inverse video [>] disappears also (see (a) above). If the line being inserted has one or more spaces after the line number but no other symbols or tokens, it is still inserted in the program and appears in the listing as a line which is blank except for its number; when the program is run such lines are ignored. A line which has no line number, or which has line number zero, is a "command" and is obeyed immediately. For as long as it takes to obey the command (which for most commands is very brief) the screen is blank, then on completion the upper part of the display contains any output generated and the lower part contains a display of the form m/n where m is a single digit and n is "—2" for most commands. If m = 0, execution was successful; if m = 9 a STOP command was executed; otherwise m is an error code (see Appendix I). Where a command (RUN, GO TO, GOSUB, CONTINUE) has caused the program to be entered, n is the line number of the offending instruction if m is an error code (exception: if the error is in a GO TO or GOSUB then n may be the target of the jump), the line number of the STOP if m = 9, and the line number of the last line in the program if m = 0. Except in the case of m = 0 or m = 9, CONTINUE is a jump to line number n (but see 3(c)). If m = 9, CONTINUE is a jump to line number n +1. Sometimes only the first digit of n is displayed because there is no room in the RAM for any more display file. For example beware confusing line number 240, of which only the first digit is displayed, with tine number 2. A jump to a line number which is beyond the end of the program, or greater than 9999, or negative, gives m = 0, n = the line number jumped to. The commands are described individually in section 3. 2. Computer's view N.B. 35h means Hexadecimal 35 (35h=3*16+5 = 53. 0Ah to 0Fh are the decimal numbers 10 to 15. (a) RAM The contents of the RAM are: The first area is fixed in size and contains various "system variables" which store various items of information such as the current line number, the line number to which CONTINUE jumps, the seed for the random number generator, etc etc. Those that could possibly be useful with PEEK etc have been documented elsewhere (Appendix 3). An important subset of the system variables are the five contiguous words labelled VARS to DP _,END which hold pointers into the RAM and define the extent of the remaining areas (apart from the stack). The program consists of zero or more lines, each of the form i.e. beginning with the line number (stored ms byte first contary to the usual practice on Z80's) and ending with a newline. The line number is in the range 1 to 9999 so that the ms 2 bits of the first byte are zeroes. The ms 2 bits of the byte pointed to by (VARS) will not both be zeroes; this gives a simple test for end-of-program. The program lines are stored in ascending order of line number. The text consists of ordinary characters (codes 0 to 3Fh) and tokens (codes CCfh to FFh), although reverse video characters (codes 80h to BFh) have also been allowed for. The variables take the forms shown below. They are not stored in any particular order; in practice each new variable is added onto the end. When a string variable is assigned to, the old copy is deleted and a new one created at the end. (Created first - "LET A$ = A$" does work!) Note that apart from the ms bit of the first byte a single-character integer is the same as the controlled variable of a FOR loop. The characters in a name, being all alphanumeric, have 6-bit codes as in the character code table. The first character in a name, being perforce alphabetic (ie in the range 26h to 3Fh) effectively has a 5-bit code. The "variables" area is terminated by a single byte holding 80h (which can't be the name of a string!). The working space holds the line being input (or edited, hence "E—LINE") except when statements are being obeyed when it is used for temporary strings (e.g. the results of CHR$ and STR$) and any other similar requirements. The subroutine X—TEMP is called after each statement to clear it out, so there is no need to explicitly release space used for these purposes. The display file always contains 25 newline characters (hex 76); the first and last bytes are always 76h and in between are 24 lines each of from 0 to 32 (inclusive) characters. (DF— EA) points to the start of the lower part of the screen. The stack (pointed to by register SP) has at the bottom (high-address end) a stack of 2-byte records. GOSUB adds a record to this stack consisting of 1+ its own line number; RETURN removes a record and jumps to the line number stored therein. The last 2 bytes of RAM contain a value which RETURN recognises as not being a line number. The expression evaluator (which is also used to check the syntax of expressions) pushes 4 bytes onto the top of the stack for each intermediate result and pops them again when the appropriate operator is found, eg. A+B*C**D*E+F*G X=B^CD Y=A+X*E stack A+ B* C** X* Y+ F A+ B* A+ Y+ A+ Thus the above expression uses a maximum of 12 bytes of stack. Parentheses use an additional 6 bytes each, eg A+(B*C**D)*E+ F*G would use 12 + 2 x 6 = 24 bytes Apart from these two cases, the stack is only used for subroutine calls and for saving registers. (b) Actions The actions taken by the computer in response to the user's keystrokes are as follows. Each time a symbol or token is inserted into or deleted from the input line, also each time the cursor is moved, this change is put into effect in the input line held in working-space after deleting the lower part of the display file (viz that part from (DF_EA to DF_END) - note that during this period the display file may be incomplete in that less than 25 newline characters are present, although the display file is never allowed to become large enough that there will not be room to add the remaining newline characters). Then the input line is checked to see if it is syntactically correct. The input line contains an inverse video at the point where the cursor is; the syntax checker notes the address of the cursor in the variable (P—PTR) and sets variable (X—PTR) to point to the first wrong symbol or to zero if there is none. It also notes whether the cursor should be displayed as or an . Finally the lower part of the display file is rebuilt, inserting the symbol and changing the cursor from to if required, as well as converting tokens into characters. If there is now insufficient room for the display file the display file area is cleared and the upper part is remade with fewer lines by re-copying from the program stored in the area RAMBOT to (VARS), again converting tokens into characters as part of this process, and the lower part is then output afresh. When a line is to be inserted into the program, its line number is converted into binary, space is made at the appropriate place by copying everything else up, and the text of the line (from which the cursor has already been deleted) is copied in. The working-space and display file are then re-made, the former now containing just the cursor and a newline. When a command is executed, it is interpreted in situ in the working-space area. Program lines are of course interpreted in their place in the program. 3. Statements (a) expressions Throughout section 3, "n" will be used to represent any space integer expression and "s" to represent any string expression. String expressions are:
Ambiguites in parsing operations are resolved by considering the priority of the operators in question: higher priorites bind tighter, equal priorities associate from the left. Example -A**B+C*D/E*F-G-H is the same as (((-(A**B))+((C*D)/(E*F)))-G)-H
Values of string expressions can be of any length and can contain any codes except 1 (the closing quote). Values of integer expressions must be in the range -32768 to +32767; any value outside this range causes a run-time error (number 6). Note that relations yield —1 for "true" and 0 for "false" and that
so that for instance I AND I>0 0R -I AND I<0 is the same as ABS (I). However constructions such as A>B>C do not have the obvious effect, being parsed as (A>B)>C i.e. as (A>B) AND C<-1 OR (NOT A>B) AND C<0. (b) Statements The statements available are:
(c)BREAK If the BREAK key is found to be pressed at the end of execution of a line, execution does not follow on to the next line but stops showing 0/n where n is the line number of the next line that would have been executed but for the break-in.
* not available from the keyboard. Codes 38-63 only available when the cursor is , codes 230-255 only when cursor is |