PORTING ASTRONOMICAL SOFTWARE
A Survivor's Tale
by Dennette A. Harrod, Jr

Proceedings of the Lunar Calendar Conference
held at the International Institute of Islamic Thought
Herndon, VA
9-10 Shawwal, 1407 AH
6-7 June, 1987 AC
ISBN 0-912463-22-8

Abstract

Algorithms for predicting new moon crescent visibility have been implemented as computer programs. These programs have been (and will be) distributed to people who may wish to modify them and run them on different computers. The people in question may not be computer professionals, and are unaware of the fact that different compilers for the same language can cause a program to produce different results. This paper discuses the author's experience in attempting to port a "Sun Rise, Sun Set, Moon Rise, Moon Set" computer program from a personal computer to a large main-frame computer.

Introduction

Early Islamic astronomers were keen mathematicians who were able to take the observations of the ancient Greeks and develop algorithms that permitted accurate prediction of astronomical events, such as eclipses. These algorithms have been refined over the centuries, and with the invention of digital computers, the calculations that took many hours by hand can be performed in seconds.

Personal computers that cost much less than one thousand dollars are capable of performing the same kinds of astronomical calculations as the large and expensive computers used by universities, governments, and research institutions. Programs for calculating prayer times for any location on Earth are available for the asking, and their use is becoming more common among affluent Muslims who must travel frequently.

A problem with this technology that is often overlooked; people have developed a sense of security in the answers they get from computers because of the assumption that "computers don't make mistakes." However, I shall demonstrate that a computer program written in a "standard' computer language can produce different results when run on different computers.

There are two main culprits; the 'expression evaluator' which may cause the same statement to produce different results with different compilers, and 'parameter mechanisms' which can cause different results from procedure calls when a different language is used.

Muslims who do not have formal training in computer science may attempt to revise well-tested astronomical programs (such as new moon predictors) to run on their personal computers. Unless they are aware of these pitfalls, and the tests that can be made to ensure correct translation, then they may contribute to confusion among the Muslim community.

In preparing this paper, I was asked by the conference organizers to "apply it specifically to the problem of lunar calendar software," and to examine a copy of a program submitted by another author and "use it as your illustrative case to make your point." Before I can do this, I must provide some general background and abstract examples, and then I can describe what problems I encountered with this specific program.

To those who may wonder why anyone other than the one who created the program might wish to modify it, I offer an example. The program in question asks the user if they wish to have adjustments performed for Daylight Savings Time (DST), and advises the user that 'DST is in effect from the last Sunday in April to the last Sunday in October.' While this may have been true when the program was written, the Congress of the United States voted last year to change the start of DST to the first Sunday in April.

Since I am a software professional, I cannot help but interject a few criticisms of this program on technical and aesthetic grounds. These comments relate more to the usability and portability of the program than the accuracy of its calculations, but I hope that they will be of use to other individuals who are engaged in the development of lunar calendar software for general distribution.

Finally, as a Muslim concerned with the problems of the lunar calendar, I cannot help but make some comments on the utility of this particular software contribution.

Background

A 'compiler' is a computer program that accepts as input a 'source program' (statements in a human readable programming language such as FORTRAN or Pascal) and generates as output an 'object program' (machine executable instructions in a binary-coded language). Thus, a compiler is a translator, and for each different computer language, there are at least as many compilers as there are processors on which it can run.

A 'processor' is a unique, patented, electronic architecture that forms the 'brain' of a digital computer. Today, it typically is characterized as a single silicon chip known by a number, such as 68000, 8088, Z80, or 6502. For example, the 6502 is use in APPLE-][, Commodore-64, Atari-800, and VIC-20 personal computers; the 68000 is used in the Macintosh, Amiga, and Atari-ST; the 8088 is used in the IBM-PC and all the various 'clones.'

The important thing to understand is that programs in the machine language of one processor will not be understood by another. However, you can write a program in a 'high-level language,' and by using a different compiler for each processor, that program can be executed on many different computers. If all of the compilers for a given language were written by the same individual or group, then all of the programs written in that language would produce more nearly identical results on each of the processors for which objects could be generated. Unfortunately, this is not the case.

What we have today is a cybernetic Tower of Babel, with as many dialects (or 'flavors') of each language as there are people with the knowledge to write a compiler and the money to start marketing it. This has led to a situation where a program written in a popular language might not compile on another machine because the author has taken advantage of 'enhancements' in one compiler that are not available in the other. Sometimes, the implementation of a specific feature of the language is not defined well enough to guarantee that all compilers will perform the same way. This is the area with which this paper is concerned.

My first example comes from 1977, during my first software port. A system of programs written in FORTRAN for a Computer Assisted Design & Drafting department was being moved from a XDS Sigma-9 (a 32-bit machine) to a DEC PDP-11/70 (a 16-bit machine). We had written special translators to turn the XDS-flavored FORTRAN into DEC-flavored FORTRAN and things were progressing very well. Then we started getting complaints about the quality of circles and arcs being plotted; they were too coarse, looking more like polygons than smooth curves.

After the better part of two days worth of intense debugging, the culprit turned out to be this simple statement:

        K=M/2+0.5

This statement says 'K gets the value of M divided by two, plus one-half.' By FORTRAN conventions, 'K' and 'M' are INTEGER variables, which means that they have no fractional part to them, and can only represent whole numbers. (REAL variables are used to represent non-integer values.) The intent of the statement was to 'round-up' the result of the division by adding one-half (0.5). The following table shows the values of 'K' obtained for different values of 'M' using the two different compilers.

Table 1.
Expression Evaluator Anomaly

Value of M XDS
Value of K
DEC
Value of K
0 0 0
1 1 0
2 1 1
3 2 1
4 2 2
5 3 2
6 3 3
7 4 3

Why the different results? Because the XDS compiler was more robust than the DEC compiler <One could argue from the other point of view that the DEC compiler was too "anal".>. The XDS compiler said, "There's a REAL number on the right-hand side, so I'll convert M into a REAL value, do a REAL divide by 2, add one-half to the intermediate, convert the result to an INTEGER, and assign it to K." The DEC compiler said, "I'll INTEGER divide M by INTEGER 2 and discard the remainder, then convert the intermediate value to REAL and add one-half, convert the result to INTEGER, and assign it to K.'

Here we see a difference in the EXPRESSION EVALUATOR, the part of a compiler responsible for deciding the order in which arithmetic operations are to be performed. Given the statement:

        A = B + C / D

did we mean

        A = (B + C) / D

or did we mean

        A = B + (C / D)

Most programming languages specify a hierarchy of operators such that multiplication and division are performed before addition and subtraction <This is called the "My Dear Aunt Sally" algorithm, a mnemonic for the order as stated.>, so the second form is implied. However, some languages use a "left to right" rule, so the first form would be used. Obviously, it is important to know which form will be used, because the results will be quite different.

To return to our original example, the two compilers used the same hierarchy of operators, which the FORTRAN language requires all compilers to support. What is not required by the language is the rules under which mixed mode expressions (those with both INTEGER and REAL terms) are evaluated. This has been left to the compiler implementers; XDS and DEC used different methods.

The correct statement, which will work with both compilers, is

        K=M/2.+0.5

By adding a decimal point after the 2, both compilers now say, "Convert M to a REAL, do a REAL divide by REAL 2, add one-half to the intermediate, convert the result to INTEGER, and assign it to K."

Thus we see that a statement can produce different results on different computers because one compiler did what you wanted, but the other compiler did what you said. This kind of mixed-mode expression occurs quite frequently in astronomical programs, especially in Julian date algorithms.

This example has demonstrated a problem when using the same language for the port. A more insidious problem awaits the individual who attempts to re-code an algorithm in another language. This has to do with the use of subprograms and how arguments are passed to them.

There are two popular methods of passing parameters to a procedure, function, or subroutine. The first is called pass by value, which means that the current value of a variable (or expression) is assigned to a local copy of the variable, and the subprogram can modify this copy without altering the original. The second method is called pass by address, which means that the subprogram knows the location of the original variable and can alter the original value.

For example, consider a function called JULIAN that takes the parameters DAY, MONTH, and YEAR, and returns a Julian day-number that corresponds to a calendar date. In order to account for leap years, most algorithms use 13 for January and 14 for February. Therefore, a test is made to see if the value of MONTH is less than 3, and if true, then 12 is added to MONTH. Depending on which method the language uses, the original value of MONTH will either be preserved or destroyed.

Another problem with astronomical algorithms is the use of trigonometric functions. Few languages provide intrinsic operators for SINE, COSINE, and TANGENT, but they often provide these functions in a run-time library. Depending on who wrote the library, and the algorithms they used, a program might experience precision-related errors when ported to another system.

Some systems provide a minimal library of routines. For example, the popular language Turbo Pascal does not provide a TANGENT function, but suggests that you write your own by using the knowledge that TAN(X) equals SIN(X) divided by COS(X). You must also provide your own ARCSINE and ARCCOSINE functions, and they are not as trivial to write as the TANGENT function.

Another concern is the problem of numeric representation within the computer. Although floating-point representations are called REAL numbers, the nature of binary coding only allows them to represent the set of rational numbers; irrational numbers such as π, ⅓, or the square root of 2, can only be represented as approximations.

Even though we tell the computer

        PI = 3.141592653589793

this may not be the value which will be used for subsequent calculations. This value cannot be represented as a 32-bit (single precision) number. One must take steps to assure that PI will be a 64-bit (double precision) value. Some microcomputers (particularly those based on 8-bit chips such as the 6502 and Z80) do not support this level of precision in the languages available for them.

There is also the question of how the arithmetic operations are performed. Some computers have special hardware for performing floating-point arithmetic, while others use software routines. The major consideration here is cost; software is cheap but slow, while hardware is fast but expensive. On most well-designed systems, the results are indistinguishable. <Two years after writing this, I upgraded to a 286 processor, which supports a floating-point accelerator chip.>

The binary representation of INTEGER values is often restricted by the hardware architecture. A 32-bit processor can handle positive values in excess of 2 billion, but a 16-bit INTEGER is restricted to just under 33 thousand. Julian day-numbers cannot be represented by 16-bit INTEGER variables. This underscores a problem of porting a program from a large machine to a small machine. There are some processors that use 60-bit INTEGER values, and there is little chance that programs written on them can be successfully ported to personal computers.

Method

Since I was asked to discuss the problems of porting astronomical software with reference to a specific program, I shall discuss my experience chronologically. I wanted to simulate the environment of a novice user, working in a vacuum, so I had minimal contact with the author. I apologize in advance to any whom I may offend by my observations, but I did not avail myself of the opportunity to inquire into the background of the contributors.

In my home, I have a DEC Rainbow 100A personal computer. This machine contains two microprocessors, a Z80 and an 8088. It can use either the CP/M or MS-DOS operating systems. I also own a variety of compilers for several languages, in some cases both an 8-bit CP/M version and a 16-bit MS-DOS version.

At work, I have access to a DEC VAX 780 mainframe running the VMS operating system, and a SUN Microsystems workstation, a 68000 based machine running the UNIX operating system. Both machines have compilers for several languages.

My original intention was to recompile the program on as many different machines as I could, document the experience, and compare the accuracy of the output. Given enough time and a small program, I would attempt to re-code it in a different language.

The first problem of porting software is media; you must have a copy of the source in a form that can be read by the target computer. After trying for a week to contact the person who was in possession of the program, I was informed that they were out of the country.

The conference organizers finally sent me what they had received from the other author, which was a magnetic tape and several pages of printout from having executed the program. There was no label on the tape to indicate density, blocking factor, or' what encoding scheme (EBCDIC or ASCII) was used to create it. A colleague and I tried several utilities, and were able to determine only that it was written in ASCII, and that it contained FORTRAN source and the same output that was contained in the printout that accompanied the tape.

The conference organizers provided me with the name and telephone number of the other author (Ahmad S. Massasati), and after contacting him, he was able to provide me with a copy of the program on an 5-1/4 inch floppy disk. He also provided me with a copy of his paper, which contained sample output from the program.

The DEC Rainbow can read single-sided IBM-PC disks, but I was sent a double-sided disk. Using a TI Professional (an MS-DOS personal computer), I copied the data from double-sided to single-sided disks. Because of these various delays and false starts, it was over three weeks from the time I was notified of the acceptance of my paper until I had a readable copy of the program that my paper was to discuss.

Results

The disk is distributed by the Muslim Student Association of the University of Missouri-Rolla. Their address is

MSA-UMR
202 Rolla Bldg.
Rolla, MO 65401

and they ask that you make a contribution to cover their expenses.

The disk contains an executable object named "PR&MOON.EXE", an input data file, several output data files, and several "source" files. A source listing file is a by-product of compiling a program. It contains page-headings and page-breaks, symbol tables, and line-numbers prefixing each statement. Source listings cannot be used as input to a compiler because of the extraneous data that has been added.

There is no point in porting a non-working program, so my first task was to attempt to run the program on my computer and see if I got the same results as those on the distribution disk.

There was no documentation as to how to operate the program, except a few cryptic references in "batch" files. The input and output data files appeared to be the same that the other author used to illustrate his paper; the calculation of sun rise/set and moon rise/set times for Makka during the month of May, 1986.

After making several back-ups of the disk, I entered the command "PR&MOON", and was rewarded with the first of several screens of introductory material. One of the screens contains the message, "Please DO NOT change the program unless you are ABSOLUTELY sure of what you are doing." Considering the manner in which it is distributed, recreating the program without making changes is a challenge even for a professional like myself.

Finally, I responded to the prompt to begin calculations. Half an hour later, it had calculated sun rise/set and moon rise/set times for Makka during the month of May, 1986. (This would have taken less time if my Rainbow had an 8087 floating-point processor chip, but these chips cost $150.00, and most personal computer owners cannot justify the expense.)

Comparing the output of the program with the output files that were already on the disk showed no difference. The program would apparently work on any MS-DOS machine, and did not rely on any IBM-PC specific hardware or software features. I later confirmed this by executing the program on the aforementioned TI-Professional, which, coincidentally, was equipped with an 8087 chip.

The next step was to attempt to recompile the program on another machine. This constitutes the actual port of the software.

The program consists of three modules containing a total of 2,549 lines of FORTRAN code. This was too much to attempt to re-enter by hand. Since this version was written for the Microsoft FORTRAN compiler, I decided to use the DEC FORTRAN compiler as the target of the port.

For the remainder of this paper, I shall refer to the FORTRAN compilers by their operating systems (MS-DOS and VMS) rather than by their vendors (Microsoft and DEC) because often a vendor will provide compilers for different operating systems. Sometimes these compilers behave the same, but sometimes they were developed by different organizations and are not considered compatible even by the vendor.

Using the KERMIT telecommunications software, I uploaded the source listings from my Rainbow to the VAX, and began the tedious process of editing the listing files to create source files that could be compiled.

A cursory examination of the source revealed several facts. More than one person had worked on the program because there were obvious differences in style; for example, FORMAT statements for output contained a mixture of Hollerith constants and quoted strings. Also, if there were two ways that the same logic could be expressed, both of them were used; for example, the use of a GOTO instead of IF-THEN-ELSE-ENDIF.

Some code had been modified or replaced; the original statements had been commented-out but left in the source. It was also obvious that at least one of the contributors spoke English as a second language; there are several grammatical mistakes, such as the inventive present participle "sightening".

From reading the comments, the history of the program is as follows. Dr. Joseph H. Senne, of the University of Missouri-Rolla, originally wrote the program to calculate the apparent place of stars. This version was made using punched cards, and was written sometime prior to December, 1981. Ahmad S. Massasati added lunar calculations to it in 1983. Finally, M. Kotob ported it (in 1986) from whatever main-frame version of FORTRAN it was written in to MS-DOS FORTRAN-77 for the IBM-PC.

After several hours of work, I was ready to start compiling. The majority of the error messages had to do with FORMAT statement associated with generating output from the program. For example, when prompting the user for input, it is desirable that the cursor remains on the same line as the question (so that the user's answer appears on the same line). In order to do this, the "carriage-return/line-feed" sequence, which normally occurs after each WRITE, must be suppressed. Each flavor of FORTRAN does it a different way. MS-DOS uses the "\" (backslash) character, which VMS considers an illegal character in a FORMAT statement; the VMS equivalent is the "$" (dollar sign) character.

I knew that once I ported the program from MS-DOS to VMS, I would want to bring it back, thus making sure the changes were not incompatible. I noticed that there was no consistency in the use of continuation characters (in column 7), and in fact, the "$" was used for continuation in many sections. Thus, I could not substitute "$" for "\" and expect to re-substitute unless I first changed the gas.' used for continuation to some other character. Also, there were many statements using compound expressions that spanned several lines, and the "*" (asterisk) was often used as the continuation character, so I had trouble reading some of the statements. To complicate matters, the "*" is also the multiplication operator.

Wherever a potential for confusion existed, I changed the continuation character to "&" (ampersand). This done, I was able to change the suppression character from "\" to "$". With the next attempt at compilation, the number of error messages had decreased by an order of magnitude.

My next roadblock was a function of not having a source file to begin with. FORTRAN statements are expected to continue to column 72, but the listing file discarded blanks at the end of each line. In several FORMAT statements, the Hollerith constant was used. This is a anachronism in which the length of a character string is explicitly stated, and the following characters are assumed to be part of the character string. When the compiler saw the truncated line, it did not include the "implied" blanks to column 72, but continued counting from after the continuation character on the next line. Thus it assumed that the formatting fields that followed and the terminating ")" (right parenthesis), were part of the character string, and it generated an error message to the effect "END OF FORMAT NOT FOUND."

When a similar situation occurred with quoted strings instead of Hollerith constants, the truncation did not cause a compilation error, but it was noticeable in the output because the spaces were used to properly place column headings, and the misalignments were obvious.

When all of the errors that prevented successful compilation had been corrected, there remained the warnings; these are non-fatal errors that do not interfere with correct execution of the program.

There were frequent warnings of the type "THIS STATEMENT CANNOT BE REACHED". The cause was extraneous RETURN statements placed at the end of subroutines, but preceded by a GOTO. The subroutines were intended to prompt the user, validate their input, and repeat the prompt as long as the response is invalid. The actual exit from the routine is a RETURN statement inside of the loop, and the MS-DOS compiler failed to warn the programmer that the statement after the GOTO would never be executed.

The solution is to either delete the extra RETURN statement, or put a label on it. The latter is a "trick", because even though there is no explicit reference to the statement, the compiler no longer considers it unreachable. On the other hand, a smarter compiler would generate an "UNUSED LABEL" warning, so the proper solution is to remove the statement.

The next class of warnings had to do with not specifying file status on OPEN statements. There was confusion as to whether a previously created file was to be used, or a new file should be created. The VMS FORTRAN was smart enough to look first and decide what to do, which is what the MS-DOS version did by default.

These warnings underscore bad programming practices that were tolerated by the MS-DOS compiler. They appear to be related to portions added during the port from the University of Missouri computer to the IBM-PC, and I'm sure that the original compiler would have flagged them.

Finally, I was able to link the compiled objects together and had a working program. When I executed the VMS version, some of the formatting of screen output was different, but that had to do with differences in how the newline suppression works. The output data for sun rise/set and moon rise/set in Makka for May, 1966, appears the same except for some minor cosmetic differences; the VMS version outputs leading zeroes under certain conditions.

While the VMS version of the program was executing, there were several run-time errors caused by an anomaly that I do not completely understand. For the lunar tables, whenever an event occurs near midnight, or near celestial longitude zero degrees, the program substitutes a very large number for the value. Apparently, this is done to prevent "ARITHMETIC OVERFLOW" errors (such as an attempt to divide by zero). The number contains too many digits to fit in the field allocated for it in the output, so "***" (three asterisks) is printed instead of the number. The footnotes on each page of output say that "*** INDICATES NO PHENOMENON", but there is no explanation what that means.

The MS-DOS version does not care that the number will not fit in the field; it prints the asterisks without complaining. The VMS version produces a very complete "trace back" that clearly identifies the output statement that attempted to print an out-of-range number. This is another example of the kind of error detection that is often overlooked by personal computer implementations of languages.

My next test was to generate data for Canton, Michigan (my place of residence) for the month of May, 1987, and compare the results of the MS-DOS and VMS versions against each other, and against a different program that is also distributed by MSA-UMR.

The output of the MS-DOS version matched the other program within tolerable limits. However, the VMS version demonstrated a curious anomaly which I have not yet isolated.

Whatever date I entered for the start of calculations, the VMS version insisted on adding a day. If I entered May 1st, then it echoed May 2nd. When I entered May 15th, it echoed May 16th. Since I wanted the calculations to start from the 1st of the month, I thought to myself, "Why not enter April 30th, and it will think that I mean May 1st?" Much to my surprise, it echoed April 31st! Nonetheless, when the program executed, 'the output started from May 1st.

As a sanity check, I decided to download the VMS FORTRAN source back to the Rainbow and recompile it with the MS-DOS compiler. The only change required was the aforementioned substitution of "\" for "$". The new MS-DOS version did not manifest the "add one day" phenomenon.

There is still a third FORTRAN compiler at my disposal, a 68000-based UNIX version. Although I have not completed the port from VMS to UNIX, there are fewer error messages, and I should be able to report the results at the conference.

Commentary

As I stated in the introduction, I am compelled to comment on the rationality of the effort we are gathered here to discuss. The computer is a wonderful tool that can help us to fulfill our obligations as Muslims, but like any tool, it can be harmful if used improperly. Science does not relieve us of our responsibility to use good judgment and common sense.

Members of Muslim Community Association of Ann Arbor and Vicinity have often asked me, "Just how accurate are these computer calculations?" My answer always surprises them; "More accurate than your watch!" I then ask them to compare the time on their wristwatch with the time on mine, and that of anyone else who happens to be within arm's reach. Between three people, there is usually at least a 45 second spread.

The best of calculations are of no use if your timepiece is not correct.

I next reminded them that Ann Arbor is only a 45-minute drive from Detroit, but the calculated prayer times are 3 minutes different because of the difference in longitude. I myself live 20 minutes away from the Ann Arbor mosque, so I consulted a U.S. Department of Interior Geological Survey map to determine the latitude and longitude of my house, and calculate times to begin and break my fast when at home. I do this even though I know that each clock in the house is a little off from every other one.

So the question of accuracy, while important, is not critical. To calculate the time of the new moon to the fraction of a second is a waste of time. Knowing if the moon sets 5 minutes or 6 minutes after sunset from a given location is not important, especially if you cannot be sure if you are standing at the coordinates used for the calculations, or 10 miles to the East or West.

Recently, a 289 line BASIC program was published in an astronomy magazine. It predicts both lunar and solar eclipses. By definition, it is capable of calculating the time of the new moon for each month, and it does not waste time calculating the moon's position on the 26 intervening days. It is sufficient for the purpose of lunar calendar calculations because it is accurate to within 1 minute, which is more accurate than I can say that my watch is on any given day.

Summary

Any computer program written in a high-level language can be, with varying degrees of effort, made to work on a computer other than the kind on which it was originally written. The effort that the software's author expends to ensure the source will compile with at least one other compiler will minimize the effort required for subsequent ports.

The computer program that I examined, which is distributed by the Muslim Students Association of the University of Missouri-Rolla, is less than optimal for the purpose of determining the start of Islamic months, because it requires a human to compare the data in two distinct tables; this is a function which could have been performed by the program, and the results presented in a single table.

The program is tedious to use (there are many screens of introductory material that could have been included as part of a user manual and kept as a file on the disk) and needlessly accurate. It runs very quickly when there is hardware assistance for the arithmetic computation, otherwise one must wait a considerable time for results. It is poorly documented, and the user is not warned that regardless of their geographic latitude and longitude, the program assumes that they are at sea level.

Porting this particular program from a personal computer to a mainframe required approximately three man-days of effort, and was 99.5% successful; perhaps the remaining anomalies can be corrected by the time this paper is presented.

While FORTRAN is a language that is well suited to the kind of number-crunching required for lunar calendar software, it is also a language that suffers from thirty years of evolution, and the plethora of compilers for it is rivaled only by Pascal. However, I hesitate recommending any language as being more appropriate, because of a comment made by Seymour Cray, founder and CEO of Cray Computers;

"I do not know what language will be used to program computers in the 21st century, but whatever it is, it will be called FORTRAN."

Bibliography


WiZ-WORX.com HomePageLast update: 2008-01-03 by Dennette@WiZ-WORX.com