21. Down in the dumps
Before I get into dumps, let’s review what happens when we run a program. The program is somewhere in the computer and each statement or instruction as well as data structures and the data itself could be in the machine. Much of this can be found in an area called working storage, and it is nothing more than a temporary place for a program as it gets executed, or run. If there is a problem as the program runs, we will have the luxury of looking at these areas of working storage to see what the problem is. As you know, a program that has difficulty running is said to abend and the result is a dump, or more specifically a picture of working storage. But it is not a pretty picture.
A dump shows instructions in our program and data in almost unreadable format. After all, the computer processes programs that we wrote as object code or machine language, which is foreign to us. The result is a dump in hexadecimal format, which happens to be base 16. We will need to be able to count in that base in order to read a dump. Fortunately with all the tools and advances in information technology, we really won’t need to worry about reading dumps on a serious level. If you own a PC and sometimes run into unexplainable problems (if you don’t have crashes you are probably from another planet), the result on your screen is a meaningless message, or perhaps nothing is happening. A restart of the PC will remedy that.
The early days of computing forced people to know hexadecimal and how to read dumps. I’ll talk about specifics regarding other base systems in the next chapter but for now I will summarize what a dump involves. To begin with, you could get a dump if you tried to run a program and that program couldn’t be found on the computer. Maybe you put the executable program in library B but the computer was looking in library A so that’s why it wasn’t found. The solution is either put the program in library A or point to library B when you run the program. If we have the program not found scenario we might get a dump along with some kind of system code, which right away indicates to us that the program was not in the library where it should have been. Thus the dump was not really needed.
You could also get a dump if you tried to create a file but didn’t allow enough space for the records in the file. Once again you would see a familiar system code along with the redundant and unnecessary dump. After getting the same system code, you would easily recognize that your file needed more space. A similar situation would occur for other little problems and in most cases, the system code would tell you all you needed to know without any need for that hexadecimal junk.
However, back in the old days there may have been other times when you had bad data and you got a dump along with a certain system code. You might recognize that the system code indicates bad data but it wouldn’t tell you which record and what field caused the problem. On that occasion you had to dig into the dreaded dump. There was a specific code that warned you of an attempt to divide by zero, which is not allowed since it can’t be done. If you ran into this system code, you could just search your program for the division operation since it was rare in programs and you would have a fairly good idea of your problem. Bad data other than that was another story altogether.
You could actually figure out what was wrong without reading the dump in many cases. Of course you would need to know at about what statement your program abended. If you could figure that it was one of two or three lines, you could see where the difficulty lay. The statement that was the problem was more or less spelled out to you as an interrupt, that is, there would be some statement to the effect
INTERRUPT AT 003A6
but you had to interpret which line that represented in your program. The way to do this involves looking at the listing of your program, which has numbers somewhere relating this number to some line of your program. That is to say, this strange number
003A6
actually points to the statement in your program where the interrupt or problem occurred. The listing that correlates your program statements to another set of numbers is called a PMAP, which stands for procedure map and it ties the statement number to a number in storage.
Of course you could look at the PMAP and not find the
003A6
anywhere but you saw on two successive lines
003A2
003B4
and this would indicate that the statement corresponding to
003A6
was either of the two lines above. This is so because our strange number is a hexadecinmal number between 003A2 and 003B4. I don’t expect you to understand that since you probably don’t know how to count in base 16 so take it on faith for now. Since our interrupt is between these two numbers, this means that one of the statements corresponding to these two numbers is what caused our program to abend.
The complicated procedure above is how I used to track down abends in programs, provided I had a PMAP. I would then look at the statement or two and since only a variable or two was involved, I quickly got to the root of the problem. In addition, the dump would actually show you what was in the variables but you had to find that area of working storage in the dump and it may not have always been easy.
Today we don’t actually need to worry about this technique as some software tool might be very specific in pointing out not only the statement where the problem exists but also the field that’s in error. All you have to do is look at it and proceed from there. If the variable in question wasn’t specifically spelled out, you could look at the troublesome statement and check each variable on that line. If a variable was supposed to be numeric you might easily discover that it had spaces in it and that was the cause of the abend.
Whenever I worked with testing programs, I always felt there were only two possibilities. Either I wrote the program from scratch and so I was very familiar with the goings on of the process, or I made a small change to an existing program. If the program was new and abended, I would probably know without too much trouble what was wrong since I had a good grasp of all that was happening. If the program wasn’t new and involved a change, there was a high probability that the abend occurred at the line that was modified or something related to the change. That thinking usually worked for me.
There are certain abends that can’t be avoided but some should never occur. A zero divide is one and it can be avoided by doing a check before the actual division takes place. If the divisor is zero, don’t perform the operation. As far as bad data goes, do as much as possible to minimize these interruptions. Data comes from input or it can be generated by the system. If the latter, that is, if some program generates the data, you should always have good data. If not, make changes to the programs generating the data so that it is always integral.
On the other hand, if the data is keyed into the system, you have less control but you can always do preliminary checking and reject invalid data, especially fields that should be numeric and aren’t. Even if the data is coming from an outside source, you can put in similar verifications before the actual processing. This will mean fewer if any abends due to bad data. Not long ago I had just this situation where a file was originating from outside and we had a program that did initial checking to make sure the fields were what they were supposed to be. If they weren’t, the next program that actually processed the data just didn’t run.
That last action may have been rather drastic but maybe it had to be done. Another alternative might be to skip the record in error and go on to the next good record. This approach will still prevent the possible abend and you’d probably care to put some kind of message on an error report for the record in error. Someone would then see it and take appropriate action.
Other problems, like I/O errors, could result from a tape drive that merely needed cleaning. Space problems and file not found can be minimized by diligence in what should be happening. Obviously you can’t prevent all problems, but taking care of the majority so that they are almost an impossibility of occurring will keep everyone happier. A bit of thought and analysis ahead of time will make weekends and evenings almost virtually trouble-free.
Reading dumps and determining what each system code means will be a part of your learning process when you begin a new job. Fortunately there should be documentation to aid you in these concerns and your fellow workers will be more than happy to guide you in learning as much as you need to know. If your compatriots aren’t helpful, make a batch of brownies with Ex-Lax, remembering not to indulge. Each company will have different problems as well as different tools to solve them and in a matter of time you will be comfortable in these environments.
Let’s get back to the method I suggested before for obtaining zip code data. We’ll use something similar to an array, called a table. For this new program,
accttable,
we’ll steal code from
acctsly
of chapter 13.
program-name: accttable
define file acctfile record account-record status acct-status structure
account-number integer(9)
last-name character(18)
field character(58)
zip-code integer(5)
field character(10)
define table ziptable source prod.copylib(ziptable) record zip-record key zip found t-sub structure
zip integer(5)
city character(15)
state character(2)
define error-msg character(60) value spaces
define list-city character(15) value spaces
define list state character(2) value spaces
screen erase
screen(1,23) “account number inquiry with table”
screen(4,20) “account number:”
screen(6,20) “last name:”
screen(8,20) “city:”
screen(10,20) “state:”
screen(12,20) “zip code:”
screen(22,20) “to exit, enter 0 for the account number”
input-number: input(4,36) account-number
screen(24,1) erase
if account-number = 0
go to end-program
end-if
read acctfile
if acct-status = 0
perform get-city-state
screen(4,36) account-number
screen(6,36) last-name
screen(8,36) list-city
screen(10,36) list-state
screen(12,36) zip-code
else
if acct-status = 5
screen(24,20) “the account number “account-number “ is not on the file”
else
error-msg = “account file read problem; program ending – press enter”
go to end-program
end-if
end-if
go to input-number
get-city-state: search zip-table key = zip-code
if t-sub = 0
screen(24,20) “that zip code ” zip-code ‘” is not on the file”
list-city = spaces
list-state = spaces
else
list-city = city(t-sub)
list-state = state(t-sub)
end-if
end-program: screen(24,1) erase screen(24,20) error-msg input
end
You’ll notice some similarity to what we had in the arrays earlier. New keywords are
table,
source,
and
found.
In the lines,
define table ziptable source prod.copylib(ziptable) record zip-record key zip found t-sub structure
zip integer(5)
city character(15)
state character(2)
the first keyword tells the system a few things:
ziptable
is both a copy member in the production copy library, PROD.COPYLIB (the second occurrence), as well as a table (the first), which enables us to go through it with a search – another keyword. That one,
source
points to the proper library and the right member in it,
ziptable.
The keyword
found
magically places into the field,
t-sub
the number of the row number in the table which matches the zip code from the account number file. This gives us the corresponding city and state for displaying on the screen. If there is no match,
t-sub
will be 0. Note that that variable is not defined but based on the size of the table it will be an appropriately sized integer. This occurs because of the combination of the keywords
search
and
found
in the definition of the table. This all takes place because the table is dynamic, meaning it’s current. From the definition of the table, all the values in the copylib member will be available to us. The size of the table results from the copy member.
The statement,
get-city-state: search zip-table key = zip-code
means we don’t have to read the zip code file anymore. There are fewer lines of code in the program and it may run faster that Sly’s program, despite his name. You can see that the
search
keyword is quite powerful as it looks for a match and because of the way the table is defined,
t-sub
is the row number of the match, giving us the city and state corresponding to the zip code. There shouldn’t be any situation where there isn’t a match because of the connection between the zip code table and the zip code file we used before since tech support updates the zip code table directly from that file. We’ll display an error message anyway if that does happen.