23. Sorting bubbles
We won’t actually be sorting bubbles here, but instead use what is referred to as a bubble sort. One common procedure in computer systems is the sorting of files of data. We might need a report sorted by account number using a particular file and later desire another report from the same file but sorted by a different field, such as last name. Many systems will use the same file but sort it in a different order and produce different reports for different people in an organization. There are also other times when it is advantageous to sort a file before doing any processing.
Fortunately it is not difficult to sort a file of data. In some systems you can merely go into edit mode in the file and enter a simple command and your file will be in a different order. Another possibility is to have some kind of job control language that will result in the file being sorted. We’ll get into those methods later. For now suppose we had a file that was in account number order, but needed to be sorted by zip code. Assume also that we didn’t have any way of sorting the file but had to rely on some technique that we ourselves developed in a computer program. Simply put, we had to sort the file ourselves.
The way we could do this would be by reading the entire file into a table of data and then use some logic to sort the table. Once done, we would write out the table to a new file, which would be in the sorted order we desired, in this case by zip code. Our report could then come from that new file. Another option would be to take the sorted table of data and process it as the file we really wanted in our desired sort order to produce the report.
Here we’ll read the account number file and move the records to a table, sort the table and then create the sorted file from the sorted table. I’ll leave it up to you to produce the needed report. Hint: just run the new file through the program
acctlist.
For simplicity sake we shall assume that our file has no more than 100 records, although we don’t know the exact count. The process to do all this – minus the processing for the report – is accomplished by the following statements:
program-name: bubblesort
define work-array character(10000) element c-element character(100) occurs 100 times
define i-sub integer(3)
define t-sub integer(3)
define process-sw integer
define record-count integer(3)
define file acctfile record account-record status acct-status structure
field character(100)
define file newfile record new-record status new-status structure
field character(100)
define error-msg character(60) value spaces
define hold-line character(100)
process-sw = 0
i-sub = 0
work-array = spaces
account-number = 9
perform load-array until process-sw > 0
perform sort-array varying i-sub from 1 by 1 until i-sub = record-count - 1
perform write-file varying i-sub from 1 by 1 until i-sub = record-count
go to end-program
load-array: i-sub = i-sub + 1
readnext acctfile
if acct-status = 0 then
c-element (i-sub) = account-record
else
if acct-status = 9
process-sw = 1
record-count = i-sub - 1
else
error-msg = “problem reading the file; program aborting – press enter”
go to end-program
end-if
end-if
sort-array: process-sw = 0
t-sub = i-sub
perform match-ssn until process-sw = 1
match-ssn: t-sub = t-sub + 1
if c-element (i-sub)(86:5) > c-element(t-sub)(86:5)
hold-line = c-element (i-sub)
c-element (i-sub) = c-element (t-sub)
c-element (t-sub) = hold-line
end-if
if t-sub = record-count
process-sw = 1
end-if
write-file: new-record = c-element(i-sub)
write newfile
if new-status > 0
error-msg = “problem writing the file; program aborting – press enter”
go to end-program
end-if
end-program: screen(24,20) error-msg input
end
The paragraph
load-array
should be recognizable as logic we’ve done before, just reading in a file of data and loading it to an array. We accomplish the read of the first record because
account-number
was initially set to
9.
Assuming the read is without fail, the first account record is moved to the first element of our array. The variable
i-sub
will be 1 to accomplish this but when we read the second record, it will be 2 and thus we will move that record to the second element of our array. This will continue until we reach the end of the file, which is found when
acct-status
is equal to 9. The lines
if acct-status = 9
process-sw = 1
record-count = i-sub - 1
else
error-msg = “problem reading the file; program aborting – press enter”
go to end-program
end-if
take care of that and note that we set
process-sw
to 1 so that our perform of the paragraph will end. One thing we don’t want is the program to keep looping and never end. We also set the variable
record-count
to 1 less than
i-sub.
We have to subtract that 1 because the variable
i-sub
at this point is 1 more than the number of records in the file since it will be in this paragraph once more after the last record is read. We will use
record-count
later as we sort the table and also when we write out the records to a new file from our sorted table. The final
else
will result if
acct-status
is neither 0 nor 9 and that means that there is something drastically wrong with our file. This shouldn’t happen but we put this code into our program just in case since there will be times when problems can occur that are out of our control. That takes care of the reading of our file and loading to the array.
We will come back to the actual sort in a while but next let us consider
write-file: new-record = c-element(i-sub)
write newfile
if new-status > 0
error-msg = “problem writing the file; program aborting – press enter”
go to end-program
end-if
which will take the sorted array and write it out to a new file,
newfile
one record at a time. This process can be described as a sequential write, since one record after another is being written to the output file. It’s a bit different from our earlier writes. The perform is done by varying
i-sub
from 1 by 1 until it is equal to
record-count.
Thus
i-sub
will start out as 1 and the first element of the table will be written to the file. This process will continue until all the records are written to the file. Thus the loop will continue and if
record-count
were 8, the paragraph would been completed for
i-sub
equal to 8 and that would end the
perform
since
i-sub
would equal
record-count.
There is one note of caution that I must add. Some systems would not write out the last record using our specific logic. We would have to say
perform write-file varying i-sub from 1 by 1 until i-sub > record-count
in the perform statement. That depends on when the variable
i-sub
is checked against
record-count.
In our system it is done after the last line of the paragraph and so we did write out the last record. However, other systems might check before the paragraph starts. You can see that if that is the case, the very last record would not be written since the comparison of
i-sub
to
record-count
would result in an equal condition and the perform would be done without writing out the last record. This is a small point but something that you have to consider. If you write out records to a file and somehow the last record is missing, this could be your problem.
If there is a problem with the
write
of the record, the
new-status
will have a value of something other than 0. We check this field to make sure that the write didn’t have a problem, which it really shouldn’t. However, by doing a check on this status field, if a problem does occur, we have a message to tell us that somehow there was a problem. If we omitted the check and had an abend, someone would have to research the results. By our process we know right away that we had a problem, saving time and frustration.
With the read of the input file and the load of the array, we can now concentrate on the sort. The sort here is referred to as a bubble sort. Just as bubbles float upward, we shall bring the smallest element to the top and the largest will wind up at the bottom. In this case, we’ll look at the zip code and wind up with the smallest zip code at the top and the largest zip code at the bottom. All the zip codes will be sorted the way we expect them to be. For sorting on this field, we’ll start with the first one and compare it to the second. For example, if the first zip code is the same or less than the second, we’ll do nothing. Had it been reversed, we would have swapped the entire first record of data with the data of the second. Note that we are not swapping only zip codes. In either case, the record with the smallest zip code will now be in the first position in the table. We will now repeat the process comparing the zip codes of the first record with the third one. After this compare and swap – if necessary – the record corresponding to the smaller zip code will be in the first position of the table.
We now need to repeat this compare process using the first and fourth records, then the first and the fifth ones and so on until our last compare will be between the first record’s zip code and the last record’s zip code. This checking is occurring within our array. When we finish this first step of the process, we will without any doubt have the record with the smallest zip code in the first position of the array.
We’ll now go on and repeat the same process but this time we will start with the second record and the third – comparing zip codes – then the second and the fourth and so on just as before. Whenever we have to, we will swap rows of the array just as we did earlier – actually swapping records. When we finish this pass and wind up comparing th