Computers for Smart People by Robert S. Swiatek - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

3. File makeup

 

Before proceeding with a simple computer program, let us look at how data is organized. All information is stored in files or databases, which strictly speaking are one and the same. A file consists of various elements or records. Thus a personnel file will have records that match individuals. Each record consists of fields or variables. Our personnel file might have records that include some identification number such as a social security number or the like, name, address, city, state, zip code, telephone and date of birth. There may be other fields as well.

Each field is a variable, which has a value, and each individual field has some kind of limit. The identification number might be limited to nine numeric digits and nothing else. It cannot be all zeros or all nines and there could be further restrictions. The name will be limited to letters of the alphabet – upper and lower case – the period, apostrophe and hyphen. I don’t know many people who have a name with $, %, a number or @ in it, so I think our restriction is valid. There is allowance made for hyphenated names to accommodate women who marry and want to somehow keep their maiden name as well as an Irish name like O’Brien. Granted, there are taxi drivers in New York City who have the letter O with a slash through it in their name, but we won’t concern ourselves with that possibility.

Other fields will have different restrictions. Zip code can be one of a few formats, such as five digits, nine digits or alternating digits and letters to accommodate our neighbors north of the border. Dates have to be in a specific format, mostly all numeric but all spaces could also be acceptable, as could an entry of all zeroes. This would accommodate a date to be entered later. Our language will require all dates to be in yyyymmdd format, that is, four digits for the year and two each for the month and day. If the date is neither zero nor spaces, MM, DD and YYYY have to be such that their combination is a valid one. MM = 02 with DD = 30 would be unacceptable since February 30th is not a valid date. Later we will develop a date check to handle this.

Other fields will have restrictions as well. The state has to be a valid two-character combination, which represents one of the fifty states. City can be no more than fifteen characters and these can only be letters of the alphabet, the hyphen, the period and a space. Amount fields will always be numeric and some can be negative, such as a bank balance. Thus some amount fields need to be able to be positive or negative. This is handled by including a sign in the field. Amount fields have decimals in them, such as current balance, so that will must be taken care of as well. There will be no need to put the decimal point into any file just as we don’t need to include a dollar sign for a withdrawal or deposit. Since we are talking about money, the $ is assumed.

Having delved into the structure of a file, you can probably see that the makeup is not unlike the book we talked about in the English language. Each has basic elements that make up words or fields. These pieces in turn then get grouped together to form sentences or records. English then combines the sentences to get a book while the combination of our data records makes a file. In each case there are rules that need to be followed. If we fail to follow the rules for either, there will be problems.

The file that we want to consider is a file for checking at the bank. For now it will consist of just a few fields, account number, last name, first name, middle initial, street address, city, state, zip code and balance. Using someone’s social security number – because of identity theft – is not a good idea. In some cases, the computer will generate an account number – and even let the customer know what it is. In our system, the account number will be a nine-digit field greater than nine.

Both the first and last names must consist of letters of the alphabet, the space, apostrophe, period and hyphen only. This accommodates Billy Bob Thornton, Tom O’Brien, Jill St. John and Olivia Newton-John. The first name is limited to fifteen characters while the last name is restricted to eighteen. That should be enough characters. The middle initial must be A through Z, but it can also be left blank. The street address is limited to twenty-five characters and has the same restrictions as the name, except numbers are also allowed as well as the comma. If you live at 10 ½ Main Street, good luck. City must be no more than fifteen characters and these must consist only of letters of the alphabet, the period, space and hyphen.

The state must be exactly two characters and it must be the valid abbreviation for one of the fifty. The zip code must be a five digit numeric field. The balance will be a signed numeric field having eight digits, six to the left of the decimal point and two to the right. If you have a balance of over $999,999, it shouldn’t be in a checking account. In fact this bank may even be more restrictive and caring about the customer – that could happen – as large balances might result in letters being sent out notifying customers that they may want to consider a certificate of deposit or the idea of buying stock.

Our file is the account file and if I want to read it in a program, I will specify the variable

            acctfile

that represents a file which the program can read. How this is done will be shown when we get to the program. For now we need to worry about the fields that make up the file. We have to spell out the names of the fields, their sizes, where they are in the record and what type each field is. To save space one field will follow the other so we’ll define a structure, which will refer to the file, the record and each field.

We’ll define a file and its composition so that we know the makeup of a typical record. That way, we’ll know where each field should be. We certainly don’t want the first record to have the account number at the beginning followed by the last name and then the second record have the first name directly after the account number. That scenario will make it impossible to process the file. In our account number file, the account number will start in position 1 of the record and end in position 9, last name will start in position 10 and end in position 27, first name will begin in position 28 and end in position 42 and so forth until we get to balance, which ends in position 99. This will be the case for each record on the file and it means we can find the data we want where it should be.

We could have put commas as separators between the fields and accomplished the same result but what happens when one of the fields has a comma in it? That could mess us up so our method will be better. We start by defining a file and its structure. The account number file consists of nine fields. We must then thoroughly describe each field. This gives us some keywords. The first is

            define

and the others are

            structure,

            integer,

            decimal,

            signed

and

            character.

The actual program code to describe the account file record and its makeup is as follows:

 

define acctfile record account-record structure

account-number integer(9)

last-name character(18)

first-name character(15)

middle-initial character

street-address character(25)

city character(15)

state character(2)

zip-code integer(5)

balance signed decimal(6.2)

 

            Note that the ten lines above are not a program, which we’ll get to in the next chapter. Let us begin with the first line,

            define  file acctfile record account-record structure.

The keyword

            define

spells out to the program the name of the file – indicated by what follows the keyword

file

and what fields make up each record. That’s what the keyword

record

is for. The field

            account-record

is a variable, as are the nine fields in the record that follow. The record is related to these fields by the keyword

            structure

which says that the variable

            account-record

consists of nine fields. The end of the record is indicated by the next occurrence of the keyword define,

or some keyword, such as

            read.

The line

            account-number integer(9)

has the variable

            account-number,

which is the first field in our record or structure. Because of the way the structure is defined, this means that the field

            account-number

starts in the very first position of the record. The keyword

            integer(9)

spells out that the field is a number consisting of 9 digits. As you may have guessed

            integer

is another keyword. Any number that is an integer is a whole number, which can be 0.

            The next line,

last-name character(18)

            is quite similar except this field is not numeric but rather consists of letters of the alphabet. The keyword

            character

is all encompassing and just about any symbol will be allowed in using it, even a number – even though, as I write this, people don’t have numbers as part of their name. Seinfeld fans, that show is fantasy. Later, we’ll see that numbers in the last name, first name or middle initial aren’t allowed, even though this keyword will include numbers and special characters. Note that this field contains 18 characters maximum. If the last name happened to be

            Smith,

            last-name

would consist of the letters “Smith             ”, that is, those five letters followed by 13 spaces.