Data Model: World Cup 2018 by Viji Kumar - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub for a complete version.

Chapter 2 - Definition of Entity Types

Two attributes, the id and the name are sufficient to identify or refer to any competition, team or venue in this model. While these entities have many other attributes that may be of interest to sports fans, e.g. the founding year of a team or the current coach; for the sake of simplicity I have not included such attributes. The id is a number that has been assigned to each of these entities for use as a unique and permanent identifier because the name may change e.g. what was the UEFA Cup is now the UEFA Europa League.

`Image

A competition, team or venue is defined with both its attributes, its name and its id, as keys to denote that they must both be unique. The id is the designated unique identifier and in database jargon is called the primary key. It is used by data management systems to uniquely identify an entity. The name, which may change, must also be unique in this context because users of the databases may wish to search for results and other statistics about a competition, team or venue using the name. To prevent duplicate names, the name attribute is designated as a secondary key. This enables database designers to know that they must check for uniqueness before recording a name or an error may occur. The attributes become table columns.

All league seasons, tournaments and head to heads must be linked to a competition. This is where the diagrams translate into linked tables, i.e. the model manifests itself in the real world. Therefore, in addition to an id and a name, these tables have a third attribute, the id of a competition, linking them to a competition. These are called foreign keys in the tables of league seasons, tournaments and head to heads.

Unlike the other 2 types of competitions, head to head competitions are defined allowing for the two participating teams but the identities of the teams may not be known when a head to head tournament is organised and can be left empty. The names of the league seasons, tournaments and head to heads may not be unique except when coupled with the competition reference.

Image

The secondary key is now a composite key that comprises both, the competition foreign key in addition to the name, to ensure that the names of the league seasons, tournaments and head to heads are unique for each competition.

Image

The model shown below includes the organisational entities used to represent the 3 types of competitions. Links to teams and venues for the different types of games are not shown.

Image

A league season (introduced by William McGregor of Aston Villa in 1888 to English football) has a specific number of games related to it depending on the number of meetings between any pair of the participating teams. The English Premier League has two games (home and away) between each pair of the 20 participating teams with a total of 380 games per season whereas the rugby union Six Nations Championship only has a single game between each pair, a total of 15 games per season. Another round robin format, tournament group, often has four teams playing a total of 6 games when the teams meet once (FIFA World Cup) and 12 games when they meet twice (UEFA Champions League).

There are some competitions that apply rules that require amendments to the usual way outcomes are deduced. The Scottish Premier League has employed an unorthodox mechanism for ensuring that 12 teams play 38 games each during a league season. After three meetings between all the 12 participating teams, the SPL is treated as two separate leagues of 6 teams based on their league positions. There is then a single game between each of the six teams in each of the two leagues, a total of 228 league games for an SPL season. After the split, the SPL season becomes 2 separate leagues of 6 teams playing each other once, but with the teams retaining their points from before the split. The decision as to which of the teams play at home after the split is decided, presumably, from reviewing the 3 previous meetings between any two teams.

The model is used below to describe the last two tournaments of the UEFA European Championships. There were 53 and 54 teams eligible to play in Euro 2012 and Euro 2016 respectively, including the non-qualifying hosts.

Image

Image

The teams in both qualifying group stages were divided into 9 groups; some with 6 participants and some with 5. To decide the best runner-up, group performances are viewed for the top five teams in each group. In both cases there were 8 teams from the qualifying groups that took part in a play-off with the four winners also qualifying. The four play-off matches were home and away fixtures.

Returning to the detail of the 2018 FIFA tournament model; there is an initial group phase of 8 groups. The 4 teams in each group play each other once, a total of 48 (8 x 6) group games. The first two teams in each group qualify for the next phase. The next three rounds are single-leg knockout rounds producing the 4 teams who will contest the two apex games; the final and the third-place game.

Image

The diagram below documents, using MS Access, the attributes of interest.

Image

For this exercise, the collection of the goal scoring data for any game is assumed to be in one of the following three mutually exclusive states; no data (to be played or the score is not known), incomplete data (game in progress with all known scores) and complete data (independently verifiable score at the end of the game). This rudimentary control mechanism (the status attribute) for specifying the extent of the known scoring data about a game should suffice for this explanation but included among its many shortcomings are that it neither allows the recording of abandoned games nor can it record the fact that there may have been temporary breaks in play due to unforeseen events during a game.

Image

The starting time (local time at the venue) and the venue reference are also common attributes for all types of games in this model. The two participating team references (the team one and team two roles) are attributes of group games but are not common to all types of games in the extended model e.g. for head to head games during a series. The group reference is the only attribute specific to a group game. Again, a unique identifier, id, is the prime key. The two team references, the venue reference and the group reference are all foreign keys.

The teams that are in a group can be deduced from reviewing the group game participants, but that relationship is also explicitly recorded in another table (not shown above) and is used when generating the group points table. The table below shows the details for the group A games.

Image

The venue, team and group names will be required to translate the foreign keys in the table of group games to make it meaningful for humans. The names can be found in the tables shown below. The four group A teams, Russia, Uruguay, Egypt and Saudi Arabia, are shown in red.

Image

The group A games table below has the names substituted for the ids of the teams, venues and group.

Image