Design-and-Use-of-Relational-Databases

Content and Goals

In this tutorial we continue the discussion about ER-models by looking into many-to-many relationships, weak entities and designing and drawing an ER model.

In this tutorial we will learn:

Many-to-Many relationship

In the ER-model every entity type is associated with at least one relationship. In practice, if you see (or create) an ER model for relational database, no entity must remain isolated, but each must have at least one connection.

In most cases the relationships are binary (between two entities) but it is also possible to have relationships between multiple entities. It’s often necessarily to explicitly define the cardinalities of the entities associated by a relationship. The cardinality affects both the use and implementation of the database.

The cardinality of a binary relationship dictates the number of instances with which the entities are participating in the relationship.

The cardinality can be subjective and dependant on how the database is expected to be used.

Learnt before: In one-to-one relationships (1:1) every instance of the first entity can relate to at most one instance of the second entity. Similarly every instance of the second entity can relate to at most one instance of the first entity.

Example: Faculty-Adminstration relationship is one-to-one. Faculty is led by one adminstration and one adminstration can lead only one faculty.

In one-to-many relationship (1:N) every instance of the first entity can relate to one or more instances of the second entity but an instance of the second entity can not be related to more than one instance of the first entity.

Example: Faculty-Department relationship is one-to-many. There are multiple departments in a faculty but a department belongs to exactly one faculty.

New:

In many-to-many relationship (M:N) every instance of the first entity can be related to more than one instance of the second entity. Similarly, every instance of the second entity can be related to more than one instance of the first entity..

Example: Student-Course relationship is many-to-many. A student can participate in many courses and a course can be participated by many students.

Let’s consider drawing a many-to-many relationship in crow’s foot notation. As learnt before we begin by drawing the entities. Between the entities we draw a diamond and write a verb inside it to represent the relationship between the entities. We connect both of the entities to the diamond with single lines. Next we consider the left entity’s cardinality. As this is a many-to-many relationship, this many instances of this entity participate in the relationship. So we draw a crow’s foot connected to the left entity. Similarly for the right entity.

In addition, in crow’s foot notation we also need to consider the participation constraints of the relationship. If the relationship is mandatory, we draw a cross line next to the crow’s foot and if the relationship is optional we draw a 0.

In the above diagram, a student does not have to enroll to any course and a course may not necessarily get any enrollments. The justification for optional participation i nthe both ends comes from a real world situation: It could be that the courses has just been published ( a second ago) and the student have not yet enrolled (but will in the next coming weeks).

Another example of M:N relationship:

Reading the ER-diagram above: An employee of the company participates in several projects (e.g. during the year), and several employees participate in the project. The employee does not have to be in any project (e.g. CEO of the company). The project must have at least one employee.

So, in practice, we could record data that (responds our information needs in the following way):

These videos recap previous information

Attributes in relationship

We can add attributes not only to an entity but to any relationship as well. This is quite common especially in many-to-many (M:N) relationships. Next, we will discuss why, and consider its limitations.

Consider the previous diagram about Employee-Project -relationship. It satisfies the need for information about which projects this employee work and which employees work in this project. However, if one would like to know how much (many hours) employees have worked in their projects or vice versa, we are missing an attribute to store that data.

So, let’s add an attribute called ‘Hours’ to the diagram that could store a numerical value denoting the number of working hours done by the employee for the specific project. But where to put it: to the entities or to the relationship?

In the Employee-entity, Hours-attribute would store only one value (e.g. 23h) for each person (employee-instance), which means that, for example John, who works in three different projects would not know hours for his projects separately. Thus, Hours-attribute would store kind of a ‘total hours for all his projects’, which is fine for that purpose, but not in the level of detail required (hours for specific project). The same holds for the Project-entity: Hours-attribute would store one value meaning ‘total hours of all employees working in the project’.

The solution is that the information need lays in the relationship between a specific employee and specific project. Hours-attribute is a characteristic of the relation between the employee and the project. For example, John has worked 23 hours in the project X, 78 hours in the project Y, and 0 hours in the project Z. In the project X, John has worked 23 hours, and Mike has worked 54 hours. Each time the employee and the project are connected we add information about the hours.

Next example describes product orders. In the database, we would like to store our products with their product numbers, names and prices per unit. Considering one specific product (e.g. a bottle of cola drink), that has been ordered many times or none. Many times, as our grocery store has thousands of those cola bottles sold, and none (optional), as the product may be new in our selection (e.g. new cherry-vanilla taste) that we have not sold yet. Each time the product is related to an order, we store data about what was the quantity of the bottles, possible discount percent and the actual selling price of one unit (bottle). From the viewpoint of one Order, many-to-many relationship is defined as mandatory: There is no order without a related product. One order can contain many different products, and for each, the attributes of the relationship tell how many pieces and what was the selling price.

So, we can get information such as

Limitations

Attributes in relationships do not have primary keys. Primary keys are defined only for entities. This means also that specific instances of participating entities can relate to each other only once.

Let’s consider two ER diagrams above. In the first diagram, we stored information that John has worked 23 hours in a Project X. There is no need, or possibility, to store that same information twice. In the same way, the second diagram says that the order number 333 contained 300 bottles of cola. We are not able to relate the same cola bottle products to the same order number twice, and there is no need to do that.

Therefore, many to many relationships and the related attributes are usable in situations when participating instances relate only once. Sometimes this is a limitation, and you should be able to identify situations when you need an entity instead of relationship with attributes. In the next tutorial we are discussing about relational model and how to transform ER model to a relational model. Relational model does not have many to many relationships and all these relationships in ER model are extracted to 1:N or 1:1 relations by adding an entity/relational table between original entities. Some designers consider this limitation, or implementation reality, already when modeling ER diagrams.

Before going into details of relational models, we turn our view on a new concept of weak entity and last creating and drawing ER models.

Weak entity

An entity without its own key attributes is called a weak entity. That is, a weak entity cannot be uniquely identified by its own attributes.

Instances of weak entities can only be identified by association with their identifying or owner entity. The participation constraint of the identifying relationship is always complete for the weak entity. That is, each instance of the weak entity needs to be associated with an instance of the owner entity.

In ER-model the weak entities are indicated by double-lined rectangles. The relationships that identify the weak entities are also indicated as double-lined diamonds.

A weak entity often has a partial key with which its instances can be identified when the owner instance is known. In the worst case, the partial key may constitute of the set of all of weak entity’s own attributes.

The attributes that are part of the partial key are underlined by a dashed line.

Examples of weak entity:

We want to store our employees and their family members and relatives in our database. An employee’s family member can be identified by their name, if we assume that the employee doesn’t have two family members with the same first name. We also need to assume that the same family member is not associated with more than one employee. In this case, the family member’s first name works as a partial key. Dependent is identifiable only by knowing employee’s ssn (social security number). Dependents can have same names but not within the same employee.

Loans have several payments that shorten the loan amount. Payment is defined here as a weak entity, because it is not identifiable without getting information about the specific loan it is a payment of. Payment_number is the partial key, meaning that the database can have e.g. payments numbered from 1 to 25 which relate to a house loan and payments numbered from 1 to 15 which relate to a car loan. Values of partial keys can be the same, but the detailed identification is got from the loan and its number.

Grocery chains have applied bonuscards that give the consumers monthly rewards (discounts etc.). Reward is defined here as a weak entity with a combined partial key (month, year) which separates different rewards from each other under the same bonuscard.

Weak entities in practice

When implementing the model we can sometimes avoid the use of weak entities by adding its attributes as a multivalued composite attribute of its owner entity. However, it is recommended to use the weak entity instead of a multivalued attribute if there are “enough” attributes that are part of the weak entity as it would become increasingly difficult to handle the multivalued attributes. The same applies if the weak entity would have relationships with other than its owner entity.

Use of weak entities may bring unnecessary complexity to ER models, and there are rare situations when weak entities are truly necessary in real life design. A common mistake with weak entities is that weak entity is confused with total / mandatory participation of a relationship and used in that purpose. For example, above weak entities could have been modelled as ‘normal’ entities, of course eventually depending on the contextual information needs: We could attach each dependent an artificial ID, a numeric value that is unique for each dependent, and having that DependentID the Dependent entity, as a normal entity, still has a mandatory relation to one and only one employee. Similarly, Payment with its own primary key has to relate to one and only one loan.

Therefore, we do not recommend using weak entities in the beginning of your database design career, although it is good to know that solutions exist. Weak entities are good for situations where we have a domain concept with several attributes (entity) but need for a primary key in relation to other entities (weak).

Creating and drawing ER diagrams

Summary of ER modeling technique:

ER-model is a conceptual model of a database, created after we are aware of our databases requirements but before the database is implemented. One of its main benefits is that we can use terminology familiar to the client in creating a high-level structured model of the database. In ER model we define the databases entities, their relationships, associations and interactions between entities. These are represented by entities, attributes and relationships. It’s often easier to see the big picture by looking at the diagram. However, interpreting the diagram requires knowledge about the substance being modelled.

Below we list some universal design tips and common difficulties for beginners based on our experience with ER models. In the end, it is your modeling and implementation context that determines how good the plan will be in practice.

Design tips for ER models

Tip 1: Think who is your client, what is the purpose of database, and whose data management you are planning to support. Usually, this is not an entity in your diagram. You need to model only what is ‘inside’ that target domain.

ER design example step by step - The Zoo

The manager of the Zoo wants a database. The manager has sent you an email: “We have different, lovely colored birds in our summer pool. Birds are fed by our employees daily, some even twice. Birds, like Donald the Duck, feels often sick, maybe because of the food given. First, I need to know who, what and when fed which bird. Another interesting information is how much feeding birds’ cost. Would you design a database for the Zoo?

We begin with defining entities and their attributes. We pick substantives and verbs from the description, and fill in rest with your previous understanding and assumptions about the zoo context.

For example,

We continue with defining relationships.

For example,

You continue defining the relationship and focus also on attributes that can help in calculating costs of feeding.

Although our model above holds required data in its attributes and is syntactically correct, we should have noted few alarming design decisions in the above model:

  1. As the manager wants information about the feeding, the Feeds relationship must be an important concept in the Zoo context.

     Feeds relation has many attributes and even more could be added, which indicates this importance.
     Such important concepts and sets of interconnected data are usually modelled as entities.
    
  2. Feeds-relationship is M:N that fits in ‘a permanent but one-time’ situation. This is not suitable solution based on the requirements and assumptions we made about the context.

     Employees feed birds daily. The same employee feeds the same bird not just once but many times during the system lifecycle. M:N relationship does not allow an employee to feed the same bird twice or vice vers We cannot record that “John the Employee fed Donald the Duck today and yesterday”. Note the difference to situations like “the book was written just once by many authors or the order containing many products was submitted just once”.
     Instead of thinking that employees have a relation to many birds, we should think that employees relate to many feeding events and vice versa. Thus, we design a feeding event entity instead of the relationship.
    

The feeding event is ‘one-time situation’ (like the ordering example above), which can target to many birds. There is no feeding event without any bird or any feeding employee. We assume that the feeding event is made by one employee only. If there are two or more employees performing the feeding events at the same time [in the real world], we will record as many different feeding events with different eventIDs into the database. This cardinality is to some extent also a designer’s choice about the level of details in data.

Now we are almost happy with the initial plan of the database. However, when looking closer to the attributes we find that food-related data is dispersed into many places. The feeding event -entity (like the previous relationship) contains the food price and amount given in a certain event. The given food name (i.e. suitable food) is currently recorded into bird’s data. Dispersing data to different locations may cause update problems in the future. In addition, you may think, what if a bird can eat several different foods and not always the same food is given in the feeding events? In that case, we should add an attribute ‘Given food name’ to the feeding event entity. Then suitable food and given food attributes may repeat the same values, and repeating data is not very good solution either. Therefore, following the definition of an entity, it would be better to gather ‘a set of data’ related to food into Food-entity.

This is our final ER model. Few remarks about the changes made:

Drawing ER diagrams

There are many tools that you can use for planning and modelling relational databases. Sometimes ER models are referred in these tools as database models, data models, relational models, ER diagrams, logical models or similar.

Some of the tools are only “drawing tools” and some have further functionality for the whole database implementation. Internet is full of reviews and lists of open-source ER modelling tools. Take a look for example this review: https://www.holistics.io/blog/open-source-data-modeling-tools/

Below, we introduce a light-weight tool that is available online here: https://erdplus.com

Pros: