E-R MODELLING HINT

Hugh Darwen, Region 4

1. At which end of the line do I put the blob? At which end the crow's foot?

I find it helps to keep a clear distinction in your mind between the terms entity and entity type. Much of the literature fails in this respect, using the term entity when entity type is meant, just as much of the Object Orientation literature uses object when object class is meant. But M359, thank goodness, is not so sloppy.

So, an entity type defines a set, whose members (elements) are called entities. We can call anything an entity, just so long as we have identified some set, to which it belongs, as an entity type. For example, an entity named Fido might be a member of an entity type named Dog.

When you are learning the notation used in an E-R diagram, you might find it hard to remember which end of the line is supposed to get the blob, the circle, the crow's foot, or the "no crow's foot", for mandatory participation, optional participation, many, and one, respectively. In that case you are not alone! I have to keep reminding myself too, and I'll probably still get it wrong from time to time when doing whiteboard examples at my tutorials. Let me at least tell you how I work it out in my own mind. If that doesn't work, then, sorry, I can't help you!

So you are looking at two entity type boxes connected by a relationship line. For example, the boxes might be labelled Dog and Person, the connecting line Owns. Look at one of those boxes, say, the one labelled Dog. Focus your mind, not on the entity type but on a typical entity of that type such as Fido, and ask yourself the following questions:

1. Must Fido, by virtue of being an entity of type Dog, be owned by some person? If so, Fido, like every other dog, must be connected to some Person entity, meaning that participation in Owns is mandatory for every dog. In that case the Dog end of the Owns line gets a blob.

2. If Fido is connected to the person Jake, meaning that Jake owns Fido, is it thereby impossible for any other dog to be connected to that same person, Jake? If so, don't put a crow's foot at the Dog end. If instead you think it's possible for other dogs to be owned by Jake, then do put a crow's foot: think of the middle toe as connecting Jake to Fido and the other toes as connecting Jake to other dogs.

Then do the same for the other end of the line. Think of a typical person, such as Jane.

3. Must Jane, by virtue of being an entity of type Person, own at least one dog? If not, then Jane is an example of a person to whom that connecting line, Owns, does not apply—Jane dos not participate in Owns. In that case participation is optional for a person and a circle, not a blob, goes at the Person end of Owns.

4. If Jane is connected to the dog Rover, is it thereby impossible for any other person to be connected to—be an owner of—Rover? If so, don't put a crow's foot at the Person end. If instead you think it's possible for other people as well as Jane to own Rover, then do put a crow's foot: think of the middle toe as connecting Rover to Jane and the other toes as connecting Rover to other people.

When you've drawn a diagram, the next thing to do is the Entity Types section, where you define, for each entity type, certain properties, called attributes, that are common to each entity of that type. And you underline that attribute, or that combination of attributes, that serves as the identifier for entities of that type. I'll say no more about that part of the exercise here, apart from warning you about a nasty trap that will definitely cost you marks in the TMAs unless you study ...

2. Relationships and Foreign Keys

This is about an issue that many students lose marks over every year, including some of my own students, even though I do always try very hard to explain the rather difficult and confusing "rule". How about this year being the first year where none of my students lose marks on it?

Consider the following relationships:

E--R Diagram

Entity types

Employee (EmployeeId, Hiredate, Name, Salary)

Department (DeptId, Budget)

Project (ProjectId, Title)

Engagement (EmployeeId, ProjectId, StartingDate)

Notice that Engagement has a ProjectId attribute, whereas Employee does not have a DeptId attribute. You might ask why Employee doesn't have a DeptId attribute, given the existence of the WorksIn relationship. Don't we need an attribute indicating which department the employee works in? After all, the diagram tells us that every employee does work in some department. Let me try to explain.

The relationship between Employee and Department is represented by the line labelled WorksIn. We don't need an attribute as well as the line. Yes, if we choose to design a relational database from this E-R model, then the corresponding Employee relation does have DepartmentId attribute but that's another matter. Entity types aren't relations.

A DepartmentId attribute in Employee would be redundant. It would (presumably) be expressing the same thing as that WorksIn line. Worse than that, if it does represent that same thing, then we would need to write an entry in the Constraints section, to the effect that the DepartmentId of the Department that an Employee is connected to via WorksIn must be the same as that Employee's DepartmentId.

Now, if you're with me so far, you might now be wondering, in that case, how come there's a ProjectId attribute in Engagement? And an EmployeeId attribute, for that matter. It appears that the Engagement entity type looks exactly like its corresponding relation, in which EmployeeId and ProjectId are foreign keys referencing Employee and Project, respectively.

Well, it so happens that some E-R practitioners do insist on leaving out those "foreign key" attributes, even when they form identifiers, as in Engagement. But the M359 course development team, like their predecessors on the old M357 and M358 courses, belonged to the camp that deems it essential to have an identifier for every entity type. You might or might not agree with that position, but it does have some disadvantages:

1. Students have to remember to leave out the "foreign key" attributes in cases like WorksIn but to put them back in again in cases like Engagement; otherwise they lose marks.

2. Aren't those identifying attributes in Engagement redundant with the relationship lines, just like the outlawed DepartmentId attribute of Employee would be? And don't we therefore need extra constraints to the effect that the ProjectId of an Engagement must be the same as that of the Project to which it is connected and the EmployeeId of an Engagement must be the same as that of the Employee to which it is connected? Good questions! They are addressed in M359 by the notion of weak entity types (see Block 4, page 35) . Our Engagement entity type is one of these. Think of it as "weak" because it can be thought of as representing a set of relationships as well as a set of entities. Indeed, some practitioners wouldn't even put put Engagement into an entity type box. Instead, they would have a many-to-many relationship line, EngagedIn, from Employee to Project, and allow that relationship, EngagedIn, to have attributes and be connected to other entity types or other relationships.

To sum up:

Don't include "foreign key" attributes in entity type definitions, unless they are needed to form identifiers.

Notice, by the way, my quotes around "foreign key". I strongly recommend that you forget about relational terminology altogether when you are doing E-R models. For example, the term primary key doesn't exist in the E‑R world. Instead we have identifiers. And entity types are not relations. Yes, both worlds do have attributes, apparently standing for much the same concept. Sorry about that, but it's a mere coincidence.

***END***