Normalizing with Entity Relationship Diagramming | jingle-bells.info
Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by Edgar F. Codd as an integral part of his relational model. Normalized relations, and the relationship between one normalized relation. Entity relationship diagram (ERD) is one of the most widely used technique an entity table into two or more tables and defining relationships. ERD(Entity-Relationship Diagram) is the conceptual model. It represents relationship between entities. ERD is converted into Relation model(tables) using .
How the entities are related to each other? We see that we should store the relationships between orders and clients as well as those between items and orders. In our cases, we have a 1: Want to store an order? Be prepared to link it to the piece of information about the client who made it and make sure this information describes exactly one client.
This is the database developer's task. However, a client can have an arbitrary number of orders: The database should provide an ability to store the clients and orders this way.
Again, the entity-relationship model does not prescribe how should we store that data, as long as the storage method satisfies the conditions above. We could store it in two files and relate them using the row numbers: A customer has registered A customer has registered A customer has registered … An order is made by the customer described on line 1 An order is made by the customer described on line 2 An order is made by the customer described on line 2or we can just keep the information in a single text file: The latter one limits us to only 2 clients and 5 orders, but, you know, every system has its limitations.
Attributes Ovals are attributes.Normalization - 1NF, 2NF, 3NF and 4NF
What information is stored in the database? Mere enumerating the clients is nice but serves no purpose. Much better if you know the names of the clients; the orders need to be assigned with unique numbers that help to identify them; and it would be great to record which good did which item contain so not only the number of packages could be checked but their contents too.
This is in fact what the database is for: Not only the links between the entities but the descriptions of the entities too. Entity-relationship and relational model Everything above should be squeezed into relational model, which as we all know stores relations.
Since the time relational database appeared, they were mostly used to implement ER models. Multiple database manuals and guides describe the relational databases solely from that point of view. Various tools exist to automatically generate relational structure given a model.
However, ER model and a relational database are not the same. There is even no mapping to either side: Due to the way the data are stored in a relational model, there is no reliable way to tell between attributes, entities and relationships by looking only at the relational model.
These terms belong to the ER model. In a relational model, one thing can be implemented as an entity, relationship or an attribute. In this article, I will give several examples. Imagine a simple model as pictured in the diagram on the right. The model requires that the database store fictional characters as the entities.
For each fictional character it should store their name, address, town and state as the attributes. There are no relations here: As I already said earlier, the ER model specifies what should be stored and the database design relational model in this case decides how. For example, consider the following entity type Student Details as shown in Figure 6. The composition of entity identifier is due to the fact that a student has multiple MajorMinor values along with being involved in multiple activities.
The multi-valued dependency affects the key structure. This means that a SID value is associated with multiple values of MajorMinor and Activity attributes, and together they determine other attributes. The entity instance of Student Details entity type is shown Figure 7.
What is entity-relationship model? at EXPLAIN EXTENDED
Each normal form rule and its application is outlined. First Normal Form 1NF The first normal form rule is that there should be no nesting or repeating groups in a table. Now an entity type that contains only one value for an attribute in an entity instance ensures the application of first normal form for the entity type.
So in a way any entity type with an entity identifier is by default in first normal form. For example, the entity type Student in Figure 2 is in first normal form.
Second Normal Form 2NF The second normal form rule is that the key attributes determine all non-key attributes. A violation of second normal form occurs when there is a composite key, and part of the key determines some non-key attributes. The second normal form deals with the situation when the entity identifier contains two or more attributes, and the non-key attribute depends on part of the entity identifier. For example, consider the modified entity type Student as shown in Figure 8.
The entity type has a composite entity identifier of SID and City attributes. Figure 8 An entity instance of this entity type is shown in Figure 9.
- Normalizing with Entity Relationship Diagramming
Now, if there is a functional dependency City? Status, then the entity type structure will violate the second normal form.
Entity Relationship Modeling (& Normalization)
Figure 9 To resolve the violation of the second normal form a separate entity type City with one-to-many relationship is created as shown in Figure The relationship cardinalities can be further modified to reflect organizational working. In general, the second normal form violation can be avoided by ensuring that there is only one attribute as an entity identifier.
This normal form is violated when there exists a dependency among non-key attributes in the form of a transitive dependency. For example consider the entity type Student as shown in Figure 4. In this entity type, there is a functional dependency BuildingName?
Fee that violates the third normal form. Transitive dependency is resolved by moving the dependency attributes to a new entity type with one-to-many relationship. In the new entity type the determinant of the dependency becomes the entity identifier.
The resolution of the third normal form is shown in Figure The Boyce-Codd normal form rule is that every determinant is a candidate key. Even though Boyce-Codd normal form and third normal form generally produce the same result, Boyce-Codd normal form is a stronger definition than third normal form.
Every table in Boyce-Codd normal form is by definition in third normal form. Boyce-Codd normal form considers two special cases not covered by third normal form: Part of a composite entity identifier determines part of its attribute, and a non entity identifier attribute determines part of an entity identifier attribute.
These situations are only possible if there is a composite entity identifier, and dependencies exist from a non-entity identifier attribute to part of the entity identifier. For example, consider the entity type StudentConcentration as shown in Figure The entity type is in third normal form, but since there is a dependency FacultyName? MajorMinor, it is not in Boyce-Codd normal form. Figure 12 To ensure that StudentConcentration entity type stays in Boyce-Codd normal form, another entity type Faculty with one-to-many relationship is constructed as shown in Figure Figure 13 Fourth Normal Form 4NF Fourth normal form rule is that there should not be more than one multi-valued dependency in a table.
For example, consider the Student Details entity type shown in Figure 6.
Database normalization - Wikipedia
Now, during requirements analysis if it is found that the MajorMinor values of a student are independent of the Activity performed by the student, then the entity type structure will violate the fourth normal form.
To resolve the violation of the fourth normal form separate weak entity types with identifying relationships are created as shown in Figure The StudentFocus and StudentActivity entity types are weak entity types. It is now presumed that the Student entity type has the functional dependency SID? Due to the similarity in the notion of an entity type and a relation, normalization concepts when explained or applied to an ERD may generate a richer model. Also, such an application enables a better representation of user working requirements.
This application now results in the specification of additional guidelines for refining an ERD. These guidelines can be stated as follows: There should be only one dependency in each entity type where the determinant is the entity identifier.
There should not be any additional dependency among the non entity identifier attributes. Any such additional dependency should be represented by a new entity type with one-to-many relationship. If there is a composite entity identifier of three or more attributes it should be ensured that there is only one multi-valued dependency among them.
Study of dependencies among attributes during requirement analysis assist in entity type identifications and cardinality specifications. Since an ERD represents a relational model schema, a normalization ERD improves the modeling effort thereby facilitating a better fit with organizational working. Enhancing the ER model with integrity methods.
Journal of Database Management, 10 4Accuracy in modeling with extended entity relationship and object oriented data models.
Journal of Database Management, 4 4 Object-Oriented Analysis and Design with Applications second edition. On the satisfiability of dependency constraints in entity- relationship schemata. Information Systems, 15 4 ,45 Cardinality constraints in the entity-relationship model. In Davis, Jejodia, Ng, and Yeh.