Sabtu, 25 April 2009

DATABASE NORMALIZATION

Diposting oleh namayana

Database plan process (review)
Gather need user/business
Develop e-r model based on need user/business
Conversion e-r model to customer collection (table)
Customer normalization to cause the loss of anomaly
Implementations to database with make table to every customer that normalization

Database normalization
Normalization data base structure formation process so that a large part ambiguity can be caused.
Normalization stage is begun from lightest stage (1nf) up to tightest (5nf)
Usually only come up with level 3nf or BCNF because enough to produce tables have a certain quality good

why done normalization?
Optimalitation table structures
increase speed
cause the loss of data entering same
efficient in storage media use
decrease redundancy
avoid anomaly (insertion anomalies, deletion anomalies, update anomalies).
data integrity that increased
a table is said good (efficient) or normal if fulfill 3 criteria’s as follows:
if there decomposition table, so decomposition must be guaranteed safe (lossless-join decomposition). mean, after table elaborated / decomposition be new tables, new tables can produce table at first with same exactly.
take care functional dependence at the (time) of data change (dependency preservation).
doesn't break boyce-code normal form (BCNF)
if third criteria (BCNF) can not be fulfilled, so at least table doesn't break normal form third stage (3rd normal form / 3nf).


functional dependency
functional dependency describe connection attributes in a customer
a attribute said functionally dependant in other if we use attribute value to determine other attribute value.
symbol that used -> for representing functional dependency.
a read functionally determine
notation: a -> b
a and b attribute from a table. mean functionally a determine b or b depend on a, if and only if there 2 data lines with value a same, so value b also same
notation: a b or a b
the contrary from previous notation.
The concept of Functional Dependency is central to normalisation theory. FD is a semantic concept.It is concerned with a particular semantic relationship between the attributes of a relationship (usually has to be defined by an expert on the data). Finding all the FDs will result in a better model of the real world being modelled. Can be defined as a constraint on the set of legal relations.


A. Defintions

A B C D
a1 b1 c1 d1
a1 b2 c1 d2
a2 b2 c2 d2
a2 b3 c2 d3
a3 b3 c2 d4
In the relation above :

It can be stated that

A -> C is satisfied
C is functionally dependant on A or A determines C
As all A tuples that have the same value have the same C value

C->A isn’t satisfied

Other FDs are also satisfied

Functional dependency is trivial if satisfied by all tuples
ie A->A
In general, a functional dependency of the form X->Y is trivial if Y* X

FDs can be stated as holds – when every possible attribute combination complies
FDs are said to be satisfied when all stated attribute instances comply

B. Closure of a set of FDs

List all FDS that must always hold
Then some FDs can be logically implied using Inference Rules – Armstrongs Axioms

A (universal) relation U is used

1. Reflexivity Rule
If Y * X
Then X -> Y holds for all X,Y * U

2. Augmentation Rule
If X->Y and Z * U
Then Zx -> Zy for all X,Y * U


3. Transitivity Rule
If X-> Y and Y-> Z then X-> Z for X,Y,Z * U

Additional rules (which can be derived from above axioms)
4. Union Rule
If X-> Y and X-> Z then X-> YZ

5. Decomposition Rule
If X-> YZ then X-> Y and X-> Z

6. Pseudo Transitivity Rules
If X-> Y and TY -> Z then TX -> Z

C. Inference Rule example use
Given:
R= (A,B,C,G,H,I)

The set F of FDs:
A ->B
A-> C
CG -> H
CG -> I
B-> H

then the following can be inferred:
A-> H (transitivity A-> B, B-> H)

CG-> HI (Union CG -> H, CG -> I)

AG -> I ( A-> C given, AG-> CG augmentation, CG -> I given)

The set of all FDs that hold is denoted as F + and this is called the closure of the set of FDS

D. Identifying candidate keys and 3NF
FDs are a useful means of identifying candidate keys.
A superkey exists when each combination of values for a particular set of attributes can occur only once.
A candidate key is an attribute , or a set of attributes that gives a unique identity to each tuple in a relation. A candidate key is a minimal superkey.
If those relations with more than one candidate key, one is designated as the primary key
In the example above CG is identified as a candidate key

A relation R is in 3NF if for all FDs that hold on R of the form X-> Y, where X * R
And Y * R, at least one of the following holds :
• X -> Y is a trivial FD
• X is a superkey for R
• Each attribute y in Y is contained in a candidate key for R



second normal form (second normal form - 2nf)
normal form 2nf has been fulfilled in a table if fulfill form 1nf, and all attributes besides primary key, according to intact has functional dependency in primary key
a table doesn't fulfills 2nf, if there attribute the dependence (functional dependency) has only partial (only depend on some of primary key)
if found attribute doesn't has dependence towards primary key, so attribute must move or caused
functional dependence x -> y is said full if wipe off a attribute a from x mean y not again depend on functional.
functional dependence x -> y is said partial if wipe off a attribute a from x mean y still to depend on functional.
customer scheme r in the form of 2nf if every attribute non primary key A Є R depend on full according to functional in primary key r.

third normal form (third normal form - 3nf)
normal form 3nf has been fulfilled if fulfill form 2nf, and otherwise there attribute non primary key that has dependence towards attribute non primary key the other (dependence transitive).
boyce-codd normal form (BCNF)
Boyce-Codd normal form has compulsion stronger than third normal form. to be BNCF, customer must in the form of normal one and every attribute is forced to base on function in super attribute key.
in example under this is found seminar customer, primary key NPM + seminar.
student may take one or two seminars. every seminar wants 2 guides and every student is guided by one of [the] between 2 seminar guides. every only may take one seminar. in this example is NPM and seminar shows a guide.

normal form fourth and fifth
customer in the form of normal fourth (4 NF) if customer in BCNF and not full dependence many values. to cause the loss of dependence many values from one customer, we divide customer is two new customers. each full customer two attributes that has connection many values.
customer in the form of normal fifth (5nf) deal with property that called join without existence loses information (lossless join). normal form fifth (5 NF also called PJNF (projection join normal form). this case is very rare appears and difficult to detected according to practice.


references

http://en.wikipedia.org/wiki/Functional_dependency
www.doc.mmu.ac.uk/STAFF/P.Quick/fdnotes.doc

0 komentar: