The Basic Relation Calculation in Database
    Aug 25, 2024

    This article mainly introduces the basic operations and examples of relational algebra such as union, difference, selection, projection and join.

    Set

    Set Union

    The union operation in relational algebra combines all tuples from two relations (i.e., tables) to form a new relation. The union operator is typically denoted by . Its mathematical definition is as follows:

    Formal Definition:

    Given two relations R(A1,A2,,An)R(A_1, A_2, \ldots, A_n) and S(A1,A2,,An)S(A_1, A_2, \ldots, A_n), their union RSR ∪ S is defined as:

    RS={ttR or tS}R ∪ S = \{ t \mid t \in R \text{ or } t \in S \}

    where tt represents a tuple. RR and SS must be union-compatible, meaning they have the same attributes (columns).

    Note: The union operation removes duplicate tuples, ensuring that the tuples in the resulting relation are unique.

    Assume we have two relations RR and SS with the following structure and data:

    Relation RR:

    IDName1Alice2Bob3Carol\begin{array}{|c|c|} \hline \textbf{ID} & \textbf{Name} \\ \hline 1 & \text{Alice} \\ \hline 2 & \text{Bob} \\ \hline 3 & \text{Carol} \\ \hline \end{array}

    Relation SS:

    IDName2Bob3Carol4Dave\begin{array}{|c|c|} \hline \textbf{ID} & \textbf{Name} \\ \hline 2 & \text{Bob} \\ \hline 3 & \text{Carol} \\ \hline 4 & \text{Dave} \\ \hline \end{array}

    The union of relations RR and SS, denoted as RSR ∪ S, results in the following relation:

    Relation RSR ∪ S:

    IDName1Alice2Bob3Carol4Dave\begin{array}{|c|c|} \hline \textbf{ID} & \textbf{Name} \\ \hline 1 & \text{Alice} \\ \hline 2 & \text{Bob} \\ \hline 3 & \text{Carol} \\ \hline 4 & \text{Dave} \\ \hline \end{array}

    In this example, the tuples (2,Bob)(2, Bob) and (3,Carol)(3, Carol) that are present in both relations RR and SS appear only once in the resulting relation RSR ∪ S. The union result includes all distinct tuples from both relations.

    Set Difference

    The difference operation in relational algebra is used to remove tuples present in one relation from another relation. The difference operator is typically denoted by .

    Formal Definition:

    Given two relations R(A1,A2,,An)R(A_1, A_2, \ldots, A_n) and S(A1,A2,,An)S(A_1, A_2, \ldots, A_n), their difference RSR - S is defined as:

    RS={ttR and tS}R - S = \{ t \mid t \in R \text{ and } t \notin S \}

    where tt represents a tuple. The result of the difference operation contains only those tuples that are present in RR but not in SS.

    Assume we have two relations RR and SS with the following structure and data:

    Relation RR:

    IDName1Alice2Bob3Carol\begin{array}{|c|c|} \hline \textbf{ID} & \textbf{Name} \\ \hline 1 & \text{Alice} \\ \hline 2 & \text{Bob} \\ \hline 3 & \text{Carol} \\ \hline \end{array}

    Relation SS:

    IDName2Bob3Carol4Dave\begin{array}{|c|c|} \hline \textbf{ID} & \textbf{Name} \\ \hline 2 & \text{Bob} \\ \hline 3 & \text{Carol} \\ \hline 4 & \text{Dave} \\ \hline \end{array}

    The difference operation RSR - S results in the following relation:

    Relation RSR - S:

    IDName1Alice\begin{array}{|c|c|} \hline \textbf{ID} & \textbf{Name} \\ \hline 1 & \text{Alice} \\ \hline \end{array}

    In this example, the tuple (1,Alice)(1, Alice) exists only in RR and not in SS, so it appears in the resulting relation RSR - S. The other tuples (2,Bob)(2, Bob) and (3,Carol)(3, Carol) also exist in SS, so they are excluded from the difference result.

    Projection

    The projection operation in relational algebra is used to select specific columns (attributes) from a relation to create a new relation. The projection operator is typically denoted by π\pi (Greek letter pi).

    Formal Definition:

    Given a relation R(A1,A2,,An)R(A_1, A_2, \ldots, A_n) and a subset of attributes {Ai,Aj,,Ak}\{A_i, A_j, \ldots, A_k\}, the projection operation πAi,Aj,,Ak(R)\pi_{A_i, A_j, \ldots, A_k}(R) is defined as:

    πAi,Aj,,Ak(R)={{Ai,Aj,,Ak}(t)tR}\pi_{A_i, A_j, \ldots, A_k}(R) = \{ \{A_i, A_j, \ldots, A_k\}(t) \mid t \in R \}

    where {Ai,Aj,,Ak}(t)\{A_i, A_j, \ldots, A_k\}(t) represents the values of attributes Ai,Aj,,AkA_i, A_j, \ldots, A_k extracted from tuple tt. The projection operation removes duplicate tuples in the result.

    Assume we have a relation RR with the following structure and data:

    Relation RR:

    IDNameAge1Alice302Bob253Carol28\begin{array}{|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{Age} \\ \hline 1 & \text{Alice} & 30 \\ \hline 2 & \text{Bob} & 25 \\ \hline 3 & \text{Carol} & 28 \\ \hline \end{array}

    Now, performing projection on relation RR, selecting attributes NameName and AgeAge:

    πName,Age(R)\pi_{Name, Age}(R)

    Resulting Relation:

    NameAgeAlice30Bob25Carol28\begin{array}{|c|c|} \hline \textbf{Name} & \textbf{Age} \\ \hline \text{Alice} & 30 \\ \hline \text{Bob} & 25 \\ \hline \text{Carol} & 28 \\ \hline \end{array}

    In this example, the projection operation selects the NameName and AgeAge columns from RR, resulting in a new relation that includes only the data for these two attributes.

    If the projection result contains duplicate tuples, the projection operation will automatically remove these duplicates, ensuring that the tuples in the resulting relation are unique.

    Selection

    The selection operation in relational algebra is used to select tuples from a relation that satisfy a specific condition. The selection operator is typically denoted by σ\sigma (Greek letter sigma).

    Formal Definition:

    Given a relation R(A1,A2,,An)R(A_1, A_2, \ldots, A_n) and a condition θ\theta (such as a comparison or logical expression), the selection operation σθ(R)\sigma_{\theta}(R) is defined as:

    σθ(R)={ttR and t satisfies condition θ}\sigma_{\theta}(R) = \{ t \mid t \in R \text{ and } t \text{ satisfies condition } \theta \}

    where tt represents a tuple and θ\theta is a Boolean expression defining the selection condition.

    Assume we have a relation RR with the following structure and data:

    Relation RR:

    IDNameAge1Alice302Bob253Carol284Dave22\begin{array}{|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{Age} \\ \hline 1 & \text{Alice} & 30 \\ \hline 2 & \text{Bob} & 25 \\ \hline 3 & \text{Carol} & 28 \\ \hline 4 & \text{Dave} & 22 \\ \hline \end{array}

    Now, performing selection on relation RR, selecting tuples where AgeAge is greater than 2525:

    σAge>25(R)\sigma_{\text{Age} > 25}(R)

    Resulting Relation:

    IDNameAge1Alice303Carol28\begin{array}{|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{Age} \\ \hline 1 & \text{Alice} & 30 \\ \hline 3 & \text{Carol} & 28 \\ \hline \end{array}

    In this example, the selection operation filters the tuples in RR where AgeAge is greater than 2525. The resulting relation includes all tuples that meet this condition.

    The selection operation allows us to extract specific data from a relation by specifying conditions, which is very common in database queries.

    Rename

    The renaming operation in relational algebra is used to change the name of a relation or its attributes (columns). The renaming operator is typically denoted by ρ\rho (Greek letter rho).

    Formal Definition:

    Given a relation R(A1,A2,,An)R(A_1, A_2, \ldots, A_n) and a new name S(B1,B2,,Bn)S(B_1, B_2, \ldots, B_n), the renaming operation ρS(B1,B2,,Bn)(R)\rho_{S(B_1, B_2, \ldots, B_n)}(R) is defined as:

    ρS(B1,B2,,Bn)(R)=S(B1,B2,,Bn)\rho_{S(B_1, B_2, \ldots, B_n)}(R) = S(B_1, B_2, \ldots, B_n)

    where S(B1,B2,,Bn)S(B_1, B_2, \ldots, B_n) represents the new relation name and attribute names.

    Assume we have a relation RR with the following structure and data:

    Relation RR:

    IDNameAge1Alice302Bob253Carol28\begin{array}{|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{Age} \\ \hline 1 & \text{Alice} & 30 \\ \hline 2 & \text{Bob} & 25 \\ \hline 3 & \text{Carol} & 28 \\ \hline \end{array}

    Now, performing the renaming operation on relation RR, changing the relation name to PersonPerson, and renaming attributes IDID to PersonIDPersonID, NameName to FullNameFullName, and AgeAge to YearsYears:

    ρPerson(PersonID,FullName,Years)(R)\rho_{\text{Person}(\text{PersonID}, \text{FullName}, \text{Years})}(R)

    Resulting Relation:

    PersonIDFullNameYears1Alice302Bob253Carol28\begin{array}{|c|c|c|} \hline \textbf{PersonID} & \textbf{FullName} & \textbf{Years} \\ \hline 1 & \text{Alice} & 30 \\ \hline 2 & \text{Bob} & 25 \\ \hline 3 & \text{Carol} & 28 \\ \hline \end{array}

    In this example, the renaming operation changes the name of the relation RR to PersonPerson and updates the attribute names. Renaming is useful, especially when you need to alias a relation or adjust the output format of query results.

    Inner Join

    Natural Join

    The natural join operation in relational algebra combines two relations based on their common attributes (i.e., attributes with the same name), creating a new relation. The natural join operator is usually denoted by \bowtie.

    Formal Definition:

    Given two relations R(A1,A2,,An,B1,B2,,Bm)R(A_1, A_2, \ldots, A_n, B_1, B_2, \ldots, B_m) and S(B1,B2,,Bm,C1,C2,,Ck)S(B_1, B_2, \ldots, B_m, C_1, C_2, \ldots, C_k), their natural join RSR \bowtie S is defined as:

    RS={tt=(rs) and rR,sS and r[Bi]=s[Bi] for all common attributes Bi}R \bowtie S = \{ t \mid t = (r \cup s) \text{ and } r \in R, s \in S \text{ and } r[B_i] = s[B_i] \text{ for all common attributes } B_i \}

    The result of the natural join includes all combinations of common attribute values that are the same in both RR and SS, while removing duplicate common attributes.

    Assume we have two relations RR and SS with the following structure and data:

    Relation RR:

    IDNameDeptID1Alice1012Bob1023Carol103\begin{array}{|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{DeptID} \\ \hline 1 & \text{Alice} & 101 \\ \hline 2 & \text{Bob} & 102 \\ \hline 3 & \text{Carol} & 103 \\ \hline \end{array}

    Relation SS:

    DeptIDDeptName101HR102IT104Finance\begin{array}{|c|c|} \hline \textbf{DeptID} & \textbf{DeptName} \\ \hline 101 & \text{HR} \\ \hline 102 & \text{IT} \\ \hline 104 & \text{Finance} \\ \hline \end{array}

    Now, performing the natural join RSR \bowtie S on their common attribute DeptIDDeptID:

    RSR \bowtie S

    Resulting Relation:

    IDNameDeptIDDeptName1Alice101HR2Bob102IT\begin{array}{|c|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{DeptID} & \textbf{DeptName} \\ \hline 1 & \text{Alice} & 101 & \text{HR} \\ \hline 2 & \text{Bob} & 102 & \text{IT} \\ \hline \end{array}

    In this example, only tuples with the same DeptIDDeptID values in both RR and SS are combined and appear in the result. The tuple (3,Carol,103)(3, Carol, 103) in RR does not have a matching DeptIDDeptID in SS, so it does not appear in the result. Similarly, the tuple with DeptIDDeptID 104 in SS has no matching DeptIDDeptID in RR, so it is not included in the result.

    The natural join is commonly used in database queries, especially when combining multiple tables based on common fields, as it automatically removes duplicate common attributes, simplifying the query results.

    θ-Join

    The θ-join (Theta Join) in relational algebra is a more general join operation that allows tuples from two relations to be joined based on any condition (θ). The θ-join is denoted by RθSR \bowtie_{\theta} S, where θ\theta is a condition expression that may include comparison operators (such as ==, <<, >>, <=<=, >=>=, <><>, etc.).

    Formal Definition:

    Given two relations R(A1,A2,,An)R(A_1, A_2, \ldots, A_n) and S(B1,B2,,Bm)S(B_1, B_2, \ldots, B_m), and a condition θ\theta, the θ-join RθSR \bowtie_{\theta} S is defined as:

    RθS={tt=(rs),rR,sS, and r and s satisfy the condition θ}R \bowtie_{\theta} S = \{ t \mid t = (r \cup s), r \in R, s \in S, \text{ and } r \\ \text{ and } s \text{ satisfy the condition } \theta \}

    In a θ-join, the resulting relation contains all tuples (r,s)(r, s) that satisfy the condition θ\theta and includes all attributes from RR and SS.

    Assume we have two relations RR and SS with the following structure and data:

    Relation RR:

    IDNameAge1Alice302Bob253Carol28\begin{array}{|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{Age} \\ \hline 1 & \text{Alice} & 30 \\ \hline 2 & \text{Bob} & 25 \\ \hline 3 & \text{Carol} & 28 \\ \hline \end{array}

    Relation SS:

    DeptIDName101Alice102Bob103Dave\begin{array}{|c|c|} \hline \textbf{DeptID} & \textbf{Name} \\ \hline 101 & \text{Alice} \\ \hline 102 & \text{Bob} \\ \hline 103 & \text{Dave} \\ \hline \end{array}

    Now, performing the θ-join on RR and SS with the condition R.Name=S.NameR.Name = S.Name:

    RR.Name=S.NameSR \bowtie_{R.Name = S.Name} S

    Resulting Relation:

    IDNameAgeDeptID1Alice301012Bob25102\begin{array}{|c|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{Age} & \textbf{DeptID} \\ \hline 1 & \text{Alice} & 30 & 101 \\ \hline 2 & \text{Bob} & 25 & 102 \\ \hline \end{array}

    In this example, the condition for the θ-join is R.Name=S.NameR.Name = S.Name, so only tuples with matching NameName values in RR and SS are joined and appear in the result. Tuples with names that do not match (e.g., CarolCarol and DaveDave) are not included in the result.

    The θ-join is a flexible join operation that can combine two relations based on any condition. Its generality allows it to represent various join operations, including natural joins and equijoins.

    Equijoin

    As mentioned, the equijoin is a special case of the θ-join where the condition θ\theta is equality ==.

    Semi Join

    The semi-join is a relational algebra operation that selects tuples from one relation that match tuples in another relation. Unlike join operations, a semi-join result includes only attributes from the first relation and does not combine all attributes from both relations.

    Formal Definition:

    Given two relations R(A1,A2,,An)R(A_1, A_2, \ldots, A_n) and S(B1,B2,,Bm)S(B_1, B_2, \ldots, B_m), the semi-join RSR \ltimes S is defined as:

    RS={tRsS such that t and s satisfy the join condition θ}R \ltimes S = \{ t \in R \mid \exists s \in S \text{ such that } t \\ \text{ and } s \text{ satisfy the join condition } \theta \}

    where θ\theta is the join condition, typically t[Ai]=s[Bj]t[A_i] = s[B_j] (i.e., equality join condition).

    Assume we have two relations RR and SS with the following structure and data:

    Relation RR:

    IDNameDeptID1Alice1012Bob1023Carol1034Dave104\begin{array}{|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{DeptID} \\ \hline 1 & \text{Alice} & 101 \\ \hline 2 & \text{Bob} & 102 \\ \hline 3 & \text{Carol} & 103 \\ \hline 4 & \text{Dave} & 104 \\ \hline \end{array}

    Relation SS:

    DeptIDDeptName101HR102IT105Finance\begin{array}{|c|c|} \hline \textbf{DeptID} & \textbf{DeptName} \\ \hline 101 & \text{HR} \\ \hline 102 & \text{IT} \\ \hline 105 & \text{Finance} \\ \hline \end{array}

    Now, performing the semi-join RSR \ltimes S with the condition R.DeptID=S.DeptIDR.DeptID = S.DeptID:

    RSR \ltimes S

    Resulting Relation:

    IDNameDeptID1Alice1012Bob102\begin{array}{|c|c|c|} \hline \text{ID} & \text{Name} & \text{DeptID} \\ \hline 1 & \text{Alice} & 101 \\ \hline 2 & \text{Bob} & 102 \\ \hline \end{array}

    In this example, the semi-join selects tuples from RR where DeptIDDeptID has a match in SS, returning these matching tuples from RR. Unlike join operations, the result includes only attributes from RR, excluding attributes from SS.

    Summary: The semi-join is used to extract tuples from one relation that have matches in another relation without including the second relation's attributes. It is useful for query optimization and reducing data redundancy.

    Anti Join

    The anti-join is a relational algebra operation used to select tuples from one relation that do not have matches in another relation. In other words, an anti-join returns tuples that do not satisfy the join condition.

    Formal Definition:

    Given two relations R(A1,A2,,An)R(A_1, A_2, \ldots, A_n) and S(B1,B2,,Bm)S(B_1, B_2, \ldots, B_m), the anti-join RSR \mathbin{\ltimes} S (or R(RS)R \setminus (R \bowtie S)) is defined as:

    RS={ttR and there is no sS such that t and s satisfy the join condition θ}R \ltimes S = \{ t \mid t \in R \text{ and there is no } s \in S \text{ such that } t \\ \text{ and } s \text{ satisfy the join condition } \theta \}

    where θ\theta is the join condition, typically t[Ai]=s[Bj]t[A_i] = s[B_j] (i.e., equality join condition).

    Assume we have two relations RR and SS with the following structure and data:

    Relation RR:

    IDNameDeptID1Alice1012Bob1023Carol1034Dave104\begin{array}{|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{DeptID} \\ \hline 1 & \text{Alice} & 101 \\ \hline 2 & \text{Bob} & 102 \\ \hline 3 & \text{Carol} & 103 \\ \hline 4 & \text{Dave} & 104 \\ \hline \end{array}

    Relation SS:

    DeptIDDeptName101HR102IT105Finance\begin{array}{|c|c|} \hline \textbf{DeptID} & \textbf{DeptName} \\ \hline 101 & \text{HR} \\ \hline 102 & \text{IT} \\ \hline 105 & \text{Finance} \\ \hline \end{array}

    Now, performing the anti-join RSR \ltimes S with the condition R.DeptID=S.DeptIDR.DeptID = S.DeptID:

    RSR \ltimes S

    Resulting Relation:

    IDNameDeptID3Carol1034Dave104\begin{array}{|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{DeptID} \\ \hline 3 & \text{Carol} & 103 \\ \hline 4 & \text{Dave} & 104 \\ \hline \end{array}

    In this example, the anti-join selects tuples from RR where DeptIDDeptID does not have a corresponding match in SS. Tuples with DeptIDDeptID values 103 and 104 are included in the result because they have no matches in SS.

    Summary: The anti-join is used to find tuples from one relation that do not match any tuples in another relation based on the join condition. It is useful for identifying records that do not meet certain criteria, such as finding entities not included in a specific group or condition.

    Division

    In relational algebra, the division operation is a higher-level operation used to select tuples from one relation that are associated with all tuples in another relation. In other words, the result of a division operation includes those tuples from the dividend relation that match every tuple in the divisor relation.

    Formal Definition:

    Given two relations R(A1,A2,,An)R(A_1, A_2, \ldots, A_n) and S(B1,B2,,Bm)S(B_1, B_2, \ldots, B_m), where B1,B2,,BmB_1, B_2, \ldots, B_m is a subset of A1,A2,,AnA_1, A_2, \ldots, A_n (denoted as BiAiB_i \subseteq A_i), the division operation R÷SR \div S is defined as:

    R÷S={t[A1,A2,,Anm]sS,rR, such that r[B1,B2,,Bm]=s}R \div S = \\ \{ t[A_1, A_2, \ldots, A_{n-m}] \mid \forall s \in S, \exists r \in R, \\ \text{ such that } r[B_1, B_2, \ldots, B_m] = s \}

    The resulting relation includes tuples that represent the values of attributes A1,A2,,AnmA_1, A_2, \ldots, A_{n-m} in RR that are associated with every tuple in SS.

    Suppose we have two relations, RR and SS, with the following structures and data:

    Relation RR:

    StudentSubjectAliceMathAlicePhysicsBobMathBobPhysicsCarolMathDaveMath\begin{array}{|c|c|} \hline \textbf{Student} & \textbf{Subject} \\ \hline \text{Alice} & \text{Math} \\ \hline \text{Alice} & \text{Physics} \\ \hline \text{Bob} & \text{Math} \\ \hline \text{Bob} & \text{Physics} \\ \hline \text{Carol} & \text{Math} \\ \hline \text{Dave} & \text{Math} \\ \hline \end{array}

    Relation SS:

    SubjectMathPhysics\begin{array}{|c|} \hline \textbf{Subject} \\ \hline \text{Math} \\ \hline \text{Physics} \\ \hline \end{array}

    Now, performing the division operation R÷SR \div S yields the students who have taken all the subjects listed in SS:

    R÷SR \div S

    Resulting Relation:

    StudentAliceBob\begin{array}{|c|} \hline \textbf{Student} \\ \hline \text{Alice} \\ \hline \text{Bob} \\ \hline \end{array}

    In this example, RR is a relation that includes students and the subjects they have taken, while SS is a list of subjects of interest. The result of the division operation is the set of students who have taken every subject in SS, so only AliceAlice and BobBob appear in the result, as they have taken both MathMath and PhysicsPhysics.

    Conclusion: The division operation is particularly useful in database queries when looking for tuples that meet all specified conditions. It helps in finding a subset of tuples that fulfill a particular requirement.

    Outer Join

    In relational algebra, an outer join is an extended join operation that preserves tuples that do not satisfy the join condition. Outer joins are classified into three types: Left Outer Join, Right Outer Join, and Full Outer Join. The result of these operations includes not only the tuples that satisfy the join condition but also the unmatched tuples, with NULLNULL used to fill in missing attribute values.

    Left Outer Join

    Formal Definition:

    Given two relations R(A1,A2,,An)R(A_1, A_2, \ldots, A_n) and S(B1,B2,,Bm)S(B_1, B_2, \ldots, B_m), the Left Outer Join RSR⟕S is defined as:

    RS={tt=(rs) and rR,sS satisfy the join condition θ}{rrR and there is no sS that satisfies the join condition θ}R⟕S = \{ t \mid t = (r \cup s) \text{ and } r \in R, s \in S \text{ satisfy the join condition } \theta \} \\ \cup \{ r \mid r \in R \text{ and there is no } s \in S \text{ that satisfies the join condition } \theta \}

    The Left Outer Join returns all tuples from RR. If a tuple in RR does not have a match in SS, the corresponding SS attributes in the result are filled with NULLNULL.

    Relation RR:

    IDNameDeptID1Alice1012Bob1023Carol1034Dave104\begin{array}{|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{DeptID} \\ \hline 1 & \text{Alice} & 101 \\ \hline 2 & \text{Bob} & 102 \\ \hline 3 & \text{Carol} & 103 \\ \hline 4 & \text{Dave} & 104 \\ \hline \end{array}

    Relation SS:

    DeptIDDeptName101HR102IT105Finance\begin{array}{|c|c|} \hline \textbf{DeptID} & \textbf{DeptName} \\ \hline 101 & \text{HR} \\ \hline 102 & \text{IT} \\ \hline 105 & \text{Finance} \\ \hline \end{array}

    Left Outer Join RSR⟕S Result:

    IDNameDeptIDDeptName1Alice101HR2Bob102IT3Carol103NULL4Dave104NULL\begin{array}{|c|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{DeptID} & \textbf{DeptName} \\ \hline 1 & \text{Alice} & 101 & \text{HR} \\ \hline 2 & \text{Bob} & 102 & \text{IT} \\ \hline 3 & \text{Carol} & 103 & \text{NULL} \\ \hline 4 & \text{Dave} & 104 & \text{NULL} \\ \hline \end{array}

    In this example, CarolCarol and DaveDave have DeptIDDeptIDs that do not match any DeptIDDeptID in SS, so the DeptNameDeptName column is NULLNULL for these rows in the result.

    Right Outer Join

    Formal Definition:

    The Right Outer Join is similar to the Left Outer Join, except that it preserves all tuples from the second relation. If a tuple in the second relation has no match in the first relation, the corresponding attributes from the first relation are filled with NULLNULL.

    RS={tt=(rs) and rR,sS satisfy the join condition θ}{ssS and there is no rR that satisfies the join condition θ}R⟖S = \{ t \mid t = (r \cup s) \text{ and } r \in R, s \in S \text{ satisfy the join condition } \theta \} \\ \cup \{ s \mid s \in S \text{ and there is no } r \in R \text{ that satisfies the join condition } \theta \}

    Right Outer Join RSR⟖S Result:

    IDNameDeptIDDeptName1Alice101HR2Bob102ITNULLNULL105Finance\begin{array}{|c|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{DeptID} & \textbf{DeptName} \\ \hline 1 & \text{Alice} & 101 & \text{HR} \\ \hline 2 & \text{Bob} & 102 & \text{IT} \\ \hline \text{NULL} & \text{NULL} & 105 & \text{Finance} \\ \hline \end{array}

    In this example, the tuple with DeptIDDeptID 105 in SS has no matching tuple in RR, so the IDID and NameName columns are NULLNULL for this row in the result.

    Full Outer Join

    Formal Definition:

    A Full Outer Join returns all tuples from both relations, including those that do not match. If a tuple in one relation has no corresponding match in the other, the attributes from the other relation are filled with NULLNULL.

    RS=(RS)(RS)R⟗S = (R⟕S) \cup (R⟖S)

    Full Outer Join RSR⟗S Result:

    IDNameDeptIDDeptName1Alice101HR2Bob102IT3Carol103NULL4Dave104NULLNULLNULL105Finance\begin{array}{|c|c|c|c|} \hline \textbf{ID} & \textbf{Name} & \textbf{DeptID} & \textbf{DeptName} \\ \hline 1 & \text{Alice} & 101 & \text{HR} \\ \hline 2 & \text{Bob} & 102 & \text{IT} \\ \hline 3 & \text{Carol} & 103 & \text{NULL} \\ \hline 4 & \text{Dave} & 104 & \text{NULL} \\ \hline \text{NULL} & \text{NULL} & 105 & \text{Finance} \\ \hline \end{array}

    In a Full Outer Join, the result includes all tuples from both RR and SS, with unmatched portions filled with NULLNULL.

    Conclusion:

    • Left Outer Join preserves all tuples from the left table, filling unmatched parts with NULLNULL.
    • Right Outer Join preserves all tuples from the right table, filling unmatched parts with NULLNULL.
    • Full Outer Join preserves all tuples from both tables, filling unmatched parts with NULLNULL.

    These outer join operations are particularly useful in database queries when we want to retain unmatched records in the result.

    Share with the post url and description