SQL Joins Interview Questions

SQL Joins

SQL Joins Interview Questions: What are different types of SQL JOIN? Why do we require SQL JOINs? What is a HASH JOIN? How to use it?

Nervous about your interview days? Giving your first interview ever for Oracle Database Manager? Do not worry. After reading this article, you will go through some important theories that will be included in fifty percent of your interview trivia rounds. This can be asked in university as well as job interviews.

SQL Joins Interview Questions

BASIC LEVEL:

  1. What is a JOIN in SQL?

In SQL, JOIN is a clause used to combine data/rows from two or more tables depending upon a matching column between them. Let us walk through an example. 

Customers
CustomerID *
CustomerName
ContactNumber
Orders
CustomerID *
OrderID
OrderDate

Note:  *  besides the column name means PRIMARY KEY.

Both tables have a common field, CustomerID which is used to relate the tables with each other. Suppose we want all the CustomerName and their OrderID, we will perform the following Query:

SELECT Customers.CustomerName, Orders.OrderId

FROM Customers INNER JOIN Orders

ON Customer.CustomerID = Orders.CustomerID

  1. Why do we require SQL JOINs?

A JOIN is used to logically combine data from two or more columns in the database systems to extract meaningful data. Since the database can be of colossal size and we do not want to see all the columns while dealing with a particular requirement, the JOIN clause is integral. Therefore, employing the matching column, we make a logically bonding and extract data efficiently.

  1. What are different types of SQL JOIN?

Here are some basic JOIN types:

  • INNER JOIN: Returns records ONLY that are matching in both tables.
  • LEFT (OUTER) JOIN: Returns all the records in table1 AND records of table2 that match with table1.
  • RIGHT (OUTER) JOIN: Returns all the records in table2 AND records of table1 that match with table2.
  • FULL OUTER JOIN: Returns ALL the records of table1 AND table2 irrespective of their repetition.

Look at the following diagram to understand the JOIN operations precisely:

Simple demonstration of SQL query to perform RIGHT OUTER JOIN:

SELECT Customers.CustomerName, Orders.OrderId

FROM Customers RIGHT OUTER JOIN Orders

ON Customer.CustomerID = Orders.CustomerID

INTERMEDIATE LEVEL:

  1. Explain CROSS JOIN, EQUI JOIN and SEMI JOIN. 
  • CROSS JOIN:

In CROSS JOIN, the result displays a number of rows of table1 MULTIPLIED by a number of rows of table2. The CROSS JOIN, if used without WHERE clause it displays result just as CARTESIAN PRODUCT.

Suppose table1 has eight rows and table2 has five rows, then the CROSS JOIN without WHERE  clause will include 8×5=40 rows; thus, each record in table1 has five occurrences in the final result.

While, WHERE is used, it functions like INNER JOIN.

An alternative way to achieve the CROSS JOIN result is by explicitly mentioning the names of columns after SELECT and names of tables after FROM.

  • EQUI JOIN:

In EQUI JOIN, the result displays the rows of two tables that match with each other provided the matching criteria specified. This can be achieved in two ways.

First, an equal sign (=) is used in WHERE clause. 

In other way, use JOIN clause followed by ON keyword and specify the column name with associated table names.

  • SEMI JOIN:

In SEMI JOIN, the yield is the tuples in the table1 having the same matching column values in table2. SEMI JOIN is achieved by using either IN or EXISTS keyword in the SQL Query. As it is a bit tricky to comprehend, let us use our Customers and Orders table.

Customers(CustomerID, CustomerName, ContactNumber)

Orders(CustomerID, OrderID, OrderDate)

Let the query be:

SELECT Customers.CustomerName

FROM Customers WHERE

Customer.CustomerID IN (

SELECT Orders.CustomerID

FROM Orders )

Thus, this will display Customer Names from the Customers table having to match CustomerID with the Orders table.

  1. Explain NATURAL JOIN.

In EQUI JOIN, using comparison operation (=) one can display column(s) values of associated tables as per the matching condition or column.

Similarly,  NATURAL JOIN is performed but the only difference is that the unique names of columns will appear associated with each table.

Certain points are necessary to keep in mind while using NATURAL JOIN:

  • The tables must have one or more pairs of IDENTICALLY named columns.
  • Try not to use ON keyword while using NATURAL JOIN.
  • The datatypes of columns must align.

Syntax:

SELECT *

FROM Customers NATURAL JOIN Orders

  1. Explain MULTIPLE JOIN.

In Multiple JOIN, the same different types of JOINs are present more than once in the query used to retrieve data. This helps us to overcome issues of Relational DBMS.

Suppose there are three tables: Customers, Orders and Payment and we need to find the records of customers that have made placed orders and made payments in cash. Well, it makes a hell if we go and sought each record of each database. Thus, multiple JOIN clauses aids in time-saving and effortlessly display results as required.

 The Venn diagram vividly explains the idea behind it.

ADVANCED LEVEL:

  1. Explain MERGE JOIN in SQL.

In MERGE JOIN ( or Sort-Merge Join), two requirements need to be fulfilled in order to perform it:

  • The rows from both the tables must be sorted on join keys/merge columns.
  • It requires at least one EQUI JOIN (=) expression.

Mainly, it matches one row of table1 with the row of table2, and if they match, the row appears in the result, and this process goes on till all rows are processed. The lesser of one row is ignored and did not appear in the result.

Since all the rows are pre-sorted the result is processed faster on large datasets.

  1. What is a HASH JOIN? How to use it?

 HASH JOIN is typically used when inputs are large, unindexed, and unsorted. They have two types of inputs: build input and probe input. The query optimizer distributes the role so that the smaller of the two inputs is the build input.

One of the hash join variants can perform duplicate removal and grouping viz. 

SUM( payment) GROUP BY PaymentMode

The query is a type of HASH JOIN:

SELECT Customers.CustomerName, Orders.OrderDate

FROM Customers JOIN Orders

ON Customers.CustomerID = Orders.CustomerID

ORDER BY Orders.OrderDate

  1. What is the best time to use CROSS JOIN?

A CROSS JOIN is nothing but a CARTESIAN PRODUCT of each row of associated tables. Thus, the combination of the rows can be helpful in many scenarios, like suppose one runs a company and want to make sure each minute a task is carried out in the firm. Therefore, one has a row named MINUTE when tasks were carried out and a table with an HOUR row. To monitor, we will use CROSS JOIN as follows:

SELECT H.hours, M.minutes

FROM Hours H CROSS JOIN Minutes M 

  1.  Create an example scenario (table with column names) of a one-to-many relationship and describe steps to have column names present in table1 OR table2 but NOT IN BOTH.

Step 1: Create tables having one-to-many relationship.

In the tables created above: Customers and Orders, one customer can have many orders. However, one order cannot belong to multiple customers. Thus we will use them here.

Customers
CustomerID *
CustomerName
ContactNumber
Orders
CustomerID *
OrderID
OrderDate

Step 2: Logic to present the result : In Customers OR Orders, but NOT IN both.

In simple words, it is like a ‘bite from the middle’ of two tables.

  1. FULL OUTER JOIN on A and B – new A ( A’).
  2. LEFT OUTER JOIN on A’ and B – new B (B’).
  3. Use NOT IN (one column match) or NOT EXISTS ( multiple column match) to include A’ data but not the one in B.’

To explain simply, let us say:

A(1,2,3) and B(3,4,5)

Now, by applying above steps we get:

  1. A’(1,2,3,4,5) 
  2. B’(1,2,3)
  3. Result – A’(1,2)
  1. Explain ANSI and NON ANSI JOIN.

There are times when outer tables have NULL values and providing filters in WHERE clause creates confusion as well as provide wrong information. 

Suppose in the Customers table we need names of customers that placed orders on a date after 2020-02-02, i.e., date>= ‘2020-02-02’. There might be columns in Orders with NULL values, and such records are ignored in the result. Thus, to overcome these problems, ANSI and NON-ANSI JOIN methods are bliss.

  • ANSI JOIN:

In ANSI JOIN syntax, the filters on columns of the outer joined table ( here, Orders) are mentioned in the WHERE clause:

SELECT Customers.CustomerName, Orders.OrderDate

FROM Customers LEFT OUTER JOIN Orders 

ON Customer.CustomerID=Orders.CustomerID AND OrderDate >= ‘2020-02-02’ 

ORDER BY Customers.CustomerName

  • NON-ANSI JOIN:

In NON-ANSI JOIN syntax, the ‘+’ is used to represent that the outer joined table column might have NULL values in it.

SELECT Customers.CustomerName, Orders.OrderDate

FROM Customers, Orders 

WHERE Customer.CustomerID=Orders.CustomerID AND OrderDate >= ‘2020-02-02’ (+)

ORDER BY Customers.CustomerName

The above questions are challenging yet the most basic ones that a programmer must know when going for an interview. In interviews, one might be asked to give a simple example of tables or write queries to perform any JOIN, thus prepare accordingly. Strengthen your coding skills and flex the coding muscles on the interview day!

Also read: Coding Salary: How Much Do Coders Make?

SQL Joins Interview Questions

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top