With so much choice out there, it can be easy to concentrate on emerging technologies rather than the basics. But from time-time, going over the basics can be beneficial to get the best performance. It's a term used a lot in sport, and it's no different with Software Development.
In this article, I'm going to go over some basic SQL statements, in particular, joining tables together. The examples will be written for SQL Server, but should also work for MySQL.
The Example
The example below shows part of an eCommerce database. There are two tables. A Product table and a OrderProduct table. The Product table has a one-to-many relationship with the OrderProduct Table.
Product
ProductId
Name
Price
1
T-Shirt
10.00
2
Shoes
25.00
3
Hat
4.00
4
Tie
2.00
5
Jumper
20.00
OrderProduct
OrderProductId
ProductId
Name
Price
1
1
T-Shirt
10.00
2
1
T-Shirt
10.00
3
1
T-Shirt
10.00
4
2
Shoes
25.00
5
2
Shoes
25.00
6
5
Jumper
20.00
Joining
Both tables join each other by using the ProductId. The ProductId on the Product table joins the OrderProduct table by the exact same key.
Inner Join
Inner join allows you to return the results of both tables, as long as the joining column exists in both table records. If the joining column record does not exist on either table record, it will not be returned.
In the example below, we are inner joining both Product and OrderProduct together and returning the columns from the Product table in the SELECT statement. Because ProductId 3 and 4 do not exist in the OrderProduct table, they do not get returned.
SELECT Product.ProductId, Product.[Name], Product.Price FROM dbo.Product
INNER JOIN dbo.OrderProduct ON Product.ProductId=OrderProduct.ProductId
ORDER BY Product.ProductId
ProductId
Name
Price
1
T-Shirt
10.00
1
T-Shirt
10.00
1
T-Shirt
10.00
2
Shoes
25.00
2
Shoes
25.00
5
Jumper
20.00
But wait, this is returning duplicate records, how do we only return unique records from the Product table? By using GROUP BY. This allows us to group the results by particular columns. The example below shows us grouping the results by ProductId and Name.
SELECT Product.ProductId, Product.[Name] FROM dbo.Product
INNER JOIN dbo.OrderProduct ON Product.ProductId=OrderProduct.ProductId
GROUP BY Product.ProductId, Product.[Name]
ORDER BY Product.ProductId
ProductId
Name
1
T-Shirt
2
Shoes
5
Jumper
Now, we can amend this further by working how much was spent for each product by using the SUM function. As you can probably guess, the SUM function adds up all the columns and returns the total. We can also use the COUNT function to work out the quantity for each product. You can only use these functions when grouping data together.
SELECT Product.ProductId, Product.[Name], COUNT(OrderProduct.OrderProductId) AS Quantity, SUM(OrderProduct.Price) AS TotalPrice FROM dbo.Product
INNER JOIN dbo.OrderProduct ON Product.ProductId=OrderProduct.ProductId
GROUP BY Product.ProductId, Product.[Name]
ORDER BY Product.ProductId
ProductId
Name
Quantity
TotalPrice
1
T-Shirt
3
30.00
2
Shoes
2
50.00
5
Jumper
1
50.00
Left Join and Right Join
Left join (or left outer join) allows you to join two tables, but the initial table will return all records, regardless of whether the key exists in the joining table.
SELECT Product.ProductId, Product.[Name] FROM dbo.Product
LEFT JOIN dbo.OrderProduct ON Product.ProductId=OrderProduct.ProductId
GROUP BY Product.ProductId, Product.[Name]
ORDER BY Product.ProductId
ProductId
Name
1
T-Shirt
2
Shoes
3
Hat
4
Tie
5
Jumper
Right join does the exact opposite. It will return all results from the joining table, even if the join doesn't exist in the initial table. But it won't return results from the initial table if it does not exist in the joining table.
SELECT Product.ProductId, Product.[Name] FROM dbo.Product
RIGHT JOIN dbo.OrderProduct ON Product.ProductId=OrderProduct.ProductId
GROUP BY Product.ProductId, Product.[Name]
ORDER BY Product.ProductId
ProductId
Name
1
T-Shirt
2
Shoes
5
Jumper
Again, with left and right joins, you can take advantage of grouping to get the totals in the exact same way that you would do it for inner joins.
Cross Join
Cross join is an interesting one. For each record in the initial table, it will return all the results in the joining table. This leads to a large number of results being returned.
As I don't want a large number of results returned, I've put in a where clause that only returns a Product if the ID is 1 or 2.
SELECT Product.ProductId, Product.[Name] FROM dbo.Product
CROSS JOIN dbo.OrderProduct
WHERE Product.ProductId IN (1,2)
ORDER BY Product.ProductId
ProductId
Name
1
T-Shirt
1
T-Shirt
1
T-Shirt
1
T-Shirt
1
T-Shirt
1
T-Shirt
2
Shoes
2
Shoes
2
Shoes
2
Shoes
2
Shoes
2
Shoes
I haven't used cross joins on too many occasions in my career. The only real example that I can think off where it could be beneficial is if you were creating a sports league and you wanted to output all the fixtures. If you had a Team table, you could cross join it with the same table and exclude records where both ID's are the same.
Conclusion
It's worth noting that joins can be expensive to the performance of your SQL Server. In our example above, we are only working with a very limited number of tables and records. But the more tables you join and the more records you have within those tables, the longer it will return the results.
When planning out your database architecture, always have this in your mind!
About the author
David Grace
Senior .NET web developer | ASP.NET Core | C# | Software developer