Table of Contents
Aggregate Functions
SUM
AVG
COUNT
GROUP BY and HAVING
GROUP BY
HAVING
DISTINCT and NULL Handling
DISTINCT
NULL
Math Functions and SQL Arithmetic
Math Functions
SQL Arithmetic
CASE Statements
Simple CASE
Searched CASE
JOIN Operations
INNER JOIN
LEFT JOIN
RIGHT JOIN
FULL OUTER JOIN
Sample test data files for Hands-on
Employee: https://github.com/vipinputhanveetil/ctb_blog_demo_files/blob/main/databricks_files/employees.csv
Orders: https://github.com/vipinputhanveetil/ctb_blog_demo_files/blob/main/databricks_files/orders.csv
Products: https://github.com/vipinputhanveetil/ctb_blog_demo_files/blob/main/databricks_files/products.csv
Sales: https://github.com/vipinputhanveetil/ctb_blog_demo_files/blob/main/databricks_files/sales.csv
1. Aggregate Functions
SUM
The SUM
function calculates the sum of a set of values.
SELECT SUM(amount) AS total_sales FROM sales;
This query calculates the total sales amount from the sales
table.
AVG
The AVG
function calculates the average of a set of values.
SELECT AVG(amount) AS avg_amount FROM sales;
This query calculates the average amount from the sales
table.
COUNT
The COUNT
function counts the number of rows returned by a query.
SELECT COUNT(*) AS total_orders FROM sales;
This query counts the total number of orders in the sales
table.
2. GROUP BY and HAVING
GROUP BY
The GROUP BY
clause groups rows that have the same values into summary rows.
SELECT region, SUM(amount) AS total_sales FROM sales GROUP BY region;
This query groups sales by region and calculates total sales for each region.
HAVING
The HAVING
clause is used to filter records returned by the GROUP BY
clause.
SELECT region, SUM(amount) AS total_sales FROM sales GROUP BY region HAVING SUM(amount) > 500;
This query selects regions with total sales greater than 500.
3. DISTINCT and NULL Handling
DISTINCT
The DISTINCT
keyword is used to return unique values in a specified column.
SELECT DISTINCT region FROM sales;
This query selects distinct regions from the sales
table.
NULL
Handling NULL values using IS NULL
and IS NOT NULL
.
SELECT product, amount FROM sales WHERE region IS NOT NULL;
This query selects products with NULL region values.
4. Math Functions and SQL Arithmetic
Math Functions
SQL supports standard math functions like ROUND
, CEIL
, FLOOR
, ABS
, etc.
SELECT product, ROUND(amount/1.1, 4) AS rounded_amount FROM sales;
This query rounds amount
to four decimal places.
SQL Arithmetic
Basic arithmetic operations (+, -, *, /) can be performed in SQL.
SELECT product, amount * 1.1 AS new_amount FROM sales;
This query calculates 10% increase in the amount
column.
5. CASE Statements
Simple CASE
The CASE
statement evaluates a list of conditions and returns one of multiple possible results.
SELECT product,
CASE region
WHEN 'North' THEN 'N'
WHEN 'South' THEN 'S'
ELSE 'Other'
END AS region_code
FROM sales;
This query assigns region codes based on the region value.
Searched CASE
The CASE
statement can also be used for searched conditions.
SELECT product,
CASE
WHEN amount > 200 THEN 'High'
WHEN amount > 100 THEN 'Medium'
ELSE 'Low'
END AS amount_category
FROM sales;
This query categorizes products based on the amount.
6. JOIN Operations
INNER JOIN
An INNER JOIN
returns rows when there is at least one match in both tables.
SELECT s.product, s.amount, p.category
FROM sales s
INNER JOIN products p
ON s.product_id = p.product_id;
This query joins sales
and products
tables based on product_id
.
LEFT JOIN
A LEFT JOIN
returns all rows from the left table and matching rows from the right table.
SELECT s.product, s.amount, p.category
FROM sales s
LEFT JOIN products p
ON s.product_id = p.product_id;
This query retrieves all sales and matches them with products.
RIGHT JOIN
A RIGHT JOIN
returns all rows from the right table and matching rows from the left table.
SELECT s.product, s.amount, p.category
FROM sales s
RIGHT JOIN products p
ON s.product_id = p.product_id;
This query retrieves all products and matches them with sales.
FULL OUTER JOIN
A FULL OUTER JOIN
returns all rows when there is a match in either table.
SELECT s.product, s.amount, p.category
FROM sales s
FULL OUTER JOIN products p
ON s.product_id = p.product_id;
This query retrieves all sales and products, with matching rows from both tables.
Conclusion
Mastering these SQL queries will empower you to manipulate and analyze your data effectively. Whether you are a beginner or an experienced SQL user, understanding these fundamental concepts is essential for advanced data analysis and reporting.