Recursive Common Table Expressions

Recursion (in SQL) is a process in which a query executes itself repeatedly for a given number of time to obtain a result set.

Recursive Query: When a query references a recursive CTE, it is referred to as Recursive Query. A common use of recursive queries is to return hierarchical data.

Recursive Common Table Expression: When an initial CTE is repeatedly executed to return subsets of data until the complete result set is obtained, it becomes a recursive Common Table Expression.

The code required to run a recursive query within a SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement can be greatly simplified by the use of recursive CTE.

Structure of a Recursive CTE

The structure of a recursive CTE is similar to recursive routines in other programming languages. A recursive CTE can return multiple rows whereas a recursive routine in other languages returns a scalar value.

A recursive CTE consists of three elements:
  • Invocation – The invocation of the recursive CTE consists of one or more CTE_query_definitions joined by UNION ALL, UNION, EXCEPT, or INTERSECT operators.
    These query definitions are referred to as anchor members because they form the base result set of the CTE structure.
    CTE_query_definitions are considered anchor members unless they reference the CTE itself. All anchor-member query definitions must be positioned before the first recursive member definition, and a UNION ALL operator must be used to join the last anchor member with the first recursive member.
  • Recursive Invocation – Recursive invocation includes one or more CTE_query_definitions joined by UNION ALL operators that reference the CTE itself. These query definitions are referred to as recursive members.
  • Termination check – The termination check is not explicit, it is implicit. When no rows are returned from the previous invocation, recursion stops.

Common Table Expressions (CTE) in SQL

A Common Table Expression (CTE) is a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE can only be referenced by the statement that immediately follows it. A CTE allows access to functionality within that single SQL statement that was previously only available through use of functions, temp tables, cursors, and so on.

A CTE can also be thought of a disposable view. No indexes or additional statistics are stored for a CTE, and it functions as a shorthand for a sub-query.

A CTE is not stored as an object and lasts only for the duration of the query. So, it is similar to a derived table. However, unlike a derived table, a CTE can be self-referencing (recursive CTE) and referenced multiple times in the same query.

A CTE can be defined in user-defined routines, such as functions, stored procedures, triggers, or views.

When could a Common Table Expression be used?

A CTE could be used to fulfill the following requirements:
  • To create a recursive query.
  • When the general use of a view is not required, a CTE can be substituted for a view; that is, you do not have to store the definition in metadata.
  • A CTE can be used to enable grouping by a column that is derived from a scalar sub-select, or a function that is either not deterministic or has external access.
  • A CTE can be used to reference the resulting table multiple times in the same statement.

Structure of a CTE

WITH expression_name [ ( column_name [,...n] ) ]
AS
( CTE_query_definition )
SELECT <column_list>
FROM expression_name;

Example

;WITH WebsiteDetails AS -- notice the semicolon here
(
SELECT ROW_NUMBER() OVER (ORDER BY in_website_id) AS [No.],
in_website_id, vc_website_domain_name
FROM dbo.website
WHERE bt_active = 1
)
SELECT * FROM WebsiteDetails

Note: The semicolon before WITH is used basically to terminate previous SQL statements to avoid errors.

SELECT statement – Order of Execution of SQL Clauses

SELECT statements retrieves rows from the database and enables the selection of one or many rows or columns from one or many tables. They do not modify data in any way and are simply a method of looking at the data stored in a single table or in many related tables.

SELECT statements do not generate a transaction in Microsoft SQL Server either.

Syntax of the SELECT statement

SELECT [ ALL | DISTINCT ]
[TOP ( expression ) [PERCENT] [ WITH TIES ] ]
< select_list >
[ INTO new_table ]
[FROM {table_name | view_name}[(optimizer_hints)]
[[, {table_name2 | view_name2}[(optimizer_hints)]
[..., {table_name16 | view_name16}[(optimizer_hints)]]]
[WHERE <search_condition>]
[GROUP BY clause]
[HAVING <search_condition>]
[ORDER BY clause]
[COMPUTE clause]
[FOR BROWSE]

Logical Processing Order of the SELECT statement


The order execution of the clauses in the SELECT statement is significant. Given below is the order in which SQL clauses get executed:
  1. FROM
  2. ON
  3. JOIN
  4. WHERE
  5. GROUP BY
  6. WITH CUBE or WITH ROLLUP
  7. HAVING
  8. SELECT
  9. DISTINCT
  10. ORDER BY
  11. TOP

Differentiate between local and global temporary tables in SQL

In SQL Server, temporary tables can be created in runtime and can do all sorts of things which can be achieved by a normal table. However, since these are temporary tables, their scope is limited. There are two types of temporary tables:
  • Local temporary tables
  • Global temporary tables
Both of these temporary tables are physical tables which are created within the Temporary Tables folder in tempdb database. Temporary tables are automatically dropped when they go out of scope, unless they are explicitly dropped using DROP TABLE.
  • Local temporary tables are visible only in the current session, i.e; are only available to the current connection for the user.
  • Local temporary tables are deleted or destroyed after the user disconnects from the instance of SQL Server.
  • Local temporary tables are are automatically destroyed at the termination of the stored procedure or session that created them.
  • Local temporary table name is prefixed with hash ("#") sign.
  • Global temporary tables are visible to all sessions and all users.
  • Global temporary tables are deleted or dropped when all users referencing the table disconnect from the instance of SQL Server.
  • Global temporary tables are dropped automatically when the last session using the temporary table has completed.
  • Global temporary table name is prefixed with a double hash ("##") sign.

Table creation statements
  • Table variables (DECLARE @tmp_surgeon TABLE) are visible only to the connection that creates it, and are deleted when the batch or stored procedure ends.
  • Local temporary tables (CREATE TABLE #tmp_surgeon) are visible only to the connection that creates it, and are deleted when the connection is closed.
  • Global temporary tables (CREATE TABLE ##tmp_surgeon) are visible to everyone, and are deleted when all connections that have referenced them have closed.
  • Tempdb permanent tables (USE tempdb CREATE TABLE tmp_surgeon) are visible to everyone, and are deleted when the server is restarted.

Local temporary table – Usage

CREATE TABLE #tmp_local (in_user_id int, vc_username varchar(50), vc_address varchar(150))

INSERT INTO #tmp_local VALUES (1, 'Rohan', 'India');

SELECT * FROM #tmp_local

After execution of all these statements, if you close the query window and again execute "INSERT" or "SELECT" Command, it will throw the following error:

Msg 208, Level 16, State 0, Line 1
Invalid object name '#tmp_local'.

Global temporary table – Usage

CREATE TABLE ##tmp_global (in_user_id int, vc_username varchar(50), vc_address varchar(150))

INSERT INTO ##tmp_global VALUES (1, 'Rohan', 'India');

SELECT * FROM ##tmp_global

Global temporary tables are visible to all SQL Server connections. Once you create a global temporary table, it becomes visible to all the users.

Count the number of occurrence of a character in a given string – SQL

Count the number of occurrence of a 'o' in the given string: Ram is a good boy.

DECLARE @vc_string varchar(40)
SET @vc_string = 'Ram is a good boy.'

SELECT LEN(@vc_string) - LEN(REPLACE(@vc_string, 'o', '')) AS in_count

Output: 3

Differentiate between @@IDENTITY, SCOPE_IDENTITY, IDENT_CURRENT in SQL

Identity columns, also known as AUTO INCREMENT column, is the column whose value is auto incremented by SQL Server on each insert. Data insertion in an IDENTITY column is automatic (not manual - we do not insert any data manually in the identity column, the data is inserted for that IDENTITY column automatically by SQL Server, depending upon the Identity Increment value).

When we need to retrieve the latest inserted row information through SQL Query, we can use SCOPE_IDENTITY, IDENT_CURRENT, and @@IDENTITY functions. All three functions return the last-generated identity values, however, these functions differ on the scope and session on which last is defined.

  • @@IDENTITY – It returns the last identity value generated for any table in the current session, across all scopes. @@IDENTITY is not limited to a specific scope.
    Ex: Suppose we create an insert trigger on table which inserts a row in another table which generates an identity column. Then @@IDENTITY returns that identity record which is created by trigger.
  • SCOPE_IDENTITYIt returns the last identity value generated for any table in the current session and the current scope.
    Ex: Suppose we create an insert trigger on table which inserts a row in another table which generates an identity column, then SCOPE_IDENTITY result is not affected but if a trigger or a user defined function is affected on the same table that produced the value returns that identity record then SCOPE_IDENTITY returns that identity record which is created by trigger or a user defined function.
  • IDENT_CURRENT – It returns the last identity value generated for a specific table in any session and any scope. It is not affected by scope and session and only depends on a particular table. IDENT_CURRENT returns the identity value for a specific table in any session or any scope.

Example: Lets assume that there are two tables, X and Y and both have identity columns. Define an INSERT trigger on X. When a row is inserted to X, the trigger fires and inserts a new row in Y. We can identify two scopes here:
  1. INSERT on X
  2. INSERT on Y by the trigger

@@IDENTITY and SCOPE_IDENTITY will return different values at the end of an INSERT statement on X. @@IDENTITY will return the last identity column value inserted across any scope in the current session. So, this is the value inserted in Y.

SCOPE_IDENTITY() will return the IDENTITY value inserted in X. This was the last insert that occurred in the same scope. The SCOPE_IDENTITY() function will return null value if the function is invoked before any INSERT statements into an identity column occur in the scope.

Note:
  1. Failed statements and transactions can change the current identity for a table and create gaps in the identity column values.
  2. The identity value is never rolled back even though the transaction that tried to insert the value into the table is not committed. For example, if an INSERT statement fails because of an IGNORE_DUP_KEY violation, the current identity value for the table is still incremented.

Example:

CREATE TABLE X(id int IDENTITY);
CREATE TABLE Y(id int IDENTITY(100,1));

GO

CREATE TRIGGER XIns ON X FOR INSERT
AS
BEGIN
INSERT Y DEFAULT VALUES
END;

GO
--End of trigger definition

SELECT id FROM X;
--id is empty

SELECT id FROM Y;
--ID is empty

--Do the following in Session 1
INSERT X DEFAULT VALUES;
SELECT @@IDENTITY;
/*Returns the value 100. This was inserted by the trigger.*/

SELECT SCOPE_IDENTITY();
/* Returns the value 1. This was inserted by the
INSERT statement - INSERT X DEFAULT VALUES;*/

SELECT IDENT_CURRENT('Y');
/* Returns value inserted into Y, that is in the trigger.*/

SELECT IDENT_CURRENT('X');
/* Returns value inserted into X.
This was the INSERT statement - INSERT X DEFAULT VALUES;*/

-- Do the following in Session 2.

SELECT @@IDENTITY;
/* Returns NULL because there has been no INSERT action
up to this point in this session.*/

SELECT SCOPE_IDENTITY();
/* Returns NULL because there has been no INSERT action
up to this point in this scope in this session.*/

SELECT IDENT_CURRENT('Y');
/* Returns the last value inserted into Y.*/