Block nested loop
Encyclopedia
A block-nested loop is an algorithm
used to join
two relations in a relational database
.
This algorithm is a variation on the simple nested loop join
used to join two relations and (the "outer" and "inner" join operands, respectively). Suppose . In a traditional nested loop join, will be scanned once for every tuple of . If there are many qualifying tuples, and particularly if there is no applicable index for the join key on , this operation will be very expensive.
The block nested loop join algorithm improves on the simple nested loop join by only scanning once for every group of tuples. For example, one variant of the block nested loop join reads an entire page
of tuples into memory and loads them into a hash table
. It then scans , and probes the hash table to find tuples that match any of the tuples in the current page of . This reduces the number of scans of that are necessary.
A more aggressive variant of this algorithm loads as many pages of as can be fit in the available memory, loading all such tuples into a hash table, and then repeatedly scans . This further reduces the number of scans of that are necessary. In fact, this algorithm is essentially a special-case of the classic hash join
algorithm.
The block nested loop runs in I/Os where is the number of available pages of internal memory and and is size of and respectively in pages. Note
that block nested loop runs in I/Os if fits in the available internal memory.
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
used to join
Join (SQL)
An SQL join clause combines records from two or more tables in a database. It creates a set that can be saved as a table or used as is. A JOIN is a means for combining fields from two tables by using values common to each. ANSI standard SQL specifies four types of JOINs: INNER, OUTER, LEFT, and RIGHT...
two relations in a relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...
.
This algorithm is a variation on the simple nested loop join
Nested loop join
A nested loop join is a naive algorithm that joins two relations R and S by making two nested loops: For each tuple r in R do For each tuple s in S do If r and s satisfy the join condition Then output the tuple...
used to join two relations and (the "outer" and "inner" join operands, respectively). Suppose . In a traditional nested loop join, will be scanned once for every tuple of . If there are many qualifying tuples, and particularly if there is no applicable index for the join key on , this operation will be very expensive.
The block nested loop join algorithm improves on the simple nested loop join by only scanning once for every group of tuples. For example, one variant of the block nested loop join reads an entire page
Page (computing)
A page, memory page, or virtual page is a fixed-length contiguous block of virtual memory that is the smallest unit of data for the following:* memory allocation performed by the operating system for a program; and...
of tuples into memory and loads them into a hash table
Hash table
In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys , to their associated values . Thus, a hash table implements an associative array...
. It then scans , and probes the hash table to find tuples that match any of the tuples in the current page of . This reduces the number of scans of that are necessary.
A more aggressive variant of this algorithm loads as many pages of as can be fit in the available memory, loading all such tuples into a hash table, and then repeatedly scans . This further reduces the number of scans of that are necessary. In fact, this algorithm is essentially a special-case of the classic hash join
Hash join
The Hash join is an example of a join algorithm and is used in the implementation of a relational database management system.The task of a join algorithm is to find, for each distinct value of the join attribute, the set of tuples in each relation which have that value.Hash joins require an...
algorithm.
The block nested loop runs in I/Os where is the number of available pages of internal memory and and is size of and respectively in pages. Note
that block nested loop runs in I/Os if fits in the available internal memory.
Further reading
- Block Nested-Loop Joins in the MySQL 5.6 Reference Manual.