How to set up smartphones and PCs. Informational portal
  • home
  • Interesting
  • Which sql aggregate function finds the maximum value. Using Aggregate SQL Functions

Which sql aggregate function finds the maximum value. Using Aggregate SQL Functions

GROUP BY offer(SELECT statement) allows you to group data (rows) by the value of a column or multiple columns or expressions. The result will be a set of summary rows.

Each column in the select list must be present in the GROUP BY clause, except for constants and columns that are operands of aggregate functions.

A table can be grouped by any combination of its columns.

Aggregate functions are used to get a single summary value from a group of rows. All aggregate functions perform calculations on a single argument, which can be either a column or an expression. The result of any aggregate function is a constant value displayed in a separate result column.

Aggregate functions are specified in the column list of the SELECT statement, which can also contain a GROUP BY clause. If there is no GROUP BY clause in the SELECT statement, and the select column list contains at least one aggregate function, then it must not contain simple columns. On the other hand, a column select list may contain column names that are not arguments to an aggregate function if those columns are arguments to the GROUP BY clause.

If the query contains a WHERE clause, then the aggregate functions calculate a value for the results of the selection.

Aggregate functions MIN and MAX calculate the smallest and largest value of the column, respectively. Arguments can be numbers, strings, and dates. All NULL values ​​are removed before calculation (i.e. they are not taken into account).

Aggregate function SUM calculates the total sum of the column values. Arguments can only be numbers. Using the DISTINCT option eliminates all duplicate values ​​in the column before applying the SUM function. Similarly, all NULL values ​​are removed before applying this aggregate function.

AVG Aggregate Function returns the average of all values ​​in a column. Arguments can also only be numbers, and all NULL values ​​are removed before evaluation.

Aggregate function COUNT has two different forms:

  • COUNT( col_name) - counts the number of values ​​in the col_name column, NULL values ​​are ignored
  • COUNT(*) - counts the number of rows in the table, NULL values ​​are also taken into account

If the DISTINCT keyword is used in a query, any duplicate column values ​​are removed before the COUNT function is applied.

COUNT_BIG Function similar to the COUNT function. The only difference between them is the type of result they return: the COUNT_BIG function always returns BIGINT values, while the COUNT function returns INTEGER data values.

IN HAVING offer defines a condition that applies to a group of rows. It has the same meaning for groups of rows as the WHERE clause for the contents of the corresponding table (WHERE applies before grouping, HAVING after).

The lesson will cover the sql topic renaming a column (fields) using the service word AS; the topic of aggregate functions in sql is also considered. Specific request examples will be analyzed

Column names in queries can be renamed. This makes the results more readable.

In SQL, renaming fields is associated with the use of AS keyword, which is used to rename field names in result sets

Syntax:

SELECT<имя поля>AS<псевдоним>FROM …

Consider an example of renaming in SQL:

An example of the database "Institute": Display the names of teachers and their salaries, for those teachers whose salary is below 15000, rename the zarplata field to "low_wage"


✍ Solution:

Renaming columns in SQL is often necessary when calculating values ​​associated with multiple fields tables. Consider an example:

An example of the database "Institute": From the teachers table, display the name field and calculate the sum of the salary and bonus, naming the field "salary_bonus"


✍ Solution:
1 2 SELECT name, (zarplata+ premia) AS zarplata_premia FROM teachers;

SELECT name, (zarplata+premia) AS zarplata_premia FROM teachers;

Result:

Aggregate functions in SQL

Aggregate functions in sql are used to get total values ​​and evaluate expressions:

All aggregate functions return a single value.

The COUNT , MIN , and MAX functions apply to any data type.

The SUM and AVG functions are only used for numeric fields.
There is a difference between the COUNT(*) and COUNT() functions: the latter does not take NULL values ​​into account when calculating.

Important: when working with aggregate functions in SQL, a function word is used AS


An example of the database "Institute": Get the value of the highest salary among teachers, display the total as "max_zp"


✍ Solution:
SELECT MAX (zarplata) AS max_zp FROM teachers;

SELECT MAX(zarplata) AS max_sal FROM teachers;

Results:

Consider a more complex example of using aggregate functions in sql.


✍ Solution:

GROUP BY clause in SQL

The group by statement in sql is usually used in conjunction with aggregate functions.

Aggregate functions are executed on all resulting query strings. If the query contains a GROUP BY clause, each set of rows specified in the GROUP BY clause constitutes a group, and the aggregate functions are executed for each group separately.

Consider an example with the lessons table:

Example:

Important: Thus, as a result of using GROUP BY, all output rows of the query are divided into groups characterized by the same combination of values ​​in these columns (that is, aggregate functions are performed for each group separately).

At the same time, it should be taken into account that when grouping by a field containing NULL -values, all such records will fall into one group.

For various types of printers, determine their average cost and quantity (i.e. separately for laser, inkjet and matrix). Use aggregate functions. The result should look like this:

Having SQL Statement

The HAVING clause in SQL is needed to check values, which are obtained using the aggregate function after grouping(after using GROUP BY). Such a check cannot be contained in a WHERE clause.

Example: DB Computer Store. Calculate the average price of computers with the same processor speed. Run the calculation only for those groups whose average price is less than 30,000.



  • Aggregate functions are used like field names in a SELECT statement, with one exception: they take the field name as an argument. With features SUM And AVG only numeric fields can be used. With features COUNT, MAX and MIN both numeric and character fields can be used. When used with character fields MAX And MIN will translate them into their ASCII equivalent and process them in alphabetical order. Some DBMSs allow nested aggregates, but this is a departure from the ANSI standard, with all its implications.


For example, you can calculate the number of students who took exams in each discipline. To do this, you need to execute a query grouped by the "Subject" field and display the name of the discipline and the number of rows in the group for this discipline as a result. Using the * character as an argument to the COUNT function means to count all the rows in the group.

SELECT R1. Discipline, COUNT(*)

GROUP BY R1. Discipline;

Result:


SELECT R1.Discipline, COUNT (*)

WHERE R1. Evaluation IS NOT NULL

GROUP BY R1. Discipline;

Result:


will not fall into the set of tuples before grouping, so the number of tuples in the group for the discipline "Information Theory" will be 1 less.

A similar result can be obtained if the query is written in the following way:

SELECT R1. Discipline, COUNT(R1. Score)

GROUP BY R1. Discipline;

Function COUNT (ATTRIBUT NAME) counts the number of defined values ​​in a group, unlike a function COUNT(*), which counts the number of rows in a group. Indeed, in the group with the discipline "Information Theory" there will be 4 lines, but only 3 certain values ​​​​of the "Evaluation" attribute.


Rules for Handling NULL Values ​​in Aggregate Functions

If any values ​​in a column are equal NULL when calculating the result of the function, they are excluded.

If all values ​​in a column are equal NULL, then Max Min Sum Avg = NULL , count = 0 (zero).

If the table is empty, count(*) = 0 .

Aggregate functions can also be used without a pre-grouping operation, in which case the entire relation is treated as one group, and one value per group can be calculated for this group.

Rules for Interpreting Aggregate Functions

Aggregate functions can be included in the output list and then they are applied to the entire table.

SELECT MAX(Score) from R1 will give the maximum mark at the session;

SELECT SUM(Score) from R1 will give the sum of all ratings for the session;

SELECT AVG(Score) from R1 will give an average score over the entire session.


2; Result: "width="640"

Referring again to the "Session" database (tables R1), we find the number of successfully passed exams:

SELECT COUNT(*) Rented _ exams

WHERE score 2;

Result:


Aggregate functions can take individual columns of tables as arguments. In order to calculate, for example, the number of distinct values ​​for a certain column in a group, the DISTINCT keyword must be used along with the column name. Let's calculate the number of different grades received in each discipline:

SELECT R1.Discipline, COUNT (DISTINCT R1.Score)

WHERE R1. Evaluation IS NOT NULL

GROUP BY R1. Discipline;

Result:


The same result is obtained if the explicit condition in the WHERE part is excluded, in which case the query will look like this:

SELECT R1. Discipline, COUNT(DISTINCT R1. Score)

GROUP BY R1. Discipline;

Function COUNT(DISTINCT R1.Score) counts only certain various values.

In order for the desired result to be obtained in this case, it is necessary to make a preliminary data type conversion of the “Score” column, bringing it to a real type, then the result of calculating the average will not be an integer. In this case, the request will look like this:


2 Group by R2. Group, R1. Discipline; Here, the CAST() function converts the Score column to a valid data type. "width="640"

Select R2.Group, R1.Subject,Count(*) as Total, AVG(cast(Score as decimal(3,1))) as Average_Score

From R1,R2

where R1. Full name = R2. Full name and R1. evaluation is not null

and R1. Grade 2

Group by R2. Group, R1. Discipline;

Here the function CAST() converts the Score column to a valid data type.


You cannot use aggregate functions in a WHERE clause because the conditions in this section are evaluated in terms of a single row, while aggregate functions are evaluated in terms of groups of rows.

The GROUP BY clause allows you to define a subset of values ​​in a particular field in terms of another field and apply an aggregate function to the subset. This makes it possible to combine fields and aggregate functions in a single SELECT clause. Aggregate functions can be used both in the output expression of the results of the SELECT row, and in the expression of the condition for processing the formed HAVING groups. In this case, each aggregate function is calculated for each selected group. The values ​​obtained during the calculation of aggregate functions can be used to display the corresponding results or for the group selection condition.

Let's build a query that displays groups in which more than one deuce was received in one discipline in the exams:


one; Result: "width="640"

SELECT R2. Group

FROM R1, R2

WHERE R1. Full name = R2. Full name AND

R1.Score = 2

GROUP BY R2.Group, R1.Discipline

HAVING count(*) 1;

Result:


We have a database "Bank", consisting of one table F, which stores the relation F, containing information about accounts in branches of a certain bank:

Find the total balance on accounts in branches. You can make a separate query for each of them by selecting the SUM (Remainder) from the table for each branch, but the GROUP BY operation will put them all in one command:

SELECT Branch , SUM( Remainder )

GROUP BY Branch;

GROUP BY applies aggregate functions independently for each group identified by the value of the Branch field. The group consists of rows with the same value in the Branch field, and the function SUM is applied separately for each such group, i.e. the total account balance is calculated separately for each branch. The value of the field to which it applies GROUP BY, has by definition only one value per output group, just like the result of an aggregate function.


5000; The arguments in the HAVING clause follow the same rules as in the SELECT clause where GROUP BY is used. They must have one value per output group. "width="640"

Assume that you select only those branches whose total account balances are greater than $5,000, as well as the total balances for the selected branches. To display branches with total balances over $5,000, use the HAVING clause. The HAVING clause specifies the criteria used to remove certain groups from the output, just like the WHERE clause does for individual rows.

The correct command would be:

SELECT Branch, SUM(Balance)

GROUP BY Branch

HAVING SUM ( Remainder ) 5 000;

Arguments in a sentence HAVING subject to the same rules as in the proposal SELECT where is used GROUP BY. They must have one value per output group.


The following command will be banned:

SELECT Branch,SUM(Balance)

GROUP BY Branch

HAVING Opening Date = 27/12/2004 ;

Field Opening date cannot be used in a sentence HAVING, because it can have more than one value per output group. To avoid this situation, the proposal HAVING should only refer to aggregates and fields selected GROUP BY. There is a correct way to make the above request:

SELECT Branch,SUM(Balance)

WHERE OpenDate = '27/12/2004'

GROUP BY Branch;


The meaning of this query is as follows: to find the sum of balances for each branch of accounts opened on December 27, 2004.

As stated earlier, HAVING can only take arguments that have one value per output group. In practice, references to aggregate functions are the most common, but fields selected using GROUP BY are also valid. For example, we want to see the total balances on the accounts of branches in St. Petersburg, Pskov and Uryupinsk:

SELECT Branch, SUM(Balance)

FROM F,Q

WHERE F. Branch = Q. Branch

GROUP BY Branch

HAVING Branch IN ('St. Petersburg', 'Pskov', 'Uryupinsk');

100,000; If the total balance is more than $100,000, then we will see it in the resulting relation, otherwise we will get an empty relation. "width="640"

Therefore, in the arithmetic expressions of the predicates included in the selection clause of the HAVING clause, only the specifications of the columns specified as grouping columns in the GROUP BY clause can be directly used. The remaining columns can only be specified within the specifications of the aggregate functions COUNT, SUM, AVG, MIN, and MAX, which in this case calculate some aggregate value for the entire group of rows. The result of executing the HAVING section is a grouped table containing only those groups of rows for which the result of calculating the selection condition in the HAVING part is TRUE. In particular, if a HAVING clause is present in a query that does not contain a GROUP BY, then the result of its execution will be either an empty table or the result of the execution of the previous sections of the table expression, treated as one group without grouping columns. Consider an example. Let's say we want to display the total amount of balances for all branches, but only if it is more than $100,000. In this case, our query will not contain a grouping operation, but will contain a HAVING section and will look like this:

SELECT SUM( Remainder )

HAVING SUM( Remainder ) 100 000;

If the total balance is more than $100,000, then we will see it in the resulting relation, otherwise we will get an empty relation.


The following subsections describe other SELECT statement clauses that can be used in queries, as well as aggregate functions and statement sets. As a reminder, so far we've covered the use of the WHERE clause, and in this article we'll look at the GROUP BY, ORDER BY, and HAVING clauses, and provide some examples of using these clauses in conjunction with the aggregate functions that are supported in Transact-SQL.

GROUP BY offer

Sentence GROUP BY groups the selected set of rows to produce a set of summary rows based on the values ​​of one or more columns or expressions. A simple use of the GROUP BY clause is shown in the example below:

USE SampleDb; SELECT Job FROM Works_On GROUP BY Job;

In this example, employee positions are selected and grouped.

In the example above, the GROUP BY clause creates a separate group for all possible values ​​(including NULL) of the Job column.

The use of columns in the GROUP BY clause must meet certain conditions. In particular, each column in the query's select list must also appear in the GROUP BY clause. This requirement does not apply to constants and columns that are part of an aggregate function. (Aggregate functions are discussed in the next subsection.) This makes sense because only columns in the GROUP BY clause are guaranteed one value per group.

A table can be grouped by any combination of its columns. The example below demonstrates grouping the rows of the Works_on table into two columns:

USE SampleDb; SELECT ProjectNumber, Job FROM Works_On GROUP BY ProjectNumber, Job;

The result of this query is:

Based on the results of the query, you can see that there are nine groups with different combinations of project number and position. The sequence of column names in the GROUP BY clause need not be the same as in the SELECT column list.

Aggregate functions

Aggregate functions are used to get total values. All aggregate functions can be divided into the following categories:

    ordinary aggregate functions;

    statistical aggregate functions;

    aggregate functions defined by the user;

    analytic aggregate functions.

Here we will look at the first three types of aggregate functions.

Ordinary Aggregate Functions

The Transact-SQL language supports the following six aggregate functions: MIN, MAX, SUM, AVG, COUNT, COUNT_BIG.

All aggregate functions perform calculations on a single argument, which can be either a column or an expression. (The only exception is the second form of the two functions COUNT and COUNT_BIG, namely COUNT(*) and COUNT_BIG(*) respectively.) The result of any aggregate function is a constant value displayed in a separate result column.

Aggregate functions are specified in the SELECT statement's column list, which can also contain a GROUP BY clause. If there is no GROUP BY clause in the SELECT statement, and the select column list contains at least one aggregate function, then it must not contain simple columns (other than columns that serve as arguments to the aggregate function). Therefore, the code in the example below is incorrect:

USE SampleDb; SELECT LastName, MIN(Id) FROM Employee;

Here, the LastName column of the Employee table should not be in the column select list because it is not an aggregate function argument. On the other hand, a column select list may contain column names that are not arguments to an aggregate function if those columns are arguments to the GROUP BY clause.

An aggregate function argument can be preceded by one of two possible keywords:

ALL

Specifies that calculations are performed on all values ​​in the column. This is the default value.

DISTINCT

Specifies that only unique column values ​​are used for calculations.

Aggregate functions MIN and MAX

The MIN and MAX aggregate functions calculate the smallest and largest value of a column, respectively. If the query contains a WHERE clause, the MIN and MAX functions return the smallest and largest value of rows that meet the specified criteria. The example below shows the use of the MIN aggregate function:

USE SampleDb; -- Returns 2581 SELECT MIN(Id) AS "Min Id" FROM Employee;

The result returned in the example above is not very informative. For example, the name of the employee who owns this number is unknown. However, it is not possible to obtain this last name in the usual way, because, as mentioned earlier, explicitly specifying the LastName column is not allowed. In order to get the last name of this employee along with the lowest personnel number of an employee, a subquery is used. The following example shows the use of such a subquery, where the subquery contains the SELECT statement from the previous example:

Query execution result:

The use of the MAX aggregate function is shown in the example below:

The MIN and MAX functions can also take strings and dates as arguments. In the case of a string argument, the values ​​are compared using the actual sort order. For all temporal date arguments, the smallest column value is the earliest date, and the largest column value is the latest date.

You can use the DISTINCT keyword with the MIN and MAX functions. Before the MIN and MAX aggregate functions are used, all NULL values ​​are excluded from their argument columns.

Aggregate function SUM

Aggregate SUM function calculates the total sum of the column values. The argument to this aggregate function must always be of a numeric data type. The use of the SUM aggregate function is shown in the example below:

USE SampleDb; SELECT SUM (Budget) "Summary budget" FROM Project;

This example calculates the total sum of the budgets of all projects. Query execution result:

In this example, the aggregate function groups all project budget values ​​and determines their total amount. For this reason, the query contains an implicit grouping function (like all similar queries). The implicit grouping function from the example above can be specified explicitly, as shown in the example below:

USE SampleDb; SELECT SUM (Budget) "Total Budget" FROM Project GROUP BY();

Using the DISTINCT option eliminates all duplicate values ​​in the column before applying the SUM function. Similarly, all NULL values ​​are removed before applying this aggregate function.

AVG Aggregate Function

Aggregate AVG function returns the arithmetic mean of all values ​​in a column. The argument to this aggregate function must always be of a numeric data type. Before the AVG function is used, all NULL values ​​are removed from its argument.

The use of the AVG aggregate function is shown in the example below:

USE SampleDb; -- Returns 133833 SELECT AVG (Budget) "Average budget per project" FROM Project;

Here the arithmetic mean of the budget for all budgets is calculated.

Aggregate Functions COUNT and COUNT_BIG

Aggregate COUNT function has two different forms:

COUNT( col_name) COUNT(*)

The first form of the function counts the number of values ​​in the col_name column. If the DISTINCT keyword is used in a query, any duplicate column values ​​are removed before the COUNT function is applied. This form of the COUNT function does not take NULL values ​​into account when counting the number of values ​​in a column.

The use of the first form of the COUNT aggregate function is shown in the example below:

USE SampleDb; SELECT ProjectNumber, COUNT(DISTINCT Job) "Works in project" FROM Works_on GROUP BY ProjectNumber;

This is where the number of different positions for each project is counted. The result of this query is:

As you can see from the example query, NULL values ​​were not taken into account by the COUNT function. (The sum of all the values ​​in the job column turned out to be 7, not 11, as it should be.)

The second form of the COUNT function, i.e. the COUNT(*) function counts the number of rows in a table. And if the SELECT statement of a query with the COUNT(*) function contains a WHERE clause with a condition, the function returns the number of rows that satisfy the specified condition. Unlike the first form of the COUNT function, the second form does not ignore NULL values ​​because this function operates on rows, not columns. The example below demonstrates the use of the COUNT(*) function:

USE SampleDb; SELECT Job AS "Type of work", COUNT(*) "Need workers" FROM Works_on GROUP BY Job;

Here the number of positions in all projects is counted. Query execution result:

COUNT_BIG Function similar to the COUNT function. The only difference between them is the type of result they return: the COUNT_BIG function always returns BIGINT values, while the COUNT function returns INTEGER data values.

Statistical aggregate functions

The following functions make up a group of statistical aggregate functions:

VAR

Calculates the statistical variance of all values ​​represented in a column or expression.

VARP

Calculates the statistical variance of the population of all values ​​represented in a column or expression.

STDEV

Calculates the standard deviation (which is calculated as the square root of the corresponding variance) of all values ​​in a column or expression.

STDEVP

Calculates the standard deviation of the totality of all values ​​in a column or expression.

User Defined Aggregate Functions

The Database Engine also supports the implementation of user-defined functions. This capability allows users to augment system aggregate functions with functions that they can implement and install themselves. These functions represent a special class of user-defined functions and are discussed in more detail later.

HAVING offer

In a sentence HAVING defines a condition that applies to a group of rows. Thus, this clause has the same meaning for groups of rows as the WHERE clause has for the contents of the corresponding table. The syntax of the HAVING clause is as follows:

HAVING condition

Here, the condition parameter represents a condition and contains aggregate functions or constants.

The use of the HAVING clause with the COUNT(*) aggregate function is shown in the example below:

USE SampleDb; -- Return "p3" SELECT ProjectNumber FROM Works_on GROUP BY ProjectNumber HAVING COUNT(*)

In this example, using the GROUP BY clause, the system groups all rows based on the values ​​in the ProjectNumber column. After that, the number of rows in each group is counted and groups containing less than four rows (three or less) are selected.

The HAVING clause can also be used without aggregate functions, as shown in the example below:

USE SampleDb; -- Returns "Consultant" SELECT Job FROM Works_on GROUP BY Job HAVING Job LIKE "K%";

This example groups rows in the Works_on table by position and eliminates those positions that do not start with the letter "K".

The HAVING clause can also be used without the GROUP BY clause, although this is not a common practice. In this case, all table rows are returned in the same group.

ORDER BY offer

Sentence ORDER BY defines the sort order of rows in the result set returned by the query. This sentence has the following syntax:

The sort order is specified in the col_name parameter. The col_number parameter is an alternative sort order specifier that specifies the columns in the order in which they appear in the select list of the SELECT statement (1 is the first column, 2 is the second column, and so on). ASC parameter defines sorting in ascending order, and DESC parameter- descending. The default is ASC.

The column names in the ORDER BY clause do not have to be in the select column list. But this does not apply to SELECT DISTINCT queries, because in such queries, the column names specified in the ORDER BY clause must also be specified in the select column list. In addition, this clause cannot contain column names from tables not listed in the FROM clause.

As you can see from the syntax of the ORDER BY clause, the result set can be sorted on multiple columns. This sorting is shown in the example below:

This example selects department numbers and employee last names and first names for employees whose personnel number is less than 20,000, and sorts by last name and first name. The result of this query is:

The columns in the ORDER BY clause can be specified not by their names, but by order in the select list. Accordingly, the sentence in the example above can be rewritten as follows:

This alternate way of specifying columns by their position instead of their names applies if the ordering criterion contains an aggregate function. (Another way is to use the column names, which then appear in the ORDER BY clause.) However, in the ORDER BY clause, it is recommended that columns be specified by their names rather than by numbers, to make it easier to update the query if columns need to be added or removed from the select list. Specifying the columns in the ORDER BY clause by their numbers is shown in the example below:

USE SampleDb; SELECT ProjectNumber, COUNT(*) "Number of employees" FROM Works_on GROUP BY ProjectNumber ORDER BY 2 DESC;

Here, for each project, the project number and the number of employees participating in it are selected, sorting the result in descending order by the number of employees.

Transact-SQL puts NULL values ​​at the beginning of the list when sorting in ascending order, and at the end of the list when sorting in descending order.

Using the ORDER BY Clause to Paginate Results

The display of query results on the current page can either be implemented in the user application, or the database server can be instructed to do so. In the first case, all database rows are sent to the application, whose task is to select the required rows and display them. In the second case, only the rows required for the current page are selected and displayed from the server side. As you might expect, server-side page generation usually provides better performance because only the rows needed for display are sent to the client.

To support server-side page creation, SQL Server 2012 introduces two new SELECT statement clauses: OFFSET and FETCH. The application of these two sentences is demonstrated in the example below. Here, from the AdventureWorks2012 database (which you can find in the sources), the business ID, job title, and birthday of all female employees are retrieved, sorting the result by job title in ascending order. The resulting set of rows is split into 10-line pages and the third page is displayed:

In a sentence OFFSET specifies the number of result rows to skip in the displayed result. This number is calculated after the rows are sorted by the ORDER BY clause. In a sentence FETCH NEXT specifies the number of matching WHERE and sorted rows to return. The parameter of this clause can be a constant, an expression, or the result of another query. The FETCH NEXT clause is similar to the clause FETCH FIRST.

The main goal when creating pages on the server side is to be able to implement common page forms using variables. You can accomplish this task through the SQL Server package.

SELECT statement and IDENTITY property

IDENTITY property allows you to define values ​​for a specific table column as an automatically incrementing counter. Numeric data type columns such as TINYINT, SMALLINT, INT, and BIGINT can have this property. For such a table column, the Database Engine automatically generates sequential values ​​starting from the specified starting value. Thus, the IDENTITY property can be used to generate single-digit numeric values ​​for the selected column.

A table can contain only one column with the IDENTITY property. The owner of the table has the ability to specify the initial value and increment, as shown in the example below:

USE SampleDb; CREATE TABLE Product (Id INT IDENTITY(10000, 1) NOT NULL, Name NVARCHAR(30) NOT NULL, Price MONEY) INSERT INTO Product(Name, Price) VALUES ("Item1", 10), ("Item2", 15) , ("Item3", 8), ("Item4", 15), ("Item5", 40); -- Returns 10004 SELECT IDENTITYCOL FROM Product WHERE Name = "Product5"; -- Similar to the previous statement SELECT $identity FROM Product WHERE Name = "Product5";

This example first creates a Product table that contains an Id column with an IDENTITY property. The values ​​in the Id column are generated automatically by the system, starting at 10,000 and increasing by one for each subsequent value: 10,000, 10,001, 10,002, and so on.

Several system functions and variables are associated with the IDENTITY property. For example, the example code uses $identity system variable. As you can see from the output of this code, this variable automatically references the IDENTITY property. You can also use the system function instead IDENTITYCOL.

The initial value and increment of a column with the IDENTITY property can be found using the functions IDENT_SEED And IDENT_INCR respectively. These functions are applied as follows:

USE SampleDb; SELECT IDENT_SEED("Product"), IDENT_INCR("Product")

As already mentioned, IDENTITY values ​​are set automatically by the system. But the user can explicitly specify their own values ​​for certain rows by setting the parameter IDENTITY_INSERT ON before inserting an explicit value:

SET IDENTITY INSERT table name ON

Because the IDENTITY_INSERT option can be set to any value for an IDENTITY property column, including a duplicate value, the IDENTITY property typically does not enforce uniqueness of column values. Therefore, UNIQUE or PRIMARY KEY constraints should be used to enforce uniqueness of column values.

When you insert values ​​into a table after IDENTITY_INSERT is set to on, the system creates the next value of the IDENTITY column, incrementing the largest current value of that column.

CREATE SEQUENCE statement

Using the IDENTITY property has several significant disadvantages, the most significant of which are the following:

    property application is limited to the specified table;

    the new value of the column cannot be obtained in any other way than by applying it;

    the IDENTITY property can only be specified when creating a column.

For these reasons, SQL Server 2012 introduces sequences that have the same semantics as the IDENTITY property, but without the drawbacks previously listed. In this context, a sequence is a database functionality that allows you to specify counter values ​​for different database objects, such as columns and variables.

Sequences are created using the instruction CREATE SEQUENCE. The CREATE SEQUENCE statement is defined in the SQL standard and is supported by other relational database systems such as IBM DB2 and Oracle.

The example below shows how to create a sequence in SQL Server:

USE SampleDb; CREATE SEQUENCE dbo.Sequence1 AS INT START WITH 1 INCREMENT BY 5 MINVALUE 1 MAXVALUE 256 CYCLE;

In the example above, the values ​​of Sequence1 are generated automatically by the system, starting at value 1 with increments of 5 for each subsequent value. Thus, in offer START initial value is specified, and INCREMENT offer- step. (The step can be either positive or negative.)

In the next two optional sentences MINVALUE And MAXVALUE the minimum and maximum values ​​of the sequence object are specified. (Note that the MINVALUE value must be less than or equal to the start value, and the MAXVALUE value cannot be greater than the upper limit of the data type specified for the sequence.) In the clause CYCLE indicates that the sequence is repeated from the beginning when the maximum (or minimum for a sequence with a negative step) value is exceeded. By default, this clause is NO CYCLE, which means that exceeding the maximum or minimum sequence value causes an exception.

The main feature of sequences is their independence from tables, i.e. they can be used with any database object such as table columns or variables. (This property has a positive effect on storage and, therefore, on performance. A specific sequence does not need to be stored; only its last value is stored.)

New sequence values ​​are created with NEXT VALUE FOR expressions, the use of which is shown in the example below:

USE SampleDb; -- Returns 1 SELECT NEXT VALUE FOR dbo.sequence1; -- Returns 6 (next step) SELECT NEXT VALUE FOR dbo.sequence1;

You can use the NEXT VALUE FOR expression to assign the result of a sequence to a variable or column cell. The example below shows the use of this expression to assign results to a column:

USE SampleDb; CREATE TABLE Product (Id INT NOT NULL, Name NVARCHAR(30) NOT NULL, Price MONEY) INSERT INTO Product VALUES (NEXT VALUE FOR dbo.sequence1, "Product1", 10); INSERT INTO Product VALUES (NEXT VALUE FOR dbo.sequence1, "Product2", 15); -- ...

The example above first creates a Product table with four columns. Next, two INSERT statements insert two rows into this table. The first two cells in the first column will have the values ​​11 and 16.

The example below shows the use of the catalog view sys.sequences to view the current value of a sequence without using it:

Typically, the NEXT VALUE FOR expression is used in an INSERT statement to force the system to insert the generated values. This expression can also be used as part of a multi-row query using the OVER clause.

To change a property of an existing sequence, use ALTER SEQUENCE statement. One of the most important uses of this statement is with the RESTART WITH option, which resets the specified sequence. The following example shows the use of the ALTER SEQUENCE statement to reset almost all properties of Sequence1:

USE SampleDb; ALTER SEQUENCE dbo.sequence1 RESTART WITH 100 INCREMENT BY 50 MINVALUE 50 MAXVALUE 200 NO CYCLE;

Delete a sequence using the instruction DROP SEQUENCE.

Set Operators

In addition to the operators discussed earlier, Transact-SQL supports three more set operators: UNION, INTERSECT, and EXCEPT.

UNION operator

UNION operator combines the results of two or more queries into a single result set that includes all rows that belong to all queries in the join. Accordingly, the result of joining two tables is a new table containing all the rows included in one of the original tables or both of these tables.

The general form of the UNION operator looks like this:

select_1 UNION select_2(select_3])...

The select_1, select_2, ... options are SELECT statements that create a union. If the ALL option is used, all rows are displayed, including duplicates. In the UNION operator, the ALL parameter has the same meaning as in the SELECT select list, with one difference: for the SELECT select list, this parameter is applied by default, but for the UNION operator it must be specified explicitly.

In its original form, the SampleDb database is not suitable for demonstrating the use of the UNION operator. Therefore, this section creates a new EmployeeEnh table that is identical to the existing Employee table, but has an additional City column. This column indicates where employees live.

Creating the EmployeeEnh table provides us with an opportunity to demonstrate the use of the clause INTO in the SELECT statement. The SELECT INTO statement performs two operations. First, a new table is created with the columns listed in the SELECT select list. Then the rows of the original table are inserted into the new table. The name of the new table is specified in the INTO clause, and the name of the source table is specified in the FROM clause.

The example below shows the creation of the EmployeeEnh table from the Employee table:

USE SampleDb; SELECT * INTO EmployeeEnh FROM Employee; ALTER TABLE EmployeeEnh ADD City NCHAR(40) NULL;

In this example, the SELECT INTO statement creates the EmployeeEnh table, inserts all the rows from the source Employee table into it, and then the ALTER TABLE statement adds the City column to the new table. But the added City column does not contain any values. Values ​​in this column can be inserted through Management Studio or by using the following code:

USE SampleDb; UPDATE EmployeeEnh SET City="Kazan" WHERE Id=2581; UPDATE EmployeeEnh SET City = "Moscow" WHERE Id = 9031; UPDATE EmployeeEnh SET City = "Yekaterinburg" WHERE Id = 10102; UPDATE EmployeeEnh SET City = "Saint Petersburg" WHERE Id = 18316; UPDATE EmployeeEnh SET City = "Krasnodar" WHERE Id = 25348; UPDATE EmployeeEnh SET City="Kazan" WHERE Id=28559; UPDATE EmployeeEnh SET City="Perm" WHERE Id=29346;

We are now ready to demonstrate the use of the UNION statement. The example below shows a query to create a join between the EmployeeEnh and Department tables using this statement:

USE SampleDb; SELECT City AS "City" FROM EmployeeEnh UNION SELECT Location FROM Department;

The result of this query is:

Only compatible tables can be joined using the UNION statement. By compatible tables, we mean that both lists of columns in the selection must contain the same number of columns, and that the corresponding columns must have compatible data types. (In terms of compatibility, the data types INT and SMALLINT are not compatible.)

The result of a join can only be ordered by using the ORDER BY clause in the last SELECT statement, as shown in the example below. GROUP BY and HAVING clauses can be used with individual SELECT statements, but not within the join itself.

The query in this example fetches employees who either work in department d1 or started working on the project before January 1, 2008.

The UNION operator supports the ALL option. When this option is used, duplicates are not removed from the result set. You can use the OR operator instead of the UNION operator if all SELECT statements joined by one or more UNION operators refer to the same table. In this case, the set of SELECT statements is replaced by a single SELECT statement with a set of OR statements.

INTERSECT and EXCEPT statements

Two other operators for working with sets, INTERSECT And EXCEPT, define the intersection and difference, respectively. Under the intersection in this context is a set of rows that belong to both tables. And the difference of two tables is defined as all the values ​​that belong to the first table and are not present in the second. The example below shows the use of the INTERSECT statement:

Transact-SQL does not support the use of the ALL option with either the INTERSECT statement or the EXCEPT statement. The use of the EXCEPT statement is shown in the example below:

Keep in mind that these three set operators have different execution precedence: the INTERSECT operator has the highest precedence, followed by the EXCEPT operator, and the UNION operator has the lowest precedence. Inattention to execution precedence when using multiple different set operators can lead to unexpected results.

CASE expressions

In the field of database application programming, it is sometimes necessary to modify the presentation of data. For example, people can be subdivided by coding them according to their social affiliation, using the values ​​1, 2, and 3, denoting males, females, and children, respectively. This programming technique can reduce the time required to implement the program. CASE expression The Transact-SQL language makes it easy to implement this type of encoding.

Unlike most programming languages, CASE is not a statement, but an expression. Therefore, a CASE expression can be used almost anywhere that the Transact-SQL language allows the use of expressions. The CASE expression has two forms:

    a simple CASE expression;

    search expression CASE.

The syntax for a simple CASE expression is as follows:

A statement with a simple CASE expression first searches the list of all expressions in WHEN clause the first expression that matches expression_1 and then executes the corresponding THEN clause. If there is no matching expression in the WHEN list, then ELSE clause.

The syntax for a CASE search expression is:

In this case, the first matching condition is searched, and then the corresponding THEN clause is executed. If none of the conditions match the requirements, the ELSE clause is executed. The use of the CASE search expression is shown in the example below:

USE SampleDb; SELECT ProjectName, CASE WHEN Budget > 0 AND Budget 100000 AND Budget 150000 AND Budget

The result of this query is:

This example weights the budgets of all projects and displays their calculated weights along with their respective project names.

The example below shows another way to use a CASE expression, where the WHEN clause contains subqueries that are part of the expression:

USE SampleDb; SELECT ProjectName, CASE WHEN p1.Budget (SELECT AVG(p2.Budget) FROM Project p2) THEN "Above Average" END "Budget Category" FROM Project p1;

The result of this query is the following:

To summarize the information contained in the database, SQL provides aggregate functions. An aggregate function takes an entire column of data as an argument and returns a single value that summarizes that column in some way.

For example, the AVG() aggregate function takes a column of numbers as an argument and calculates their average.

To calculate the average per capita income of a resident of Zelenograd, you need the following query:

SELECT ‘AVERAGE INCOME=’, AVG(SUMD)

SQL has six aggregate functions that allow you to get different kinds of summary information (Figure 1):

– SUM() calculates the sum of all values ​​contained in the column;

– AVG() calculates the average among the values ​​contained in the column;

– MIN() finds the smallest among all values ​​contained in the column;

– MAX() finds the largest among all values ​​contained in the column;

– COUNT() counts the number of values ​​contained in a column;

– COUNT(*) counts the number of rows in the query result table.

The aggregate function argument can be a simple column name, as in the previous example, or an expression, as in the following query that specifies the calculation of the per capita tax:

SELECT AVG(SUMD*0.13)

This query creates a temporary column containing the values ​​(SUMD*0.13) for each row in the PERSON table, and then calculates the average of the temporary column.

The sum of incomes of all residents of Zelenograd can be calculated using the SUM aggregate function:

SELECT SUM(SUMD) FROM PERSON

An aggregate function can also be used to calculate totals for a result table obtained by joining several source tables. For example, you can calculate the total amount of income that residents received from a source called "Scholarship":

SELECT SUM(MONEY)

FROM PROFIT, HAVE_D

WHERE PROFIT.ID=HAVE_D.ID

AND PROFIT.SOURCE='Scholarship'

The MIN() and MAX() aggregate functions allow you to find the smallest and largest values ​​in a table, respectively. However, the column can contain numeric or string values, or date or time values.

For example, you can define:

(a) the lowest total income received by residents and the highest tax payable:

SELECT MIN(SUMD), MAX(SUMD*0.13)

(b) the dates of birth of the oldest and youngest resident:

SELECT MIN(RDATE), MAX(RDATE)

(c) surnames, first names and patronymics of the very first and most recent residents in the list, sorted alphabetically:

SELECT MIN(FIO), MAX(FIO)

When applying these aggregate functions, you need to remember that numerical data is compared according to arithmetic rules, dates are compared sequentially (earlier date values ​​are considered smaller than later ones), time intervals are compared based on their duration.

When using the MIN() and MAX() functions with string data, the result of comparing two strings depends on the character encoding table used.

The COUNT() aggregate function counts the number of values ​​in a column of any type:

(a) how many apartments are there in the 1st microdistrict?

SELECT COUNT(ADR) FROM FLAT WHERE ADR LIKE "%, 1_ _-%"

(b) how many residents have sources of income?

SELECT COUNT(DISTINCT NOM) FROM HAVE_D

(c) how many sources of income are used by residents?

SELECT COUNT(DISTINCT ID) FROM HAVE_D (the DISTINCT keyword specifies that non-repeating values ​​in a column are counted).

The special aggregate function COUNT(*) counts the rows in the result table, not the data values:

(a) how many apartments are there in the 2nd microdistrict?

SELECT COUNT(*) FROM FLAT WHERE ADR LIKE "%, 2__-%"

(b) how many sources of income does Ivanov Ivan Ivanovich have?

SELECT COUNT(*) FROM PERSON, HAVE_D WHERE FIO="Ivanov Ivan Ivanovich" AND PERSON.NOM=HAVE_D.NOM

(c) how many people live in an apartment at a particular address?

SELECT COUNT(*) FROM PERSON WHERE ADR="Zelenograd, 1001-45"

One way to understand how summary queries with aggregate functions are executed is to think of the query execution as split into two parts. First, it is determined how the query would work without aggregate functions, returning multiple rows of results. The aggregate functions are then applied to the query results, returning a single summary row.

For example, consider the following complex query: find the average per capita total income, the sum of the total income of the residents, and the average income of the source as a percentage of the total income of the resident. The answer is given by the operator

SELECT AVG(SUMD), SUM(SUMD), (100*AVG(MONEY/SUMD)) FROM PERSON, PROFIT, HAVE_D WHERE PERSON.NOM=HAVE_D.NOM AND HAVE_D.ID=PROFIT.ID

Without aggregate functions, the query would look like this:

SELECT SUMD, SUMD, MONEY/SUMD FROM PERSON, PROFIT, HAVE_D WHERE PERSON.NOM=HAVE_D.NOM AND HAVE_D.ID=PROFIT.ID

and would return one row of results for each resident and specific source of income. Aggregate functions use the columns of the query's result table to produce a one-row table with summary results.

You can specify an aggregate function in the returned columns string instead of any column name. For example, it can be part of an expression that adds or subtracts the values ​​of two aggregate functions:

SELECT MAX(SUMD)-MIN(SUMD) FROM PERSON

However, an aggregate function cannot be an argument to another aggregate function, i.e. nested aggregate functions are prohibited.

Also, you can't use aggregate functions and regular column names at the same time in the list of returned columns, because it doesn't make sense, for example:

SELECT FIO, SUM(SUMD) FROM PERSON

Here, the first element of the list tells the DBMS to create a table that will consist of several rows and contain one row for each inhabitant. The second element of the list asks the DBMS to return a single result value, which is the sum of the values ​​in the SUMD column. These two directions contradict each other, resulting in an error.

The foregoing does not apply to cases of processing subqueries and queries with grouping.

Top Related Articles