To begin with, let's create the general formula to perform a matrix lookup. A well-designed filtered index improves query performance and execution plan quality because it is smaller than a full-table nonclustered index and has filtered statistics.
Available index types
For more information, see Unique Index Design Guidelines in this guide. Examine data distribution in the column. Frequently, a long-running query is caused by indexing a column with few unique values, or by performing a join on such a column. This is a fundamental problem with the data and query, and generally cannot be resolved without identifying this situation.
For example, a physical telephone directory sorted alphabetically on last name will not expedite locating a person if all people in the city are named Smith or Jones. For more information about data distribution, see Statistics. Consider using filtered indexes on columns that have well-defined subsets, for example sparse columns, columns with mostly NULL values, columns with categories of values, and columns with distinct ranges of values.
A well-designed filtered index can improve query performance, reduce index maintenance costs, and reduce storage costs.
Consider the order of the columns if the index will contain multiple columns. Additional columns should be ordered based on their level of distinctness, that is, from the most distinct to the least distinct. Consider indexing computed columns.
For more information, see Indexes on Computed Columns. After you have determined that an index is appropriate for a query, you can select the type of index that best fits your situation. Index characteristics include the following:. Also, you can determine the index storage location by using filegroups or partition schemes to optimize performance. As you develop your index design strategy, you should consider the placement of the indexes on the filegroups associated with the database.
Careful selection of the filegroup or partition scheme can improve query performance. By default, indexes are stored in the same filegroup as the base table on which the index is created. A nonpartitioned clustered index and the base table always reside in the same filegroup.
However, you can do the following:. Create nonclustered indexes on a filegroup other than the filegroup of the base table or clustered index. By creating the nonclustered index on a different filegroup, you can achieve performance gains if the filegroups are using different physical drives with their own controllers. Data and index information can then be read in parallel by the multiple disk heads.
This creates no performance gain. Because you cannot predict what type of access will occur and when it will occur, it could be a better decision to spread your tables and indexes across all filegroups. This would guarantee that all disks are being accessed because all data and indexes are spread evenly across all disks, regardless of which way the data is accessed.
This is also a simpler approach for system administrators. You can also consider partitioning clustered and nonclustered indexes across multiple filegroups.
Partitioned indexes are partitioned horizontally, or by row, based on a partition function. The partition function defines how each row is mapped to a set of partitions based on the values of certain columns, called partitioning columns. A partition scheme specifies the mapping of the partitions to a set of filegroups.
Provide scalable systems that make large indexes more manageable. OLTP systems, for example, can implement partition-aware applications that deal with large indexes.
Make queries run faster and more efficiently. When queries access several partitions of an index, the query optimizer can process individual partitions at the same time and exclude partitions that are not affected by the query. When defining indexes, you should consider whether the data for the index key column should be stored in ascending or descending order.
Ascending is the default and maintains compatibility with earlier versions of SQL Server. Specifying the order in which key values are stored in an index is useful when queries referencing the table have ORDER BY clauses that specify different directions for the key column or columns in that index.
In these cases, the index can remove the need for a SORT operator in the query plan; therefore, this makes the query more efficient. For example, the buyers in the Adventure Works Cycles purchasing department have to evaluate the quality of products they purchase from vendors. The buyers are most interested in finding products sent by these vendors with a high rejection rate. As shown in the following query, retrieving the data to meet this criteria requires the RejectedQty column in the Purchasing.
PurchaseOrderDetail table to be sorted in descending order large to small and the ProductID column to be sorted in ascending order small to large. After the query is executed again, the following execution plan shows that the SORT operator has been eliminated and the newly created nonclustered index is used.
The Database Engine can move equally efficiently in either direction. Sort order can be specified only for key columns. Clustered indexes sort and store the data rows in the table based on their key values. There can only be one clustered index per table, because the data rows themselves can only be sorted in one order.
With few exceptions, every table should have a clustered index defined on the column, or columns, that offer the following:. By default, this index is clustered; however, you can specify a nonclustered index when you create the constraint.
When it is required, the Database Engine automatically adds a uniqueifier value to a row to make each key unique. This column and its values are used internally and cannot be seen or accessed by users. Each page in an index B-tree is called an index node. The top node of the B-tree is called the root node.
The bottom nodes in the index are called the leaf nodes. Any index levels between the root and the leaf nodes are collectively known as intermediate levels. In a clustered index, the leaf nodes contain the data pages of the underlying table. The root and intermediate level nodes contain index pages holding index rows. Each index row contains a key value and a pointer to either an intermediate level page in the B-tree, or a data row in the leaf level of the index. The pages in each level of the index are linked in a doubly-linked list.
Clustered indexes have one row in sys. By default, a clustered index has a single partition. When a clustered index has multiple partitions, each partition has a B-tree structure that contains the data for that specific partition.
For example, if a clustered index has four partitions, there are four B-tree structures; one in each partition. Depending on the data types in the clustered index, each clustered index structure will have one or more allocation units in which to store and manage the data for a specific partition.
The pages in the data chain and the rows in them are ordered on the value of the clustered index key. All inserts are made at the point where the key value in the inserted row fits in the ordering sequence among existing rows. Before you create clustered indexes, understand how your data will be accessed. Consider using a clustered index for queries that do the following:. After the row with the first value is found by using the clustered index, rows with subsequent indexed values are guaranteed to be physically adjacent.
For example, if a query retrieves records between a range of sales order numbers, a clustered index on the column SalesOrderNumber can quickly locate the row that contains the starting sales order number, and then retrieve all successive rows in the table until the last sales order number is reached.
This improves query performance. Generally, you should define the clustered index key with as few columns as possible. Consider columns that have one or more of the following attributes:. For example, an employee ID uniquely identifies employees.
Alternatively, a clustered index could be created on LastName , FirstName , MiddleName because employee records are frequently grouped and queried in this way, and the combination of these columns would still provide a high degree of difference. For example, a product ID uniquely identifies products in the Production.
Product table in the AdventureWorks database. This is because the rows would be stored in sorted order on that key column. It can be a good idea to cluster, that is physically sort, the table on that column to save the cost of a sort operation every time the column is queried. This causes in the whole row to move, because the Database Engine must keep the data values of a row in physical order. This is an important consideration in high-volume transaction processing systems in which data is typically volatile.
Wide keys are a composite of several columns or several large-size columns. The key values from the clustered index are used by all nonclustered indexes as lookup keys. Any nonclustered indexes defined on the same table will be significantly larger because the nonclustered index entries contain the clustering key and also the key columns defined for that nonclustered index. A nonclustered index contains the index key values and row locators that point to the storage location of the table data.
You can create multiple nonclustered indexes on a table or indexed view. Generally, nonclustered indexes should be designed to improve the performance of frequently used queries that are not covered by the clustered index. Similar to the way you use an index in a book, the query optimizer searches for a data value by searching the nonclustered index to find the location of the data value in the table and then retrieves the data directly from that location.
This makes nonclustered indexes the optimal choice for exact match queries because the index contains entries describing the exact location in the table of the data values being searched for in the queries. For example, to query the HumanResources.
The query optimizer can quickly find all entries in the index that match the specified ManagerID. Each index entry points to the exact page and row in the table, or clustered index, in which the corresponding data can be found. After the query optimizer finds all entries in the index, it can go directly to the exact page and row to retrieve the data. Nonclustered indexes have the same B-tree structure as clustered indexes, except for the following significant differences:.
The data rows of the underlying table are not sorted and stored in order based on their nonclustered keys. The row locators in nonclustered index rows are either a pointer to a row or are a clustered index key for a row, as described in the following:. If the table is a heap, which means it does not have a clustered index, the row locator is a pointer to the row.
The pointer is built from the file identifier ID , page number, and number of the row on the page. If the table has a clustered index, or the index is on an indexed view, the row locator is the clustered index key for the row. Nonclustered indexes have one row in sys. By default, a nonclustered index has a single partition. When a nonclustered index has multiple partitions, each partition has a B-tree structure that contains the index rows for that specific partition.
For example, if a nonclustered index has four partitions, there are four B-tree structures, with one in each partition. Depending on the data types in the nonclustered index, each nonclustered index structure will have one or more allocation units in which to store and manage the data for a specific partition.
Databases or tables with low update requirements, but large volumes of data can benefit from many nonclustered indexes to improve query performance. Consider creating filtered indexes for well-defined subsets of data to improve query performance, reduce index storage costs, and reduce index maintenance costs compared with full-table nonclustered indexes. Decision Support System applications and databases that contain primarily read-only data can benefit from many nonclustered indexes. The query optimizer has more indexes to choose from to determine the fastest access method, and the low update characteristics of the database mean index maintenance will not impede performance.
Online Transaction Processing applications and databases that contain heavily updated tables should avoid over-indexing. Additionally, indexes should be narrow, that is, with as few columns as possible. Before you create nonclustered indexes, you should understand how your data will be accessed. Consider using a nonclustered index for queries that have the following attributes:.
Create multiple nonclustered indexes on columns involved in join and grouping operations, and a clustered index on any foreign key columns. Create filtered indexes to cover queries that return a well-defined subset of rows from a large table. Contain columns frequently involved in search conditions of a query, such as WHERE clause, that return exact matches. Performance gains are achieved when the index contains all columns in the query.
Use index with included columns to add covering columns instead of creating a wide index key. If the table has a clustered index, the column or columns defined in the clustered index are automatically appended to the end of each nonclustered index on the table. This can produce a covered query without specifying the clustered index columns in the definition of the nonclustered index. For example, if a table has a clustered index on column C , a nonclustered index on columns B and A will have as its key values columns B , A , and C.
Lots of distinct values, such as a combination of last name and first name, if a clustered index is used for other columns. If there are very few distinct values, such as only 1 and 0, most queries will not use the index because a table scan is generally more efficient. For this type of data, consider creating a filtered index on a distinct value that only occurs in a small number of rows. For example, if most of the values are 0, the query optimizer might use a filtered index for the data rows that contain 1.
You can extend the functionality of nonclustered indexes by adding nonkey columns to the leaf level of the nonclustered index. By including nonkey columns, you can create nonclustered indexes that cover more queries. This is because the nonkey columns have the following benefits:. They are not considered by the Database Engine when calculating the number of index key columns or index key size. An index with included nonkey columns can significantly improve query performance when all columns in the query are included in the index either as key or nonkey columns.
When an index contains all the columns referenced by the query it is typically referred to as covering the query. While key columns are stored at all levels of the index, nonkey columns are stored only at the leaf level. You can include nonkey columns in a nonclustered index to avoid exceeding the current index size limitations of a maximum of 16 key columns and a maximum index key size of bytes. The Database Engine does not consider nonkey columns when calculating the number of index key columns or index key size.
For example, assume that you want to index the following columns in the Document table:. The following statement creates such an index. All data types are allowed except text , ntext , and image. Computed columns that are deterministic and either precise or imprecise can be included columns. As with key columns, computed columns derived from image , ntext , and text data types can be nonkey included columns as long as the computed column data type is allowed as a nonkey index column.
At least one key column must be defined. The maximum number of nonkey columns is columns. This is the maximum number of table columns minus 1. Index key columns, excluding nonkeys, must follow the existing index size restrictions of 16 key columns maximum, and a total index key size of bytes. When you modify a table column that has been defined as an included column, the following restrictions apply:. Increase the length of varchar , nvarchar , or varbinary columns.
Redesign nonclustered indexes with a large index key size so that only columns used for searching and lookups are key columns. Make all other columns that cover the query included nonkey columns.
In this way, you will have all columns needed to cover the query, but the index key itself is small and efficient. To cover the query, each column must be defined in the index. Although you could define all columns as key columns, the key size would be bytes. Because the only column actually used as search criteria is the PostalCode column, having a length of 30 bytes, a better index design would define PostalCode as the key column and include all other columns as nonkey columns.
Avoid adding unnecessary columns. Adding too many index columns, key or nonkey, can have the following performance implications:. Fewer index rows will fit on a page. More disk space will be required to store the index. In particular, adding varchar max , nvarchar max , varbinary max , or xml data types as nonkey index columns may significantly increase disk space requirements. This is because the column values are copied into the index leaf level. Therefore, they reside in both the index and the base table.
Index maintenance may increase the time that it takes to perform modifications, inserts, updates, or deletes, to the underlying table or indexed view.
You will have to determine whether the gains in query performance outweigh the affect to performance during data modification and in additional disk space requirements. A unique index guarantees that the index key contains no duplicate values and therefore every row in the table is in some way unique. Specifying a unique index makes sense only when uniqueness is a characteristic of the data itself. If the user tries to enter the same value in that column for more than one employee, an error message is displayed and the duplicate value is not entered.
With multicolumn unique indexes, the index guarantees that each combination of values in the index key is unique. For example, if a unique index is created on a combination of LastName , FirstName , and MiddleName columns, no two rows in the table could have the same combination of values for these columns.
Both clustered and nonclustered indexes can be unique. Provided that the data in the column is unique, you can create both a unique clustered index and multiple unique nonclustered indexes on the same table.
There are no significant differences between creating a UNIQUE constraint and creating a unique index independent of a constraint.
Data validation occurs in the same manner and the query optimizer does not differentiate between a unique index created by a constraint or manually created. By doing this the objective of the index will be clear.
If the data is unique and you want uniqueness enforced, creating a unique index instead of a nonunique index on the same combination of columns provides additional information for the query optimizer that can produce more efficient execution plans. A unique nonclustered index can contain included nonkey columns.
For more information, see Index with Included Columns. A filtered index is an optimized nonclustered index, especially suited to cover queries that select from a well-defined subset of data. It uses a filter predicate to index a portion of rows in the table. A well-designed filtered index can improve query performance, reduce index maintenance costs, and reduce index storage costs compared with full-table indexes. A well-designed filtered index improves query performance and execution plan quality because it is smaller than a full-table nonclustered index and has filtered statistics.
The filtered statistics are more accurate than full-table statistics because they cover only the rows in the filtered index.
An index is maintained only when data manipulation language DML statements affect the data in the index. A filtered index reduces index maintenance costs compared with a full-table nonclustered index because it is smaller and is only maintained when the data in the index is affected.
It is possible to have a large number of filtered indexes, especially when they contain data that is affected infrequently. Similarly, if a filtered index contains only the frequently affected data, the smaller size of the index reduces the cost of updating the statistics. Creating a filtered index can reduce disk storage for nonclustered indexes when a full-table index is not necessary.
You can replace a full-table nonclustered index with multiple filtered indexes without significantly increasing the storage requirements. To mark this text everywhere it shows up in the document, click Mark All. To mark additional index entries, select the text, click in the Mark Index Entry dialog box, and then repeat steps 3 and 4.
On the References tab, in the Index group, click Insert Index. In the Index dialog box, you can choose the format for text entries, page numbers, tabs, and leader characters. You can change the overall look of the index by choosing from the Formats dropdown menu. A preview is displayed in the window to the top left. To update the index, click the index, and then press F9. Or click Update Index in the Index group on the References tab.
If you find an error in the index, locate the index entry that you want to change, make the change, and then update the index. Expand your Office skills. Get new features first. Was this information helpful? How can we improve it? Thank you for your feedback!