The Netscape Directory Server Deployment Guide introduced the concept of indexing, the costs and benefits, and different types of index shipped with Netscape Directory Server (Directory Server). This chapter begins with a description of the searching algorithm itself, so as to place the indexing mechanism in context, and then describes how to create, delete, and manage indexes. This chapter contains the following sections:
This section provides an overview of indexing in Directory Server. It contains the following topics:
Indexes are stored in files in the directory's databases. The names of the files are based on the indexed attribute, not the type of index contained in the file. Each index file may contain multiple types of indexes if multiple indexes are maintained for the specific attribute. For example, all indexes maintained for the common name attribute are contained in the cn.db3 file.
Directory Server supports the following types of index:
When you install Directory Server, a set of default and system indexes is created per database instance. To maintain these indexes, the directory uses standard indexes.
The default indexes can be modified depending on your indexing needs, although you should ensure that no server plug-ins or other servers in your enterprise depend on this index before you remove it.
Table
10-1 lists the default indexes installed with the
directory.
Table 10-1
Default Indexes
|
Improves the performance of the most common types of user directory searches. |
||||
|
Improves the performance of the most common types of user directory searches. |
||||
|
Improves the performance of the most common types of user directory searches. |
||||
|
Improves Netscape server performance. This index is also used by the Referential Integrity Plug-in. See Maintaining Referential Integrity for more information. |
||||
|
Improves Netscape server performance. This index is also used by the Referential Integrity Plug-in. See Maintaining Referential Integrity for more information. |
||||
|
Improves Netscape server performance. This index is also used by the Referential Integrity Plug-in. See Maintaining Referential Integrity for more information. |
||||
|
Improves the performance of the most common types of user directory searches. |
||||
|
Improves the performance of the most common types of user directory searches. |
||||
|
Improves Netscape server performance. This index is also used by the Referential Integrity Plug-in. See Maintaining Referential Integrity for more information. |
System indexes are
indexes that cannot be
deleted or modified. They are required by the directory to function
properly. Table 10-2 lists
the system indexes included with the directory.
Table 10-2
System Indexes
Because of the need to maintain default indexes and other internal indexing mechanisms, the Directory Server also maintains certain standard index files. The following standard indexes exist by default, and you do not need to generate them:
Indexes are used to speed up searches. To understand how the directory uses indexes, it helps to understand the searching algorithm. Each index contains a list of attributes (such as the cn, common name, attribute) and a pointer to the entries corresponding to each value. Directory Server processes a search request as follows:
See Netscape Directory Server Configuration, Command, and File Reference for further information about these attributes.
In addition, the directory uses a variation
of the metaphone phonetic algorithm to
perform searches on an approximate index. Each value is treated as a
sequence of words, and a phonetic code is generated for each word.
|
|
|
|
The metaphone phonetic algorithm in Directory Server supports only US-ASCII letters. Therefore, use approximate indexing only with English values.
|
|
|
|
|
Values entered on an approximate search are similarly translated into a sequence of phonetic codes. An entry is considered to match a query if both of the following are true:
Before you create new indexes, balance the benefits of maintaining indexes against the costs. Keep in mind that:
The following example illustrates exactly how time-consuming indexes can become. Consider the procedure for creating a specific attribute:
For example, suppose the Directory Server is asked to add the entry
dn: cn=John Doe,
ou=People, o=example.com
objectclass:
top
objectClass:
person
objectClass:
orgperson
objectClass:
inetorgperson
cn: John Doe
cn: John
sn: Doe
ou:
Manufacturing
ou: people
telephonenumber:
408 555 8834
description:
Manufacturing lead for the Z238 line of widgets.
Further suppose that the Directory Server is maintaining the following indexes:
Then to add this entry to the directory, the Directory Server must perform these steps:
This section describes
how to create
presence, equality, approximate, substring, and international indexes
for specific attributes using the Directory Server Console and the
command-line.
Using the Directory Server Console, you can create presence, equality, approximate, substring, and international indexes for specific attributes.
|
|
|
|
Do not click on the Database Settings node because this will take you to the Default Index Settings window and not the window for configuring indexes per database.
|
|
|
|
|
You can create presence, equality, approximate, substring, and international indexes for specific attributes from the command-line.
Creating indexes from the command-line involves two steps:
|
|
|
|
You cannot create new system indexes because system indexes are hard-coded in Directory Server.
|
|
|
|
|
Use ldapmodify to add the new index attributes to your directory. If you want to create a new index that will become one of the default indexes, add the new index attributes to the cn=default indexes,cn=config,cn=ldbm database, cn=plugins,cn=config entry.
To create a new index for a particular
database, add it to the cn=index,cn=database_name,cn=ldbm
database,cn=plugins,cn=config
entry, where cn=database_name corresponds to the name of the
database.
For information on the LDIF update statements required to add entries, see LDIF Update Statements.
For example, assume you want to create presence, equality, and substring indexes for the sn (surname) attribute in the Example1 database.
First, type the following to change to the directory containing the utility:
Run the ldapmodify command-line utility as follows:
ldapmodify -a -h server -p 389 -D "cn=directory manager" -w password
The ldapmodify utility binds to the server and prepares it to add an entry to the configuration file.
Next, you add the following entry for the new indexes:
dn:
cn=sn,cn=index,cn=Example1,
cn=ldbm database,cn=plugins,cn=config
objectClass:top
objectClass:nsIndex
cn:sn
nsSystemIndex:false
nsIndexType:pres
nsIndexType:eq
nsIndexType:sub
nsMatchingRule:
2.16.840.1.113730.3.3.2.3.1
The cn attribute contains the name of the attribute you want to index, in this example, the sn attribute. The entry is a member of the nsIndex object class. The nsSystemIndex attribute is false, indicating that the index is not essential to Directory Server operations. The multi-valued nsIndexType attribute specifies the presence (pres), equality (eq) and substring (sub) indexes. Each keyword has to be entered on a separate line. The nsMatchingRule attribute specifies the OID of the Bulgarian collation order.
Specifying an index entry with no value in the nsIndexType attribute results in all indexes (except international) being maintained for the specified attribute. For example, suppose instead that you specify the following entry for your new sn indexes:
dn:
cn=sn,cn=index,cn=database_name,cn=ldbm database,cn=plugins,cn=config
objectClass:top
objectClass:nsIndex
cn:sn
nsSystemIndex:false
nsIndexType:
This new entry results in all indexes for the sn (surname) attribute.
You can use the keyword none in the nsIndexType attribute to specify that no indexes are to be maintained for the attribute. For example, suppose you want to temporarily disable the sn indexes you just created on the Example1 database. You change the nsIndexType to none as follows:
dn:
cn=sn,cn=index,cn=
Example1,
cn=ldbm database,cn=plugins,cn=config
objectClass:top
objectClass:nsIndex
cn:sn
nsSystemIndex:false
nsIndexType:none
For a complete list of collation orders and their OIDs, see Appendix D, "Internationalization."
For more information about the index configuration attributes, see the Netscape Directory Server Configuration, Command, and File Reference.
For more information about the
ldapmodify command-line utility, refer to the Netscape
Directory
Server Configuration, Command, and File Reference.
|
|
|
|
You should always use the attribute's primary name (not the attribute's alias) when creating indexes. The primary name of the attribute is the first name listed for the attribute in the schema; for example, uid for the userid attribute. See Table 10-3 for a list of all primary and alias attribute names.
|
|
|
|
|
Once you have created an indexing entry or added additional index types to an existing indexing entry, run the db2index.pl script to generate the new set of indexes to be maintained by the Directory Server. Once you run the script, the new set of indexes is active for any new data you add to your directory and any existing data in your directory.
To run the db2index.pl Perl script:
Two examples of generating indexes using the db2index.pl follow:
Windows batch file (you need to run the script from the ..\bin\slapd\admin\bin\perl directory as shown in the example):
..\bin\slapd\admin\bin\perl db2index.pl - D "cn=Directory Manager" -w password -n ExampleServer -t sn
db2index.pl -D "cn=Directory Manager" -w passsword -n ExampleServer -t sn
The following table describes the db2index.pl options used in the examples:
|
Specifies the name of the database into which you are importing the data. |
|
For more information about the db2index.pl Perl script, see the Netscape Directory Server Configuration, Command, and File Reference.
To create a browsing index or virtual list view (VLV) index using the Directory Server Console:
The default access control for VLV information is for it to be allowed for anyone who has authenticated. If a site requires anonymous users to use the VLV information, modify the access control set for cn: VLV Request Control in the Directory Server's configuration.
Creating a browsing index or virtual list view (VLV) index from the command-line involves these steps:
The following sections describe the steps involved in creating browsing indexes.
The type of browsing index entry you want to create depends on the type of ldapsearch attribute sorting you want to accelerate. It is important to take the following into account:
For example, you want to create a browsing index to accelerate an ldapsearch on the entry "dc=example,dc=com" held in the Example1 database where:
First, type the following to change to the directory containing the utility:
Run the ldapmodify command-line utility as follows:
ldapmodify -a -h server -p 389 -D "cn=directory manager" -w password
The ldapmodify utility binds to the server and prepares it to add an entry to the configuration file.
Next, you need to add two browsing index entries which define your browsing index.
The first entry you add specifies the base, scope, and filter of the browsing index:
dn:
cn="dc=example,dc=com",cn=Example1,cn=ldbm database
,
cn=plugins,cn=config
objectClass:top
objectClass:vlvSearch
cn:"dc=example,dc=com"
vlvbase:"dc=example,dc=com"
vlvscope:one
vlvfilter:
(|(objectclass=*)(objectclass=ldapsubentry))
The cn contains the browsing index identifier, which specifies the entry on which you want to create the browsing index, in this example the "dc=example,dc=com" entry. We recommend you use the dn of the entry for your browsing index identifier, which is the approach adopted by the Directory Server Console, to prevent identical browsing indexes from being created. The entry is a member of the vlvSearch object class.
The vlvbase attribute value specifies the entry on which you want to create the browsing index, in this example the "dc=example,dc=com" entry (the browsing index identifier).
The vlvscope attribute is one, indicating that the base for the search you want to accelerate is one. A search base of one means that only the immediate children of the entry specified in the cn attribute, and not the entry itself, will be searched.
The vlvfilter specifies the filter to be used for the search, in this example (|(objectclass=*)(objectclass=ldapsubentry)).
The second entry you add specifies the sorting order you want for the returned attributes:
dn:cn=sort_cn_givenname_o_ou_sn,cn="dc=example,dc=com",cn=Example1,
cn=ldbm
database
,
cn=plugins,cn=config
objectClass:top
objectClass:vlvIndex
cn:cn=sort_cn_givenname_o_ou_sn
vlvsort:cn
givenname o ou sn
The cn contains the browsing index sort identifier. We recommend you use a sort identifier which clearly identifies the search sorting order for the browsing index you create, such as the explicit sort identifier cn=sort_cn_givenname_o_ou_sn in this example. The entry is a member of the vlvIndex object class.
The vlvsort
attribute value specifies the order in which you want your
attributes to be sorted, in this example cn, givenname, o, ou, and
then sn.
|
|
|
|
This first browsing index entry must be added to the cn=database_name,cn=ldbm database,cn=plugins,cn=config directory tree node, and the second entry must be a child of the first entry.
|
|
|
|
|
Once you have created the two browsing indexing entries or added additional attribute types to an existing indexing browsing entries, run the vlvindex script to generate the new set of browsing indexes to be maintained by the Directory Server. After you run the script, the new set of browsing indexes is active for any new data you add to your directory and any existing data in your directory.
Two examples of generating browsing indexes using the vlvindex script follow.
Windows batch file (you need to run the script from the ..\bin\slapd\admin\bin\perl directory as shown in the example):
..\bin\slapd\admin\bin\perl vlvindex -n Example1 -T "dc=example,dc=com"
vlvindex -n Example1 -T "dc=example,dc=com"
The following table describes the vlvindex options used in the examples:
|
Browsing index identifier to use to create browsing indexes. |
For more information about the vlvindex script, see the Netscape Directory Server Configuration, Command, and File Reference.
The default access control for the VLV index information is to allow anyone who has authenticated. If a site requires anonymous users to use the VLV index information, modify the access control set for cn: VLV Request Control in the Directory Server's configuration.
This section describes
how to delete
presence, equality, approximate, substring, international, and browsing
indexes for specific attributes.
As the procedure for deleting browsing indexes is different, it is covered in a separate section. This section contains the following procedures:
|
|
|
|
You must not delete system indexes because deleting them can significantly affect Directory Server performance. System indexes are located in the cn=index,cn=instance,cn=ldbm database,cn=plugins,cn=config entry and the cn=default indexes,cn=config,cn=ldbm database,cn=plugins,cn=config entry. Take care when deleting default indexes since this can also affect how Directory Server works. For further information on system and default indexes, see the Netscape Directory Server Deployment Guide.
|
|
|
|
|
Using the Directory Server Console you can delete indexes you have created, indexes used by other Netscape servers (such as Netscape Messaging Server or Netscape Calendar Server), and default indexes. You cannot delete system indexes.
To delete indexes using the Directory Server Console:
You can browsing indexes, or virtual list view (VLV) indexes, using the ldapdelete command-line utility as follows:
The following sections describe the steps involved in deleting an index.
Use the ldapdelete command-line utility to delete either the entire indexing entry or the unwanted index types from an existing entry.
If you want to delete the indexes for a particular database, you remove your index entry from the cn=index,cn=database_name,cn=ldbm database,cn=plugins,cn=config entry, where cn=database_name corresponds to the name of the database.
To delete a default index, remove it from the cn=default indexes,cn=config,cn=ldbm database,cn=plugins,cn=config entry.
For example, you want to delete presence, equality, and substring indexes for the sn attribute on the database named Example1.
You want to delete the following entry:
dn:
cn=sn,cn=index,cn=Example1,cn=ldbm database,cn=plugins,cn=config
objectClass:top
objectClass:nsIndex
cn:sn
nsSystemIndex:false
nsIndexType:pres
nsIndexType:eq
nsIndexType:sub
nsMatchingRule:2.16.840.1.113730.3.3.2.3.1
To run the ldapdelete command-line utility, type the following to change to the directory containing the utility:
Perform the ldapdelete as follows:
ldapdelete -D "cn=Directory Manager" -w password -h ExampleServer -p845 "cn=sn,cn=index,cn=Example1,dn=ldbm database, cn=plugins,dn=config"
The following table describes the ldapdelete options used in the example:
For full information on ldapdelete options, refer to the Netscape Directory Server Configuration, Command, and File Reference.
Once you have deleted this entry, the presence, equality, and substring indexes for the sn attribute will no longer be maintained by the Example1 database.
Once you have deleted an indexing entry or deleted some of the index types from an indexing entry, run the db2index.pl script to generate the new set of indexes to be maintained by the Directory Server. Once you run the script, the new set of indexes is active for any new data you add to your directory and any existing data in your directory.
To run the db2index.pl Perl script:
Two examples of generating the new set of indexes to be maintained by the server using db2index.pl follow:
Windows batch file (you need to run the script from the ..\bin\slapd\admin\bin\perl. directory as shown in the example):
..\bin\slapd\admin\bin\perl db2index.pl - D "cn=Directory Manager" -w password -n Example1
db2index.pl -D "cn=Directory Manager" -w password -n Example1
The following table describes the db2index.pl options used in the examples:
|
Specifies the name of the database into which you are importing the data. |
For more information about the db2index.pl Perl script, see the Netscape Directory Server Configuration, Command, and File Reference.
Using Directory Server Console you can delete browsing indexes.
To delete a browsing index using the Directory Server Console:
Deleting a browsing index, or virtual list view (VLV) index, from the command-line involves two steps:
The following sections describe the steps involved in deleting browsing indexes.
Use the ldapdelete command-line utility to either delete browsing indexing entries or edit existing browsing index entries.
To delete browsing indexes for a particular database, you remove your browsing index entries from the cn=index,cn=database_name,cn=ldbm database,cn=plugins,cn=config entry, where cn=database_name corresponds to the name of the database.
For example, you want to delete a browsing index for accelerating ldapsearch operations on the entry "dc=example,dc=com" held in the Example1 database where the search base is "dc=example,dc=com", the search filter is (|(objectclass=*)(objectclass=ldapsubentry)), the scope is one, and the sorting order for the returned attributes is cn, givenname, o, ou, and sn.
To delete this browsing index, you need to delete the two corresponding browsing index entries which follow:
dn:
cn="dc=example,dc=com",cn=Example1,cn=ldbm database,cn=plugins,cn=config
objectClass:top
objectClass:vlvSearch
cn:"dc=example,dc=com"
vlvbase:"dc=example,dc=com
vlvscope:one
vlvfilter:
(|(objectclass=*)(objectclass=ldapsubentry))
dn:cn=sort_cn_givenname_o_ou_sn,cn="dc=example,dc=com",cn=Example1,
cn=ldbm
database
,
cn=plugins,cn=config
objectClass:top
objectClass:vlvIndex
cn:cn=sort_cn_givenname_o_ou_sn
vlvsort:cn
givenname o ou sn
To run the ldapdelete command-line utility, type the following to change to the directory containing the utility:
Perform the ldapdelete as follows:
ldapdelete -D
"cn=Directory Manager" -w password -h ExampleServer -p 845
"cn="dc=example,dc=com",cn=Example1,cn=ldbm database
,
cn=plugins,cn=config"
"cn=sort_cn_givenname_o_ou_sn,cn="dc=example,dc=com",cn=Example1,
cn=ldbm
database
,
cn=plugins,cn=config"
The following table describes the ldapdelete options used in the example:
For full information on ldapdelete options, refer to the Netscape Directory Server Configuration, Command, and File Reference.
Once you have deleted these two browsing index entries, the browsing index for accelerating ldapsearch operations on the entry "dc=example,dc=com" held in the Example1 database where the search base is "dc=example,dc=com" the search filter is (|(objectclass=*)(objectclass=ldapsubentry)), the scope is one, and the sorting order for the returned attributes is cn, givenname, o, ou, and sn will no longer be maintained by the Example1 database
Once you have deleted browsing indexing entries or deleted unwanted attribute types from existing browsing indexing entries, run the vlvindex script to generate the new set of browsing indexes to be maintained by the Directory Server. Once you run the script, the new set of browsing indexes is active for any new data you add to your directory and any existing data in your directory.
Two examples of creating indexes using vlvindex follow:
Windows batch file (you need to run the script from the ..\bin\slapd\admin\bin\perl. directory as shown in the example):
..\bin\slapd\admin\bin\perl vlvindex -n Example1 -T "dc=example,dc=com"
vlvindex -n Example1 -T "dc=example,dc=com"
The following table describes the vlvindex options used in the examples.
|
Browsing index identifier to use to create browsing indexes. |
For more information about the vlvindex script, see the Netscape Directory Server Configuration, Command, and File Reference.
Each index that the directory uses is composed of a table of index keys and matching entry ID lists. This entry ID list is used by the directory to build a list of candidate entries that may match a client application's search request (see About Indexes for details).
In the 7.0 release of Directory Server, the secondary index structure has been redesigned, greatly improving write, search, and indexing operations. The following sections examine enhanced indexing operations and the BerkeleyDB design, searching and the old All IDs Threshold, and migrating and compatibility with previous versions of Directory Server.
To improve write performance, Directory Server 7.0 has a redesigned secondary index structure.
While achieving extremely high read performance, in previous versions of Directory Server, write performance was limited by the number of bytes per second that could be written into the storage manager's transaction log file. Large log files were generated for each LDAP write operation; in fact, "log file verbosity" could easily be 100 times the corresponding number of bytes changed in the Directory Server. The majority of the contents in the log files are related to index changes (ID insert and delete operations).
The secondary index structure was separated into two levels in the old design:
Because it had no insight into the internal structure of the ID lists, the storage manager had to treat ID lists as opaque byte arrays. From the storage manager's perspective, when the content of an ID list changed, the entire list had changed. For a single ID that was inserted or deleted from an ID list, the corresponding number of bytes written to the transaction log was the maximum configured size for that ID list, about 8Kbytes. Also, every database page on which the list was stored was marked as dirty, since the "entire" list had changed.
In version 7.0, the storage manager now has visibility into the fine-grain index structure, which optimizes transaction logging so that only the number of bytes actually changed need to be logged for any given index modification. The BerkeleyDB feature provides ID list semantics, which are implemented by the storage manager. The Berkeley API was enhanced to support the insertion and deletion of individual IDs stored against a common key, with support for duplicate keys, and an optimized mechanism for the retrieval of the complete ID list for a given key.
The storage manager has direct knowledge of the application's intent when changes are made to ID lists. As a result:
For each entry ID list, there is a size limit that is globally applied to all index keys managed by the server. In previous versions of Directory Server, this limit was called the All IDs Threshold. Because maintaining large ID lists in memory can affect performance, the All IDs Threshold set a limit on how large a single entry ID list could get. When a list hit a certain pre-determined size, the search would treat it as if the index contained the entire directory.
The difficulty in setting the All IDs Threshold hurt peformance. If the threshold was too low, too many searches examined every entry in the directory. If it was too high, too many large ID lists had to be maintained in memory.
The problems addressed by the All IDs Threshold are no longer present because of the efficiency of entry insertion, modification, and deletion in the BerkeleyDB design. The All IDs Threshold is removed for database write operations, and every ID list is now maintained accurately.
Since loading a long ID list from the
database can significantly reduce search
performance, the configuration parameter, nsslapd-idlistscanlimit,
sets a limit on the number of IDs
that are read before a key is considered to match the entire primary
index. This mechanism is analagous to
the All IDs Threshold, but it only applies to the behavior of the
server's search code, not the content of the database.
When the server uses indexes in the processing of a search operation, it is possible that one index key matches a large number of entries. For example, consider a search for 'objectclass=inetorgperson' in a directory that contained one million inetorgperson entries. When the server reads the inetorgperson index key in the objectclass index, it would find one million matching entries. In cases like this, it is usually more efficient simply to conclude in the index lookup phase of the search operation processing that all the entries in the database match the query. This causes subsequent search processing to scan the entire database content, checking each entry as to whether it matches the search filter. The idea is that the time required to fetch the index keys is not worthwhile; the search operation is likely to be processed more efficiently by omitting the index lookup.
Directory Server implements this search
optimization as follows: when examining an index, if more than a
certain number of entries are found, the server will stop reading the
index and mark the search as unindexed with respect to that particular
index.
The threshold number of entries is called the idlistscanlimit
and is configured with the nsslapd-idlistscanlimit
configuration attribute. The default value is 4000, which
is designed to give good performance for a common range of database
sizes and access patterns. Typically, it is not necessary to change
this value. However, in rare circumstances it may be possible to
improve search
performance with a different value. For example, lowering the value
will improve performance for searches that will otherwise eventually
hit the default limit of 4000. This might, of course, reduce
performance for other searches that would benefit from indexing.
Conversely, increasing the limit could improve performance for searches
that were previously hitting the limit. With a higher limit, these
searches could benefit from indexing where previously they did not.
For more information on search limits for the server, see Overview of the Searching Algorithm.
While Directory Server 7.0 contains code to support the old database design, only the new design is supported for this and later releases of Directory Server.
Upon startup, the server will read the
database version from the DBVERSION file, which contains the text
Netscape-ldbm/6.2
(old database version) and Netscape-ldbm/7.0
(new
database format). If the file indicates that the old format is used,
then the old code is selected for the database. Because the DBVERSION
file stores everything per-backend, it is theoretically possible to
have different database formats for different individual backends.
However, the old database format is not recommended.
All databases must be migrated to Directory Server 7.0 when the system is upgraded. Migration is supported for Directory Server 6.x versions. Version 7.0 can be installed "over" the previous releases, resulting in a working server that supports the new database design. Migrating the databases is a lengthy operation; do not do this at the time of installation.
For releases earlier than version 6.11, it is recommended that the databases be dumped, and Directory Server installed fresh.
For more information on migrating databases, see chapter 6, "Migrating from Previous Versions," in the Netscape Directory Server Installation Guide.
Also, the index sizes can be larger than in previous releases, so you may want to increase your database cache size. To reconfigure your cache size, refer to "nsslap-dbcachesize" in the Netscape Directory Server Configuration, Command, and File Reference.
Table 10-3 lists all
attributes which
have a primary or real name as well as an alias. When creating indexes
be sure to use the primary name.
Table
10-3 Attribute Name Quicke Reference Table