16 DBMS_SEARCH Package
The DBMS_SEARCH PL/SQL package provides procedures and functions to create, manage, and query a ubiquitous search index.
Instead of creating various individual indexes or manually defining the USER_DATASTORE or MULTI_COLUMN_DATASTORE procedures along with materialized views, you can create a ubiquitous search index to automatically synthesize multiple tables or views into a single index. This index lets you perform full-text, vector-based semantic search, hybrid text-vector search and range-based searches across multiple objects within an entire schema.
Note:
In addition to the examples provided for each of these APIs, you can run various end-to-end example scenarios, as demonstrated in Oracle Text Application Developer's Guide.| Name | Description |
|---|---|
|
Creates a ubiquitous search index. |
|
|
Adds a table, view, or Duality view to the index as data source. |
|
|
Removes a data source and all its associated data from the index. |
|
|
Removes the index and all its associated data from the database. |
|
|
Returns a virtual indexed JSON document for the specified source metadata. |
|
|
Retrieves a hitlist, and facets an aggregations of JSON documents based on the specified filter conditions. |
16.1 CREATE_INDEX
The DBMS_SEARCH.CREATE_INDEX procedure creates a ubiquitous search index (or DBMS_SEARCH index) to perform full-text and range-based searches across multiple schema objects.
Syntax
The index type is a JSON search index enabled with a predefined set of preferences and settings to perform full text searches on tables, views, and Duality views.
DBMS_SEARCH.CREATE_INDEX(
index_name VARCHAR2,
tablespace VARCHAR2 DEFAULT NULL,
datatype VARCHAR2 DEFAULT NULL,
lexer VARCHAR2 DEFAULT NULL,
stoplist VARCHAR2 DEFAULT NULL,
wordlist VARCHAR2 DEFAULT NULL,
vectorizer VARCHAR2 DEFAULT NULL
);Note:
TheDBMS_SEARCH.CREATE_INDEX procedure has been upgraded in version 23.9 to support lexer, stoplist, wordlist, and vectorizer parameters. These parameters are not available for use in earlier release versions. Additionally, starting from version 23.9 DBMS_SEARCH index will not get created with default indexing preference, WILDCARD_INDEX.
- index_name
-
Specify the name of the
DBMS_SEARCHindex to create. You can also specify the schema owner name along with the index name as:[schema].index_name - tablespace
-
Specify the name of the tablespace to contain the index or index partitions.
- datatype
-
Specify the datatype of the
DATAcolumn on which to create the index. The allowed values areJSONandOSON.The default value is set to
JSON, so you need to specify thedatatypeargument only if you want to override this default. - lexer
-
Specify the name of your lexer or multilexer preference. Use the lexer preference to identify the language of your text and how text is tokenized for indexing. See Lexer Types
- wordlist
-
Specify the name of your wordlist preference. Use the wordlist preference to enable features such as fuzzy, stemming, and prefix indexing for better wildcard searching. See Wordlist Type
- stoplist
-
Specify the name of your stoplist. Use stoplist to identify words that are not to be indexed. See Stoplists
- vectorizer
-
Specify the name of your vectorizer preference. Use the vectorizer preference to customize vector search parameters of a hybrid vector indexing pipeline. The goal of a vectorizer preference is to provide you with a straightforward way to configure how to chunk and embed your documents and create a vector index, without requiring a deep understanding of various chunking or embedding strategies.
A vectorizer preference is a JSON object that collectively holds all indexing parameters related to chunking (
UTL_TO_CHUNKSorVECTOR_CHUNKS), embedding (UTL_TO_EMBEDDING,UTL_TO_EMBEDDINGS, orVECTOR_EMBEDDING), and vector index (distance,accuracy, orvector_idxtype). You use theDBMS_VECTOR_CHAIN.CREATE_PREFERENCEPL/SQL function to create a vectorizer preference. To create a vectorizer preference, see DBMS_VECTOR_CHAIN.CREATE_PREFERENCE. After creating a vectorizer preference, you can use thevectorizerparameter to pass the preference name.
Notes
-
You can define which tables or views should be indexed by adding them as data sources into your index. All the columns of those tables or views are indexed. Use the
DBMS_SEARCH.ADD_SOURCEandDBMS_SEARCH.REMOVE_SOURCEprocedures to manage data sources. -
The
DBMS_SEARCHindex is created with the following default indexing preferences:Preference Description WILDCARD_INDEXEnables wildcard indexing for a fast wildcard search.
BASIC_WORDLISTEnables stemming and fuzzy matching.
SEARCH_ONAllows both full-text and range-search queries for a specific data type. The supported data types are
NUMBER(for indexing numeric values) andTIMESTAMP(for indexing date-time values).SYNCandOPTIMIZECreates background jobs at predefined intervals to automatically synchronize the DML changes and optimize the index using the
AUTO_DAILYmode on all data sources.You do not need to explicitly run any
SYNC_INDEXandOPTIMIZE_INDEXoperations on this index. -
You can query this index using the
CONTAINS(),JSON_TEXTCONTAINS(), andJSON_EXISTSoperators on theINDEX_NAMEtable. -
You can use the following
DBMS_SEARCHdictionary views to examine these indexes:-
USER_DBMS_SEARCH_INDEXES: To query information about theDBMS_SEARCHindexes that are created in a user's schema. -
ALL_DBMS_SEARCH_INDEXES: To query information about all existingDBMS_SEARCHindexes, corresponding to each index owner. -
USER_DBMS_SEARCH_INDEX_SOURCES: To query information about the data sources that are added to theDBMS_SEARCHindexes, created in a user's schema. -
ALL_DBMS_SEARCH_INDEX_SOURCES: To query information about all existing data sources added to theDBMS_SEARCHindexes, corresponding to each index owner.
-
Example
This example specifies the index_name, tablespace, and datatype arguments. Here, the schema owner name is specified along with the index name as SCOTT.MYINDEX.
CREATE TABLESPACE tbs_02 DATAFILE 'dt.dbf' size 100MB segment space management auto;
exec DBMS_SEARCH.CREATE_INDEX('SCOTT.MYINDEX','tbs_02','JSON');16.2 ADD_SOURCE
The DBMS_SEARCH.ADD_SOURCE procedure adds one or more data sources (tables or views) from different schemas to the DBMS_SEARCH index.
Syntax
DBMS_SEARCH.ADD_SOURCE (
index_name VARCHAR2,
source_name VARCHAR2,
memory VARCHAR2 DEFAULT NULL,
parallel_degree NUMBER DEFAULT NULL);Note:
The DBMS_SEARCH.ADD_SOURCE procedure has been upgraded in version 23.9 to support memory and parallel_degree parameters. These parameters are not available for use in earlier release versions.
- index_name
- Specify name of the index to which you want to add the table or view. You can also specify
[schema].index_name. - source_name
- Specify name of the table, view, or Duality view to add to the index. You can also specify
[schema].table_or_view_name. - memory
- Specify the amount of memory allocated for indexing of the added data source.
Note:
Total memory usage is determined by multiplying thememoryparameter by theparallel_degree. - parallel_degree
- Specify the degree of parallelism used for indexing the added data source.
Notes
-
To add a data source, the index owner must have
SELECTandDMLaccess to the source. The user also must have theCTXAPPprivilege. -
All the data sources (such as table, view, or each table in the view definition) that are added to the
DBMS_SEARCHindex must include at least onePrimary Keycolumn. Each table that is part of a view source having a foreign key must also have theForeign Keyconstraint, referencing the relevant primary keys defined on the table.If the source table does not have a primary key, then a
ROWIDis used instead. However, Oracle strongly recommends defining a primary key. -
The
DBMS_SEARCHindex stores all supported SQL data types (includingObject Typecolumns) in JSON objects, except for theXMLTYPEandLONGdata types. This means that you cannot add a table or view as a data source to the index if it has a column with theXMLTYPEorLONGdata type. The maximum allowed length of a JSON data type is 32 megabytes. -
You can use the
USER_DBMS_SEARCH_INDEX_SOURCESandALL_DBMS_SEARCH_INDEX_SOURCESdictionary views to query information about the data sources that are added to yourDBMS_SEARCHindexes.
Examples
exec DBMS_SEARCH.ADD_SOURCE('MYINDEX','MYTABLE');exec DBMS_SEARCH.ADD_SOURCE('MYINDEX','MYVIEW');exec DBMS_SEARCH.ADD_SOURCE('DOCUSER.MYINDEX','DOCUSER.MYTABLE');Related Topics
16.3 REMOVE_SOURCE
The DBMS_SEARCH.REMOVE_SOURCE procedure removes one or more data sources (tables or views) from the DBMS_SEARCH index.
When run, this procedure deletes all indexed data and stops further indexing or maintenance operations on the associated data sources (tables or views).
Syntax
DBMS_SEARCH.REMOVE_SOURCE(
index_name VARCHAR2,
source_name VARCHAR2);
Example
exec DBMS_SEARCH.REMOVE_SOURCE('MYINDEX','MYTABLE');Related Topics
16.4 DROP_INDEX
The DBMS_SEARCH.DROP_INDEX procedure removes a DBMS_SEARCH index and all its associated data from the database.
Syntax
DBMS_SEARCH.DROP_INDEX(
INDEX_NAME VARCHAR2);
Example
exec DBMS_SEARCH.DROP_INDEX('MYINDEX');Related Topics
16.5 GET_DOCUMENT
The DBMS_SEARCH.GET_DOCUMENT procedure returns a virtual indexed JSON document as is indexed in the JSON search index for a particular row of an indexed data source (table or view).
Syntax
The DBMS_SEARCH index references associated data source tables to dynamically create a virtual indexed document. This document contains a JSON representation for each indexed row of a table or view that is added as a data source to this index. In this way, you can view all the contents extracted from the original base tables.
DBMS_SEARCH.GET_DOCUMENT(
index_name VARCHAR2,
metadata JSON
);
Example
SELECT DBMS_SEARCH.GET_DOCUMENT('MYINDEX',METADATA) from MYINDEX;Related Topics
16.6 FIND
The DBMS_SEARCH.FIND procedure retrieves a hitlist, and facets an aggregations of JSON documents based on the specified query-by-example (QBE) filter conditions.
You can compute aggregations on different fields of the JSON data. The query lists search results in the JSON Results Set Interface, which supports faceted navigation and aggregations.
Syntax
DBMS_SEARCH.FIND(
index_name VARCHAR2,
search_QBE JSON);- index_name
-
Specify name of the index on which you want to perform the query.
- search_QBE
-
Specify the
result_set_descriptorparameter value in JSON. It describes what the result set should contain.The JSON format input result set descriptor consists of the$query,$search, and$facetparts:{ "$query":text query and filter conditions, "$search":search result specifications, "$facet":faceted result specifications }For details on each of these JSON objects, see The JSON Format Input Result Set Descriptor.
The JSON format output result set descriptor consists of the following parts:
"$count":number "$hit":[ hit_object_1, ..., hit_object_i , ... ] "$facet":[ facet_object_1, ..., facet_object_i, ...]For details on each of these JSON objects, see The JSON Format Result Set Output.
Example
SET FEEDBACK 1
SET NUMWIDTH 10
SET LINESIZE 80
SET TRIMSPOOL ON
SET TAB OFF
SET PAGESIZE 100
connect sys/knl_example as sysdba;
Connected.
grant connect,resource, unlimited tablespace, ctxapp
to u1 identified by u1;
Grant succeeded.
connect u1/u1;
Connected.
create table tbl(id number primary key, jsn_col clob check(jsn_col is json));
Table created.
INSERT INTO tbl
VALUES (1,'{ "zebra" : { "price" : [2000,1000],
"name" : "Marty",
"stripes" : ["Dark","Light"],
"handler" : "Bob", "sold" : true }}');
1 row created.
INSERT INTO tbl
VALUES (2,'{ "zebra" : { "rating": 5, "price" : 1000,
"name" : "Zigby",
"stripes" : ["Light","Grey"],
"handler" : "Handy Marty", "sold" : "true" }}');
1 row created.
INSERT INTO tbl
VALUES (3,'{ "zebra" : { "rating": 4.5, "price" : 3000,
"name" : "Zigs",
"stripes" : ["Grey","Dark"],
"handler" : "Handy Marty", "sold" : false }}');
1 row created.
INSERT INTO tbl
VALUES (4,'{ "zebra" : { "rating": "4.5", "price" : "3000",
"name" : "Zigs",
"stripes" : ["Grey","Dark"],
"handler" : "Handy Marty", "sold" : null }}');
1 row created.
commit;
Commit complete.DBMS_SEARCH index using the DBMS_SEARCH.CREATE_INDEX procedure, and add a source table to the index: SQL> exec DBMS_SEARCH.CREATE_INDEX('JIDX');
PL/SQL procedure successfully completed.
SQL> exec DBMS_SEARCH.ADD_SOURCE('JIDX','TBL');
PL/SQL procedure successfully completed.
SQL> DBMS_SEARCH.FIND procedure:Query: All zebras having name starting with Zig or having name Marty and
having a price greater than equal to 2000
Facets: For all zebras that satisfy the query, do the following
-- 1. Get the count of zebras per zebra handler
-- 2. Get the minimum zebra rating
-- 3. Get the count of zebras for each unique stripe color
select DBMS_SEARCH.FIND('JIDX',JSON('
{
"$query": { "$and" : [
{ "U1.TBL.JSN_COL.zebra.name" : { "$contains" : "Zig% or Marty" } },
{ "U1.TBL.JSN_COL.zebra.price" : { "$gte" : 2000 } }
]
},
"$facet" : [
{ "$uniqueCount" : "U1.TBL.JSN_COL.zebra.handler" },
{ "$min" : "U1.TBL.JSN_COL.zebra.rating" },
{ "$uniqueCount" : "U1.TBL.JSN_COL.zebra.stripes" }
]
}'));
FIND_RESULT
--------------------------------------------------------------------------------
{
"$count" : 3,
"$facet" :
[
{
"U1.TBL.JSN_COL.zebra.handler" :
[
{
"value" : "Handy Marty",
"$uniqueCount" : 2
},
{
"value" : "Bob",
"$uniqueCount" : 1
}
]
},
{
"U1.TBL.JSN_COL.zebra.rating" :
{
"$min" : 4.5
}
},
{
"U1.TBL.JSN_COL.zebra.stripes" :
[
{
"value" : "Dark",
"$uniqueCount" : 3
},
{
"value" : "Grey",
"$uniqueCount" : 2
},
{
"value" : "Light",
"$uniqueCount" : 1
}
]
}
]
}
1 row selected.
connect sys/knl_example as sysdba;
Connected.
drop user u1 cascade;
User dropped.