4.7 Creating a CTXRULE Index
To build a document classification application, use the CTXRULE index on a table or queries. The stream of incoming documents is classified by content, and the queries define your categories. You can use the MATCHES operator to classify single documents.
To create a CTXRULE index and a simple document classification application:
-
Create a table of queries.
Create a
myqueriestable to hold the category name and query text, and then populate the table with the classifications and the queries that define each classification.CREATE TABLE myqueries ( queryid NUMBER PRIMARY KEY, category VARCHAR2(30), query VARCHAR2(2000) );
For example, consider a classification for the US Politics, Music, and Soccer subjects:
INSERT INTO myqueries VALUES(1, 'US Politics', 'democrat or republican'); INSERT INTO myqueries VALUES(2, 'Music', 'ABOUT(music)'); INSERT INTO myqueries VALUES(3, 'Soccer', 'ABOUT(soccer)');
Tip:
You can also generate a table of rules (or queries) with the
CTX_CLS.TRAINprocedure, which takes as input a document training set. -
Create the
CTXRULEindex.Use the
CREATE INDEXstatement to create theCTXRULEindex and specify lexer, storage, section group, and wordlist parameters if needed.CREATE INDEX myruleindex ON myqueries(query) INDEXTYPE IS CTXRULE PARAMETERS ('lexer lexer_pref storage storage_pref section group section_pref wordlist wordlist_pref'); -
Classify a document.
Use the
MATCHESoperator to classify a document.Assume that incoming documents are stored in the table
news:CREATE TABLE news ( newsid NUMBER, author VARCHAR2(30), source VARCHAR2(30), article CLOB);
If you want, create a "before insert" trigger with
MATCHESto route each document to anews_routetable based on its classification:BEGIN -- find matching queries FOR c1 IN (select category from myqueries where MATCHES(query, :new.article)>0) LOOP INSERT INTO news_route(newsid, category) VALUES (:new.newsid, c1.category); END LOOP; END;
See Also:
-
Classifying Documents in Oracle Text for more information on document classification and the
CTXRULEindex -
Oracle Text Reference for more information on
CTX_CLS.TRAIN