8.1 Highlighting Query Terms
In text query applications, you can present selected documents with query terms highlighted for text queries or with themes highlighted for ABOUT queries.
You can generate three types of output associated with highlighting:
-
A marked-up version of the document
-
Query offset information for the document
-
A concordance of the document, in which occurrences of the query term are returned with their surrounding text
This section contains the following topics:
8.1.1 Text highlighting
For text highlighting, you supply the query, and Oracle Text highlights words in the document that satisfy the query. You can obtain plain-text or HTML highlighting.
8.1.2 Theme Highlighting
For ABOUT queries, the CTX_DOC procedures highlight and mark up words or phrases that best represent the ABOUT query.
8.1.3 CTX_DOC Highlighting Procedures
These are the highlighting procedures in CTX_DOC:
-
CTX_DOC.MARKUPandCTX_DOC.POLICY_MARKUP -
CTX_DOC.HIGHLIGHTandCTX_DOC.POLICY_HIGHLIGHT -
CTX_DOC.SNIPPETandCTX_DOC.POLICY_SNIPPET
The POLICY and non-POLICY versions of the procedures are equivalent, except that the POLICY versions do not require an index.
Note:
SNIPPET can also be generated using the Result Set Interface.
See Also:
Oracle Text Reference for information on CTX_QUERY.RESULT_SET
This section contains these topics:
8.1.3.1 Markup Procedure
The CTX_DOC.MARKUP and CTX_DOC.POLICY_MARKUP procedures take a document reference and a query, and return a marked-up version of the document.
The output can be either marked-up plain text or marked-up HTML. For example, specify that a marked-up document be returned with the query term surrounded by angle brackets (<<<tansu>>>) or HTML (<b>tansu</b>).
CTX_DOC.MARKUP and CTX_DOC.POLICY_MARKUP are equivalent, except that CTX_DOC.POLICY_MARKUP does not require an index.
You can customize the markup sequence for HTML navigation.
CTX_DOC.MARKUP Example
The following example is taken from the web application described in CONTEXT Query Application. The showDoc procedure takes an HTML document and a query, creates the highlight markup—in this case, the query term is displayed in red—and outputs the result to an in-memory buffer. It then uses htp.print to display it in the browser.
procedure showDoc (p_id in varchar2, p_query in varchar2) is
v_clob_selected CLOB;
v_read_amount integer;
v_read_offset integer;
v_buffer varchar2(32767);
v_query varchar(2000);
v_cursor integer;
begin
htp.p('<html><title>HTML version with highlighted terms</title>');
htp.p('<body bgcolor="#ffffff">');
htp.p('<b>HTML version with highlighted terms</b>');
begin
ctx_doc.markup (index_name => 'idx_search_table',
textkey => p_id,
text_query => p_query,
restab => v_clob_selected,
starttag => '<i><font color=red>',
endtag => '</font></i>');
v_read_amount := 32767;
v_read_offset := 1;
begin
loop
dbms_lob.read(v_clob_selected,v_read_amount,v_read_offset,v_buffer);
htp.print(v_buffer);
v_read_offset := v_read_offset + v_read_amount;
v_read_amount := 32767;
end loop;
exception
when no_data_found then
null;
end;
exception
when others then
null; --showHTMLdoc(p_id);
end;
end showDoc;
end;
/
show errors
set define on
8.1.3.2 Highlight Procedure
CTX_DOC.HIGHLIGHT and CTX_DOC.POLICY_HIGHLIGHT take a query and a document and return offset information for the query in plain text or HTML format. You can use this offset information to write your own custom routines for displaying documents.
CTX_DOC.HIGHLIGHT and CTX_DOC.POLICY_HIGHLIGHT are equivalent, except that CTX_DOC.POLICY_HIGHLIGHT does not require an index.
With offset information, you can display a highlighted version of a document (such as different font types or colors) instead of the standard plain-text markup obtained from CTX_DOC.MARKUP.
See Also:
Oracle Text Reference for more information about using CTX_DOC.HIGHLIGHT and CTX_DOC.POLICY_HIGHLIGHT
8.1.3.3 Concordance
CTX_DOC.SNIPPET and CTX_DOC.POLICY_SNIPPET produce a concordance of the document, in which occurrences of the query term are returned with their surrounding text. This result is sometimes known as Key Word in Context (KWIC) because, instead of returning the entire document (with or without the query term highlighted), it returns the query term in text fragments, allowing a user to see it in context. You can control how the query term is highlighted in the returned fragments.
CTX_DOC.SNIPPET and CTX_DOC.POLICY_SNIPPET are equivalent, except that CTX_DOC.POLICY_SNIPPET does not require an index. CTX_DOC.POLICY_SNIPPET and CTX_DOC.SNIPPET include two new attributes: radius specifies the approximate desired length of each segment, whereas, max_length puts an upper bound on the length of the sum of all segments.
See Also:
Oracle Text Reference for more information about CTX_DOC.SNIPPET and CTX_DOC.POLICY_SNIPPET