CREATE_LANG_DATA
Use the DBMS_VECTOR_CHAIN.CREATE_LANG_DATA chunker helper procedure to load your own language data file into the database.
Purpose
To create custom language data for your chosen language (specified using the language chunking parameter).
A language data file contains language-specific abbreviation tokens. You can supply this data to the chunker to help in accurately determining sentence boundaries of chunks, by using knowledge of the input language's end-of-sentence (EOS) punctuations, abbreviations, and contextual rules.
Usage Notes
-
All supported languages are distributed with the default language-specific abbreviation dictionaries. You can create a language data based on the abbreviation tokens loaded in the
schema.table.column, using a user-specified language data name (PREFERENCE_NAME). -
After loading your language data, you can use language-specific chunking by specifying the
languagechunking parameter withVECTOR_CHUNKSorUTL_TO_CHUNKS. -
You can query these data dictionary views to access existing language data:
-
ALL_VECTOR_LANGdisplays all available languages data. -
USER_VECTOR_LANGdisplays languages data from the schema of the current user. -
ALL_VECTOR_ABBREV_TOKENSdisplays abbreviation tokens from all available language data. -
USER_VECTOR_ABBREV_TOKENSdisplays abbreviation tokens from the language data owned by the current user.
-
Syntax
DBMS_VECTOR_CHAIN.CREATE_LANG_DATA (
PARAMS IN JSON default NULL
);PARAMS
{
table_name,
column_name,
language,
preference_name
} Table 12-19 Parameter Details
| Parameter | Description | Required | Default Value |
|---|---|---|---|
|
|
Name of the table (along with the optional table owner) in which you want to load the language data |
Yes |
No value |
|
|
Column name in the language data table in which you want to load the language data |
Yes |
No value |
|
|
Any supported language name, as listed in Supported Languages and Data File Locations |
Yes |
No value |
|
|
User-specified preference name for this language data |
Yes |
No value |
Example
declare
params CLOB := '{"table_name" : "eos_data_1",
"column_name" : "token",
"language" : "indonesian",
"preference_name" : "my_lang_1"}';
begin
DBMS_VECTOR_CHAIN.CREATE_LANG_DATA(
JSON (params));
end;
/End-to-end example:
To run an end-to-end example scenario using this procedure, see Create and Use Custom Language Data.
Related Topics
Parent topic: DBMS_VECTOR_CHAIN