Federated Distributed Database Reference
SYNC SCHEMA Operations
DDL Synchronization
DDL synchronization is an operation that SYNC SCHEMA runs
just after the deployment of the shards in a federated distributed database configuration.
The goal of this operation is to import the object definitions from all of the shards, compare the definitions across the shards, and generate DDLs for the objects that exist on all of the shards (common objects). Once the DDLs are run and the objects are created, you can reference these objects in multi-shard queries.
Import Users
A user or schema is a candidate for import by SYNC SCHEMA
if it exists on all of the shards and owns importable schema objects.
You can narrow the list of users to be imported by passing a list of users
in the -SCHEMA parameter. For example,
gdsctl> sync schema -schema scottgdsctl> sync schema -schema scott,myschemaFor case-sensitive schemas use quoted identifiers.
gdsctl> sync schema -schema "O'Brien",scottTo include all non-Oracle schemas, use the value ALL in the
SCHEMA parameter.
gdsctl> sync schema -schema allBefore importing the users, SYNC SCHEMA verifies that any
discovered users exist on all shards, and no user already exists on the shard catalog
with the same name. The users are then created on the shard catalog as local users and
they are locked. Because these are local users, they only share the same name with
shards and are essentially the same as any other user that may have the same name across
different databases. Note that these users are not able to login and issue queries
because they are not all shard users. To issue multi-shard queries, an all shard
user must be created.
Note:
Only users local to a PDB are imported. Common CDB users are not imported.Grant User Roles and Priviledges
For the imported users, SYNC SCHEMA compares users'
privileges.
SYNC SCHEMA grants only the privileges that are granted on
all of the shards (common grants). A user A who has a DBA role on
shard1, but does not have DBA role on shard2, is not granted the DBA role in
the shard catalog.
Import Object Definitions
The objects compared and imported by SYNC SCHEMA to the
shard catalog are the objects that will be referenced in multi-shard queries or used by
multi-shard query processing.
These objects are:
- Tables
- Views and Materialized Views (exported as tables)
- Check Constraints
- Object Types
- Synonyms
Running SYNC SCHEMA does not import objects related to
storage, or objects that have no impact on multi-shard query processing, such as
tablespaces, indexes, indextypes, directories, or zone maps.
Schema Object Comparison
The objects, from one shard to another, can have different definitions.
SYNC SCHEMA compares the different definitions and creates a common
definition to enable multi-shard queries against imported objects.
SYNC SCHEMA detects the objects' differences at two levels:
number of objects, and object definitions.
First, SYNC SCHEMA considers the number of objects. It is
likely that, during an application upgrade, some objects are added to the schemas. Only
objects that are on all of the shards will be imported into the shard catalog.
Second, the object definitions from one shard to another can have different
attributes. For the objects that SYNC SCHEMA imports, the following
differences are noted:
Differences in Tables
When comparing objects in a federated distributed database configuration, some differences in tables have an impact on multi-shard queries and some do not.
Column Differences
Only column differences have an impact on multi-shard queries. SYNC
SCHEMA addresses only this difference.
-
The number of columns can be different.
-
The data type of a given column can be different.
-
The default value of a given column can be different.
-
The expression of a virtual column can be different
When a table has a different numbers of columns, SYNC
SCHEMA will opt for the creation of a table that contains the union of all
of the columns. Taking the union of all of the columns, compared to just taking the
intersection, will spare you from re-writing multi-shard queries in case of an
incremental deploy, when the added shard has fewer columns than indicated in the shard
catalog.
When a column has different data types, SYNC SCHEMA defines
it as the highest (largest) datatype.
When a column has different data types, and one of the columns is a user-defined object type, then that column is not imported into the shard catalog.
When a column has different default values, SYNC SCHEMA
sets NULL as the default value.
Nested table columns are not imported into the shard catalog.
Example: a Customer table is defined on shard1 and shard2 as shown here.
On shard1:
Customer( Cust_id number, Name varchar(30),
Address varchar(50),Zip_code number)
On shard2:
Customer( Cust_id varchar(20), Name varchar(30),
Address varchar(50),Zip_code number,
Country_code number)
Note that the column Cust_id is a number on shard1 and
a varchar(20) on shard2. Also, note that Country_code exists on
shard2 but does not exist on shard1.
The Customer table created by SYNC SCHEMA in the shard
catalog has all of the columns, including Country_code, and the
Cust_id type is varchar(20).
Customer( Cust_id varchar(20), Name varchar(30),
Address varchar(50),Zip_code number,
Country_code number)
SYNC SCHEMA keeps track of these differences between
schemas in the shard catalog. A query issued on the catalog database that accesses
these heterogeneous columns is rewritten to address the differences before it is
sent to the shards. On the shard, if there is a data type mismatch, the data is CAST
into the "superior" data type as created on the catalog. If the column is missing on
the shard, the default value is returned as set on the catalog.
Partition Scheme Differences
Note that this difference has no impact on multi-shard queries, and is ignored.
- Partitioning column can be different.
- Partition type can be different.
- Number of partitions can be different.
Storage Attribute Differences
Note that this difference has no impact on multi-shard queries, and is ignored.
- Tablespaces, on which the table is created, are different.
- The encryption can be different.
- The
INMEMORYattribute can be different.
Differences in Views
Views on shards are created and handled as tables in the shard catalog. The same restrictions that apply to tables also apply to views.
Troubleshooting a Federated Distributed Database
Solve common federated distributed database issues with these troubleshooting tips.
ORA-03851: Operation not supported in federated database
ORA-03701: Invalid parameter combination: federated database and ...
Some of the operations and command options that apply to a traditional distributed database are not applicable to a federated distributed database. This is because:
-
There is no concept of a chunk in a federated distributed database. Any chunk-related operation is invalid, for example
SPLIT CHUNKandMOVE CHUNK. -
The Data Guard broker configuration is not set up or managed by the system in federated distributed database, because the existing shards may already have been set up with their own high availability configurations. Operations such as
SET DATAGUARD_PROPERTYorMODIFY SHARDSPACEare not supported. -
The
CREATE SHARDcommand is not supported.
ORA-03885: Some primary shards are undeployed or unavailable
The SYNC SCHEMA operation requires that all primary
shards be available. Check the output of the CONFIG
SHARD command, and check the status of all primary
shards. Fix any issues and retry the operations when the shards
become available.
ORA-03871: Some DDL statements are not applied to the catalog
The SYNC SCHEMA operation cannot import
object definitions from the shards when some statements from the
previous issuance are still not applied on the shard catalog. Run
SYNC SCHEMA with the
-apply option to run these statements.
Handling Errors During Multi-Shard Queries
If a multi-shard query fails with this error due to a
mismatch of the object definition on the shard and the catalog, make
sure that the shard catalog has the latest schema changes imported.
Any time there are schema changes in the federated distributed database, you must run SYNC SCHEMA to import any changes
in the schemas on the shards.
Note that subsequent runs of SYNC SCHEMA
will not drop and recreate the object, but will generate
ALTER statements to incorporate the
definition changes. This ensures that if there are queries already
running during the SYNC SCHEMA operation, they
won't fail with invalid object errors.
Handling Errors During DDL Processing Phase
If DDL fails on the shard catalog, the status of each DDL can
be examined with the SYNC SCHEMA
-show option.
gdsctl> sync schema -show
Note: The SYNC SCHEMA -show command is different
from the command SHOW DDL. SHOW
DDL lists DDL statements run by an all-shard user
that are first run on the catalog and then propagated to the shards,
whereas SYNC SCHEMA -show DDL statements are
generated from the objects imported from shards.
By default, SYNC SCHEMA -show lists a fixed number of
the latest DDLs. The -count and
-ddl options can be used to inspect
specific range of DDLs. For example,
gdsctl> sync schema -show -count 20
gdsctl> sync schema -show -count 20 -ddl 5
To check the complete DDL text and error message, if any, use the
-ddl option.
gdsctl> sync schema -show -ddl 5
To list only the failed DDL statements, use the
-failed_only option.
gdsctl> sync schema –failed_only
Based on the error message of the failed DDL, fix the cause of the error and perform the apply phase.
gdsctl> sync schema -apply
The SYNC SCHEMA command also has a
-restart option to perform the complete
operation from the beginning as if it were run for the first time.
This option will DROP all existing schemas imported
during all previous runs of SYNC SCHEMA and any
related metadata. Be aware that this will cause any running queries
on these objects to fail.
gdsctl> sync schema -restart