Column Collation and Case Insensitive Database
Oracle Database – Enterprise Edition – Version 22.214.171.124 and later
Information in this document applies to any platform.
Explain the column level collation option available in 12cR2. This feature can be extended further to have a Case Insensitive Database.
This document explains the enhancements made to linguistic sorting and searching of character data or strings in Oracle Database 12cR2.
The process of determining the mutual ordering of strings (character values) is called a collation or sort ordering. There are many possible collations that can be applied to strings to determine their ordering.
Binary collation orders strings based on their binary representation (character encoding), treating each string as a simple sequences of bytes. Collations that take into consideration the standards and customs of spoken languages are called linguistic collations. They order strings in the same way as dictionaries, phone directories, and other text lists written in a given language. Oracle Database provides case-insensitive and accent-insensitive options for collation. Examples of binary named collation are: BINARY, BINARY_CI (case-insensitive binary collation), and BINARY_AI (accent-insensitive and case-insensitive binary collation). Examples of linguistic named collation are: GENERIC_M, GENERIC_M_AI, FRENCH, POLISH, UCA0700_CFRENCH, and so on. For details, refer to the document: Linguistic Sorting – Frequently Asked Questions Note 227335.1.
Column Level Collation
From 12cR2, collations can be declared at a column-level and Oracle supports case-insensitive and accent-insensitive collations. By assigning such collation to a column, you can easily force all comparisons of column values to be case-insensitive or accent-insensitive or both. This type of collation, which is associated with a particular data, is called the data-bound collation .
The only allowed collation for CLOB and NCLOB columns is USING_NLS_COMP.
To enable the data-bound collation feature, set the following database initialization parameter values:
In 12cR2, data-bound collation can be specified for:
- Table columns
- Cluster columns
- Schemas through the owning user
- Views and materialized views
- PL/SQL units, such as procedures, functions, packages, types, and triggers
- SQL expressions
How to Specify Column Level Collation
A data-bound collation can be explicitly specified for columns of character data types VARCHAR2, CHAR, LONG, CLOB, NVARCHAR2, NCHAR, and NCLOB using COLLATE clause of a standard or a virtual column definition in a CREATE TABLE or ALTER TABLE statement.
1. Example: Adding a column with collation declaration
2. Example: Changing the collation of a column
3. Example: Creating table with default collation set for columns
CREATE TABLE my_table
my_column VARCHAR2(100) COLLATE POLISH,
Case Insensitive Database
The column level collation explained in the previous section is extended to table level and user level too. . The column collation, if not specified explicitly, is inherited from a table default collation, which in turn is inherited from a schema default collation. This way, you can easily declare all character columns of an application in a database as case-insensitive.
Table Default Collation.
You can specify a default data-bound collation for a table using the
DEFAULT COLLATION clause in the
CREATE TABLE and
ALTER TABLE statements.
1. Example: Applying a default collation to a table while creating a table
CREATE TABLE employees
emp_code VARCHAR2(10) PRIMARY KEY,
job_code VARCHAR2(5) COLLATE BINARY,
DEFAULT COLLATION BINARY_CI
— other CREATE TABLE clauses
The columns emp_code, first_name, and last_name inherit the table default collation BINARY_CI. The column job_code is declared explicitly with the collation BINARY.
2. Example: Changing the default collation of a table
This statement changes the default collation of the table employees to the pseudo-collation USING_NLS_COMP. Any new VARCHAR2, CHAR, NVARCHAR2, NCHAR, and LONG columns added to the table after the ALTER TABLE statement is executed, inherits the new collation, unless these columns are declared with an explicit collation or belong to a foreign key. The collations of the existing columns are not affected.
Schema level Collation
You can specify a default data-bound collation for a schema using the
DEFAULT COLLATION clause in the
CREATE USER and
ALTER USER statements. The schema default collation determines the effective schema default collation that is assigned as the default collation for all the tables, views, materialized views, PL/SQL units, and user-defined types (UDTs) created in that schema, if these database objects do not have explicitly declared default collations.
If a schema default collation is not specified explicitly in the
CREATE USER statement, then it is set to
USING_NLS_COMP collation. You can change the schema default collation with the
ALTER USER statement.
1. Example: Applying a default collation to a schema
CREATE USER hrsys
IDENTIFIED BY password
DEFAULT TABLESPACE hr_ts_1
DEFAULT COLLATION BINARY
— the clauses after password can be in any order
This statement creates a new database user hrsys with its schema. The default collation of the schema is set to BINARY. All database objects created in the schema that do not contain the DEFAULT COLLATION clause have their default collation set to BINARY, unless the session parameter DEFAULT_COLLATION overrides it.
2. Example: Changing the default collation of a schema
- If the
DEFAULT_COLLATIONparameter is specified for a session, then it overrides the default collation of a schema referenced in that session.
In short, for Case insentive database, the collation can be specified in the User (or application) level, table level or in the Colum level.
- If collation for a column is not specified explicitly in the DDL statement that creates the column (in the CREATE TABLE or ALTER TABLE ADD statement), then the containing table’s default collation is used for the column.
- If the DDL statement creating a table does not specify a default collation, then the default collation of the schema owning the table is used as the default collation for the table.
- If the CREATE USER statement does not specify the default collation for a schema, then the collation USING_NLS_COMP is used.