Skip to content

Column Collation and Case Insensitive Database

APPLIES TO:

Oracle Database – Enterprise Edition – Version 12.2.0.1 and later
Information in this document applies to any platform.

PURPOSE

Explain the column level collation option available in 12cR2. This feature can be extended further to have a Case Insensitive Database.

SCOPE

This document explains the enhancements made to linguistic sorting and searching of character data or strings in Oracle Database 12cR2.

DETAILS

The process of determining the mutual ordering of strings (character values) is called a collation or sort ordering. There are many possible collations that can be applied to strings to determine their ordering.

Binary collation orders strings based on their binary representation (character encoding), treating each string as a simple sequences of bytes. Collations that take into consideration the standards and customs of spoken languages are called linguistic collations. They order strings in the same way as dictionaries, phone directories, and other text lists written in a given language. Oracle Database provides case-insensitive and accent-insensitive options for collation. Examples of binary named collation are: BINARY, BINARY_CI (case-insensitive binary collation), and BINARY_AI (accent-insensitive and case-insensitive binary collation). Examples of linguistic named collation are: GENERIC_M, GENERIC_M_AI, FRENCH, POLISH, UCA0700_CFRENCH, and so on. For details, refer to the document: Linguistic Sorting – Frequently Asked Questions Note 227335.1.

Column Level Collation

From 12cR2, collations can be declared at a column-level and Oracle supports case-insensitive and accent-insensitive collations. By assigning such collation to a column, you can easily force all comparisons of column values to be case-insensitive or accent-insensitive or both. This type of collation, which is associated with a particular data, is called the data-bound collation .

A data-bound collation can be applied only to the values of character data types — VARCHAR2, CHAR, LONG, NVARCHAR2, NCHAR, CLOB, and NCLOB.

The only allowed collation for CLOB and NCLOB columns is USING_NLS_COMP.

To enable the data-bound collation feature, set the following database initialization parameter values:

MAX_STRING_SIZE=EXTENDED

COMPATIBLE>=12.2

In 12cR2, data-bound collation can be specified for:

  • Table columns
  • Cluster columns
  • Tables
  • Schemas through the owning user
  • Views and materialized views
  • PL/SQL units, such as procedures, functions, packages, types, and triggers
  • SQL expressions

How to Specify Column Level Collation

A data-bound collation can be explicitly specified for columns of character data types VARCHAR2, CHAR, LONG, CLOB, NVARCHAR2, NCHAR, and NCLOB using COLLATE clause of a standard or a virtual column definition in a CREATE TABLE or ALTER TABLE statement.

1. Example: Adding a column with collation declaration

ALTER TABLE employees ADD gender VARCHAR2(1) COLLATE BINARY_CI;

2. Example: Changing the collation of a column

ALTER TABLE employees MODIFY job_code COLLATE BINARY_CI;

3. Example: Creating table with default collation set for columns

CREATE TABLE my_table

(

my_column VARCHAR2(100) COLLATE POLISH,

);

 

Case Insensitive Database

The column level collation explained in the previous section is extended to table level and user level too. . The column collation, if not specified explicitly, is inherited from a table default collation, which in turn is inherited from a schema default collation. This way, you can easily declare all character columns of an application in a database as case-insensitive.

Table Default Collation.

You can specify a default data-bound collation for a table using the DEFAULT COLLATION clause in the CREATE TABLE and ALTER TABLE statements.

1. Example: Applying a default collation to a table while creating a table

CREATE TABLE employees

(

emp_code VARCHAR2(10) PRIMARY KEY,

first_name VARCHAR2(100),

last_name VARCHAR2(200),

job_code VARCHAR2(5) COLLATE BINARY,

dep_code NUMBER

)

DEFAULT COLLATION BINARY_CI

— other CREATE TABLE clauses

/

The columns emp_code, first_name, and last_name inherit the table default collation BINARY_CI. The column job_code is declared explicitly with the collation BINARY.

2. Example: Changing the default collation of a table

ALTER TABLE employees DEFAULT COLLATION USING_NLS_COMP;

This statement changes the default collation of the table employees to the pseudo-collation USING_NLS_COMP. Any new VARCHAR2, CHAR, NVARCHAR2, NCHAR, and LONG columns added to the table after the ALTER TABLE statement is executed, inherits the new collation, unless these columns are declared with an explicit collation or belong to a foreign key. The collations of the existing columns are not affected.

Schema level Collation

You can specify a default data-bound collation for a schema using the DEFAULT COLLATION clause in the CREATE USER and ALTER USER statements. The schema default collation determines the effective schema default collation that is assigned as the default collation for all the tables, views, materialized views, PL/SQL units, and user-defined types (UDTs) created in that schema, if these database objects do not have explicitly declared default collations.

If a schema default collation is not specified explicitly in the CREATE USER statement, then it is set to USING_NLS_COMP collation. You can change the schema default collation with the ALTER USER statement.

1. Example: Applying a default collation to a schema

CREATE USER hrsys

IDENTIFIED BY password

DEFAULT TABLESPACE hr_ts_1

DEFAULT COLLATION BINARY

ACCOUNT LOCK

— the clauses after password can be in any order

/

This statement creates a new database user hrsys with its schema. The default collation of the schema is set to BINARY. All database objects created in the schema that do not contain the DEFAULT COLLATION clause have their default collation set to BINARY, unless the session parameter DEFAULT_COLLATION overrides it.

2. Example: Changing the default collation of a schema

ALTER USER hrsys DEFAULT COLLATION USING_NLS_COMP;
  • If the DEFAULT_COLLATION parameter is specified for a session, then it overrides the default collation of a schema referenced in that session.

 

A schema default collation cannot be changed for an Oracle-supplied database user.

 

In short, for Case insentive database, the collation can be specified in the User (or application) level, table level or in the Colum level.

  • If collation for a column is not specified explicitly in the DDL statement that creates the column (in the CREATE TABLE or ALTER TABLE ADD statement), then the containing table’s default collation is used for the column.
  • If the DDL statement creating a table does not specify a default collation, then the default collation of the schema owning the table is used as the default collation for the table.
  •  If the CREATE USER statement does not specify the default collation for a schema, then the collation USING_NLS_COMP is used.