PostgreSQL pattern matching indexes

The two basic types of indexes in PostgreSQL are B-trees and Hash. Personally I have only used the B-trees, but recently a friend of mine asked me a question about the way PostgreSQL handles indexes on text fields internally. The question was: “Assume that you have created an index on a varchar field and then you issue a query to retrieve some results. Does your query use the index with the LIKE operator?”

The answer to that actually was No.

I had never thought of testing that. So we decided to make a little test. Our test involves one table a with an id and a varchar field. So our table is:

create table a (
id integer primary key,
name varchar(30) );

Now let us create an index on the Name field.

create index name on a(name);

Finally we insert two rows in the table.

insert into a values(1, 'eleni');
insert into a values(2, 'elena');

Next we are going to test the usage of the indexes created (one for the primary key of the table which is integer and the one we created for the Name field) we are going to issue some simple queries and use the postgresql command explain.

Test 1: Use the primary key to retrieve a single matching record.

explain select * from a where id = 1;

The result for this query is shown in the following image. The index that is by default created for the primary key of our table is used to retrieve the desired record.

Test 1 for PostgreSQL images

Test 2: Issue a query on the field name to retrieve some records. The match should be exact (search for the string exactly as we enter it).

explain select * from a where name = 'eleni';

The result for this query is shown in the following image. The index we created is used to retrieve the matching records.

Test 2 for PostgreSQL Indexes

Test 3: Issue a query that will use pattern matching to retrieve all matching records.

explain select * from a where name like 'elen%';

The result for this query is shown in the following image. The index is not used in this case. The whole table is traversed to retrieve the matching results.

Test 3 for PostgreSQL Indexes

Why?

The thing is that the default type of index used by PostgreSQL is the B-tree. For these indexes the supported operations are < , <= , = , >= , > (from the documentation of PostgreSQL http://www.postgresql.org/docs/9.0/static/indexes-types.html). This means that pattern matching operations are not (by default) supported by our B-trees.

In order to create an index that supports pattern matching operations (with some limitations) for text fields (text, varchar, char) we should create the index a bit differently by adding an operator class. Fore the three types of text fields mentioned:

  • text => text_pattern_ops
  • varchar => varchar_pattern_ops
  • char => bpchar_pattern_ops

(from the documentation of PostgreSQL http://www.postgresql.org/docs/9.0/static/indexes-opclass.html).

So how would we use these operators in the above example? We should first drop and recreate the table.

drop table a;

create table a (
id integer primary key,
name varchar(30)
);

Then we would create the index as follows

create index name on a(name varchar_pattern_ops);

After inserting the same two lines used in the examples above we can now re-execute the last explain query to see the results of our changes (there is no difference in the first two explains).

explain select * from a where name like 'elen%';
Test 4 for PostgreSQL queries

As we can see now the index is being used for the pattern matching operation.

Important Note:

  • If you also want to use the operators < , >=, = etc. you should create 2 indexes (one as you would normally do and one with the varchar_pattern_ops.

One thought on “PostgreSQL pattern matching indexes

  1. Correction on your final “important note”: you DO have to create 2 indexes if you want to be able to use the “=” operators, however you absolutely DON’T need an extra index for just the “=” operator – the varchar_pattern_ops index works fine for this purpose. As per the documentation:

    Note that you should also create an index with the default operator class if you want queries involving ordinary <, , or >= comparisons to use an index. Such queries cannot use the xxx_pattern_ops operator classes. (Ordinary equality comparisons can use these operator classes, however.)

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s