The behavior of a custom text search configuration can easily become confusing. The functions described in this section are useful for testing text search objects. You can test a complete configuration, or test parsers and dictionaries separately.
   The function ts_debug allows easy testing of a
   text search configuration.
  
ts_debug([configregconfig, ]documenttext, OUTaliastext, OUTdescriptiontext, OUTtokentext, OUTdictionariesregdictionary[], OUTdictionaryregdictionary, OUTlexemestext[]) returns setof record
   ts_debug displays information about every token of
   document as produced by the
   parser and processed by the configured dictionaries.  It uses the
   configuration specified by config,
   or default_text_search_config if that argument is
   omitted.
  
   ts_debug returns one row for each token identified in the text
   by the parser.  The columns returned are
    
       alias text — short name of the token type
      
       description text — description of the
       token type
      
       token text — text of the token
      
       dictionaries regdictionary[] — the
       dictionaries selected by the configuration for this token type
      
       dictionary regdictionary — the dictionary
       that recognized the token, or NULL if none did
      
       lexemes text[] — the lexeme(s) produced
       by the dictionary that recognized the token, or NULL if
       none did; an empty array ({}) means it was recognized as a
       stop word
      
Here is a simple example:
SELECT * FROM ts_debug('english','a fat  cat sat on a mat - it ate a fat rats');
   alias   |   description   | token |  dictionaries  |  dictionary  | lexemes 
-----------+-----------------+-------+----------------+--------------+---------
 asciiword | Word, all ASCII | a     | {english_stem} | english_stem | {}
 blank     | Space symbols   |       | {}             |              | 
 asciiword | Word, all ASCII | fat   | {english_stem} | english_stem | {fat}
 blank     | Space symbols   |       | {}             |              | 
 asciiword | Word, all ASCII | cat   | {english_stem} | english_stem | {cat}
 blank     | Space symbols   |       | {}             |              | 
 asciiword | Word, all ASCII | sat   | {english_stem} | english_stem | {sat}
 blank     | Space symbols   |       | {}             |              | 
 asciiword | Word, all ASCII | on    | {english_stem} | english_stem | {}
 blank     | Space symbols   |       | {}             |              | 
 asciiword | Word, all ASCII | a     | {english_stem} | english_stem | {}
 blank     | Space symbols   |       | {}             |              | 
 asciiword | Word, all ASCII | mat   | {english_stem} | english_stem | {mat}
 blank     | Space symbols   |       | {}             |              | 
 blank     | Space symbols   | -     | {}             |              | 
 asciiword | Word, all ASCII | it    | {english_stem} | english_stem | {}
 blank     | Space symbols   |       | {}             |              | 
 asciiword | Word, all ASCII | ate   | {english_stem} | english_stem | {ate}
 blank     | Space symbols   |       | {}             |              | 
 asciiword | Word, all ASCII | a     | {english_stem} | english_stem | {}
 blank     | Space symbols   |       | {}             |              | 
 asciiword | Word, all ASCII | fat   | {english_stem} | english_stem | {fat}
 blank     | Space symbols   |       | {}             |              | 
 asciiword | Word, all ASCII | rats  | {english_stem} | english_stem | {rat}
   For a more extensive demonstration, we
   first create a public.english configuration and
   Ispell dictionary for the English language:
  
CREATE TEXT SEARCH CONFIGURATION public.english ( COPY = pg_catalog.english );
CREATE TEXT SEARCH DICTIONARY english_ispell (
    TEMPLATE = ispell,
    DictFile = english,
    AffFile = english,
    StopWords = english
);
ALTER TEXT SEARCH CONFIGURATION public.english
   ALTER MAPPING FOR asciiword WITH english_ispell, english_stem;SELECT * FROM ts_debug('public.english','The Brightest supernovaes');
   alias   |   description   |    token    |         dictionaries          |   dictionary   |   lexemes   
-----------+-----------------+-------------+-------------------------------+----------------+-------------
 asciiword | Word, all ASCII | The         | {english_ispell,english_stem} | english_ispell | {}
 blank     | Space symbols   |             | {}                            |                | 
 asciiword | Word, all ASCII | Brightest   | {english_ispell,english_stem} | english_ispell | {bright}
 blank     | Space symbols   |             | {}                            |                | 
 asciiword | Word, all ASCII | supernovaes | {english_ispell,english_stem} | english_stem   | {supernova}   In this example, the word Brightest was recognized by the
   parser as an ASCII word (alias asciiword).
   For this token type the dictionary list is
   english_ispell and
   english_stem. The word was recognized by
   english_ispell, which reduced it to the noun
   bright. The word supernovaes is
   unknown to the english_ispell dictionary so it
   was passed to the next dictionary, and, fortunately, was recognized (in
   fact, english_stem is a Snowball dictionary which
   recognizes everything; that is why it was placed at the end of the
   dictionary list).
  
   The word The was recognized by the
   english_ispell dictionary as a stop word (Section 12.6.1) and will not be indexed.
   The spaces are discarded too, since the configuration provides no
   dictionaries at all for them.
  
You can reduce the width of the output by explicitly specifying which columns you want to see:
SELECT alias, token, dictionary, lexemes
FROM ts_debug('public.english','The Brightest supernovaes');
   alias   |    token    |   dictionary   |   lexemes   
-----------+-------------+----------------+-------------
 asciiword | The         | english_ispell | {}
 blank     |             |                | 
 asciiword | Brightest   | english_ispell | {bright}
 blank     |             |                | 
 asciiword | supernovaes | english_stem   | {supernova}
The following functions allow direct testing of a text search parser.
ts_parse(parser_nametext,documenttext, OUTtokidinteger, OUTtokentext) returnssetof recordts_parse(parser_oidoid,documenttext, OUTtokidinteger, OUTtokentext) returnssetof record
   ts_parse parses the given document
   and returns a series of records, one for each token produced by
   parsing. Each record includes a tokid showing the
   assigned token type and a token which is the text of the
   token.  For example:
SELECT * FROM ts_parse('default', '123 - a number');
 tokid | token
-------+--------
    22 | 123
    12 |
    12 | -
     1 | a
    12 |
     1 | number
ts_token_type(parser_nametext, OUTtokidinteger, OUTaliastext, OUTdescriptiontext) returnssetof recordts_token_type(parser_oidoid, OUTtokidinteger, OUTaliastext, OUTdescriptiontext) returnssetof record
   ts_token_type returns a table which describes each type of
   token the specified parser can recognize.  For each token type, the table
   gives the integer tokid that the parser uses to label a
   token of that type, the alias that names the token type
   in configuration commands, and a short description.  For
   example:
SELECT * FROM ts_token_type('default');
 tokid |      alias      |               description                
-------+-----------------+------------------------------------------
     1 | asciiword       | Word, all ASCII
     2 | word            | Word, all letters
     3 | numword         | Word, letters and digits
     4 | email           | Email address
     5 | url             | URL
     6 | host            | Host
     7 | sfloat          | Scientific notation
     8 | version         | Version number
     9 | hword_numpart   | Hyphenated word part, letters and digits
    10 | hword_part      | Hyphenated word part, all letters
    11 | hword_asciipart | Hyphenated word part, all ASCII
    12 | blank           | Space symbols
    13 | tag             | XML tag
    14 | protocol        | Protocol head
    15 | numhword        | Hyphenated word, letters and digits
    16 | asciihword      | Hyphenated word, all ASCII
    17 | hword           | Hyphenated word, all letters
    18 | url_path        | URL path
    19 | file            | File or path name
    20 | float           | Decimal notation
    21 | int             | Signed integer
    22 | uint            | Unsigned integer
    23 | entity          | XML entity
    The ts_lexize function facilitates dictionary testing.
   
ts_lexize(dictregdictionary,tokentext) returnstext[]
    ts_lexize returns an array of lexemes if the input
    token is known to the dictionary,
    or an empty array if the token
    is known to the dictionary but it is a stop word, or
    NULL if it is an unknown word.
   
Examples:
SELECT ts_lexize('english_stem', 'stars');
 ts_lexize
-----------
 {star}
SELECT ts_lexize('english_stem', 'a');
 ts_lexize
-----------
 {}
     The ts_lexize function expects a single
     token, not text. Here is a case
     where this can be confusing:
SELECT ts_lexize('thesaurus_astro','supernovae stars') is null;
 ?column?
----------
 t
     The thesaurus dictionary thesaurus_astro does know the
     phrase supernovae stars, but ts_lexize
     fails since it does not parse the input text but treats it as a single
     token. Use plainto_tsquery or to_tsvector to
     test thesaurus dictionaries, for example:
SELECT plainto_tsquery('supernovae stars');
 plainto_tsquery
-----------------
 'sn'