Current File : //usr/share/doc/postgresql-9.2.24/html/unaccent.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML
><HEAD
><TITLE
>unaccent</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
REV="MADE"
HREF="mailto:pgsql-docs@postgresql.org"><LINK
REL="HOME"
TITLE="PostgreSQL 9.2.24 Documentation"
HREF="index.html"><LINK
REL="UP"
TITLE="Additional Supplied Modules"
HREF="contrib.html"><LINK
REL="PREVIOUS"
TITLE="tsearch2"
HREF="tsearch2.html"><LINK
REL="NEXT"
TITLE="uuid-ossp"
HREF="uuid-ossp.html"><LINK
REL="STYLESHEET"
TYPE="text/css"
HREF="stylesheet.css"><META
HTTP-EQUIV="Content-Type"
CONTENT="text/html; charset=ISO-8859-1"><META
NAME="creation"
CONTENT="2017-11-06T22:43:11"></HEAD
><BODY
CLASS="SECT1"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="5"
ALIGN="center"
VALIGN="bottom"
><A
HREF="index.html"
>PostgreSQL 9.2.24 Documentation</A
></TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="top"
><A
TITLE="tsearch2"
HREF="tsearch2.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="top"
><A
HREF="contrib.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="60%"
ALIGN="center"
VALIGN="bottom"
>Appendix F. Additional Supplied Modules</TD
><TD
WIDTH="20%"
ALIGN="right"
VALIGN="top"
><A
TITLE="uuid-ossp"
HREF="uuid-ossp.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="SECT1"
><H1
CLASS="SECT1"
><A
NAME="UNACCENT"
>F.39. unaccent</A
></H1
><P
>  <TT
CLASS="FILENAME"
>unaccent</TT
> is a text search dictionary that removes accents
  (diacritic signs) from lexemes.
  It's a filtering dictionary, which means its output is
  always passed to the next dictionary (if any), unlike the normal
  behavior of dictionaries.  This allows accent-insensitive processing
  for full text search.
 </P
><P
>  The current implementation of <TT
CLASS="FILENAME"
>unaccent</TT
> cannot be used as a
  normalizing dictionary for the <TT
CLASS="FILENAME"
>thesaurus</TT
> dictionary.
 </P
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="AEN152712"
>F.39.1. Configuration</A
></H2
><P
>   An <TT
CLASS="LITERAL"
>unaccent</TT
> dictionary accepts the following options:
  </P
><P
></P
><UL
><LI
><P
>     <TT
CLASS="LITERAL"
>RULES</TT
> is the base name of the file containing the list of
     translation rules.  This file must be stored in
     <TT
CLASS="FILENAME"
>$SHAREDIR/tsearch_data/</TT
> (where <TT
CLASS="LITERAL"
>$SHAREDIR</TT
> means
     the <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
> installation's shared-data directory).
     Its name must end in <TT
CLASS="LITERAL"
>.rules</TT
> (which is not to be included in
     the <TT
CLASS="LITERAL"
>RULES</TT
> parameter).
    </P
></LI
></UL
><P
>   The rules file has the following format:
  </P
><P
></P
><UL
><LI
><P
>     Each line represents a pair, consisting of a character with accent
     followed by a character without accent.  The first is translated into
     the second.  For example,
</P><PRE
CLASS="PROGRAMLISTING"
>&Agrave;        A
&Aacute;        A
&Acirc;        A
&Atilde;        A
&Auml;        A
&Aring;        A
&AElig;        A</PRE
><P>
    </P
></LI
></UL
><P
>   A more complete example, which is directly useful for most European
   languages, can be found in <TT
CLASS="FILENAME"
>unaccent.rules</TT
>, which is installed
   in <TT
CLASS="FILENAME"
>$SHAREDIR/tsearch_data/</TT
> when the <TT
CLASS="FILENAME"
>unaccent</TT
>
   module is installed.
  </P
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="AEN152734"
>F.39.2. Usage</A
></H2
><P
>   Installing the <TT
CLASS="LITERAL"
>unaccent</TT
> extension creates a text
   search template <TT
CLASS="LITERAL"
>unaccent</TT
> and a dictionary <TT
CLASS="LITERAL"
>unaccent</TT
>
   based on it.  The <TT
CLASS="LITERAL"
>unaccent</TT
> dictionary has the default
   parameter setting <TT
CLASS="LITERAL"
>RULES='unaccent'</TT
>, which makes it immediately
   usable with the standard <TT
CLASS="FILENAME"
>unaccent.rules</TT
> file.
   If you wish, you can alter the parameter, for example

</P><PRE
CLASS="PROGRAMLISTING"
>mydb=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');</PRE
><P>

   or create new dictionaries based on the template.
  </P
><P
>   To test the dictionary, you can try:
</P><PRE
CLASS="PROGRAMLISTING"
>mydb=# select ts_lexize('unaccent','H&ocirc;tel');
 ts_lexize
-----------
 {Hotel}
(1 row)</PRE
><P>
  </P
><P
>   Here is an example showing how to insert the
   <TT
CLASS="FILENAME"
>unaccent</TT
> dictionary into a text search configuration:
</P><PRE
CLASS="PROGRAMLISTING"
>mydb=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
mydb=# ALTER TEXT SEARCH CONFIGURATION fr
        ALTER MAPPING FOR hword, hword_part, word
        WITH unaccent, french_stem;
mydb=# select to_tsvector('fr','H&ocirc;tels de la Mer');
    to_tsvector
-------------------
 'hotel':1 'mer':4
(1 row)

mydb=# select to_tsvector('fr','H&ocirc;tel de la Mer') @@ to_tsquery('fr','Hotels');
 ?column?
----------
 t
(1 row)

mydb=# select ts_headline('fr','H&ocirc;tel de la Mer',to_tsquery('fr','Hotels'));
      ts_headline
------------------------
 &lt;b&gt;H&ocirc;tel&lt;/b&gt; de la Mer
(1 row)</PRE
><P>
  </P
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="AEN152749"
>F.39.3. Functions</A
></H2
><P
>  The <CODE
CLASS="FUNCTION"
>unaccent()</CODE
> function removes accents (diacritic signs) from
  a given string.  Basically, it's a wrapper around the
  <TT
CLASS="FILENAME"
>unaccent</TT
> dictionary, but it can be used outside normal
  text search contexts.
 </P
><PRE
CLASS="SYNOPSIS"
>unaccent([<SPAN
CLASS="OPTIONAL"
><TT
CLASS="REPLACEABLE"
><I
>dictionary</I
></TT
>, </SPAN
>] <TT
CLASS="REPLACEABLE"
><I
>string</I
></TT
>) returns <TT
CLASS="TYPE"
>text</TT
></PRE
><P
>  For example:
</P><PRE
CLASS="PROGRAMLISTING"
>SELECT unaccent('unaccent', 'H&ocirc;tel');
SELECT unaccent('H&ocirc;tel');</PRE
><P>
 </P
></DIV
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="tsearch2.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="uuid-ossp.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>tsearch2</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="contrib.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>uuid-ossp</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>