[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
New Phonetic Design
- To: openldap-devel@OpenLDAP.org
- Subject: New Phonetic Design
- From: Alexandre PAUZIES <apauzies@linagora.com>
- Date: Wed, 22 Sep 2004 12:02:49 +0200
- Organization: LINAGORA - http://www.linagora.com
- User-agent: Gnus/5.110003 (No Gnus v0.3) Emacs/21.3.50 (gnu/linux)
Hi,
As far as you know, OpenLdap use phonetic functions for
approximation. I've done a new phonetic mecanism for OpenLDAP 2.2.17
that use for instance a set of french rules but new set of rules could
be easily added.
Why we need a new design ?
##########################
- for each new phonetic algorithm/language we need to implement a new
function, add #define etc...
- phonetic functions are not easy to understand or implement
- the use of strcmp for matching does not allow a flexible match (see
SLAPD_PHONETIC_V2_PRECISION)
- the use of #define does not allow to switch from an algorithm/language
on runtime (so could not be used with langage codes)
How does this design look like ?
################################
- A new language could easily be added by a new entry (lang, rules,
post-rules) in the phonetic lang table.
- A new algorithm could easily be added by writting a simple table of
rules.
- Each rule is an action (find/replace...) with a set of condition (is
preceded by...) which are easy to implement.
- Each post-phonetic rule is a simple table of ordered characters.
- The precision of this phonetic mecanism could easily be changed.
- The default phonetic language could be changed from config file.
How does this one works ?
#########################
1) The Phonetic's rules
-----------------------
- You need to write your own language/algorithm photenic rules :
Here i define rules for french language and phonex alorithm (by Frederic
BROUARD)
static rule_t phonetic_rules_fr_phonex[] =
{
}
a rule is defined by an action (ie: FIND_REPLACE) with its arguments
("ie: h" -> "") and by a set of conditions (ie: NOT PRECEDED BY 'c' OR
's' OR 'p')
this example :
{ {FIND_REPLACE, {"h", ""}}, {{PRECEDED, "csp", NOT|OR}} },
will delete all characters 'h' not preceded by character 'c' or 's' or
'p'
You could write rules with more than one condition like this :
{ {FIND_REPLACE, {"s", "z"}}, {{FOLLOWED, "aeiou1234", OR},
{PRECEDED, "aeiou1234", OR}} },
An other example, I want to delete character 't' if it end the word :
{ {FIND_REPLACE, {"t", ""}}, {{FOLLOWED, ALL, AND|NOT}} },
etc...
You could find more example by looking in "phonetic.h"
2) Post-Phonetic's rules
------------------------
For now, you got a phonetic function that return a phonetic copy of the
word (like the old one function) but you can't select how
permissive/flexible your match will be. That's why the post_phonetic
function is.
You need to define post-phonetic rules by assigning an integer (the
position of the char on the "char tab[]") to each character (not
replaced/deleted by your phonetic algorithm).
Thoses rules will be used to convert your phonetic word copy to a string
representing a float value.
Example :
static char phonetic_post_rules_fr_phonex[22] =
{
'1', '2', '3', '4', '5', 'e', 'f', 'g', 'h', 'i', 'k',
'l', 'n', 'o', 'r', 's', 't', 'u', 'w', 'x', 'y', 'z'
};
will asign number 0 to char '1' ... and number 21 to char 'z'
In this example, those number will be converted to base 22 and the sum
of all new numbers will become a float. This float number will be store
into a string.
So, to set the precision/flexibility of this new phonetic mecanism, you
need to set SLAPD_PHONETIC_V2_PRECISION (in schema_init.c) to the
signifiant number of figure in your float (string) value.
Then, an strncmp(word, post_phonetic_word, SLAPD_PHONETIC_V2_PRECISION)
will be done to do the match.
3) The Phonetic language table
------------------------------
Once you've defined your phonetic and post-phonetic rules, you need to
add them for your language to phonetic_lang[] :
static phonetic_t phonetic_lang[] =
{
{"fr", phonetic_rules_fr_phonex, phonetic_post_rules_fr_phonex},
{NULL, NULL, NULL},
}
4) Slapd.conf
-------------
Set the default "lang" option in you slapd.conf like this :
lang fr
so the phonetic function now which rules to use for your language.
4) Enable new Phonetic mecanism
-------------------------------
Finaly, add the "--enable-phonetic2" option to you configure script.
To do:
######
May be more actions/conditions should be added to this new mecanism to
suite all languages.
The LDAP_UTF8_APPROX flag passed to UTF8bvnormalize could be a problem
(ie: I can't do actions or check condition on accentueted characters).
The "lang" option in the config file should be the default lang and not
the only one because for attributes with language codes we should select
the corresponding phonetic rules if there is one, or the default one
(config file defined).
Any comments will be appreciated.
Best regards,
Alexandre.
--- openldap-2.2.17/configure.in 2004-07-26 20:15:05.000000000 +0200
+++ openldap-2.2.17-phonetic2/configure.in 2004-09-20 16:17:56.805716056 +0200
@@ -194,6 +194,7 @@
OL_ARG_ENABLE(slapi,[ --enable-slapi enable SLAPI support (experimental)], no)dnl
OL_ARG_ENABLE(slp,[ --enable-slp enable SLPv2 support], no)dnl
OL_ARG_ENABLE(wrappers,[ --enable-wrappers enable tcp wrapper support], no)dnl
+OL_ARG_ENABLE(phonetic2,[ --enable-phonetic2 enable new phonetic system for approx], no)dnl
dnl ----------------------------------------------------------------
dnl SLAPD Backend Options
@@ -1990,6 +1991,30 @@
dnl fi
dnl ----------------------------------------------------------------
+dnl PHONETIC2
+ol_link_math=no
+if test $ol_enable_phonetic2 != no ; then
+ AC_CHECK_HEADERS(math.h)
+ if test $ac_cv_header_math_h != yes ; then
+ AC_MSG_ERROR([could not locate <math.h>])
+ fi
+
+ AC_CHECK_LIB(m,powf,[have_m=yes],[have_m=no])
+ if test $have_m = yes ; then
+ ol_link_math="yes"
+ fi
+
+ if test $ol_link_math != no ; then
+ ac_save_LIBS="$LIBS"
+ LIBS="$LIBS -lm"
+
+ elif test $ol_enable_phonetic2 != auto ; then
+ AC_MSG_ERROR([could not locate Math library])
+ fi
+fi
+
+
+dnl ----------------------------------------------------------------
dnl SQL
ol_link_sql=no
if test $ol_enable_sql != no ; then
@@ -2401,6 +2426,9 @@
if test "$ol_enable_aci" != no ; then
AC_DEFINE(SLAPD_ACI_ENABLED,1,[define to support per-object ACIs])
fi
+if test "$ol_enable_phonetic2" != no ; then
+ AC_DEFINE(SLAPD_PHONETIC_V2,1,[define to support new phonetic system])
+fi
if test "$ol_link_modules" != no ; then
AC_DEFINE(SLAPD_MODULES,1,[define to support modules])
--- openldap-2.2.17/include/portable.h.in 2004-07-16 20:35:03.000000000 +0200
+++ openldap-2.2.17-phonetic2/include/portable.h.in 2004-09-20 16:25:52.775357672 +0200
@@ -977,6 +977,9 @@
/* define to support SHELL backend */
#undef SLAPD_SHELL
+/* define to support new Phonetic system */
+#undef SLAPD_PHONETIC_V2
+
/* define to support SQL backend */
#undef SLAPD_SQL
--- openldap-2.2.17/servers/slapd/schema_init.c 2004-08-30 18:18:31.000000000 +0200
+++ openldap-2.2.17-phonetic2/servers/slapd/schema_init.c 2004-09-20 16:39:35.628265016 +0200
@@ -61,6 +61,7 @@
#define IA5StringApproxIndexer approxIndexer
#define IA5StringApproxFilter approxFilter
+
static int
inValidate(
Syntax *syntax,
@@ -1400,6 +1401,11 @@
# define SLAPD_APPROX_WORDLEN 1
#endif
+#if defined(SLAPD_PHONETIC_V2)
+# define SLAPD_PHONETIC_V2_PRECISION 7
+#endif
+
+
static int
approxMatch(
int *matchp,
@@ -1412,6 +1418,8 @@
struct berval *nval, *assertv;
char *val, **values, **words, *c;
int i, count, len, nextchunk=0, nextavail=0;
+ char *tmp;
+
/* Yes, this is necessary */
nval = UTF8bvnormalize( value, NULL, LDAP_UTF8_APPROX, NULL );
@@ -1442,7 +1450,14 @@
values = (char **)ch_malloc( count * sizeof(char *) );
for ( c = nval->bv_val, i = 0; i < count; i++, c += strlen(c) + 1 ) {
words[i] = c;
+#if defined(SLAPD_PHONETIC_V2)
+ tmp = phonetic_v2(c);
+ values[i] = post_phonetic_v2(tmp);
+ printf("[%s] -> [%s] -> [%s]\n", c, tmp, values[i]);
+ ch_free(tmp);
+#else
values[i] = phonetic(c);
+#endif
}
/* Work through the asserted value's words, to see if at least some
@@ -1467,11 +1482,22 @@
else {
/* Isolate the next word in the asserted value and phonetic it */
assertv->bv_val[nextchunk+len] = '\0';
+#if defined(SLAPD_PHONETIC_V2)
+ tmp = phonetic_v2(assertv->bv_val + nextchunk);
+ val = post_phonetic_v2(tmp);
+ printf("[%s] -> [%s] -> [%s...]\n", assertv->bv_val+nextchunk, tmp, val);
+ ch_free(tmp);
+#else
val = phonetic( assertv->bv_val + nextchunk );
+#endif
/* See if this phonetic chunk is in the remaining words of *value */
for( i=nextavail; i<count; i++ ){
+#if defined(SLAPD_PHONETIC_V2)
+ if( !strncmp( val, values[i], SLAPD_PHONETIC_V2_PRECISION ) ){
+#else
if( !strcmp( val, values[i] ) ){
+#endif
nextavail = i+1;
break;
}
@@ -1521,6 +1547,7 @@
void *ctx )
{
char *c;
+ char *tmp;
int i,j, len, wordcount, keycount=0;
struct berval *newkeys;
BerVarray keys=NULL;
@@ -1551,7 +1578,13 @@
for( c = val.bv_val, i = 0; i < wordcount; c += len + 1 ) {
len = strlen( c );
if( len < SLAPD_APPROX_WORDLEN ) continue;
+#if defined (SLAPD_PHONETIC_V2)
+ tmp = phonetic_v2(c);
+ ber_str2bv( post_phonetic_v2( tmp ), 0, 0, &keys[keycount] );
+ ch_free(tmp);
+#else
ber_str2bv( phonetic( c ), 0, 0, &keys[keycount] );
+#endif
keycount++;
i++;
}
@@ -1576,6 +1609,7 @@
void *ctx )
{
char *c;
+ char *tmp;
int i, count, len;
struct berval *val;
BerVarray keys;
@@ -1607,7 +1641,13 @@
for( c = val->bv_val, i = 0; i < count; c += len + 1 ) {
len = strlen(c);
if( len < SLAPD_APPROX_WORDLEN ) continue;
+#if defined (SLAPD_PHONETIC_V2)
+ tmp = phonetic_v2(c);
+ ber_str2bv( post_phonetic_v2( tmp ), 0, 0, &keys[i] );
+ ch_free(tmp);
+#else
ber_str2bv( phonetic( c ), 0, 0, &keys[i] );
+#endif
i++;
}
--- openldap-2.2.17/servers/slapd/phonetic.c 2004-01-01 19:16:34.000000000 +0100
+++ openldap-2.2.17-phonetic2/servers/slapd/phonetic.c 2004-09-20 18:09:57.158067208 +0200
@@ -23,6 +23,13 @@
* software without specific prior written permission. This software
* is provided ``as is'' without express or implied warranty.
*/
+/* Portions Copyright (c) 2004 Alexandre PAUZIES <apauzies@linagora.com>.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms are permitted
+ * provided that this notice is preserved and that due credit is given
+ * to Alexandre PAUZIES.
+ */
#include "portable.h"
@@ -32,10 +39,13 @@
#include <ac/string.h>
#include <ac/socket.h>
#include <ac/time.h>
+#include <math.h>
#include "slap.h"
+#include "phonetic.h"
+
-#if !defined(SLAPD_METAPHONE) && !defined(SLAPD_PHONETIC)
+#if !defined(SLAPD_METAPHONE) && !defined(SLAPD_PHONETIC) && !defined(SLAPD_PHONETIC_V2)
#define SLAPD_METAPHONE
#endif
@@ -180,6 +190,197 @@
return( ch_strdup( phoneme ) );
}
+
+#elif defined(SLAPD_PHONETIC_V2)
+/* Portions Copyright (c) 2004 Alexandre PAUZIES <apauzies@linagora.com>.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms are permitted
+ * provided that this notice is preserved and that due credit is given
+ * to Alexandre PAUZIES.
+ */
+
+
+static int is_followed(char *start, char *pos, condition_t *condition)
+{
+ char *p;
+
+ if (*(++pos))
+ {
+ if (condition->flag & OR)
+ {
+ if (strchr(condition->param, *pos) != NULL)
+ return ((condition->flag & NOT) ? 0 : 1);
+ }
+ else if (condition->flag & AND)
+ {
+ for (p = condition->param;
+ *p && *pos && *p == *pos; p++, pos++);
+ if (!*p)
+ return ((condition->flag & NOT) ? 0 : 1);
+ }
+ }
+ return ((condition->flag & NOT) ? 1 : 0);
+}
+
+
+static int is_repeated(char *start, char *pos, condition_t *condition)
+{
+ if ((*(pos+1)) && *pos == (*(pos + 1)))
+ return ((condition->flag & NOT) ? 0 : 1);
+ return ((condition->flag & NOT) ? 1 : 0);
+}
+
+
+static int is_preceded(char *start, char *pos, condition_t *condition)
+{
+ int i;
+
+ if (pos > start)
+ {
+ pos--;
+ if (condition->flag & OR)
+ {
+ if (strchr(condition->param, *pos) != NULL)
+ return ((condition->flag & NOT) ? 0 : 1);
+ }
+ else if (condition->flag & AND)
+ {
+ for (i = strlen(condition->param) - 1;
+ i >= 0 && pos >= start && condition->param[i] == *pos;
+ i--, pos--);
+ if (i < 0)
+ return ((condition->flag & NOT) ? 0 : 1);
+ }
+ }
+ return ((condition->flag & NOT) ? 1 : 0);
+}
+
+
+static int check_conditions(char *start, char *pos, rule_t *rule)
+{
+ int i;
+ int j;
+
+ for (i = 0; rule->conditions[i].name; i++)
+ for (j = 0; checks[j].name; j++)
+ if (checks[j].name == rule->conditions[i].name)
+ switch (rule->conditions[i].name)
+ {
+ case FOLLOWED:
+ if (!checks[j].try(start, pos+strlen(rule->action.params[0])-1,
+ &rule->conditions[i]))
+ return 0;
+ default:
+ if (!checks[j].try(start, pos, &rule->conditions[i]))
+ return 0;
+ }
+ return 1;
+}
+
+
+static char *replace(char *start, char *pos, rule_t *rule)
+{
+ int str_len;
+ int look_for_len;
+ int change_to_len;
+ int diff_len;
+
+ str_len = strlen(pos);
+ look_for_len = strlen(rule->action.params[0]);
+ if (!look_for_len)
+ look_for_len++;
+ change_to_len = strlen(rule->action.params[1]);
+ diff_len = look_for_len - change_to_len;
+
+ if (diff_len < 0) // Do we really need this ?
+ pos = ch_realloc(pos, (size_t)(strlen - diff_len +1));
+ memmove(pos + change_to_len, pos + look_for_len, str_len - diff_len + 1);
+ if (change_to_len)
+ memcpy(pos, rule->action.params[1], change_to_len);
+
+ return pos;
+}
+
+
+static void *find_replace(char *start, char *pos, rule_t *rule)
+{
+ if (!*pos)
+ return NULL;
+ if (*rule->action.params[0])
+ if ((pos = strstr(pos, rule->action.params[0])) == NULL)
+ return NULL;
+
+ if (!check_conditions(start, pos, rule))
+ find_replace(start, ++pos, rule);
+ else if (*(pos = replace(start, pos, rule)))
+ find_replace(start, pos, rule);
+
+ return NULL;
+}
+
+
+char *phonetic_v2(char *word)
+{
+ int i;
+ int j;
+ char *s;
+ rule_t *rules;
+
+ for (i = 0; phonetic_lang[i].lang != NULL &&
+ strcmp(phonetic_lang[i].lang, lang); i++);
+ if (phonetic_lang[i].lang == NULL)
+ return NULL; // Error, no phonetic rules found for this lang
+
+ rules = phonetic_lang[i].rules;
+ s = ch_strdup(word);
+
+ for (i = 0; rules[i].action.name; i++)
+ for (j = 0; commands[j].name; j++)
+ if (rules[i].action.name == commands[j].name)
+ commands[j].run(s, s, &rules[i]);
+
+ return s;
+}
+
+
+char *post_phonetic_v2(char *word)
+{
+ int *tab;
+ int i;
+ int j;
+ double res;
+ char *res_str;
+ char *p;
+ char *post_rules;
+
+ for (i = 0; phonetic_lang[i].lang != NULL &&
+ strcmp(phonetic_lang[i].lang, lang); i++);
+ if (phonetic_lang[i].lang == NULL)
+ return NULL; // Error, no post phonetic rules found for this lang
+
+ post_rules = phonetic_lang[i].post_rules;
+
+ tab = ch_malloc(sizeof(int) * strlen(word) + 1);
+ for (i = 0, p = word; *p; p++, i++)
+ for (j = 0; post_rules[j]; j++)
+ if (*p == post_rules[j])
+ tab[i] = j;
+
+ for (j = 0; post_rules[j]; j++);
+
+ for (res = 0.0, i = 0; i < strlen(word); i++)
+ res += tab[i] * powf(j, 0 -i -1);
+
+ if (tab)
+ ch_free (tab);
+
+ res_str = ch_malloc(sizeof(char) * 26);
+ sprintf(res_str, "%4.20f", res);
+
+ return res_str;
+}
+
#elif defined(SLAPD_METAPHONE)
/*
--- openldap-2.2.17/servers/slapd/proto-slap.h 2004-09-12 22:22:39.000000000 +0200
+++ openldap-2.2.17-phonetic2/servers/slapd/proto-slap.h 2004-09-20 14:18:23.046293256 +0200
@@ -901,6 +901,9 @@
* phonetic.c
*/
LDAP_SLAPD_F (char *) phonetic LDAP_P(( char *s ));
+LDAP_SLAPD_F (char *) phonetic_v2 LDAP_P(( char *s ));
+LDAP_SLAPD_F (char *) post_phonetic_v2 LDAP_P(( char *s ));
+
/*
* referral.c
@@ -1259,6 +1262,8 @@
LDAP_SLAPD_V (unsigned long) num_ops_initiated_[SLAP_OP_LAST];
#endif /* SLAPD_MONITOR */
+LDAP_SLAPD_V (char *) lang;
+
LDAP_SLAPD_V (char *) slapd_pid_file;
LDAP_SLAPD_V (char *) slapd_args_file;
LDAP_SLAPD_V (time_t) starttime;
--- openldap-2.2.17/servers/slapd/phonetic.h 1970-01-01 01:00:00.000000000 +0100
+++ openldap-2.2.17-phonetic2/servers/slapd/phonetic.h 2004-09-20 17:53:03.952097928 +0200
@@ -0,0 +1,194 @@
+/* phonetic.h - routines to do phonetic matching */
+/* This work is part of OpenLDAP Software <http://www.openldap.org/>.
+ *
+ * Copyright 1998-2004 The OpenLDAP Foundation.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted only as authorized by the OpenLDAP
+ * Public License.
+ *
+ * A copy of this license is available in the file LICENSE in the
+ * top-level directory of the distribution or, alternatively, at
+ * <http://www.OpenLDAP.org/license.html>.
+ */
+/* Portions Copyright (c) 2004 Alexandre PAUZIES <apauzies@linagora.com>.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms are permitted
+ * provided that this notice is preserved and that due credit is given
+ * to Alexandre PAUZIES.
+ */
+
+#ifndef _SLAP_PHONETIC_H_
+#define _SLAP_PHONETIC_H_
+
+#define NONE 0
+
+#define FOLLOWED 1
+#define REPEATED 2
+#define PRECEDED -1
+
+#define OR 1
+#define AND 2
+#define NOT 4
+#define ALL ""
+
+#define FIND_REPLACE 1
+
+
+
+#define MAX_CONDITIONS 4
+#define MAX_PARAMS 3
+
+
+
+typedef struct check_s
+{
+ int name;
+ int (*try)();
+} check_t;
+
+typedef struct command_s
+{
+ int name;
+ void *(*run)();
+} command_t;
+
+typedef struct condition_s
+{
+ int name;
+ char *param;
+ int flag;
+} condition_t;
+
+typedef struct action_s
+{
+ int name;
+ char *params[MAX_PARAMS];
+} action_t;
+
+typedef struct rule_s
+{
+ action_t action;
+ condition_t conditions[MAX_CONDITIONS];
+} rule_t;
+
+
+typedef struct phonetic_s
+{
+ char *lang;
+ rule_t *rules;
+ char *post_rules;
+} phonetic_t;
+
+
+static void *find_replace(char *start, char *pos, rule_t *rule);
+static char *replace(char *start, char *pos, rule_t *rule);
+static int check_conditions(char *start, char *pos, rule_t *rule);
+static int is_followed(char *start, char *pos, condition_t *condition);
+static int is_preceded(char *start, char *pos, condition_t *condition);
+static int is_repeated(char *start, char *pos, condition_t *condition);
+
+
+static command_t commands[] =
+ {
+ { FIND_REPLACE, find_replace },
+ { NONE, NULL },
+ };
+
+static check_t checks[] =
+ {
+ { PRECEDED, is_preceded },
+ { FOLLOWED, is_followed },
+ { REPEATED, is_repeated },
+ { NONE, NULL },
+ };
+
+
+
+/* This is the phonex rules, by Frederic BROUARD
+ (http://sqlpro.developpez.com/Soundex/SQL_AZ_soundex.html) */
+
+static rule_t phonetic_rules_fr_phonex[] =
+ {
+ { {FIND_REPLACE, {"y", "i"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"h", ""}}, {{PRECEDED, "csp", NOT|OR}} },
+ { {FIND_REPLACE, {"ph", "f"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"gan", "kan"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"gam", "kam"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"gain", "kain"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"gaim", "kaim"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"ain", "yn"}}, {{FOLLOWED, "aeiou", OR}} },
+ { {FIND_REPLACE, {"ein", "yn"}}, {{FOLLOWED, "aeiou", OR}} },
+ { {FIND_REPLACE, {"aim", "yn"}}, {{FOLLOWED, "aeiou", OR}} },
+ { {FIND_REPLACE, {"eim", "yn"}}, {{FOLLOWED, "aeiou", OR}} },
+ { {FIND_REPLACE, {"eau", "o"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"oua", "2"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"ein", "4"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"ain", "4"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"eim", "4"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"aim", "4"}}, {{NONE, NULL, NONE}} },
+ /* { "é", "y", {{NONE, NULL, NONE}} }, */ // Could not be use
+ /* { "è", "y", {{NONE, NULL, NONE}} }, */ // (APPROX flag to
+ /* { "ê", "y", {{NONE, NULL, NONE}} }, */ // normalize())
+ { {FIND_REPLACE, {"ai", "y"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"ei", "y"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"er", "yr"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"et", "yt"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"ess", "yss"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"an", "1"}}, {{FOLLOWED, "aeiou1234", OR|NOT}} },
+ { {FIND_REPLACE, {"am", "1"}}, {{FOLLOWED, "aeiou1234", NOT|OR}} },
+ { {FIND_REPLACE, {"en", "1"}}, {{FOLLOWED, "aeiou1234", NOT|OR}} },
+ { {FIND_REPLACE, {"em", "1"}}, {{FOLLOWED, "aeiou1234", NOT|OR}} },
+ { {FIND_REPLACE, {"in", "4"}}, {{FOLLOWED, "aeiou1234", NOT|OR}} },
+ { {FIND_REPLACE, {"s", "z"}}, {{FOLLOWED, "aeiou1234", OR},
+ {PRECEDED, "aeiou1234", OR}} },
+ { {FIND_REPLACE, {"oe", "e"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"eu", "e"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"au", "o"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"oi", "2"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"oy", "2"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"ou", "3"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"sch", "5"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"ch", "5"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"sh", "5"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"ss", "s"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"sc", "s"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"c", "s"}}, {{FOLLOWED, "ei", OR}} },
+ { {FIND_REPLACE, {"c", "k"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"q", "k"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"qu", "k"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"gu", "k"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"ga", "ka"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"go", "ko"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"gy", "ky"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"a", "o"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"d", "t"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"p", "t"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"j", "g"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"b", "f"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"v", "f"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"m", "n"}}, {{NONE, NULL, NONE}} },
+ { {FIND_REPLACE, {"t", ""}}, {{FOLLOWED, ALL, AND|NOT}} },
+ { {FIND_REPLACE, {"x", ""}}, {{FOLLOWED, ALL, AND|NOT}} },
+ { {FIND_REPLACE, {ALL, ""}}, {{REPEATED, NULL, NONE}} },
+ { {NONE, {NULL}}, {{NONE, NULL, NONE}} },
+};
+
+
+static char phonetic_post_rules_fr_phonex[22] =
+ {
+ '1', '2', '3', '4', '5', 'e', 'f', 'g', 'h', 'i', 'k',
+ 'l', 'n', 'o', 'r', 's', 't', 'u', 'w', 'x', 'y', 'z'
+ };
+
+
+static phonetic_t phonetic_lang[] =
+ {
+ {"fr", phonetic_rules_fr_phonex, phonetic_post_rules_fr_phonex},
+ {NULL, NULL, NULL},
+ };
+
+
+#endif /* _SLAP_PHONETIC_H_ */
--- openldap-2.2.17/servers/slapd/config.c 2004-09-12 22:22:38.000000000 +0200
+++ openldap-2.2.17-phonetic2/servers/slapd/config.c 2004-09-20 12:01:04.084805784 +0200
@@ -91,6 +91,8 @@
char *strtok_quote_ptr;
+char *lang = NULL;
+
int use_reverse_lookup = 0;
#ifdef LDAP_SLAPI
@@ -631,6 +633,24 @@
} else if ( strcasecmp( cargv[0], "replica-argsfile" ) == 0 ) {
/* ignore */ ;
+ /* get default lang for approx */
+ } else if ( strcasecmp( cargv[0], "lang" ) == 0 ) {
+ if ( cargc < 2 ) {
+#ifdef NEW_LOGGING
+ LDAP_LOG( CONFIG, CRIT,
+ "%s: line %d missing lang name in \"lang <language>\" "
+ "line.\n", fname, lineno, 0 );
+#else
+ Debug( LDAP_DEBUG_ANY,
+ "%s: line %d: missing lang name in \"lang <language>\" line\n",
+ fname, lineno, 0 );
+#endif
+
+ return( 1 );
+ }
+
+ lang = ch_strdup( cargv[1] );
+
/* default password hash */
} else if ( strcasecmp( cargv[0], "password-hash" ) == 0 ) {
if ( cargc < 2 ) {
--
Alexandre PAUZIES <apauzies@linagora.com>
LINAGORA - http://www.linagora.com/