[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: FW: profiling

At 03:55 PM 2001-09-21, Stig Venaas wrote:
>One thing I noticed when looking at Unicode normalization is that we
>often normalize the same string several times,

Yes, especially cached attribute values.

>and often we try to normalize strings that already are normalized.

This should be minimized.  In particular, I note that filter code is
designed to avoid this.  And we also maintain both non-normalized and
normalized DNs in the cache (because we know we'll need both).

>One possibility would
>be for each string to be a struct that contains a pointer to a
>normalized copy. It should be NULL until the string is normalized.

That's one approach.  But note that keeping these copies will decrease
the amount of memory available for caching and that could have a
negative impact on performance.  Likely a behavior which should be

>Also the normalization code should probably be reworked to allocate
>larger chunks.  It currently uses a lot of reallocs, when normalizing
>a single string, several reallocs might be done.

Or use a malloc library which avoids small incremental allocations...
(or build this into ber allocation routines).

>Anyway, I would like to get more testing of the current code before optimizing it.