[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#6437) sl_malloc issues



h.b.furuseth@usit.uio.no wrote:
> Full_Name: Hallvard B Furuseth
> Version: HEAD
> OS: Linux x86_64
> URL:
> Submission from: (NULL) (129.240.6.233)
> Submitted by: hallvard
>
>
> Here are a number of bugs and other issues with slapd/sl_malloc.c, and
> one for ber_memcalloc.  Some for RE24, likely some for RE25, but I don't
> know where to cut off for RE24?  (Waiting for someone to say...)

Go ahead and commit. In working pieces if necessary. We'll pick and choose...
>
> * It claims to align on 2*sizeof(int) boundaries, but it looks like
>    stack:realloc and pool:mem_create can un-align to (2n+1)*sizeof(int)
>    boundaries on 32-bit hosts, where sizeof(ber_len_t) == sizeof(int).
>
>    A slap_sl_realloc() fix for the stack implementation is to replace
> 	size += pad + sizeof( ber_len_t );
> 	size&= ~pad;
>    with
> 	size = ((size + pad)&  ~pad) + sizeof( ber_len_t );
>    however see the optimizations and tweaks below.

Fine.

>    For the pool implementation, maybe slap_sl_mem_create() can just set
> 	so->so_ptr = (char *) sh->sh_base + sizeof(ber_len_t) % (pad+1);
>    Unless that messes up its expectations of how much space is left, I
>    haven't checked.  Should mem_create size be increased accordingly?

Nobody uses the pool code. I would suggest you profile it against the stack 
code before spending any time fixing it.

>    Unfortunately sl_malloc.c exports its internals, so I don't know what
>    kind of fix would affect it:
>
> * Why is there a slap_sl_mem_detach() when there is no way to reattach
>    it?  As far as I can tell all the caller can do is to inspect/clean up
>    the returned slab_heap itself.

It was originally implemented in 2.2 to support psearch for sync replication. 
It was obsoleted when the syncprov overlay was written in 2.3, we just never 
deleted it. Go ahead and delete it. (And remember, the point of sl_malloc is 
that a heap here doesn't need any special cleanup - just free it and forget it.)

> * slap_sl_mem_create() switching implementation stack<->pool on an old
>    memctx slab can crash or leak, it runs the wrong cleanup code.
>
>    Fix: Remove code duplication, rearrange.  The pool cleanup duplicates
>    slap_sl_mem_destroy(), except the final frees.  Let destroy(key=NULL,)
>    mean don't free and call that.  Also the create/destroy pool branches
>    duplicate the stack branches.
>
> * However, will the pool (i.e. non-stack) implementation ever be used?
>    If not, a simpler fix to some of this is to delete it.  OTOH if it
>    someday may be used, hopefully some of these cleanups will be leaving
>    it a bit more correct, easier to read, and faster.  Though I haven't
>    gone to the drastic step of studying it to see how it works.
>
>    Or we could #ifdef it out by default.  It's unclear to me why this is
>    a run-time argument to the create function instead of instead of a
>    -D<compiler flag>, why would the caller care?

Dunno. I guess it was for quicker turnaround during testing. But afaik only 
Jong ever tested/used it, and no one else has spent any time on it. #if'ing it 
all out is fine with me. Of course, if you find that it's actually better 
performing than the stack version, then we can #if out the stack code instead.

>
>    #define SLAP_SLAB_MODE<undef or 1:stack, 0:pool: -1:support both>.
>    #define SLAP_SLAB_ISSTACK(stack)<stack if both, else SLAP_SLAB_MODE>
>
> * Regarding code duplication, the #ifdef SLAP_NO_SL_MALLOC code
>    duplicates the !ctx code.  Make that if(<enum No_sl_malloc>  &&  !ctx)
>    and the #ifdefs can be killed.

I would personally prefer no runtime checks, and all compile-time checks.

> * Also we can throw the #ifdef NO_THREADS/#else/#endif code out to two
>    macros #define GET_MEMCTX()/SET_MEMCTX().

right.

> * The "enough room?" test (ptr + size<  endptr) is invalid, can e.g.
>    wrap around, when there is not enough room.  Use (size<  endptr-ptr).

ok.

> * Some failures assert(0)/exit() like ch_malloc.c, but without giving a
>    Debug() message first.

don't really care...

> * The slap_sl_malloc() pool implementation can return NULL on failure
>    at comment "FIXME: missing return; guessing...".  For consistency it
>    should exit on failure like the rest of sl_malloc.c and ch_malloc.c.

don't care, unless we decide to keep the pool code.

Go ahead and commit.

> * Realloc(last block) when the following free space is not big enough,
>    does not move sh_last and reclaim free space properly.
>
> * slap_sl_calloc()/ber_memcalloc() do not catch<count>*<size>  overflow:
>      ber_len_t total = n * size;
>      if (!n&&  total/n == size)<allocate...>;
>
>    That can be optimized a bit, if anyone cares:
>
>    #define LIM_SQRT(t) /* some value<  sqrt(max value of unsigned type t) */ \
> 	((0UL|(t)-1)>>31>>31>  1 ? ((t)1<<32) - 1 : \
> 	 (0UL|(t)-1)>>31 ? 65535U : (0UL|(t)-1)>>15 ? 255U : 15U)
>    ...
>    /* sqrt test = slight optimization: often avoids the division */
>    if ((n | size)<= LIM_SQRT(ber_len_t) || !n || total/n == size) ...
>
> * slap_sl_<malloc/realloc>  increase size before proceeding, they should
>    reduce it before falling back to ch_malloc and for Debug messages.
>
> Space optimizations/cleanup:
>
> * To save space on 64-bit hosts: We can store store sizes (head, tail)
>    in 4-byte typedef ber_uint_t slap_sl_size_t instead of a ber_len_t.
>
>    After fixing the misalignment issue above, however.
>
> * Also the stack implementation does not need to allocate the tail.
>    It can store the "block is freed" bit in the head of the _next_ block,
>    and only store a tail in free blocks (i.e. if that bit is set).
>
>    I think that does add a bit of code, though it is convenient for the
>    "reclaim freed blocks at sh_last" code in slap_sl_free().
>
> * Ignoring last point, the "reclaim" code can in any case be simplified:
>          /* mark it free */
>          p[-1] = size |= 1;
>          /* reclaim free space off tail */
>          if (sh->sh_last == p) {
>              do {
>                  p = (slap_sllen_t*) ((char*) p - size + 1) - 1;
>                  size = p[-1];
>              } while (size&  1);
>              sh->sh_last = p;
>          }
>
> * Realloc can check if previous or next block is free before giving up
>    and doing malloc/copy/free.  That does multiply its code size though.
>    Dunno if it's a good idea.  Needs testing/timing, I guess.
>
> * Aligning with (size&  ~pad) is wrong for non-two's complement signed
>    pad, it should be (size&  -(pad + 1)) aka (size&  -Alignment).
>
>    All right, so I'm nitpicking.  I just dislike unportable code which
>    isn't even simpler than the portable variant.


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/