[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Binding mdb_set_compare with Go



Bryan Matsuo wrote:
After digging it seems that it will continue to be possible for users to
safely pass static Go function references. It is quite a burden on the user.
But I will continue to think about it.

Jay, noted. I am open to exploring that direction. Though as was pointed out
earlier a library of static functions can be made more useful (if somewhat
slower) when a context object can configure their behavior. Before reading
that suggestion I was uncertain how much useful functionality could be exposed
as a library. I am writing general purpose bindings, so I would prefer a
function library be fairly generic.

Howard, do you have thoughts on the proposal from Juerg regarding a
compound-key comparison function implemented using a context value?

I remain unconvinced. key-comparison is still per-DB; a comparison specifier saves some space but at the expense of time - more compare ops, more branching per key compare. Dedicated functions are still the better way to go. Plus, naive constructs like in the emailed example are easy to get wrong - his example will never terminate because he only breaks from the switch statement, nothing breaks from the while(1) loop. It is attempting to be too clever, when a more straightforward approach will be faster and obviously bug-free.

On Thu, Oct 29, 2015 at 8:49 PM Jay Booth <jaybooth@gmail.com
<mailto:jaybooth@gmail.com>> wrote:

     From the peanut gallery:  Small set of static C functions is probably the
    way to go.  If I understand correctly, which I probablay don't, the
    mismatch between green threads and OS threads means there's a lot of
    expensive stack-switching involved in go->C->go execution.

    On Thu, Oct 29, 2015 at 5:28 PM, Bryan Matsuo <bryan.matsuo@gmail.com
    <mailto:bryan.matsuo@gmail.com>> wrote:

        Juerg,

        That is is interesting proposal. As an alternative to letting users
        hook up arbitrary Go function for comparison, I have also thought
        about the possibility of providing a small set of static C functions
        usable for comparison. A flexible compound key comparison function
        like this could fit well into that idea.

        Howard,

        Sorry I did not find the issues mentioned in previous searches.

        I understand the concern about such a hot code path. I'm not sure that
        Go would see acceptable performance.

        But, Go is not an interpreted language (though there is glue). And
        while I'm not positive about the performance of Go in this area you
        seem to dismiss comparison functions in any other language. Is it
        unreasonable to think that comparison functions written in other
        compiled languages like Rust, Nim, or ML variants would also be
        impractically slow?

        I also believe you have misunderstood the practical problems of
        passing Go function pointers to C. But to be fair, I think the wording
        of that quoted paragraph could be better.

        >Sorry but there is no other Go function for the mdb_cmp() function to
        call, the only one it knows about is the function pointer that you pass.

        It may be of benefit to see how the I've used the context argument in
        a binding being developed for the mdb_reader_list function.

        https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb/lmdbgo.c

        The callback passed to mdb_reader_list is always the same static
        function because correctly calling a Go function from C requires an
        annotated static Go function. The context argument allows dispatch to
        the correct Go function that was configured at runtime. I believe that
        is the "other" Go function you mentioned.

        The implementation would be similar for mdb_set_compare. The callback
        would always be the same static function which handles the dynamic
        dispatch.

        Cheers,
        - Bryan

        On Thu, Oct 29, 2015 at 3:12 AM Jürg Bircher
        <juerg.bircher@helmedica.com <mailto:juerg.bircher@helmedica.com>> wrote:

            Actually I’m not commenting on binding Go but I’m voting for a
            context passed to the compare function.

            I fully agree that the compare function is part of the critical
            path. But as I need to define custom indexes with compound keys
            the compare functions varies and it would be impractical to
            predefine for any compound key combination a c function.

            The compare context would be stored on the struct MDB_dbx.

            typedef struct MDB_dbx {
                     MDB_val         md_name;                /**< name of the
            database */
                     MDB_cmp_func    *md_cmp;        /**< function for
            comparing keys */
                     void            *md_cmpctx; /** user-provided context for
            md_cmp **/
                     MDB_cmp_func    *md_dcmp;       /**< function for
            comparing data items */
                     void            *md_dcmpctx;/** user-provided context for
            md_dcmp **/
                     MDB_rel_func    *md_rel;        /**< user relocate
            function */
                     void            *md_relctx;             /**<
            user-provided context for md_rel */
            } MDB_dbx;

            The following is a draft (not tested yet) of a generic compare
            function. The context contains a compare specification which is a
            null terminated list of <type><order> pairs.

            // compareSpec <type><order>...<null>
            int key_comp_generic(const MDB_val *a, const MDB_val *b, char
            *compareSpec) {
                 int result = 0;
                 char *pa = a->mv_data;
                 char *pb = b->mv_data;

                 while (1) {
                     switch (*compareSpec++) {
                        case 0:
                             break;
                        case INT32_KEY :
                        {
                             unsigned int va = *(unsigned int *)pa;
                             unsigned int vb = *(unsigned int *)pb;

                             if (*compareSpec++ == ASCENDING_ORDER) {
                                 result = (va < vb) ? -1 : va > vb;
                             }
                             else {
                                 result = (va > vb) ? -1 : va < vb;
                             }

                             if (result != 0) {
                                 break;
                             }
                             else {
                                 pa += 4;
                                 pb += 4;
                             }
                         }
                         case INT64_KEY :
                         {
                             unsigned long long va = *(unsigned long long *)pa;
                             unsigned long long vb = *(unsigned long long *)pb;

                             if (*compareSpec++ == ASCENDING_ORDER) {
                                 result = (va < vb) ? -1 : va > vb;
                             }
                             else {
                                 result = (va > vb) ? -1 : va < vb;
                             }

                             if (result != 0) {
                                 break;
                             }
                             else {
                                 pa += 8;
                                 pb += 8;
                             }
                         }
                         case STRING_KEY :
                         {
                             unsigned int la = *(unsigned int *)pa;
                             unsigned int lb = *(unsigned int *)pb;

                             pa += 4;
                             pb += 4;

                             if (*compareSpec++ == ASCENDING_ORDER) {
                                 result = strncmp(pa, pb, (la < lb) ? la : lb);
                                 if (result != 0) {
                                     break;
                                 }
                                 else {
                                     result = (la < lb) ? -1 : la > lb;
                                 }
                             }
                             else {
                                 result = strncmp(pb, pa, (la < lb) ? la : lb);
                                 if (result != 0) {
                                     break;
                                 }
                                 else {
                                     result = (la > lb) ? -1 : la < lb;
                                 }
                             }

                             if (result != 0) {
                                 break;
                             }
                             else {
                                 pa += la;
                                 pb += lb;
                             }
                         }
                     }
                 }

                 return result;
            }

            Regards
            Juerg


            On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu"
            <openldap-technical-bounces@openldap.org
            <mailto:openldap-technical-bounces@openldap.org> on behalf of
            hyc@symas.com <mailto:hyc@symas.com>> wrote:

             >Bryan Matsuo wrote:
             >> openldap-technical,
             >>
             >> I am working on some Go (golang) bindings[1] for the LMDB
            library and I have
             >> some interest in exposing the functionality of mdb_set_compare
            (and
             >> mdb_set_dupsort). But it is proving difficult and I have a
            question about the
             >> function(s).
             >>
             >> Calling mdb_set_compare from the Go runtime is challenging.
            Using C APIs with
             >> callbacks comes with restrictions[2][3]. I believe it
            impossible to bind these
             >> functions way that is flexible, as one would expect. A
            potential change to
             >> LMDB that would make binding drastically easier is having
            MDB_cmp_func to take
             >> a third "context" argument with type void*. Then a binding
            could safely use an
             >> arbitrary Go function for comparisons.
             >>
             >> Is it possible for future versions of LMDB to add a third
            argument to the
             >> MDB_cmp_func signature? Otherwise would it be acceptable for a
            variant API to
             >> be added using a different function type, one accepting three
            arguments?
             >>
             >> Thanks for the consideration.
             >>
             >> Cheers,
             >> - Bryan
             >>
             >> [1] Go bindings -- https://github.com/bmatsuo/lmdb-go
             >> [2] Cgo pointer restrictions --
             >>
            https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
             >> [3] Cgo documentation -- https://golang.org/cmd/cgo/
             >
             >I see nothing in these restrictions that requires extra
            information to be
             >passed from Go to C or from C to Go.
             >
             >There is a vague mention in [2]
             >
             >"A particular unsafe area is C code that wants to hold on to Go
            func and
             >pointer values for future callbacks from C to Go. This works
            today but is not
             >permitted by the invariant. It is hard to detect. One safe
            approach is: Go
             >code that wants to preserve funcs/pointers stores them into a
            map indexed by
             >an int. Go code calls the C code, passing the int, which the C
            code may store
             >freely. When the C code wants to call into Go, it passes the int
            to a Go
             >function that looks in the map and makes the call."
             >
             >But it's nonsense in this case - you want to pass a Go function
            pointer to C,
             >but the only way for C to use it is to call some *other* Go
            function? Sorry
             >but there is no other Go function for the mdb_cmp() function to
            call, the only
             >one it knows about is the function pointer that you pass.
             >
             >If this is what you're referring to, adding a context pointer
            doesn't achieve
             >anything. If this isn't what you're referring to, then please
            explain exactly
             >what you hope to achieve with this context pointer.
             >
             >--
             >   -- Howard Chu
             >   CTO, Symas Corp. http://www.symas.com
             >   Director, Highland Sun http://highlandsun.com/hyc/
             >   Chief Architect, OpenLDAP http://www.openldap.org/project/




--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/