[Pharo-dev] You can cheat in FFI as long as you don't get caught...

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Thu Nov 9 09:28:39 EST 2017


2017-11-09 12:24 GMT+01:00 Igor Stasenko <siguctua at gmail.com>:

>
> On 9 November 2017 at 12:03, Esteban Lorenzano <estebanlm at gmail.com>
> wrote:
>
>> yeah, but this is because LGitDiff, LGitRepository and LGitTree (and many
>> others) are defined as FFIExternalObjects while they should be
>> FFIOpaqueObjects. Then, the problem is that external objects are *already*
>> pointers (they have arity 1).
>>
>>
> hmm but the external objects were already designed with thought that it
> holding some opaque handle to some external object. it does not neccessary
> should be a pointer .. it could be some handle,
> that doesn't points directly to some memory location (like in windows api
> etc).
> they only difference is only in what function(s) expecting, e.g. if it
> takes a handle value - you should pass it by value (the handle itself), and
> when it asks for pointer to the handle (to fill a return value , for
> instance) - you passing a pointer to it.
>
>
>

Agree that should not make a difference, a handle can be anything, a
pointer, an int, ...

Technically, the git_tree are not opaque, they are struct.
BUT we don't care at all of their layout/fields from within Pharo and want
to treat them as Opaque.
- First reason: declaring fields would make the Pharo code fragile versus
future evolution of libgit2 because any change in the layout would create a
dangerous mismatch.
- Second reason: we never have to allocate space for those struct, the
library cares for us.
- Third reason: libgit2 APi will never deal with a git_tree, only git_tree*
or git_tree**

So I think that you made a good choice to use a sort-of-handle (git_tree*).
I am now certain that it was the exact intention of FFIExternalObject

Maybe we could rename superclass to FFIExternalObjectPointer to make
explicit that it adds a pointer arity?
We must also provide examples in class comment (or link to a Help page...)

Example: I have an external object bar from library foo (a foo_bar).

    C signature: void foo_create_bar(foo_bar **bar);
    FFIExternalObjectPointer subclass: FooBar.

a Smalltalk FooBar now represents a pointer to a foo_bar (a foo_bar *).
And the library will be invoked with:

    self call: #(void foo_create_bar(FooBar *self))

If the library function return a pointer to a foo_bar

    C signature: foo_bar *foo_bar_alloc(size_t size);

then we interface it like that:

    handle pointerAt: 1 put: (self call: #(FooBar foo_bar_alloc( size_t
size))).

Note that the handle has one more indirection (it is equivalent to a
foo_bar**)
And when wanting to use a the foo_bar:

    C signature: void foo_use_bar(foo_bar *bar);
    self call: #(void foo_use_bar(FooBar self)).

Still it's not obvious to distinguish why LGitTree has different pointer
arity than LGitDiffOption at first sight...
Naming them LGitTreeHandle or LGitTreePointer is maybe too much?


this are remaining of old nativeboost and we created the opaque objects to
>> not need this “cheating”… just, I wanted to change them but is a lot of
>> work and I didn’t have the time.
>>
>> I agree it stinks.
>>
>> Esteban
>>
>> On 8 Nov 2017, at 19:06, Nicolas Cellier <nicolas.cellier.aka.nice at gmai
>> l.com> wrote:
>>
>> I was trying to inquire about my latest vm crash in libgit2
>>
>> https://pharo.fogbugz.com/f/cases/20655/vm-crash-in-libgit-
>> git_tree_free-throw-suspiscion-on-potential-double-free-problem
>>
>> when I found this bizarre prototype:
>>
>> ^ self
>>         callUnchecked:
>>             #(LGitReturnCodeEnum git_diff_tree_to_tree #(LGitDiff * diff
>> , LGitRepository repo , LGitTree old_tree , LGitTree new_tree ,
>> LGitDiffOptions * opts))
>>         options: #()
>>
>> While the libgit2 signature differs in indirection levels:
>> https://libgit2.github.com/libgit2/#v0.19.0/group/diff/git_
>> diff_tree_to_tree
>>
>> int git_diff_tree_to_tree(git_diff_list **diff, git_repository *repo,
>> git_tree *old_tree, git_tree *new_tree, const git_diff_options *opts);
>>
>> The fact that it does not differs for opts, made me think of a bug...
>> But no.
>>
>> opts is allocated on external c heap with:
>>
>> callUnchecked: #(LGitReturnCodeEnum git_diff_init_options(LGitDiffOptions
>> * self, LGitOptionsVersionsEnum version))
>>
>> int git_diff_init_options(git_diff_options *opts, unsigned int version);
>>
>> What you see this time is that signatures match...
>> So the LGitDiffOptions handle will be an ExternalAddress that will really
>> point on a git_diff_options
>> In other words, the handle is a git_diff_options *...
>> IOW, our LGitDiffOptions object points to an external it_diff_options so
>> it is what it promises to be.
>>
>> For the other structures not so.
>> We are lying on the level of indirection, so our LGitTree handle are not
>> really a git_tree *. They are a git_tree **. (an ExternalAddress on a
>> pointer).
>>
>> As long as we lie consistently, it's OK, because it avoids too much
>> packToArity: unpackToArity: dance.
>>
>> That's the "beauty" of C pointers.
>> We can cast them to whatever or pretend they are handle_for_anything * in
>> the intermediate level, as long as original producer and final consumer
>> agree on what it really points to.
>>
>> But from code quality perspective, it really stinks. Anyone like me
>> opening a page to get the exact signature of the function will be
>> scratching head and loose precious time. Especially when tracking vm crash
>> down.
>>
>> I'm not sure how well it's documented, I presume it's a well known
>> conscious hack from the original developpers, but such practice really
>> ought to be discussed here.
>>
>>
>>
>>
>>
>
>
> --
> Best regards,
> Igor Stasenko.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20171109/4b202899/attachment-0002.html>


More information about the Pharo-dev mailing list