[Pharo-dev] OpalCompiler evaluate speed

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Thu Nov 23 15:41:36 EST 2017


2017-11-22 0:31 GMT+01:00 Ben Coman <btc at openinworld.com>:

>
>
> On 22 November 2017 at 05:49, Nicolas Cellier <nicolas.cellier.aka.nice@
> gmail.com> wrote:
>
>>
>>
>> 2017-11-21 14:19 GMT+01:00 Nicolas Cellier <nicolas.cellier.aka.nice at gmai
>> l.com>:
>>
>>> I have an ArbitraryPrecisionFloatTests doing an exhaustive test for
>>> printing and reevaluating all positve half precision float.
>>>
>>> That's about 2^15 or approximately 32k loop which evaluate snippets like
>>>
>>>     (ArbitraryPrecisionFloat readFrom: '1.123' readStream numBits: 10)
>>>
>>> The test was naively written with Compiler evaluate: and was using the
>>> legacy Compiler.
>>>
>>> If I rewrite self class compiler evaluate: the test times out.
>>> Let's see what increase is necessary:
>>>
>>>     [ ArbitraryPrecisionFloatTest new testPrintAndEvaluate  ] timeToRun.
>>>     -> 3s with legacy Compiler
>>>     -> 14s with OpalCompiler
>>>
>>> It's not unexpected that intermediate representation (IR) reification
>>> has a cost, but here the 4.5x is a bit too much...
>>> This test did account for 1/4 of total test duration already (3s out of
>>> 12s).
>>> With Opal, the total test duration doubles... (14s out of 23s)
>>>
>>> So let's analyze the hot spot with:
>>>
>>>     MessageTally  spyOn: [ ArbitraryPrecisionFloatTest new
>>> testPrintAndEvaluate  ].
>>>
>>> (I didn't use AndreasSystemProfiler becuase outputs seems a bit
>>> garbbled, no matter since the primitives do not account that much, a
>>> MessageTally will do the job)
>>>
>>> I first see a hot spot which does not seem that necessary:
>>>
>>>       |    |24.6% {3447ms} RBMethodNode(RBProgramNode)>>formattedCode
>>>
>>> From the comments I understand that AST-based stuff requires a pattern
>>> (DoIt) and an explicit return (^), but this expensive formatting seems too
>>> much for just evaluating. i think that we should change that.
>>>
>>> Then comes:
>>>
>>>       |    |20.7% {2902ms} RBMethodNode>>generate:
>>>
>>> which is split in two halves, ATS->IR and IR->bytecode
>>>
>>>       |    |  |9.3% {1299ms} RBMethodNode>>generateIR
>>>
>>>       |    |  |  |11.4% {1596ms} IRMethod>>generate:
>>>
>>> But then I see this cost a 2nd time which also leave room for progress:
>>>
>>>       |                |10.9% {1529ms} RBMethodNode>>generateIR
>>>
>>>       |                |  |12.9% {1814ms} IRMethod>>generate:
>>>
>>> The first is in RBMethodNode>>generateWithSource, the second in
>>> OpalCompiler>>compile
>>>
>>> Last comes the parse time (sourceCode -> AST)
>>>
>>>       |                  13.2% {1858ms} OpalCompiler>>parse
>>>
>>> Along with semantic analysis
>>>
>>>       |                  6.0% {837ms} OpalCompiler>>doSemanticAnalysis
>>>
>>> -----------------------------------
>>>
>>> For comparison, the legacy Compiler decomposes into:
>>>
>>>       |        |61.5% {2223ms} Parser>>parse:class:category:n
>>> oPattern:context:notifying:ifFail:
>>>
>>> which more or less covers parse time + semantic analysis time.
>>> That means that Opal does a fair work for this stage.
>>>
>>> Then, the direct AST->byteCode phase is:
>>>
>>>      |      16.9% {609ms} MethodNode>>generate
>>>
>>> IR costs almost a 5x on this phase, but we know it's the price to pay
>>> for the additional features that it potentially offers. If only we would do
>>> it once...
>>>
>>> And that's all for the legacy one...
>>>
>>> --------------------------------------
>>>
>>> This little exercize shows that a 2x acceleration of OpalCompiler
>>> evaluate seems achievable:
>>> - simplify the uselessely expensive formatted code
>>> - generate bytecodes once, not twice
>>>
>>> Then it will be a bit more 2x slower than legacy, which is a better
>>> trade for yet to come superior features potentially brought by Opal.
>>>
>>> It would be interesting to carry same analysis on method compilation
>>>
>>
>> Digging further here is what I find:
>>
>> compile sends generate: and answer a CompiledMethod
>> translate sends compile but throw the CompiledMethod away, and just
>> answer the AST.
>> Most senders of translate will also generate: (thus we generate: twice
>> quite often, loosing a 2x factor in compilation).
>>
>> A 2x gain is a huge gain when installing big code bases, especially if
>> the custom is to throw image away and reconstruct.
>> No matter if a bot does the job, it does it for twice many watts and at
>> the end, we're waiting for the bot.
>>
>> However, before changing anything, further clarification is required:
>> translate does one more thing, it catches ReparseAfterSourceEditing and
>> retry compilation (once).
>> So my question: are there some cases when generate: will cause
>> ReparseAfterSourceEditing?
>>
>
> I don't know the full answer about other cases, but I can provide the
> background why ReparseAfterSourceEditing was added.
>
> IIRC, a few years ago with the move to an AST based system, there was a
> problem with syntax highlighting where
> the AST referenced its original source which caused highlighting offsets
> when reference to source modified in the editor.
> Trying to work backwards from modified source to update all AST elements
> source-location proved an intractable problem.
> The workaround I found was to move only in a forward direction
> regenerating AST from source every keystroke.
> Performance was acceptable so this became the permanent solution.
>
> I don't have access to an image to check, but you should find
> ReparseAfterSourceEditing raised only in one location near editor #changed:
> Maybe this should activate only for interactively modified code, and
> disabled/bypassed for bulk loading code.
> For testing purposes commenting it out should not harm the system, just
> produce visual artifacts in syntax highlighting.
>
>
>
>> That could happen in generation phase if some byte code limit is
>> exceeded, and an interactive handling corrects code...
>> I did not see any such condition, but code base is huge...
>>
>
> At worst, the impact should only be a temporary visual artifact. Corrected
> on the next keystroke.
> (unless ReparseAfterSourceEditing has been adopted for other than original
> purpose, but I'd guess not)
>
> cheers -ben
>
> Hi Ben,
Thanks for information.
We must keep ReparseAfterSourceEditing, it does its job.

But it just sounds like we have an inversion:

translate (source code->AST) does call compile (source code->AST->bytecode
in a CompiledMethod)

I would expect the other way around: if we want to compile, we need to
translate first.
If we want to translate, we don't really need to compile, unless there's an
hidden reason...
Thus my question: is there an hidden reason?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20171123/ab6edf38/attachment-0002.html>


More information about the Pharo-dev mailing list