The curious case of constant blocks

MD
Marcus Denker
Sat, May 20, 2023 9:02 AM

You might have come across code like this:

minHeight
	"answer the receiver's minHeight"
	^ self
		valueOfProperty: #minHeight
		ifAbsent: [2]

In the case the #minHeight property is not set, it returns 2.

Code like this is quite common, another example are empty ifAbsent blocks:

 someDictonary remove: anObject ifAbsent: []

If we analyse the system, we can easily find all of them. The best is to use the AST for this:

allBlocks := Smalltalk globals methods flatCollect: [:method | method ast blockNodes ].
allBlocks size. "86805"

nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not].
nonInlinedBlocks size.  "36661"

“the blocks are actually just constant"
constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode isConstant].
constantBlocks size. "2572" 

So there are 2572 constant (literal) blocks. You can inspect constantBlocks to explore them:

Constant or empty blocks ([] is just [nil]) do not feel like something to think too much about.

After all, they just retunr the literal when you send #value to them. What can be the problem?

But: they are blocks, and in a system without clean blocks, they are full blocks, which means they are created at runtime for every execution of the [] block. And they are blocks, so there is a CompiledBlock created for each and sending #value will execute that bytecode, with the JIT having to create binary code.

For Morph>>#minHeight the bytecode would be:

 "'49 <4C> self
50 <20> pushConstant: #minHeight
51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0
54 <A2> send: valueOfProperty:ifAbsent:
55 <5C> returnTop'"

This is expensive! [2] is the same as 2 (the only thing we can do with the block is to send #value, and we can do that with the literal directly).

[ 2 value ] bench.
[ [2] value ] bench

218625362.000/25750416.833 "8.490167884188363"

So there >factor 8 for "create and evaluate" in difference between the two!

This lead to people actually rewriting code to use the literal directly, e.g. we could just change it to

minHeight
	"answer the receiver's minHeight"
	^ self
		valueOfProperty: #minHeight
		ifAbsent: 2

I am guilty of using this sometimes when optimizing for performance, but it does not feel nice. Yet another rule for performance to think about, and the number of constant blocks that are there shows that this is not how people want to do it. And, most important: it just works for 0 arg constant blocks, as literals undestand #value, but not #value:, #value:value: and so on.

So what can we do? The first thing (and I am sure you are thinking about that alreary) is the idea of clean blocks. Clean blocks are blocks that only need (to be created)  information that the compiler has statically at compile time. you can look at RBProgramNode>>#isClean and the overrides in RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but for this case, all what you need to know is that a constant block, as it accesses nothing, is of course the trivial case of a clean block.

If we compile them as clean blocks, we will immediatly move creation to compile time, and runtime property will be the same as using a literal. With the added benefit that constant blocks with arguments are supported, too.

But: using "2" instead of [2] is not only faster for creation, it is faster when evaluting, too. The reason is that "2 value" sends #value, which executes Object>>#value, which is

value

	^self

Which is a Quick return self method, aka a primitive:

self symbolic   "'Quick return self'"

This is very fast. While even as a clean block, we have, for every clean block, it's own method (compiledBlock) that the VM has to execute and thus create
code for:

self symbolic

"'25 <20> pushConstant: 2
26 <5E> blockReturn'"

It seems the fact that one is a quick return and the other a push/return is for the JIT not that of a difference, it matters for the interpreter more. But the JIT has to create code for every constant block, and #value means executing BlockClosure>>#value, which triggers execution of that compiedBlock.

We thus have to execute two methods, not one. And the JIT has to cache all the generated code.

So can we do better? It is actually easy to implement a class ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the #value methods to just return
the constant value:

value
	^literal

Thus we get the same as with sending #value to the literal directly: we send #value, we execute one method that is a quick return.

And the good news: there is #optionConstantBlockClosure in the compiler, and it is enabled by default in Pharo11!

The reason why we can turn on Constant Bocks without problem is that they are never on the stack, so we do not need to take care to fix all the tools to know how to deal with them.
(Constant Bocks actually do have a CompiledBlock so that the e.g. for “senders of” we check the literals just as if it would be a normal clean block, it is just never executed)

If we go back to our method #minHeight, this means the bytecode looks like that:

self symbolic "'49 <4C> self
50 <20> pushConstant: #minHeight
51 <21> pushConstant: [2]
52 <A2> send: valueOfProperty:ifAbsent:
53 <5C> returnTop'"

Thus, in Pharo11, the execution path of all the >2500 constant blocks end up executing one of the #value methods of ConstantBlockClosure. To get all the exceptions corect when sending e.g. #value ot a 1-arg block, there are subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now (there are not many).

If you want to check that this really works, go to ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's really called a lot!

Marcus
You might have come across code like this: ``` minHeight "answer the receiver's minHeight" ^ self valueOfProperty: #minHeight ifAbsent: [2] ``` In the case the #minHeight property is not set, it returns 2. Code like this is quite common, another example are empty ifAbsent blocks: ``` someDictonary remove: anObject ifAbsent: [] ``` If we analyse the system, we can easily find all of them. The best is to use the AST for this: ``` allBlocks := Smalltalk globals methods flatCollect: [:method | method ast blockNodes ]. allBlocks size. "86805" nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not]. nonInlinedBlocks size. "36661" “the blocks are actually just constant" constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode isConstant]. constantBlocks size. "2572" ``` So there are 2572 constant (literal) blocks. You can inspect constantBlocks to explore them: Constant or empty blocks ([] is just [nil]) do not feel like something to think too much about. After all, they just retunr the literal when you send #value to them. What can be the problem? But: they are blocks, and in a system without clean blocks, they are full blocks, which means they are created at runtime for *every* execution of the [] block. And they are blocks, so there is a CompiledBlock created for each and sending #value will execute that bytecode, with the JIT having to create binary code. For Morph>>#minHeight the bytecode would be: ``` "'49 <4C> self 50 <20> pushConstant: #minHeight 51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0 54 <A2> send: valueOfProperty:ifAbsent: 55 <5C> returnTop'" ``` This is expensive! [2] is the same as 2 (the only thing we can do with the block is to send #value, and we can do that with the literal directly). ``` [ 2 value ] bench. [ [2] value ] bench 218625362.000/25750416.833 "8.490167884188363" ``` So there >factor 8 for "create and evaluate" in difference between the two! This lead to people actually rewriting code to use the literal directly, e.g. we could just change it to ``` minHeight "answer the receiver's minHeight" ^ self valueOfProperty: #minHeight ifAbsent: 2 ``` I am guilty of using this sometimes when optimizing for performance, but it does not feel nice. Yet another rule for performance to think about, and the number of constant blocks that are there shows that this is not how people want to do it. And, most important: it just works for 0 arg constant blocks, as literals undestand #value, but not #value:, #value:value: and so on. So what can we do? The first thing (and I am sure you are thinking about that alreary) is the idea of clean blocks. Clean blocks are blocks that only need (to be created) information that the compiler has statically at compile time. you can look at RBProgramNode>>#isClean and the overrides in RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but for this case, all what you need to know is that a constant block, as it accesses nothing, is of course the trivial case of a clean block. If we compile them as clean blocks, we will immediatly move creation to compile time, and runtime property will be the same as using a literal. With the added benefit that constant blocks with arguments are supported, too. But: using "2" instead of [2] is not only faster for *creation*, it is faster when evaluting, too. The reason is that "2 value" sends #value, which executes Object>>#value, which is ``` value ^self ``` Which is a Quick return self method, aka a primitive: ``` self symbolic "'Quick return self'" ``` This is *very fast*. While even as a clean block, we have, for every clean block, it's own method (compiledBlock) that the VM has to execute and thus create code for: self symbolic "'25 <20> pushConstant: 2 26 <5E> blockReturn'" It seems the fact that one is a quick return and the other a push/return is for the JIT not that of a difference, it matters for the interpreter more. But the JIT has to create code for *every* constant block, and #value means executing BlockClosure>>#value, which triggers execution of that compiedBlock. We thus have to execute two methods, not one. And the JIT has to cache all the generated code. So can we do better? It is actually easy to implement a class ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the #value methods to just return the constant value: ``` value ^literal ``` Thus we get the same as with sending #value to the literal directly: we send #value, we execute one method that is a quick return. And the good news: there is #optionConstantBlockClosure in the compiler, and it is enabled by default in Pharo11! The reason why we can turn on Constant Bocks without problem is that they are never on the stack, so we do not need to take care to fix all the tools to know how to deal with them. (Constant Bocks actually do have a CompiledBlock so that the e.g. for “senders of” we check the literals just as if it would be a normal clean block, it is just never executed) If we go back to our method #minHeight, this means the bytecode looks like that: ``` self symbolic "'49 <4C> self 50 <20> pushConstant: #minHeight 51 <21> pushConstant: [2] 52 <A2> send: valueOfProperty:ifAbsent: 53 <5C> returnTop'" ``` Thus, in Pharo11, the execution path of all the >2500 constant blocks end up executing one of the #value methods of ConstantBlockClosure. To get all the exceptions corect when sending e.g. #value ot a 1-arg block, there are subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now (there are not many). If you want to check that this really works, go to ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's really called a lot! Marcus
SD
stephane ducasse
Sat, May 20, 2023 1:53 PM

Thanks marcus.
I will turn this post into a doc fpr P12

On 20 May 2023, at 11:02, Marcus Denker marcus.denker@inria.fr wrote:

You might have come across code like this:

minHeight
	"answer the receiver's minHeight"
	^ self
		valueOfProperty: #minHeight
		ifAbsent: [2]

In the case the #minHeight property is not set, it returns 2.

Code like this is quite common, another example are empty ifAbsent blocks:

someDictonary remove: anObject ifAbsent: []

If we analyse the system, we can easily find all of them. The best is to use the AST for this:

allBlocks := Smalltalk globals methods flatCollect: [:method | method ast blockNodes ].
allBlocks size. "86805"

nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not].
nonInlinedBlocks size.  "36661"

“the blocks are actually just constant"
constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode isConstant].
constantBlocks size. "2572" 

So there are 2572 constant (literal) blocks. You can inspect constantBlocks to explore them:

<Constant.jpeg>

Constant or empty blocks ([] is just [nil]) do not feel like something to think too much about.

After all, they just retunr the literal when you send #value to them. What can be the problem?

But: they are blocks, and in a system without clean blocks, they are full blocks, which means they are created at runtime for every execution of the [] block. And they are blocks, so there is a CompiledBlock created for each and sending #value will execute that bytecode, with the JIT having to create binary code.

For Morph>>#minHeight the bytecode would be:

"'49 <4C> self
50 <20> pushConstant: #minHeight
51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0
54 <A2> send: valueOfProperty:ifAbsent:
55 <5C> returnTop'"

This is expensive! [2] is the same as 2 (the only thing we can do with the block is to send #value, and we can do that with the literal directly).

[ 2 value ] bench.
[ [2] value ] bench

218625362.000/25750416.833 "8.490167884188363"

So there >factor 8 for "create and evaluate" in difference between the two!

This lead to people actually rewriting code to use the literal directly, e.g. we could just change it to

minHeight
	"answer the receiver's minHeight"
	^ self
		valueOfProperty: #minHeight
		ifAbsent: 2

I am guilty of using this sometimes when optimizing for performance, but it does not feel nice. Yet another rule for performance to think about, and the number of constant blocks that are there shows that this is not how people want to do it. And, most important: it just works for 0 arg constant blocks, as literals undestand #value, but not #value:, #value:value: and so on.

So what can we do? The first thing (and I am sure you are thinking about that alreary) is the idea of clean blocks. Clean blocks are blocks that only need (to be created)  information that the compiler has statically at compile time. you can look at RBProgramNode>>#isClean and the overrides in RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but for this case, all what you need to know is that a constant block, as it accesses nothing, is of course the trivial case of a clean block.

If we compile them as clean blocks, we will immediatly move creation to compile time, and runtime property will be the same as using a literal. With the added benefit that constant blocks with arguments are supported, too.

But: using "2" instead of [2] is not only faster for creation, it is faster when evaluting, too. The reason is that "2 value" sends #value, which executes Object>>#value, which is

value

	^self

Which is a Quick return self method, aka a primitive:

self symbolic   "'Quick return self'"

This is very fast. While even as a clean block, we have, for every clean block, it's own method (compiledBlock) that the VM has to execute and thus create
code for:

self symbolic

"'25 <20> pushConstant: 2
26 <5E> blockReturn'"

It seems the fact that one is a quick return and the other a push/return is for the JIT not that of a difference, it matters for the interpreter more. But the JIT has to create code for every constant block, and #value means executing BlockClosure>>#value, which triggers execution of that compiedBlock.

We thus have to execute two methods, not one. And the JIT has to cache all the generated code.

So can we do better? It is actually easy to implement a class ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the #value methods to just return
the constant value:

value
	^literal

Thus we get the same as with sending #value to the literal directly: we send #value, we execute one method that is a quick return.

And the good news: there is #optionConstantBlockClosure in the compiler, and it is enabled by default in Pharo11!

The reason why we can turn on Constant Bocks without problem is that they are never on the stack, so we do not need to take care to fix all the tools to know how to deal with them.
(Constant Bocks actually do have a CompiledBlock so that the e.g. for “senders of” we check the literals just as if it would be a normal clean block, it is just never executed)

If we go back to our method #minHeight, this means the bytecode looks like that:

self symbolic "'49 <4C> self
50 <20> pushConstant: #minHeight
51 <21> pushConstant: [2]
52 <A2> send: valueOfProperty:ifAbsent:
53 <5C> returnTop'"

Thus, in Pharo11, the execution path of all the >2500 constant blocks end up executing one of the #value methods of ConstantBlockClosure. To get all the exceptions corect when sending e.g. #value ot a 1-arg block, there are subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now (there are not many).

If you want to check that this really works, go to ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's really called a lot!

Marcus
Thanks marcus. I will turn this post into a doc fpr P12 > On 20 May 2023, at 11:02, Marcus Denker <marcus.denker@inria.fr> wrote: > > You might have come across code like this: > > ``` > minHeight > "answer the receiver's minHeight" > ^ self > valueOfProperty: #minHeight > ifAbsent: [2] > ``` > > In the case the #minHeight property is not set, it returns 2. > > Code like this is quite common, another example are empty ifAbsent blocks: > > > ``` > someDictonary remove: anObject ifAbsent: [] > ``` > > If we analyse the system, we can easily find all of them. The best is to use the AST for this: > > ``` > allBlocks := Smalltalk globals methods flatCollect: [:method | method ast blockNodes ]. > allBlocks size. "86805" > > nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not]. > nonInlinedBlocks size. "36661" > > “the blocks are actually just constant" > constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode isConstant]. > constantBlocks size. "2572" > > ``` > > So there are 2572 constant (literal) blocks. You can inspect constantBlocks to explore them: > > <Constant.jpeg> > > Constant or empty blocks ([] is just [nil]) do not feel like something to think too much about. > > After all, they just retunr the literal when you send #value to them. What can be the problem? > > But: they are blocks, and in a system without clean blocks, they are full blocks, which means they are created at runtime for *every* execution of the [] block. And they are blocks, so there is a CompiledBlock created for each and sending #value will execute that bytecode, with the JIT having to create binary code. > > For Morph>>#minHeight the bytecode would be: > > ``` > "'49 <4C> self > 50 <20> pushConstant: #minHeight > 51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0 > 54 <A2> send: valueOfProperty:ifAbsent: > 55 <5C> returnTop'" > ``` > > This is expensive! [2] is the same as 2 (the only thing we can do with the block is to send #value, and we can do that with the literal directly). > > ``` > [ 2 value ] bench. > [ [2] value ] bench > > 218625362.000/25750416.833 "8.490167884188363" > ``` > > So there >factor 8 for "create and evaluate" in difference between the two! > > This lead to people actually rewriting code to use the literal directly, e.g. we could just change it to > > > ``` > minHeight > "answer the receiver's minHeight" > ^ self > valueOfProperty: #minHeight > ifAbsent: 2 > ``` > > I am guilty of using this sometimes when optimizing for performance, but it does not feel nice. Yet another rule for performance to think about, and the number of constant blocks that are there shows that this is not how people want to do it. And, most important: it just works for 0 arg constant blocks, as literals undestand #value, but not #value:, #value:value: and so on. > > > So what can we do? The first thing (and I am sure you are thinking about that alreary) is the idea of clean blocks. Clean blocks are blocks that only need (to be created) information that the compiler has statically at compile time. you can look at RBProgramNode>>#isClean and the overrides in RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but for this case, all what you need to know is that a constant block, as it accesses nothing, is of course the trivial case of a clean block. > > If we compile them as clean blocks, we will immediatly move creation to compile time, and runtime property will be the same as using a literal. With the added benefit that constant blocks with arguments are supported, too. > > But: using "2" instead of [2] is not only faster for *creation*, it is faster when evaluting, too. The reason is that "2 value" sends #value, which executes Object>>#value, which is > > ``` > value > > ^self > ``` > > Which is a Quick return self method, aka a primitive: > > > ``` > self symbolic "'Quick return self'" > ``` > > This is *very fast*. While even as a clean block, we have, for every clean block, it's own method (compiledBlock) that the VM has to execute and thus create > code for: > > self symbolic > > "'25 <20> pushConstant: 2 > 26 <5E> blockReturn'" > > > It seems the fact that one is a quick return and the other a push/return is for the JIT not that of a difference, it matters for the interpreter more. But the JIT has to create code for *every* constant block, and #value means executing BlockClosure>>#value, which triggers execution of that compiedBlock. > > We thus have to execute two methods, not one. And the JIT has to cache all the generated code. > > So can we do better? It is actually easy to implement a class ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the #value methods to just return > the constant value: > > ``` > value > ^literal > ``` > > Thus we get the same as with sending #value to the literal directly: we send #value, we execute one method that is a quick return. > > And the good news: there is #optionConstantBlockClosure in the compiler, and it is enabled by default in Pharo11! > > The reason why we can turn on Constant Bocks without problem is that they are never on the stack, so we do not need to take care to fix all the tools to know how to deal with them. > (Constant Bocks actually do have a CompiledBlock so that the e.g. for “senders of” we check the literals just as if it would be a normal clean block, it is just never executed) > > If we go back to our method #minHeight, this means the bytecode looks like that: > > > ``` > self symbolic "'49 <4C> self > 50 <20> pushConstant: #minHeight > 51 <21> pushConstant: [2] > 52 <A2> send: valueOfProperty:ifAbsent: > 53 <5C> returnTop'" > ``` > > Thus, in Pharo11, the execution path of all the >2500 constant blocks end up executing one of the #value methods of ConstantBlockClosure. To get all the exceptions corect when sending e.g. #value ot a 1-arg block, there are subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now (there are not many). > > If you want to check that this really works, go to ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's really called a lot! > > Marcus > >
MD
Marcus Denker
Mon, May 22, 2023 12:12 PM

I have put a slightly improved version here, it at the end adds a discussion that the mapping still
works (you can inspect ConstantBlockClosure allSubInstances and it can even show the block highlighted
in the home method).

https://blog.marcusdenker.de/constant-blocks-in-pharo11

I will put this in the Queue for the Pharo Dev blog next.

Marcus

On 20 May 2023, at 11:02, Marcus Denker marcus.denker@inria.fr wrote:

You might have come across code like this:

minHeight
	"answer the receiver's minHeight"
	^ self
		valueOfProperty: #minHeight
		ifAbsent: [2]

In the case the #minHeight property is not set, it returns 2.

Code like this is quite common, another example are empty ifAbsent blocks:

someDictonary remove: anObject ifAbsent: []

If we analyse the system, we can easily find all of them. The best is to use the AST for this:

allBlocks := Smalltalk globals methods flatCollect: [:method | method ast blockNodes ].
allBlocks size. "86805"

nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not].
nonInlinedBlocks size.  "36661"

“the blocks are actually just constant"
constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode isConstant].
constantBlocks size. "2572" 

So there are 2572 constant (literal) blocks. You can inspect constantBlocks to explore them:

<Constant.jpeg>

Constant or empty blocks ([] is just [nil]) do not feel like something to think too much about.

After all, they just retunr the literal when you send #value to them. What can be the problem?

But: they are blocks, and in a system without clean blocks, they are full blocks, which means they are created at runtime for every execution of the [] block. And they are blocks, so there is a CompiledBlock created for each and sending #value will execute that bytecode, with the JIT having to create binary code.

For Morph>>#minHeight the bytecode would be:

"'49 <4C> self
50 <20> pushConstant: #minHeight
51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0
54 <A2> send: valueOfProperty:ifAbsent:
55 <5C> returnTop'"

This is expensive! [2] is the same as 2 (the only thing we can do with the block is to send #value, and we can do that with the literal directly).

[ 2 value ] bench.
[ [2] value ] bench

218625362.000/25750416.833 "8.490167884188363"

So there >factor 8 for "create and evaluate" in difference between the two!

This lead to people actually rewriting code to use the literal directly, e.g. we could just change it to

minHeight
	"answer the receiver's minHeight"
	^ self
		valueOfProperty: #minHeight
		ifAbsent: 2

I am guilty of using this sometimes when optimizing for performance, but it does not feel nice. Yet another rule for performance to think about, and the number of constant blocks that are there shows that this is not how people want to do it. And, most important: it just works for 0 arg constant blocks, as literals undestand #value, but not #value:, #value:value: and so on.

So what can we do? The first thing (and I am sure you are thinking about that alreary) is the idea of clean blocks. Clean blocks are blocks that only need (to be created)  information that the compiler has statically at compile time. you can look at RBProgramNode>>#isClean and the overrides in RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but for this case, all what you need to know is that a constant block, as it accesses nothing, is of course the trivial case of a clean block.

If we compile them as clean blocks, we will immediatly move creation to compile time, and runtime property will be the same as using a literal. With the added benefit that constant blocks with arguments are supported, too.

But: using "2" instead of [2] is not only faster for creation, it is faster when evaluting, too. The reason is that "2 value" sends #value, which executes Object>>#value, which is

value

	^self

Which is a Quick return self method, aka a primitive:

self symbolic   "'Quick return self'"

This is very fast. While even as a clean block, we have, for every clean block, it's own method (compiledBlock) that the VM has to execute and thus create
code for:

self symbolic

"'25 <20> pushConstant: 2
26 <5E> blockReturn'"

It seems the fact that one is a quick return and the other a push/return is for the JIT not that of a difference, it matters for the interpreter more. But the JIT has to create code for every constant block, and #value means executing BlockClosure>>#value, which triggers execution of that compiedBlock.

We thus have to execute two methods, not one. And the JIT has to cache all the generated code.

So can we do better? It is actually easy to implement a class ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the #value methods to just return
the constant value:

value
	^literal

Thus we get the same as with sending #value to the literal directly: we send #value, we execute one method that is a quick return.

And the good news: there is #optionConstantBlockClosure in the compiler, and it is enabled by default in Pharo11!

The reason why we can turn on Constant Bocks without problem is that they are never on the stack, so we do not need to take care to fix all the tools to know how to deal with them.
(Constant Bocks actually do have a CompiledBlock so that the e.g. for “senders of” we check the literals just as if it would be a normal clean block, it is just never executed)

If we go back to our method #minHeight, this means the bytecode looks like that:

self symbolic "'49 <4C> self
50 <20> pushConstant: #minHeight
51 <21> pushConstant: [2]
52 <A2> send: valueOfProperty:ifAbsent:
53 <5C> returnTop'"

Thus, in Pharo11, the execution path of all the >2500 constant blocks end up executing one of the #value methods of ConstantBlockClosure. To get all the exceptions corect when sending e.g. #value ot a 1-arg block, there are subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now (there are not many).

If you want to check that this really works, go to ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's really called a lot!

Marcus
I have put a slightly improved version here, it at the end adds a discussion that the mapping still works (you can inspect ConstantBlockClosure allSubInstances and it can even show the block highlighted in the home method). https://blog.marcusdenker.de/constant-blocks-in-pharo11 I will put this in the Queue for the Pharo Dev blog next. Marcus > On 20 May 2023, at 11:02, Marcus Denker <marcus.denker@inria.fr> wrote: > > You might have come across code like this: > > ``` > minHeight > "answer the receiver's minHeight" > ^ self > valueOfProperty: #minHeight > ifAbsent: [2] > ``` > > In the case the #minHeight property is not set, it returns 2. > > Code like this is quite common, another example are empty ifAbsent blocks: > > > ``` > someDictonary remove: anObject ifAbsent: [] > ``` > > If we analyse the system, we can easily find all of them. The best is to use the AST for this: > > ``` > allBlocks := Smalltalk globals methods flatCollect: [:method | method ast blockNodes ]. > allBlocks size. "86805" > > nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not]. > nonInlinedBlocks size. "36661" > > “the blocks are actually just constant" > constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode isConstant]. > constantBlocks size. "2572" > > ``` > > So there are 2572 constant (literal) blocks. You can inspect constantBlocks to explore them: > > <Constant.jpeg> > > Constant or empty blocks ([] is just [nil]) do not feel like something to think too much about. > > After all, they just retunr the literal when you send #value to them. What can be the problem? > > But: they are blocks, and in a system without clean blocks, they are full blocks, which means they are created at runtime for *every* execution of the [] block. And they are blocks, so there is a CompiledBlock created for each and sending #value will execute that bytecode, with the JIT having to create binary code. > > For Morph>>#minHeight the bytecode would be: > > ``` > "'49 <4C> self > 50 <20> pushConstant: #minHeight > 51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0 > 54 <A2> send: valueOfProperty:ifAbsent: > 55 <5C> returnTop'" > ``` > > This is expensive! [2] is the same as 2 (the only thing we can do with the block is to send #value, and we can do that with the literal directly). > > ``` > [ 2 value ] bench. > [ [2] value ] bench > > 218625362.000/25750416.833 "8.490167884188363" > ``` > > So there >factor 8 for "create and evaluate" in difference between the two! > > This lead to people actually rewriting code to use the literal directly, e.g. we could just change it to > > > ``` > minHeight > "answer the receiver's minHeight" > ^ self > valueOfProperty: #minHeight > ifAbsent: 2 > ``` > > I am guilty of using this sometimes when optimizing for performance, but it does not feel nice. Yet another rule for performance to think about, and the number of constant blocks that are there shows that this is not how people want to do it. And, most important: it just works for 0 arg constant blocks, as literals undestand #value, but not #value:, #value:value: and so on. > > > So what can we do? The first thing (and I am sure you are thinking about that alreary) is the idea of clean blocks. Clean blocks are blocks that only need (to be created) information that the compiler has statically at compile time. you can look at RBProgramNode>>#isClean and the overrides in RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but for this case, all what you need to know is that a constant block, as it accesses nothing, is of course the trivial case of a clean block. > > If we compile them as clean blocks, we will immediatly move creation to compile time, and runtime property will be the same as using a literal. With the added benefit that constant blocks with arguments are supported, too. > > But: using "2" instead of [2] is not only faster for *creation*, it is faster when evaluting, too. The reason is that "2 value" sends #value, which executes Object>>#value, which is > > ``` > value > > ^self > ``` > > Which is a Quick return self method, aka a primitive: > > > ``` > self symbolic "'Quick return self'" > ``` > > This is *very fast*. While even as a clean block, we have, for every clean block, it's own method (compiledBlock) that the VM has to execute and thus create > code for: > > self symbolic > > "'25 <20> pushConstant: 2 > 26 <5E> blockReturn'" > > > It seems the fact that one is a quick return and the other a push/return is for the JIT not that of a difference, it matters for the interpreter more. But the JIT has to create code for *every* constant block, and #value means executing BlockClosure>>#value, which triggers execution of that compiedBlock. > > We thus have to execute two methods, not one. And the JIT has to cache all the generated code. > > So can we do better? It is actually easy to implement a class ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the #value methods to just return > the constant value: > > ``` > value > ^literal > ``` > > Thus we get the same as with sending #value to the literal directly: we send #value, we execute one method that is a quick return. > > And the good news: there is #optionConstantBlockClosure in the compiler, and it is enabled by default in Pharo11! > > The reason why we can turn on Constant Bocks without problem is that they are never on the stack, so we do not need to take care to fix all the tools to know how to deal with them. > (Constant Bocks actually do have a CompiledBlock so that the e.g. for “senders of” we check the literals just as if it would be a normal clean block, it is just never executed) > > If we go back to our method #minHeight, this means the bytecode looks like that: > > > ``` > self symbolic "'49 <4C> self > 50 <20> pushConstant: #minHeight > 51 <21> pushConstant: [2] > 52 <A2> send: valueOfProperty:ifAbsent: > 53 <5C> returnTop'" > ``` > > Thus, in Pharo11, the execution path of all the >2500 constant blocks end up executing one of the #value methods of ConstantBlockClosure. To get all the exceptions corect when sending e.g. #value ot a 1-arg block, there are subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now (there are not many). > > If you want to check that this really works, go to ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's really called a lot! > > Marcus > >
SD
stephane ducasse
Mon, May 22, 2023 8:15 PM

I love the idea of a blog post.
I like to be able to read about when I want.

S

On 22 May 2023, at 14:12, Marcus Denker marcus.denker@inria.fr wrote:

I have put a slightly improved version here, it at the end adds a discussion that the mapping still
works (you can inspect ConstantBlockClosure allSubInstances and it can even show the block highlighted
in the home method).

https://blog.marcusdenker.de/constant-blocks-in-pharo11

I will put this in the Queue for the Pharo Dev blog next.

Marcus

On 20 May 2023, at 11:02, Marcus Denker marcus.denker@inria.fr wrote:

You might have come across code like this:

minHeight
	"answer the receiver's minHeight"
	^ self
		valueOfProperty: #minHeight
		ifAbsent: [2]

In the case the #minHeight property is not set, it returns 2.

Code like this is quite common, another example are empty ifAbsent blocks:

someDictonary remove: anObject ifAbsent: []

If we analyse the system, we can easily find all of them. The best is to use the AST for this:

allBlocks := Smalltalk globals methods flatCollect: [:method | method ast blockNodes ].
allBlocks size. "86805"

nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not].
nonInlinedBlocks size.  "36661"

“the blocks are actually just constant"
constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode isConstant].
constantBlocks size. "2572" 

So there are 2572 constant (literal) blocks. You can inspect constantBlocks to explore them:

<Constant.jpeg>

Constant or empty blocks ([] is just [nil]) do not feel like something to think too much about.

After all, they just retunr the literal when you send #value to them. What can be the problem?

But: they are blocks, and in a system without clean blocks, they are full blocks, which means they are created at runtime for every execution of the [] block. And they are blocks, so there is a CompiledBlock created for each and sending #value will execute that bytecode, with the JIT having to create binary code.

For Morph>>#minHeight the bytecode would be:

"'49 <4C> self
50 <20> pushConstant: #minHeight
51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0
54 <A2> send: valueOfProperty:ifAbsent:
55 <5C> returnTop'"

This is expensive! [2] is the same as 2 (the only thing we can do with the block is to send #value, and we can do that with the literal directly).

[ 2 value ] bench.
[ [2] value ] bench

218625362.000/25750416.833 "8.490167884188363"

So there >factor 8 for "create and evaluate" in difference between the two!

This lead to people actually rewriting code to use the literal directly, e.g. we could just change it to

minHeight
	"answer the receiver's minHeight"
	^ self
		valueOfProperty: #minHeight
		ifAbsent: 2

I am guilty of using this sometimes when optimizing for performance, but it does not feel nice. Yet another rule for performance to think about, and the number of constant blocks that are there shows that this is not how people want to do it. And, most important: it just works for 0 arg constant blocks, as literals undestand #value, but not #value:, #value:value: and so on.

So what can we do? The first thing (and I am sure you are thinking about that alreary) is the idea of clean blocks. Clean blocks are blocks that only need (to be created)  information that the compiler has statically at compile time. you can look at RBProgramNode>>#isClean and the overrides in RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but for this case, all what you need to know is that a constant block, as it accesses nothing, is of course the trivial case of a clean block.

If we compile them as clean blocks, we will immediatly move creation to compile time, and runtime property will be the same as using a literal. With the added benefit that constant blocks with arguments are supported, too.

But: using "2" instead of [2] is not only faster for creation, it is faster when evaluting, too. The reason is that "2 value" sends #value, which executes Object>>#value, which is

value

	^self

Which is a Quick return self method, aka a primitive:

self symbolic   "'Quick return self'"

This is very fast. While even as a clean block, we have, for every clean block, it's own method (compiledBlock) that the VM has to execute and thus create
code for:

self symbolic

"'25 <20> pushConstant: 2
26 <5E> blockReturn'"

It seems the fact that one is a quick return and the other a push/return is for the JIT not that of a difference, it matters for the interpreter more. But the JIT has to create code for every constant block, and #value means executing BlockClosure>>#value, which triggers execution of that compiedBlock.

We thus have to execute two methods, not one. And the JIT has to cache all the generated code.

So can we do better? It is actually easy to implement a class ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the #value methods to just return
the constant value:

value
	^literal

Thus we get the same as with sending #value to the literal directly: we send #value, we execute one method that is a quick return.

And the good news: there is #optionConstantBlockClosure in the compiler, and it is enabled by default in Pharo11!

The reason why we can turn on Constant Bocks without problem is that they are never on the stack, so we do not need to take care to fix all the tools to know how to deal with them.
(Constant Bocks actually do have a CompiledBlock so that the e.g. for “senders of” we check the literals just as if it would be a normal clean block, it is just never executed)

If we go back to our method #minHeight, this means the bytecode looks like that:

self symbolic "'49 <4C> self
50 <20> pushConstant: #minHeight
51 <21> pushConstant: [2]
52 <A2> send: valueOfProperty:ifAbsent:
53 <5C> returnTop'"

Thus, in Pharo11, the execution path of all the >2500 constant blocks end up executing one of the #value methods of ConstantBlockClosure. To get all the exceptions corect when sending e.g. #value ot a 1-arg block, there are subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now (there are not many).

If you want to check that this really works, go to ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's really called a lot!

Marcus
I love the idea of a blog post. I like to be able to read about when I want. S > On 22 May 2023, at 14:12, Marcus Denker <marcus.denker@inria.fr> wrote: > > I have put a slightly improved version here, it at the end adds a discussion that the mapping still > works (you can inspect ConstantBlockClosure allSubInstances and it can even show the block highlighted > in the home method). > > https://blog.marcusdenker.de/constant-blocks-in-pharo11 > > I will put this in the Queue for the Pharo Dev blog next. > > Marcus > >> On 20 May 2023, at 11:02, Marcus Denker <marcus.denker@inria.fr> wrote: >> >> You might have come across code like this: >> >> ``` >> minHeight >> "answer the receiver's minHeight" >> ^ self >> valueOfProperty: #minHeight >> ifAbsent: [2] >> ``` >> >> In the case the #minHeight property is not set, it returns 2. >> >> Code like this is quite common, another example are empty ifAbsent blocks: >> >> >> ``` >> someDictonary remove: anObject ifAbsent: [] >> ``` >> >> If we analyse the system, we can easily find all of them. The best is to use the AST for this: >> >> ``` >> allBlocks := Smalltalk globals methods flatCollect: [:method | method ast blockNodes ]. >> allBlocks size. "86805" >> >> nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not]. >> nonInlinedBlocks size. "36661" >> >> “the blocks are actually just constant" >> constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode isConstant]. >> constantBlocks size. "2572" >> >> ``` >> >> So there are 2572 constant (literal) blocks. You can inspect constantBlocks to explore them: >> >> <Constant.jpeg> >> >> Constant or empty blocks ([] is just [nil]) do not feel like something to think too much about. >> >> After all, they just retunr the literal when you send #value to them. What can be the problem? >> >> But: they are blocks, and in a system without clean blocks, they are full blocks, which means they are created at runtime for *every* execution of the [] block. And they are blocks, so there is a CompiledBlock created for each and sending #value will execute that bytecode, with the JIT having to create binary code. >> >> For Morph>>#minHeight the bytecode would be: >> >> ``` >> "'49 <4C> self >> 50 <20> pushConstant: #minHeight >> 51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0 >> 54 <A2> send: valueOfProperty:ifAbsent: >> 55 <5C> returnTop'" >> ``` >> >> This is expensive! [2] is the same as 2 (the only thing we can do with the block is to send #value, and we can do that with the literal directly). >> >> ``` >> [ 2 value ] bench. >> [ [2] value ] bench >> >> 218625362.000/25750416.833 "8.490167884188363" >> ``` >> >> So there >factor 8 for "create and evaluate" in difference between the two! >> >> This lead to people actually rewriting code to use the literal directly, e.g. we could just change it to >> >> >> ``` >> minHeight >> "answer the receiver's minHeight" >> ^ self >> valueOfProperty: #minHeight >> ifAbsent: 2 >> ``` >> >> I am guilty of using this sometimes when optimizing for performance, but it does not feel nice. Yet another rule for performance to think about, and the number of constant blocks that are there shows that this is not how people want to do it. And, most important: it just works for 0 arg constant blocks, as literals undestand #value, but not #value:, #value:value: and so on. >> >> >> So what can we do? The first thing (and I am sure you are thinking about that alreary) is the idea of clean blocks. Clean blocks are blocks that only need (to be created) information that the compiler has statically at compile time. you can look at RBProgramNode>>#isClean and the overrides in RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but for this case, all what you need to know is that a constant block, as it accesses nothing, is of course the trivial case of a clean block. >> >> If we compile them as clean blocks, we will immediatly move creation to compile time, and runtime property will be the same as using a literal. With the added benefit that constant blocks with arguments are supported, too. >> >> But: using "2" instead of [2] is not only faster for *creation*, it is faster when evaluting, too. The reason is that "2 value" sends #value, which executes Object>>#value, which is >> >> ``` >> value >> >> ^self >> ``` >> >> Which is a Quick return self method, aka a primitive: >> >> >> ``` >> self symbolic "'Quick return self'" >> ``` >> >> This is *very fast*. While even as a clean block, we have, for every clean block, it's own method (compiledBlock) that the VM has to execute and thus create >> code for: >> >> self symbolic >> >> "'25 <20> pushConstant: 2 >> 26 <5E> blockReturn'" >> >> >> It seems the fact that one is a quick return and the other a push/return is for the JIT not that of a difference, it matters for the interpreter more. But the JIT has to create code for *every* constant block, and #value means executing BlockClosure>>#value, which triggers execution of that compiedBlock. >> >> We thus have to execute two methods, not one. And the JIT has to cache all the generated code. >> >> So can we do better? It is actually easy to implement a class ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the #value methods to just return >> the constant value: >> >> ``` >> value >> ^literal >> ``` >> >> Thus we get the same as with sending #value to the literal directly: we send #value, we execute one method that is a quick return. >> >> And the good news: there is #optionConstantBlockClosure in the compiler, and it is enabled by default in Pharo11! >> >> The reason why we can turn on Constant Bocks without problem is that they are never on the stack, so we do not need to take care to fix all the tools to know how to deal with them. >> (Constant Bocks actually do have a CompiledBlock so that the e.g. for “senders of” we check the literals just as if it would be a normal clean block, it is just never executed) >> >> If we go back to our method #minHeight, this means the bytecode looks like that: >> >> >> ``` >> self symbolic "'49 <4C> self >> 50 <20> pushConstant: #minHeight >> 51 <21> pushConstant: [2] >> 52 <A2> send: valueOfProperty:ifAbsent: >> 53 <5C> returnTop'" >> ``` >> >> Thus, in Pharo11, the execution path of all the >2500 constant blocks end up executing one of the #value methods of ConstantBlockClosure. To get all the exceptions corect when sending e.g. #value ot a 1-arg block, there are subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now (there are not many). >> >> If you want to check that this really works, go to ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's really called a lot! >> >> Marcus >> >> >