Empathy List Archives

RO

Richard O'Keefe

Sat, Apr 15, 2023 2:28 PM

Let
initial :: a
combine :: a -> b -> a
finish :: a -> c
then
(finish . foldl combine initial) :: Foldable t => t b -> c

This appears to be analogous to your 'reduce with completion'.
What can be done with composition generally should be done
with composition; I really don't see any advantage in
defining
reduce finish combine initial = finish . combine initial

As for growing a collection and then finally trimming it to its
desired size, I've never known anything like that to be a good
idea. There's a reason why my library includes
Set BoundedSet
IdentitySet BoundedIdentitySet
Deque BoundedDeque
Heap BoundedHeap
I use BoundedHeap a lot. If I want to find the k best things
out of n, using BoundedHeap lets me do it in O(n.log k) time
and O(k) space instead of O(n.log n) time and O(n) space.

Your example showing how much better #reduceLeft: is when
expressed using transducers is not well chosen,because the
implementation of #reduceLeft: in Pharo is (how to say this
politely?) not up to the standards we expect from Pharo.
Here is how it stands in my compatibility library:

Enumerable
methods for: 'summarising'
reduceLeft: aBlock
<compatibility: #pharo>
|n|
^(n := aBlock argumentCount) > 2
ifTrue: [
|i a|
a := Array new: n.
i := 0.
self do: [:each |
a at: (i := i + 1) put: each.
i = n ifTrue: [
a at: (i := 1) put: (aBlock valueWithArguments: a)]].
i = 1 ifTrue: [a at: i] ifFalse: [
self error: 'collection/block size mismatch']]
ifFalse: [
"If n = 0 or 1 and the receiver has two or more elements,
this will raise an exception passing two arguments to aBlock.
That's a good way to complain about it."
|r f|
r := f := nil.
self do: [:each |
r := f ifNil: [f := self. each]
ifNotNil: [aBlock value: r value: each]].
r == f ifNil: [CollectionTooSmall collection: self]
ifNotNil: [r]]

No OrderedCollection. And exactly one traversal of the collection.
(Enumerables can only be traversed once. Collections can be traversed
multiple times.) The only method the receiver needs to provide
is #do:, so this method doesn't really need to be in Enumerable.
It could, for example, be in Block.

#reduceRight: I've placed in AbstractSequence because it needs
#reverseDo:, but it equally makes sense as a method of Block
(as it knows more about what Blocks can do than it does about
what sequences can do). For example, I have SortedSet and
SortedBag which can be traversed forwards and backwards but only
#reduceLeft: is available to them; #reduceRight: is not, despite
them sensibly supporting #reverseDo:. For that matter, Deques
support #reverseDo: but are not sequences...
To the limited extent that I understand the Transducers package,
this view that (s reduce{Left,Right}: b) should really be
(b reduce{Left,Right}) applyTo: s
seems 100% in the spirit of Transducers.

In fact, now that I understand a bit about Transducers,
<collection> reduce: <transducer>
(a) seems back to front compared with
<reducer> applyTo: <collection>
(b) seems like a misnomer because in general much may be
generated transformed and retained, with nothing
getting smaller, so why 'reduce'?
It seems like a special case of applying a function-like
object (the transducer) to an argument (the collection),
so why not
transducer applyTo: dataSource
transducer applyTo: dataSource initially: initial

Did you look at Richard Waters' "obviously synchronizable series
expressions"? (MIT AI Memo 958A and 959A) Or his later SERIES
package? (Appendix A of Common Lisp the Language, 2nd edition)

On Fri, 14 Apr 2023 at 22:32, Steffen Märcker merkste@web.de wrote:

Hi Richard!

Thanks for sharing your thoughts.

There's a reason why #inject:into: puts the block argument
last. It works better to have "heavy" constituents on the
right in an English sentence, and it's easier to indent
blocks when they come last.

Nice, I never though of it this way. I always appreciate a historical
background.

Let me try to respond to the rest with a focus on the ideas. First of all,
the point of the transducers framework is to cleanly separate between
iteration over sequences, processing of the elements and accumulation of a
result. It enables easy reuse the concepts common to different data
structures.

What do I mean by completion? If we iterate over a sequence of objects,
we sometimes want to do a final step after all elements have seen to
complete the computation. For instance, after copying elements to a new
collection, we may want to trim it to its actual size:

distinct := col inject: Set new into: #add.
distinct trim.

For some cases it turns out to be useful to have an object that knows how
to do both:

distinct := col reduce: (#add completing: #trim) init: Set new.

#reduce:init: knows how to deal with both this new objects and ordinary
blocks. For now I will call both variants a "reducing function". Note,
completion is completely optional and the implementation is literally
#inject:into: plus completion if required.

What is a reduction? In some cases, it turns out to be useful to pair
up a reducing function with an (initial) value. You called it a magma and
often its indeed the neutral element of a mathematical operator, e.g., +
and 0. But we can use a block that yields the initial value, too. For
instance:

sum := #+ init: 0.
result := numbers reduce: sum.

toSet := (#add completing: #trim) initializer: [Set new].
distinct := col reduce: toSet.

#reduce: is just a shorthand that passes the function and the value to
#reduce:init: Maybe #reduceMagma: is a reasonable name?

The reason I implemented #reduce:init: directly on collections,
streams, etc. is that these objects know best how to efficiently iterate
over their elements. And if a data structure knows how to #reduce:init: we
can use it with all the transducers functions, e.g., for partitioning,
filtering etc. Other useful methods could then be added to the behaviour
with a trait, e.g., #transduce:reduce:init which first apples a transducer
and then reduces. As traits are not available in plain VW 8.3, I did not
try this approach, though.
Lets take #reduce:Left: as and example and reimplement the method using
transducers, shall we? The following code works for each
sequence/collection/stream that supports #transduce:reduce:init:

reduceLeft: aBlock
| head rest arity |
head := self transduce: (Take number: 1) reduce: [:r :e | e] init: nil.
rest := Drop number: 1.

arity := aBlock arity.
^arity = 2
ifTrue: [self transduce: rest reduce: aBlock init: head]
ifFalse: [
| size arguments |
size := arity - 1.
rest := rest * (Partition length: size) * (Remove predicate: [:part | part
size < size]).
arguments := Array new: arity.
arguments at: 1 put: head.
self
transduce: rest
reduce: ([:args :part |
args
replaceFrom: 2 to: arity with: part;
at: 1 put: (aBlock valueWithArguments: args);
yourself] completing: [:args | args first])
init: arguments]

This code is both more general and faster: It does not create an
intermediate OrderedCollection and it treats the common case of binary
blocks efficiently. Note the implementation can more compact and optimized
if it was specialized in certain class. For instance,
SequenceableCollection allows accessing elements by index which turns the
first line into a simple "self first".

Thanks for staying with me for this long reply. I hope I did not miss a
point. I do not insist on the existing names but will appreciate any ideas.

Best, Steffen

Richard O'Keefe schrieb am Freitag, 14. April 2023 09:43:32 (+02:00):

#reduce: aReduction
Are you saying that aReduction is an object from which
a dyadic block and an initial value can be derived?
That's going to confuse the heck out of Dolphin and Pharo
users (like me, for example). And in my copy of Pharo,
#reduce: calls #reduceLeft:, not #foldLeft:.
The sad thing about #reduceLeft: in Pharo is that in order
to provide extra generality I have no use for, it fails to
provide a fast path for the common case of a dyadic block.

reduceLeft: aBlock
aBlock argumentCount = 2 ifTrue: [
|r|
r := self first.
self from: 2 to: self last do: [:each |
r := aBlock value: r value: each].
^r].
... everything else as before ...

Adding up a million floats takes half the time using the
fast path (67 msec vs 137 msec). Does your #reduce:
also perform "a completion action"? If so, it definitely
should not be named after #inject:into:.

At any rate, if it does something different, it should have
a different name, so #reduce: is no good.

#reduce:init:
There's a reason why #inject:into: puts the block argument
last. It works better to have "heavy" constituents on the
right in an English sentence, and it's easier to indent
blocks when they come last.

Which of the arguments here specifies the 'completion action'?
What does the 'completion action' do? (I can't tell from the name.)

I think the answer is clear:

choose new intention-revealing names that do not clash.

If I have have understood your reduce: aReduction correctly,
a Reduction specifies

a binary operation (not necessarily associative)
a value which can be passed to that binary operation
which suggests that it represents a magma with identity.
By the way, it is not clear whether
{x} reduce: <<ident. binop>>
answers x or binop value: ident value: x.
It's only when ident is an identity for binop that you
can say 'it doesn't matter'.
I don't suppose you could bring yourself to call
aReduction aMagmaWithIdentity?

Had you considered
aMagmaWithIdentity reduce: aCollection
where the #reduce: method is now in your class so
can't technically clash with anything else?
All you really need from aCollection is #do: so
it could even be a stream.

MagmaWithIdentity

identity
combine:with:
reduce: anEnumerable

  |r|
  r := self identity.
  anEumerable do: [:each | r := self combine: r with: each].
  ^r

MagmaSansIdentity

combine:with:
reduce: anEnumerable

  |r f|
  f := r := nil.
  anEnumerable do: [:each |
    r := f ifNil: [f := self. each] ifNotNil: [self combine: r with:

each]].
f ifNil: [anEnumerable error: 'is empty'].
^r

On Fri, 14 Apr 2023 at 05:02, Steffen Märcker merkste@web.de wrote:

The reason I came up with the naming question in the first place is that
I (finally !) finish my port of Transducers to Pharo. But currently, I am
running into a name clash. Maybe you have some good ideas how to resolve
the following situation in a pleasant way.

#fold: exists in Pharo and is an alias of #reduce:
#reduce: exists in Pharo and calls #foldLeft: which also deals with
more than two block arguments

Both of which are not present in VW. Hence, I used the following messages
in VW with no name clash:

#reduce: aReduction "= block + initial value"
#reduce:init: is similar to #inject:into: but executes an additional
completion action

Some obvious ways to avoid a clash in Pharo are:

Make #reduce: distinguish between a reduction and a simple block (e.g.
by double dispatch)
Rename the transducers #reduce: to #injectInto: and adapt
#inject:into: to optionally do the completion
Find another selector that is not too counter-intuitive

All three approaches have some downsides in my opinion:

Though straight forward to implement, both flavors behave quite
different, especially with respect to the number of block arguments. The
existing one creates a SequenceableCollection and partitions it according
to the required number of args. Transducers' #reduce: considers binary
blocks as the binary fold case but ternary blocks as fold with indexed
elements.
This is a real extension of #inject:into: but requires to touch
multiple implementations of that message. Something I consider undesirabe.
Currently, I cannot think of a good name that is not too far away from
what we're familiar with.

Do you have some constructive comments and ideas?

Kind regards,
Steffen

Steffen Märcker schrieb am Donnerstag, 13. April 2023 17:11:15 (+02:00):

:-D I don't know how compress made onto that site. There is not even an
example in the list of language examples where fold/reduce is named
compress.

Richard O'Keefe schrieb am Donnerstag, 13. April 2023 16:34:29 (+02:00):

OUCH. Wikipedia is as reliable as ever, I see.
compress and reduce aren't even close to the same thing.
Since the rank of the result of compression is the same
as the rank of the right operand, and the rank of the
result of reducing is one lower, they are really quite
different. compress is Fortran's PACK.
https://gcc.gnu.org/onlinedocs/gfortran/PACK.html

On Fri, 14 Apr 2023 at 01:34, Steffen Märcker merkste@web.de wrote:

Hi Richard and Sebastian!

Interesting read. I obviously was not aware of the variety of meanings
for fold/reduce. Thanks for pointing this out. Also, in some languages it
seems the same name is used for both reductions with and without an initial
value. There's even a list on WP on the matter:
https://en.wikipedia.org/wiki/Fold_%28higher-order_function%29#In_various_languages

Kind regards,
Steffen

Richard O'Keefe schrieb am Donnerstag, 13. April 2023 13:16:28 (+02:00):

The standard prelude in Haskell does not define anything
called "fold". It defines fold{l,r}{,1} which can be
applied to any Foldable data (see Data.Foldable). For
technical reasons having to do with Haskell's
non-strict evaluation, foldl' and foldr' also exist.
But NOT "fold".

https://hackage.haskell.org/package/base-4.18.0.0/docs/Data-Foldable.html#laws

On Thu, 13 Apr 2023 at 21:17, Sebastian Jordan Montano <
sebastian.jordan@inria.fr> wrote:

Hello Steffen,

Let's take Kotlin documentation (
https://kotlinlang.org/docs/collection-aggregate.html#fold-and-reduce)

The difference between the two functions is that fold() takes an

initial value and uses it as the accumulated value on the first step,
whereas the first step of reduce() uses the first and the second elements
as operation arguments on the first step.

Naming is not so consistent in all the programming languages, they mix
up the names "reduce" and "fold". For example in Haskell "fold" does not
take an initial value, so it is like a "reduce" in Kotlin. In Kotlin, Java,
Scala and other oo languages "reduce" does not take an initial value while
"fold" does. Pharo align with those languages (except that out fold is
called #inject:into:)

So for me the Pharo methods #reduce: and #inject:into represent well
what they are doing and they are well named.

Cheers,
Sebastian

----- Mail original -----

De: "Steffen Märcker" merkste@web.de
À: "Any question about pharo is welcome" <pharo-users@lists.pharo.org

Envoyé: Mercredi 12 Avril 2023 19:03:01
Objet: [Pharo-users] Collection>>reduce naming

Hi!

I wonder whether there was a specific reason to name this method

#reduce:?

I would have expected #fold: as this is the more common term for what

it

does. And in fact, even the comment reads "Fold the result of the

receiver

into aBlock." Whereas #reduce: is the common term for what we call

with

#inject:into: .

I am asking not to annoy anyone but out of curiosity. It figured this

out

only by some weird behaviour after porting some code that (re)defines
#reduce .

Ciao!
Steffen

--
Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos von vivaldi.com
herunter.

Let initial :: a combine :: a -> b -> a finish :: a -> c then (finish . foldl combine initial) :: Foldable t => t b -> c This appears to be analogous to your 'reduce with completion'. What *can* be done with composition generally *should* be done with composition; I really don't see any advantage in defining reduce finish combine initial = finish . combine initial As for growing a collection and then finally trimming it to its desired size, I've never known anything like that to be a good idea. There's a reason why my library includes Set BoundedSet IdentitySet BoundedIdentitySet Deque BoundedDeque Heap BoundedHeap I use BoundedHeap a lot. If I want to find the k best things out of n, using BoundedHeap lets me do it in O(n.log k) time and O(k) space instead of O(n.log n) time and O(n) space. Your example showing how much better #reduceLeft: is when expressed using transducers is not well chosen,because the implementation of #reduceLeft: in Pharo is (how to say this politely?) not up to the standards we expect from Pharo. Here is how it stands in my compatibility library: Enumerable methods for: 'summarising' reduceLeft: aBlock <compatibility: #pharo> |n| ^(n := aBlock argumentCount) > 2 ifTrue: [ |i a| a := Array new: n. i := 0. self do: [:each | a at: (i := i + 1) put: each. i = n ifTrue: [ a at: (i := 1) put: (aBlock valueWithArguments: a)]]. i = 1 ifTrue: [a at: i] ifFalse: [ self error: 'collection/block size mismatch']] ifFalse: [ "If n = 0 or 1 and the receiver has two or more elements, this will raise an exception passing two arguments to aBlock. That's a good way to complain about it." |r f| r := f := nil. self do: [:each | r := f ifNil: [f := self. each] ifNotNil: [aBlock value: r value: each]]. r == f ifNil: [CollectionTooSmall collection: self] ifNotNil: [r]] No OrderedCollection. And exactly one traversal of the collection. (Enumerables can only be traversed once. Collections can be traversed multiple times.) The only method the receiver needs to provide is #do:, so this method doesn't really need to be in Enumerable. It could, for example, be in Block. #reduceRight: I've placed in AbstractSequence because it needs #reverseDo:, but it equally makes sense as a method of Block (as it knows more about what Blocks can do than it does about what sequences can do). For example, I have SortedSet and SortedBag which can be traversed forwards and backwards but only #reduceLeft: is available to them; #reduceRight: is not, despite them sensibly supporting #reverseDo:. For that matter, Deques support #reverseDo: but are not sequences... To the limited extent that I understand the Transducers package, this view that (s reduce{Left,Right}: b) should really be (b reduce{Left,Right}) applyTo: s seems 100% in the spirit of Transducers. In fact, now that I understand a bit about Transducers, <collection> reduce: <transducer> (a) seems back to front compared with <reducer> applyTo: <collection> (b) seems like a misnomer because in general much may be generated transformed and retained, with nothing getting smaller, so why 'reduce'? It seems like a special case of applying a function-like object (the transducer) to an argument (the collection), so why not transducer applyTo: dataSource transducer applyTo: dataSource initially: initial Did you look at Richard Waters' "obviously synchronizable series expressions"? (MIT AI Memo 958A and 959A) Or his later SERIES package? (Appendix A of Common Lisp the Language, 2nd edition) On Fri, 14 Apr 2023 at 22:32, Steffen Märcker <merkste@web.de> wrote: > Hi Richard! > > Thanks for sharing your thoughts. > > There's a reason why #inject:into: puts the block argument > last. It works better to have "heavy" constituents on the > right in an English sentence, and it's easier to indent > blocks when they come last. > > > Nice, I never though of it this way. I always appreciate a historical > background. > > Let me try to respond to the rest with a focus on the ideas. First of all, > the point of the transducers framework is to cleanly separate between > iteration over sequences, processing of the elements and accumulation of a > result. It enables easy reuse the concepts common to different data > structures. > > 1. What do I mean by completion? If we iterate over a sequence of objects, > we sometimes want to do a final step after all elements have seen to > complete the computation. For instance, after copying elements to a new > collection, we may want to trim it to its actual size: > > distinct := col inject: Set new into: #add. > distinct trim. > > For some cases it turns out to be useful to have an object that knows how > to do both: > > distinct := col reduce: (#add completing: #trim) init: Set new. > > #reduce:init: knows how to deal with both this new objects and ordinary > blocks. For now I will call both variants a "reducing function". Note, > completion is completely optional and the implementation is literally > #inject:into: plus completion if required. > > > 2. What is a reduction? In some cases, it turns out to be useful to pair > up a reducing function with an (initial) value. You called it a magma and > often its indeed the neutral element of a mathematical operator, e.g., + > and 0. But we can use a block that yields the initial value, too. For > instance: > > sum := #+ init: 0. > result := numbers reduce: sum. > > toSet := (#add completing: #trim) initializer: [Set new]. > distinct := col reduce: toSet. > > #reduce: is just a shorthand that passes the function and the value to > #reduce:init: Maybe #reduceMagma: is a reasonable name? > > > 3. The reason I implemented #reduce:init: directly on collections, > streams, etc. is that these objects know best how to efficiently iterate > over their elements. And if a data structure knows how to #reduce:init: we > can use it with all the transducers functions, e.g., for partitioning, > filtering etc. Other useful methods could then be added to the behaviour > with a trait, e.g., #transduce:reduce:init which first apples a transducer > and then reduces. As traits are not available in plain VW 8.3, I did not > try this approach, though. > > > 4. Lets take #reduce:Left: as and example and reimplement the method using > transducers, shall we? The following code works for each > sequence/collection/stream that supports #transduce:reduce:init: > > reduceLeft: aBlock > | head rest arity | > head := self transduce: (Take number: 1) reduce: [:r :e | e] init: nil. > rest := Drop number: 1. > > arity := aBlock arity. > ^arity = 2 > ifTrue: [self transduce: rest reduce: aBlock init: head] > ifFalse: [ > | size arguments | > size := arity - 1. > rest := rest * (Partition length: size) * (Remove predicate: [:part | part > size < size]). > arguments := Array new: arity. > arguments at: 1 put: head. > self > transduce: rest > reduce: ([:args :part | > args > replaceFrom: 2 to: arity with: part; > at: 1 put: (aBlock valueWithArguments: args); > yourself] completing: [:args | args first]) > init: arguments] > > This code is both more general and faster: It does not create an > intermediate OrderedCollection and it treats the common case of binary > blocks efficiently. Note the implementation can more compact and optimized > if it was specialized in certain class. For instance, > SequenceableCollection allows accessing elements by index which turns the > first line into a simple "self first". > > Thanks for staying with me for this long reply. I hope I did not miss a > point. I do not insist on the existing names but will appreciate any ideas. > > Best, Steffen > > > > > Richard O'Keefe schrieb am Freitag, 14. April 2023 09:43:32 (+02:00): > > #reduce: aReduction > Are you saying that aReduction is an object from which > a dyadic block and an initial value can be derived? > That's going to confuse the heck out of Dolphin and Pharo > users (like me, for example). And in my copy of Pharo, > #reduce: calls #reduceLeft:, not #foldLeft:. > The sad thing about #reduceLeft: in Pharo is that in order > to provide extra generality I have no use for, it fails to > provide a fast path for the common case of a dyadic block. > > reduceLeft: aBlock > aBlock argumentCount = 2 ifTrue: [ > |r| > r := self first. > self from: 2 to: self last do: [:each | > r := aBlock value: r value: each]. > ^r]. > ... everything else as before ... > > Adding up a million floats takes half the time using the > fast path (67 msec vs 137 msec). Does your #reduce: > also perform "a completion action"? If so, it definitely > should not be named after #inject:into:. > > > > At any rate, if it does something different, it should have > a different name, so #reduce: is no good. > > #reduce:init: > There's a reason why #inject:into: puts the block argument > last. It works better to have "heavy" constituents on the > right in an English sentence, and it's easier to indent > blocks when they come last. > > Which of the arguments here specifies the 'completion action'? > What does the 'completion action' do? (I can't tell from the name.) > > I think the answer is clear: > * choose new intention-revealing names that do not clash. > > If I have have understood your reduce: aReduction correctly, > a Reduction specifies > - a binary operation (not necessarily associative) > - a value which can be passed to that binary operation > which suggests that it represents a magma with identity. > By the way, it is not clear whether > {x} reduce: <<ident. binop>> > answers x or binop value: ident value: x. > It's only when ident is an identity for binop that you > can say 'it doesn't matter'. > I don't suppose you could bring yourself to call > aReduction aMagmaWithIdentity? > > Had you considered > aMagmaWithIdentity reduce: aCollection > where the #reduce: method is now in your class so > can't *technically* clash with anything else? > All you really need from aCollection is #do: so > it could even be a stream. > > MagmaWithIdentity > >> identity > >> combine:with: > >> reduce: anEnumerable > |r| > r := self identity. > anEumerable do: [:each | r := self combine: r with: each]. > ^r > > MagmaSansIdentity > >> combine:with: > >> reduce: anEnumerable > |r f| > f := r := nil. > anEnumerable do: [:each | > r := f ifNil: [f := self. each] ifNotNil: [self combine: r with: > each]]. > f ifNil: [anEnumerable error: 'is empty']. > ^r > > > > > On Fri, 14 Apr 2023 at 05:02, Steffen Märcker <merkste@web.de> wrote: > >> The reason I came up with the naming question in the first place is that >> I (finally !) finish my port of Transducers to Pharo. But currently, I am >> running into a name clash. Maybe you have some good ideas how to resolve >> the following situation in a pleasant way. >> >> - #fold: exists in Pharo and is an alias of #reduce: >> - #reduce: exists in Pharo and calls #foldLeft: which also deals with >> more than two block arguments >> >> Both of which are not present in VW. Hence, I used the following messages >> in VW with no name clash: >> >> - #reduce: aReduction "= block + initial value" >> - #reduce:init: is similar to #inject:into: but executes an additional >> completion action >> >> Some obvious ways to avoid a clash in Pharo are: >> >> 1) Make #reduce: distinguish between a reduction and a simple block (e.g. >> by double dispatch) >> 2) Rename the transducers #reduce: to #injectInto: and adapt >> #inject:into: to optionally do the completion >> 3) Find another selector that is not too counter-intuitive >> >> All three approaches have some downsides in my opinion: >> 1) Though straight forward to implement, both flavors behave quite >> different, especially with respect to the number of block arguments. The >> existing one creates a SequenceableCollection and partitions it according >> to the required number of args. Transducers' #reduce: considers binary >> blocks as the binary fold case but ternary blocks as fold with indexed >> elements. >> 2) This is a real extension of #inject:into: but requires to touch >> multiple implementations of that message. Something I consider undesirabe. >> 3) Currently, I cannot think of a good name that is not too far away from >> what we're familiar with. >> >> Do you have some constructive comments and ideas? >> >> Kind regards, >> Steffen >> >> >> >> >> Steffen Märcker schrieb am Donnerstag, 13. April 2023 17:11:15 (+02:00): >> >> :-D I don't know how compress made onto that site. There is not even an >> example in the list of language examples where fold/reduce is named >> compress. >> >> >> Richard O'Keefe schrieb am Donnerstag, 13. April 2023 16:34:29 (+02:00): >> >> OUCH. Wikipedia is as reliable as ever, I see. >> compress and reduce aren't even close to the same thing. >> Since the rank of the result of compression is the same >> as the rank of the right operand, and the rank of the >> result of reducing is one lower, they are really quite >> different. compress is Fortran's PACK. >> https://gcc.gnu.org/onlinedocs/gfortran/PACK.html >> >> On Fri, 14 Apr 2023 at 01:34, Steffen Märcker <merkste@web.de> wrote: >> >>> Hi Richard and Sebastian! >>> >>> Interesting read. I obviously was not aware of the variety of meanings >>> for fold/reduce. Thanks for pointing this out. Also, in some languages it >>> seems the same name is used for both reductions with and without an initial >>> value. There's even a list on WP on the matter: >>> https://en.wikipedia.org/wiki/Fold_%28higher-order_function%29#In_various_languages >>> >>> Kind regards, >>> Steffen >>> >>> Richard O'Keefe schrieb am Donnerstag, 13. April 2023 13:16:28 (+02:00): >>> >>> The standard prelude in Haskell does not define anything >>> called "fold". It defines fold{l,r}{,1} which can be >>> applied to any Foldable data (see Data.Foldable). For >>> technical reasons having to do with Haskell's >>> non-strict evaluation, foldl' and foldr' also exist. >>> But NOT "fold". >>> >>> >>> https://hackage.haskell.org/package/base-4.18.0.0/docs/Data-Foldable.html#laws >>> >>> >>> On Thu, 13 Apr 2023 at 21:17, Sebastian Jordan Montano < >>> sebastian.jordan@inria.fr> wrote: >>> >>>> Hello Steffen, >>>> >>>> Let's take Kotlin documentation ( >>>> https://kotlinlang.org/docs/collection-aggregate.html#fold-and-reduce) >>>> >>>> > The difference between the two functions is that fold() takes an >>>> initial value and uses it as the accumulated value on the first step, >>>> whereas the first step of reduce() uses the first and the second elements >>>> as operation arguments on the first step. >>>> >>>> Naming is not so consistent in all the programming languages, they mix >>>> up the names "reduce" and "fold". For example in Haskell "fold" does not >>>> take an initial value, so it is like a "reduce" in Kotlin. In Kotlin, Java, >>>> Scala and other oo languages "reduce" does not take an initial value while >>>> "fold" does. Pharo align with those languages (except that out fold is >>>> called #inject:into:) >>>> >>>> So for me the Pharo methods #reduce: and #inject:into represent well >>>> what they are doing and they are well named. >>>> >>>> Cheers, >>>> Sebastian >>>> >>>> ----- Mail original ----- >>>> > De: "Steffen Märcker" <merkste@web.de> >>>> > À: "Any question about pharo is welcome" <pharo-users@lists.pharo.org >>>> > >>>> > Envoyé: Mercredi 12 Avril 2023 19:03:01 >>>> > Objet: [Pharo-users] Collection>>reduce naming >>>> >>>> > Hi! >>>> > >>>> > I wonder whether there was a specific reason to name this method >>>> #reduce:? >>>> > I would have expected #fold: as this is the more common term for what >>>> it >>>> > does. And in fact, even the comment reads "Fold the result of the >>>> receiver >>>> > into aBlock." Whereas #reduce: is the common term for what we call >>>> with >>>> > #inject:into: . >>>> > >>>> > I am asking not to annoy anyone but out of curiosity. It figured this >>>> out >>>> > only by some weird behaviour after porting some code that (re)defines >>>> > #reduce . >>>> > >>>> > Ciao! >>>> > Steffen >>>> >>> >> -- >> Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos von vivaldi.com >> herunter. >> >

SM

Steffen Märcker

Mon, Apr 17, 2023 6:26 PM

Dear Richard,

sorry for the delayed reply and thanks for your thoughts. I wrapped my head around your suggestions and tried a few thing to see how the play out in code. I tend towards the following names.

Reducing function (a role) is a binary block / object that responds to #value:value: and #numArgs; commonly used with #inject:into:
Completing function (a role) is an unary block / object that responds to #value:.
Reducer is a pair (rfn, cfn) of a reducing function and a completing function.
Reduction is a pair (red, val) of a reducer and an initial value.

I like your idea to inverse the message flow and implement #reduce: in Reduction. It reads well in my eyes in actual code. I do not think that this is a misnomer, since it indeed reduces a sequence of elements to a single object. A reducer knows how to apply first its rfn to a sequence and then cfn to the result.

negatedSum := (#+ completing: #negate) init: 0.
result := negatedSum reduce: (1 to: 10).

asSet := (#add completing: #trim) initializer: [Set new].
distinct := asSet reduce: aCollection

I understand your concerns that we usually either do not need cfn or we applies it manually to the result of the reduction. Yes, most often cfn = identity and I took this into account. However, in some cases I actually need them to pass around as a single object.

I also attempted to find names closer to #inject:into: but did not come up with any good ideas yet.

That this makes sense so far?

I deliberately did not mention transducers above to focus on the naming first. I heard about "obviously synchronizable series expressions" before and had a look at the paper just now. In short, transducers are in the same spirit and offer similar nice properties. But maybe we save the details for another thread?

Regarding the examples. Yes, they are are simple, stupid and the reimplemenation of #reduce: is far from optimal. The goal was to show how the objects/messages are used rather then coming up with real world examples or showing off. ;-)

Kind regards,
Steffen

Richard O'Keefe schrieb am Samstag, 15. April 2023 16:28:09 (+02:00):

Let
initial :: a
combine :: a -> b -> a

finish :: a -> c
then
(finish . foldl combine initial) :: Foldable t => t b -> c

This appears to be analogous to your 'reduce with completion'.
What can be done with composition generally should be done
with composition; I really don't see any advantage in
defining
reduce finish combine initial = finish . combine initial

As for growing a collection and then finally trimming it to its
desired size, I've never known anything like that to be a good
idea. There's a reason why my library includes
Set BoundedSet
IdentitySet BoundedIdentitySet
Deque BoundedDeque
Heap BoundedHeap
I use BoundedHeap a lot. If I want to find the k best things
out of n, using BoundedHeap lets me do it in O(n.log k) time
and O(k) space instead of O(n.log n) time and O(n) space.

Your example showing how much better #reduceLeft: is when
expressed using transducers is not well chosen,because the
implementation of #reduceLeft: in Pharo is (how to say this
politely?) not up to the standards we expect from Pharo.
Here is how it stands in my compatibility library:

Enumerable
methods for: 'summarising'

reduceLeft: aBlock
  <compatibility: #pharo>
  |n|
  ^(n := aBlock argumentCount) > 2
     ifTrue: [
       |i a|
       a := Array new: n.
       i := 0.
       self do: [:each |
         a at: (i := i + 1) put: each.
         i = n ifTrue: [
           a at: (i := 1) put: (aBlock valueWithArguments: a)]].
       i = 1 ifTrue: [a at: i] ifFalse: [
         self error: 'collection/block size mismatch']]
     ifFalse: [
       "If n = 0 or 1 and the receiver has two or more elements,
        this will raise an exception passing two arguments to aBlock.
        That's a good way to complain about it."
       |r f|
       r := f := nil.
       self do: [:each |
         r := f ifNil: [f := self. each]
                ifNotNil: [aBlock value: r value: each]].
       r == f ifNil: [CollectionTooSmall collection: self]
            ifNotNil: [r]]

No OrderedCollection. And exactly one traversal of the collection.
(Enumerables can only be traversed once. Collections can be traversed
multiple times.) The only method the receiver needs to provide
is #do:, so this method doesn't really need to be in Enumerable.
It could, for example, be in Block.

#reduceRight: I've placed in AbstractSequence because it needs
#reverseDo:, but it equally makes sense as a method of Block
(as it knows more about what Blocks can do than it does about
what sequences can do). For example, I have SortedSet and
SortedBag which can be traversed forwards and backwards but only
#reduceLeft: is available to them; #reduceRight: is not, despite
them sensibly supporting #reverseDo:. For that matter, Deques
support #reverseDo: but are not sequences...

To the limited extent that I understand the Transducers package,
this view that (s reduce{Left,Right}: b) should really be
(b reduce{Left,Right}) applyTo: s
seems 100% in the spirit of Transducers.

In fact, now that I understand a bit about Transducers,
<collection> reduce: <transducer>
(a) seems back to front compared with
<reducer> applyTo: <collection>
(b) seems like a misnomer because in general much may be
generated transformed and retained, with nothing
getting smaller, so why 'reduce'?
It seems like a special case of applying a function-like
object (the transducer) to an argument (the collection),
so why not
transducer applyTo: dataSource
transducer applyTo: dataSource initially: initial

Did you look at Richard Waters' "obviously synchronizable series
expressions"? (MIT AI Memo 958A and 959A) Or his later SERIES
package? (Appendix A of Common Lisp the Language, 2nd edition)

On Fri, 14 Apr 2023 at 22:32, Steffen Märcker merkste@web.de wrote:

Hi Richard!

Thanks for sharing your thoughts.

There's a reason why #inject:into: puts the block argument
last. It works better to have "heavy" constituents on the
right in an English sentence, and it's easier to indent
blocks when they come last.

Nice, I never though of it this way. I always appreciate a historical background.

Let me try to respond to the rest with a focus on the ideas. First of all, the point of the transducers framework is to cleanly separate between iteration over sequences, processing of the elements and accumulation of a result. It enables easy reuse the concepts common to different data structures.

What do I mean by completion? If we iterate over a sequence of objects, we sometimes want to do a final step after all elements have seen to complete the computation. For instance, after copying elements to a new collection, we may want to trim it to its actual size:

distinct := col inject: Set new into: #add.
distinct trim.

For some cases it turns out to be useful to have an object that knows how to do both:

distinct := col reduce: (#add completing: #trim) init: Set new.

#reduce:init: knows how to deal with both this new objects and ordinary blocks. For now I will call both variants a "reducing function". Note, completion is completely optional and the implementation is literally #inject:into: plus completion if required.

What is a reduction? In some cases, it turns out to be useful to pair up a reducing function with an (initial) value. You called it a magma and often its indeed the neutral element of a mathematical operator, e.g., + and 0. But we can use a block that yields the initial value, too. For instance:

sum := #+ init: 0.
result := numbers reduce: sum.

toSet := (#add completing: #trim) initializer: [Set new].
distinct := col reduce: toSet.

#reduce: is just a shorthand that passes the function and the value to #reduce:init: Maybe #reduceMagma: is a reasonable name?

The reason I implemented #reduce:init: directly on collections, streams, etc. is that these objects know best how to efficiently iterate over their elements. And if a data structure knows how to #reduce:init: we can use it with all the transducers functions, e.g., for partitioning, filtering etc. Other useful methods could then be added to the behaviour with a trait, e.g., #transduce:reduce:init which first apples a transducer and then reduces. As traits are not available in plain VW 8.3, I did not try this approach, though.
Lets take #reduce:Left: as and example and reimplement the method using transducers, shall we? The following code works for each sequence/collection/stream that supports #transduce:reduce:init:

reduceLeft: aBlock

| head rest arity |
head := self transduce: (Take number: 1) reduce: [:r :e | e] init: nil.
rest := Drop number: 1.

arity := aBlock arity.
^arity = 2
ifTrue: [self transduce: rest reduce: aBlock init: head]
ifFalse: [
| size arguments |
size := arity - 1.
rest := rest * (Partition length: size) * (Remove predicate: [:part | part size < size]).
arguments := Array new: arity.
arguments at: 1 put: head.
self
transduce: rest
reduce: ([:args :part |
args
replaceFrom: 2 to: arity with: part;
at: 1 put: (aBlock valueWithArguments: args);
yourself] completing: [:args | args first])
init: arguments]

This code is both more general and faster: It does not create an intermediate OrderedCollection and it treats the common case of binary blocks efficiently. Note the implementation can more compact and optimized if it was specialized in certain class. For instance, SequenceableCollection allows accessing elements by index which turns the first line into a simple "self first".

Thanks for staying with me for this long reply. I hope I did not miss a point. I do not insist on the existing names but will appreciate any ideas.

Best, Steffen

Richard O'Keefe schrieb am Freitag, 14. April 2023 09:43:32 (+02:00):

#reduce: aReduction
Are you saying that aReduction is an object from which
a dyadic block and an initial value can be derived?
That's going to confuse the heck out of Dolphin and Pharo
users (like me, for example). And in my copy of Pharo,
#reduce: calls #reduceLeft:, not #foldLeft:.
The sad thing about #reduceLeft: in Pharo is that in order
to provide extra generality I have no use for, it fails to
provide a fast path for the common case of a dyadic block.

reduceLeft: aBlock
aBlock argumentCount = 2 ifTrue: [
|r|
r := self first.
self from: 2 to: self last do: [:each |
r := aBlock value: r value: each].
^r].
... everything else as before ...

Adding up a million floats takes half the time using the
fast path (67 msec vs 137 msec). Does your #reduce:
also perform "a completion action"? If so, it definitely
should not be named after #inject:into:.

At any rate, if it does something different, it should have
a different name, so #reduce: is no good.

#reduce:init:
There's a reason why #inject:into: puts the block argument
last. It works better to have "heavy" constituents on the
right in an English sentence, and it's easier to indent
blocks when they come last.

Which of the arguments here specifies the 'completion action'?
What does the 'completion action' do? (I can't tell from the name.)

I think the answer is clear:

choose new intention-revealing names that do not clash.

If I have have understood your reduce: aReduction correctly,
a Reduction specifies

a binary operation (not necessarily associative)
a value which can be passed to that binary operation
which suggests that it represents a magma with identity.
By the way, it is not clear whether
{x} reduce: <<ident. binop>>
answers x or binop value: ident value: x.
It's only when ident is an identity for binop that you
can say 'it doesn't matter'.

I don't suppose you could bring yourself to call
aReduction aMagmaWithIdentity?

Had you considered
aMagmaWithIdentity reduce: aCollection
where the #reduce: method is now in your class so
can't technically clash with anything else?
All you really need from aCollection is #do: so
it could even be a stream.

MagmaWithIdentity

identity
combine:with:
reduce: anEnumerable

 |r|
 r := self identity.
 anEumerable do: [:each | r := self combine: r with: each].
 ^r

MagmaSansIdentity

combine:with:
reduce: anEnumerable

 |r f|
 f := r := nil.
 anEnumerable do: [:each |
   r := f ifNil: [f := self. each] ifNotNil: [self combine: r with: each]].
 f ifNil: [anEnumerable error: 'is empty'].
 ^r

On Fri, 14 Apr 2023 at 05:02, Steffen Märcker merkste@web.de wrote:

The reason I came up with the naming question in the first place is that I (finally !) finish my port of Transducers to Pharo. But currently, I am running into a name clash. Maybe you have some good ideas how to resolve the following situation in a pleasant way.

#fold: exists in Pharo and is an alias of #reduce:
#reduce: exists in Pharo and calls #foldLeft: which also deals with more than two block arguments

Both of which are not present in VW. Hence, I used the following messages in VW with no name clash:

#reduce: aReduction "= block + initial value"
#reduce:init: is similar to #inject:into: but executes an additional completion action

Some obvious ways to avoid a clash in Pharo are:

Make #reduce: distinguish between a reduction and a simple block (e.g. by double dispatch)
Rename the transducers #reduce: to #injectInto: and adapt #inject:into: to optionally do the completion
Find another selector that is not too counter-intuitive

All three approaches have some downsides in my opinion:

Though straight forward to implement, both flavors behave quite different, especially with respect to the number of block arguments. The existing one creates a SequenceableCollection and partitions it according to the required number of args. Transducers' #reduce: considers binary blocks as the binary fold case but ternary blocks as fold with indexed elements.
This is a real extension of #inject:into: but requires to touch multiple implementations of that message. Something I consider undesirabe.
Currently, I cannot think of a good name that is not too far away from what we're familiar with.

Do you have some constructive comments and ideas?

Kind regards,
Steffen

Steffen Märcker schrieb am Donnerstag, 13. April 2023 17:11:15 (+02:00):

:-D I don't know how compress made onto that site. There is not even an example in the list of language examples where fold/reduce is named compress.

Richard O'Keefe schrieb am Donnerstag, 13. April 2023 16:34:29 (+02:00):

OUCH. Wikipedia is as reliable as ever, I see.
compress and reduce aren't even close to the same thing.
Since the rank of the result of compression is the same
as the rank of the right operand, and the rank of the
result of reducing is one lower, they are really quite
different. compress is Fortran's PACK.
https://gcc.gnu.org/onlinedocs/gfortran/PACK.html

On Fri, 14 Apr 2023 at 01:34, Steffen Märcker merkste@web.de wrote:

Hi Richard and Sebastian!

Interesting read. I obviously was not aware of the variety of meanings for fold/reduce. Thanks for pointing this out. Also, in some languages it seems the same name is used for both reductions with and without an initial value. There's even a list on WP on the matter: https://en.wikipedia.org/wiki/Fold_%28higher-order_function%29#In_various_languages

Kind regards,
Steffen

Richard O'Keefe schrieb am Donnerstag, 13. April 2023 13:16:28 (+02:00):

The standard prelude in Haskell does not define anything
called "fold". It defines fold{l,r}{,1} which can be
applied to any Foldable data (see Data.Foldable). For
technical reasons having to do with Haskell's
non-strict evaluation, foldl' and foldr' also exist.
But NOT "fold".

https://hackage.haskell.org/package/base-4.18.0.0/docs/Data-Foldable.html#laws

On Thu, 13 Apr 2023 at 21:17, Sebastian Jordan Montano sebastian.jordan@inria.fr wrote:

Hello Steffen,

Let's take Kotlin documentation (https://kotlinlang.org/docs/collection-aggregate.html#fold-and-reduce)

The difference between the two functions is that fold() takes an initial value and uses it as the accumulated value on the first step, whereas the first step of reduce() uses the first and the second elements as operation arguments on the first step.

Naming is not so consistent in all the programming languages, they mix up the names "reduce" and "fold". For example in Haskell "fold" does not take an initial value, so it is like a "reduce" in Kotlin. In Kotlin, Java, Scala and other oo languages "reduce" does not take an initial value while "fold" does. Pharo align with those languages (except that out fold is called #inject:into:)

So for me the Pharo methods #reduce: and #inject:into represent well what they are doing and they are well named.

Cheers,
Sebastian

----- Mail original -----

De: "Steffen Märcker" merkste@web.de
À: "Any question about pharo is welcome" pharo-users@lists.pharo.org
Envoyé: Mercredi 12 Avril 2023 19:03:01
Objet: [Pharo-users] Collection>>reduce naming

Hi!

I wonder whether there was a specific reason to name this method #reduce:?
I would have expected #fold: as this is the more common term for what it
does. And in fact, even the comment reads "Fold the result of the receiver
into aBlock." Whereas #reduce: is the common term for what we call with
#inject:into: .

I am asking not to annoy anyone but out of curiosity. It figured this out
only by some weird behaviour after porting some code that (re)defines
#reduce .

Ciao!
Steffen

--
Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos unter vivaldi.com herunter

Dear Richard, sorry for the delayed reply and thanks for your thoughts. I wrapped my head around your suggestions and tried a few thing to see how the play out in code. I tend towards the following names. - Reducing function (a role) is a binary block / object that responds to #value:value: and #numArgs; commonly used with #inject:into: - Completing function (a role) is an unary block / object that responds to #value:. - Reducer is a pair (rfn, cfn) of a reducing function and a completing function. - Reduction is a pair (red, val) of a reducer and an initial value. I like your idea to inverse the message flow and implement #reduce: in Reduction. It reads well in my eyes in actual code. I do not think that this is a misnomer, since it indeed reduces a sequence of elements to a single object. A reducer knows how to apply first its rfn to a sequence and then cfn to the result. negatedSum := (#+ completing: #negate) init: 0. result := negatedSum reduce: (1 to: 10). asSet := (#add completing: #trim) initializer: [Set new]. distinct := asSet reduce: aCollection I understand your concerns that we usually either do not need cfn or we applies it manually to the result of the reduction. Yes, most often cfn = identity and I took this into account. However, in some cases I actually need them to pass around as a single object. I also attempted to find names closer to #inject:into: but did not come up with any good ideas yet. That this makes sense so far? I deliberately did not mention transducers above to focus on the naming first. I heard about "obviously synchronizable series expressions" before and had a look at the paper just now. In short, transducers are in the same spirit and offer similar nice properties. But maybe we save the details for another thread? Regarding the examples. Yes, they are are simple, stupid and the reimplemenation of #reduce: is far from optimal. The goal was to show how the objects/messages are used rather then coming up with real world examples or showing off. ;-) Kind regards, Steffen Richard O'Keefe schrieb am Samstag, 15. April 2023 16:28:09 (+02:00): Let initial :: a combine :: a -> b -> a finish :: a -> c then (finish . foldl combine initial) :: Foldable t => t b -> c This appears to be analogous to your 'reduce with completion'. What *can* be done with composition generally *should* be done with composition; I really don't see any advantage in defining reduce finish combine initial = finish . combine initial As for growing a collection and then finally trimming it to its desired size, I've never known anything like that to be a good idea. There's a reason why my library includes Set BoundedSet IdentitySet BoundedIdentitySet Deque BoundedDeque Heap BoundedHeap I use BoundedHeap a lot. If I want to find the k best things out of n, using BoundedHeap lets me do it in O(n.log k) time and O(k) space instead of O(n.log n) time and O(n) space. Your example showing how much better #reduceLeft: is when expressed using transducers is not well chosen,because the implementation of #reduceLeft: in Pharo is (how to say this politely?) not up to the standards we expect from Pharo. Here is how it stands in my compatibility library: Enumerable methods for: 'summarising' reduceLeft: aBlock <compatibility: #pharo> |n| ^(n := aBlock argumentCount) > 2 ifTrue: [ |i a| a := Array new: n. i := 0. self do: [:each | a at: (i := i + 1) put: each. i = n ifTrue: [ a at: (i := 1) put: (aBlock valueWithArguments: a)]]. i = 1 ifTrue: [a at: i] ifFalse: [ self error: 'collection/block size mismatch']] ifFalse: [ "If n = 0 or 1 and the receiver has two or more elements, this will raise an exception passing two arguments to aBlock. That's a good way to complain about it." |r f| r := f := nil. self do: [:each | r := f ifNil: [f := self. each] ifNotNil: [aBlock value: r value: each]]. r == f ifNil: [CollectionTooSmall collection: self] ifNotNil: [r]] No OrderedCollection. And exactly one traversal of the collection. (Enumerables can only be traversed once. Collections can be traversed multiple times.) The only method the receiver needs to provide is #do:, so this method doesn't really need to be in Enumerable. It could, for example, be in Block. #reduceRight: I've placed in AbstractSequence because it needs #reverseDo:, but it equally makes sense as a method of Block (as it knows more about what Blocks can do than it does about what sequences can do). For example, I have SortedSet and SortedBag which can be traversed forwards and backwards but only #reduceLeft: is available to them; #reduceRight: is not, despite them sensibly supporting #reverseDo:. For that matter, Deques support #reverseDo: but are not sequences... To the limited extent that I understand the Transducers package, this view that (s reduce{Left,Right}: b) should really be (b reduce{Left,Right}) applyTo: s seems 100% in the spirit of Transducers. In fact, now that I understand a bit about Transducers, <collection> reduce: <transducer> (a) seems back to front compared with <reducer> applyTo: <collection> (b) seems like a misnomer because in general much may be generated transformed and retained, with nothing getting smaller, so why 'reduce'? It seems like a special case of applying a function-like object (the transducer) to an argument (the collection), so why not transducer applyTo: dataSource transducer applyTo: dataSource initially: initial Did you look at Richard Waters' "obviously synchronizable series expressions"? (MIT AI Memo 958A and 959A) Or his later SERIES package? (Appendix A of Common Lisp the Language, 2nd edition) On Fri, 14 Apr 2023 at 22:32, Steffen Märcker <merkste@web.de> wrote: Hi Richard! Thanks for sharing your thoughts. There's a reason why #inject:into: puts the block argument last. It works better to have "heavy" constituents on the right in an English sentence, and it's easier to indent blocks when they come last. Nice, I never though of it this way. I always appreciate a historical background. Let me try to respond to the rest with a focus on the ideas. First of all, the point of the transducers framework is to cleanly separate between iteration over sequences, processing of the elements and accumulation of a result. It enables easy reuse the concepts common to different data structures. 1. What do I mean by completion? If we iterate over a sequence of objects, we sometimes want to do a final step after all elements have seen to complete the computation. For instance, after copying elements to a new collection, we may want to trim it to its actual size: distinct := col inject: Set new into: #add. distinct trim. For some cases it turns out to be useful to have an object that knows how to do both: distinct := col reduce: (#add completing: #trim) init: Set new. #reduce:init: knows how to deal with both this new objects and ordinary blocks. For now I will call both variants a "reducing function". Note, completion is completely optional and the implementation is literally #inject:into: plus completion if required. 2. What is a reduction? In some cases, it turns out to be useful to pair up a reducing function with an (initial) value. You called it a magma and often its indeed the neutral element of a mathematical operator, e.g., + and 0. But we can use a block that yields the initial value, too. For instance: sum := #+ init: 0. result := numbers reduce: sum. toSet := (#add completing: #trim) initializer: [Set new]. distinct := col reduce: toSet. #reduce: is just a shorthand that passes the function and the value to #reduce:init: Maybe #reduceMagma: is a reasonable name? 3. The reason I implemented #reduce:init: directly on collections, streams, etc. is that these objects know best how to efficiently iterate over their elements. And if a data structure knows how to #reduce:init: we can use it with all the transducers functions, e.g., for partitioning, filtering etc. Other useful methods could then be added to the behaviour with a trait, e.g., #transduce:reduce:init which first apples a transducer and then reduces. As traits are not available in plain VW 8.3, I did not try this approach, though. 4. Lets take #reduce:Left: as and example and reimplement the method using transducers, shall we? The following code works for each sequence/collection/stream that supports #transduce:reduce:init: reduceLeft: aBlock | head rest arity | head := self transduce: (Take number: 1) reduce: [:r :e | e] init: nil. rest := Drop number: 1. arity := aBlock arity. ^arity = 2 ifTrue: [self transduce: rest reduce: aBlock init: head] ifFalse: [ | size arguments | size := arity - 1. rest := rest * (Partition length: size) * (Remove predicate: [:part | part size < size]). arguments := Array new: arity. arguments at: 1 put: head. self transduce: rest reduce: ([:args :part | args replaceFrom: 2 to: arity with: part; at: 1 put: (aBlock valueWithArguments: args); yourself] completing: [:args | args first]) init: arguments] This code is both more general and faster: It does not create an intermediate OrderedCollection and it treats the common case of binary blocks efficiently. Note the implementation can more compact and optimized if it was specialized in certain class. For instance, SequenceableCollection allows accessing elements by index which turns the first line into a simple "self first". Thanks for staying with me for this long reply. I hope I did not miss a point. I do not insist on the existing names but will appreciate any ideas. Best, Steffen Richard O'Keefe schrieb am Freitag, 14. April 2023 09:43:32 (+02:00): #reduce: aReduction Are you saying that aReduction is an object from which a dyadic block and an initial value can be derived? That's going to confuse the heck out of Dolphin and Pharo users (like me, for example). And in my copy of Pharo, #reduce: calls #reduceLeft:, not #foldLeft:. The sad thing about #reduceLeft: in Pharo is that in order to provide extra generality I have no use for, it fails to provide a fast path for the common case of a dyadic block. reduceLeft: aBlock aBlock argumentCount = 2 ifTrue: [ |r| r := self first. self from: 2 to: self last do: [:each | r := aBlock value: r value: each]. ^r]. ... everything else as before ... Adding up a million floats takes half the time using the fast path (67 msec vs 137 msec). Does your #reduce: also perform "a completion action"? If so, it definitely should not be named after #inject:into:. At any rate, if it does something different, it should have a different name, so #reduce: is no good. #reduce:init: There's a reason why #inject:into: puts the block argument last. It works better to have "heavy" constituents on the right in an English sentence, and it's easier to indent blocks when they come last. Which of the arguments here specifies the 'completion action'? What does the 'completion action' do? (I can't tell from the name.) I think the answer is clear: * choose new intention-revealing names that do not clash. If I have have understood your reduce: aReduction correctly, a Reduction specifies - a binary operation (not necessarily associative) - a value which can be passed to that binary operation which suggests that it represents a magma with identity. By the way, it is not clear whether {x} reduce: <<ident. binop>> answers x or binop value: ident value: x. It's only when ident is an identity for binop that you can say 'it doesn't matter'. I don't suppose you could bring yourself to call aReduction aMagmaWithIdentity? Had you considered aMagmaWithIdentity reduce: aCollection where the #reduce: method is now in your class so can't *technically* clash with anything else? All you really need from aCollection is #do: so it could even be a stream. MagmaWithIdentity >> identity >> combine:with: >> reduce: anEnumerable |r| r := self identity. anEumerable do: [:each | r := self combine: r with: each]. ^r MagmaSansIdentity >> combine:with: >> reduce: anEnumerable |r f| f := r := nil. anEnumerable do: [:each | r := f ifNil: [f := self. each] ifNotNil: [self combine: r with: each]]. f ifNil: [anEnumerable error: 'is empty']. ^r On Fri, 14 Apr 2023 at 05:02, Steffen Märcker <merkste@web.de> wrote: The reason I came up with the naming question in the first place is that I (finally !) finish my port of Transducers to Pharo. But currently, I am running into a name clash. Maybe you have some good ideas how to resolve the following situation in a pleasant way. - #fold: exists in Pharo and is an alias of #reduce: - #reduce: exists in Pharo and calls #foldLeft: which also deals with more than two block arguments Both of which are not present in VW. Hence, I used the following messages in VW with no name clash: - #reduce: aReduction "= block + initial value" - #reduce:init: is similar to #inject:into: but executes an additional completion action Some obvious ways to avoid a clash in Pharo are: 1) Make #reduce: distinguish between a reduction and a simple block (e.g. by double dispatch) 2) Rename the transducers #reduce: to #injectInto: and adapt #inject:into: to optionally do the completion 3) Find another selector that is not too counter-intuitive All three approaches have some downsides in my opinion: 1) Though straight forward to implement, both flavors behave quite different, especially with respect to the number of block arguments. The existing one creates a SequenceableCollection and partitions it according to the required number of args. Transducers' #reduce: considers binary blocks as the binary fold case but ternary blocks as fold with indexed elements. 2) This is a real extension of #inject:into: but requires to touch multiple implementations of that message. Something I consider undesirabe. 3) Currently, I cannot think of a good name that is not too far away from what we're familiar with. Do you have some constructive comments and ideas? Kind regards, Steffen Steffen Märcker schrieb am Donnerstag, 13. April 2023 17:11:15 (+02:00): :-D I don't know how compress made onto that site. There is not even an example in the list of language examples where fold/reduce is named compress. Richard O'Keefe schrieb am Donnerstag, 13. April 2023 16:34:29 (+02:00): OUCH. Wikipedia is as reliable as ever, I see. compress and reduce aren't even close to the same thing. Since the rank of the result of compression is the same as the rank of the right operand, and the rank of the result of reducing is one lower, they are really quite different. compress is Fortran's PACK. https://gcc.gnu.org/onlinedocs/gfortran/PACK.html On Fri, 14 Apr 2023 at 01:34, Steffen Märcker <merkste@web.de> wrote: Hi Richard and Sebastian! Interesting read. I obviously was not aware of the variety of meanings for fold/reduce. Thanks for pointing this out. Also, in some languages it seems the same name is used for both reductions with and without an initial value. There's even a list on WP on the matter: https://en.wikipedia.org/wiki/Fold_%28higher-order_function%29#In_various_languages Kind regards, Steffen Richard O'Keefe schrieb am Donnerstag, 13. April 2023 13:16:28 (+02:00): The standard prelude in Haskell does not define anything called "fold". It defines fold{l,r}{,1} which can be applied to any Foldable data (see Data.Foldable). For technical reasons having to do with Haskell's non-strict evaluation, foldl' and foldr' also exist. But NOT "fold". https://hackage.haskell.org/package/base-4.18.0.0/docs/Data-Foldable.html#laws On Thu, 13 Apr 2023 at 21:17, Sebastian Jordan Montano <sebastian.jordan@inria.fr> wrote: Hello Steffen, Let's take Kotlin documentation (https://kotlinlang.org/docs/collection-aggregate.html#fold-and-reduce) > The difference between the two functions is that fold() takes an initial value and uses it as the accumulated value on the first step, whereas the first step of reduce() uses the first and the second elements as operation arguments on the first step. Naming is not so consistent in all the programming languages, they mix up the names "reduce" and "fold". For example in Haskell "fold" does not take an initial value, so it is like a "reduce" in Kotlin. In Kotlin, Java, Scala and other oo languages "reduce" does not take an initial value while "fold" does. Pharo align with those languages (except that out fold is called #inject:into:) So for me the Pharo methods #reduce: and #inject:into represent well what they are doing and they are well named. Cheers, Sebastian ----- Mail original ----- > De: "Steffen Märcker" <merkste@web.de> > À: "Any question about pharo is welcome" <pharo-users@lists.pharo.org> > Envoyé: Mercredi 12 Avril 2023 19:03:01 > Objet: [Pharo-users] Collection>>reduce naming > Hi! > > I wonder whether there was a specific reason to name this method #reduce:? > I would have expected #fold: as this is the more common term for what it > does. And in fact, even the comment reads "Fold the result of the receiver > into aBlock." Whereas #reduce: is the common term for what we call with > #inject:into: . > > I am asking not to annoy anyone but out of curiosity. It figured this out > only by some weird behaviour after porting some code that (re)defines > #reduce . > > Ciao! > Steffen -- Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos unter vivaldi.com herunter

RO

Richard O'Keefe

Thu, Apr 20, 2023 8:12 AM

I've always called the block in #inject:into: the combining function.
Perhaps the reason I don't think of it as a "reducing" function is that
setOfSets inject: Set new into: [:acc :each | acc union: each]
has a block that combines its arguments in a way that makes the result
bigger, not smaller. For what it's worth, Backus used "insert":

insert-right /f where /f:〈x〉 = x
and /f:〈x1,x2,...,xn〉 =
f:〈x1,/f:〈x2,...,xn〉〉
and /f:〈〉 = unit f

insert-left *f* where *f*:〈x〉 = x
and *f*:〈x1,x2,...,xn〉 =
f:〈*f*:〈x1,...,xn-1〉,xn〉
and *f*:〈〉 = unit f
--- https://en.wikipedia.org/wiki/FP_(programming_language)

Every day spent debating the words is another day the software isn't
getting used,
so I'm sorry about prolonging this. Some names are just too confusing.
Some years ago, I found it useful to add
{value:}...{optionalValue:}...
to my library for passing arguments that the receiver may
choose to ignore. (The methods are exactly as direct as
the {value:}... methods, no extra allocations or anything.)
About the same time, Pharo independenty added {cull:}...
but it was years before I noticec, because "cull" to me
means either (a) to slaughter a large proportion of a
population of wild or domestic anmals, or (b) to gather
information from a large number of sources. (The concept
"reap" is lurking somewhere in the background.) I don't
know who came up with that name for pass-optional-argument,
but it grates so much I can't bring myself to use it and
just make do without optional parameters in Pharo. In
fact it was this example that was at the back of my mind.

Changing the subject a wee bit, there's an operation family
in my library, and I wonder how it would fit into Transducers?
To avoid bias, here's a specification in Haskell (for lists,
because I haven't had any luck installing Data.Witherable).

uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a
successor, predecessor :: Ord a => a -> [a] -> a

successor = successorBy compare

successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT)

predecessor = predecessorBy compare

predecessorBy cmp = successorBy (flip cmp)

The reason these operations exist is to pick neighbouring
elements in SortedCollections and SortedSets. But they make
sense for any Enumerable. So there are "generic"
definitions with orderrides for those two classes.

A filter + a reduce . Traditionally, a #select:thenFold:ifNone:
in order to avoid building an intermediate collection. That much
I see how to do with transducers. But you can't get the desired
override for #successor:[sortBlock:][ifNone:] by overriding
#select:thenFold:ifNone: in SortedCollection or SortedSet. So what
should one do?

I've always called the block in #inject:into: the *combining* function. Perhaps the reason I don't think of it as a "reducing" function is that setOfSets inject: Set new into: [:acc :each | acc union: each] has a block that combines its arguments in a way that makes the result bigger, not smaller. For what it's worth, Backus used "insert": *insert-right* /*f* where /*f*:〈*x*〉 = *x* and /*f*:〈*x*1,*x*2,...,*x*n〉 = *f*:〈*x*1,/*f*:〈*x*2,...,*x*n〉〉 and /*f*:〈〉 = *unit f* *insert-left* \*f* where \*f*:〈*x*〉 = *x* and \*f*:〈*x*1,*x*2,...,*x*n〉 = *f*:〈\*f*:〈*x*1,...,*x*n-1〉,*x*n〉 and \*f*:〈〉 = *unit f* --- https://en.wikipedia.org/wiki/FP_(programming_language) Every day spent debating the words is another day the software isn't getting used, so I'm sorry about prolonging this. Some names are just too confusing. Some years ago, I found it useful to add {value:}...{optionalValue:}... to my library for passing arguments that the receiver may choose to ignore. (The methods are exactly as direct as the {value:}... methods, no extra allocations or anything.) About the same time, Pharo independenty added {cull:}... but it was years before I noticec, because "cull" to me means either (a) to slaughter a large proportion of a population of wild or domestic anmals, or (b) to gather information from a large number of sources. (The concept "reap" is lurking somewhere in the background.) I don't know who came up with that name for pass-optional-argument, but it grates so much I can't bring myself to use it and just make do without optional parameters in Pharo. In fact it was this example that was at the back of my mind. Changing the subject a wee bit, there's an operation family in my library, and I wonder how it would fit into Transducers? To avoid bias, here's a specification in Haskell (for lists, because I haven't had any luck installing Data.Witherable). uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a successor, predecessor :: Ord a => a -> [a] -> a successor = successorBy compare successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT) predecessor = predecessorBy compare predecessorBy cmp = successorBy (flip cmp) The reason these operations exist is to pick neighbouring elements in SortedCollections and SortedSets. But they make *sense* for any Enumerable. So there are "generic" definitions with orderrides for those two classes. A filter + a reduce . Traditionally, a #select:thenFold:ifNone: in order to avoid building an intermediate collection. That much I see how to do with transducers. But you can't get the desired override for #successor:[sortBlock:][ifNone:] by overriding #select:thenFold:ifNone: in SortedCollection or SortedSet. So what *should* one do?

SM

Steffen Märcker

Thu, Apr 20, 2023 9:08 AM

Dear Richard,

thanks for that additional piece. I'll put insert-<left/right> on my list
of possible variants. I think we come back to naming after the initial port
is done and everyone can play with it. Generally, I made the observation to
better be careful with names since it's too easy to alienate other or
trigger wrong assumptions.

New topic! (quote below)

Honestly, my knowledge of Haskell is rather limited and rusted. Hence, I am
having difficulties understanding what exactly these operations with a
sequence of elements. Can you give an example or some pseude/smalltalk code
from your use-case and library?

Kind regards

Changing the subject a wee bit, there's an operation family
in my library, and I wonder how it would fit into Transducers?
To avoid bias, here's a specification in Haskell (for lists,
because I haven't had any luck installing Data.Witherable).

uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a
successor, predecessor :: Ord a => a -> [a] -> a

successor = successorBy compare

successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT)

predecessor = predecessorBy compare

predecessorBy cmp = successorBy (flip cmp)

The reason these operations exist is to pick neighbouring
elements in SortedCollections and SortedSets. But they make
sense for any Enumerable. So there are "generic"
definitions with orderrides for those two classes.

A filter + a reduce . Traditionally, a #select:thenFold:ifNone:
in order to avoid building an intermediate collection. That much
I see how to do with transducers. But you can't get the desired
override for #successor:[sortBlock:][ifNone:] by overriding
#select:thenFold:ifNone: in SortedCollection or SortedSet. So what
should one do?

Dear Richard, thanks for that additional piece. I'll put insert-<left/right> on my list of possible variants. I think we come back to naming after the initial port is done and everyone can play with it. Generally, I made the observation to better be careful with names since it's too easy to alienate other or trigger wrong assumptions. New topic! (quote below) Honestly, my knowledge of Haskell is rather limited and rusted. Hence, I am having difficulties understanding what exactly these operations with a sequence of elements. Can you give an example or some pseude/smalltalk code from your use-case and library? Kind regards Changing the subject a wee bit, there's an operation family in my library, and I wonder how it would fit into Transducers? To avoid bias, here's a specification in Haskell (for lists, because I haven't had any luck installing Data.Witherable). uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a successor, predecessor :: Ord a => a -> [a] -> a successor = successorBy compare successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT) predecessor = predecessorBy compare predecessorBy cmp = successorBy (flip cmp) The reason these operations exist is to pick neighbouring elements in SortedCollections and SortedSets. But they make *sense* for any Enumerable. So there are "generic" definitions with orderrides for those two classes. A filter + a reduce . Traditionally, a #select:thenFold:ifNone: in order to avoid building an intermediate collection. That much I see how to do with transducers. But you can't get the desired override for #successor:[sortBlock:][ifNone:] by overriding #select:thenFold:ifNone: in SortedCollection or SortedSet. So what *should* one do?

RO

Richard O'Keefe

Fri, Apr 21, 2023 3:33 AM

successor of x in c = the smallest element of c that is larger than x
min {y | y in c . y > x}
predecessor of x in c = the largest element of c that is smaller than x
max {y | y in c . y < x}

On Thu, 20 Apr 2023 at 21:08, Steffen Märcker merkste@web.de wrote:

Dear Richard,

thanks for that additional piece. I'll put insert-<left/right> on my list
of possible variants. I think we come back to naming after the initial port
is done and everyone can play with it. Generally, I made the observation to
better be careful with names since it's too easy to alienate other or
trigger wrong assumptions.

New topic! (quote below)

Honestly, my knowledge of Haskell is rather limited and rusted. Hence, I
am having difficulties understanding what exactly these operations with a
sequence of elements. Can you give an example or some pseude/smalltalk code
from your use-case and library?

Kind regards

Changing the subject a wee bit, there's an operation family
in my library, and I wonder how it would fit into Transducers?
To avoid bias, here's a specification in Haskell (for lists,
because I haven't had any luck installing Data.Witherable).

uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a
successor, predecessor :: Ord a => a -> [a] -> a

successor = successorBy compare

successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT)

predecessor = predecessorBy compare

predecessorBy cmp = successorBy (flip cmp)

The reason these operations exist is to pick neighbouring
elements in SortedCollections and SortedSets. But they make
sense for any Enumerable. So there are "generic"
definitions with orderrides for those two classes.

A filter + a reduce . Traditionally, a #select:thenFold:ifNone:
in order to avoid building an intermediate collection. That much
I see how to do with transducers. But you can't get the desired
override for #successor:[sortBlock:][ifNone:] by overriding
#select:thenFold:ifNone: in SortedCollection or SortedSet. So what
should one do?

successor of x in c = the smallest element of c that is larger than x min {y | y in c . y > x} predecessor of x in c = the largest element of c that is smaller than x max {y | y in c . y < x} On Thu, 20 Apr 2023 at 21:08, Steffen Märcker <merkste@web.de> wrote: > Dear Richard, > > thanks for that additional piece. I'll put insert-<left/right> on my list > of possible variants. I think we come back to naming after the initial port > is done and everyone can play with it. Generally, I made the observation to > better be careful with names since it's too easy to alienate other or > trigger wrong assumptions. > > New topic! (quote below) > > Honestly, my knowledge of Haskell is rather limited and rusted. Hence, I > am having difficulties understanding what exactly these operations with a > sequence of elements. Can you give an example or some pseude/smalltalk code > from your use-case and library? > > Kind regards > > > Changing the subject a wee bit, there's an operation family > in my library, and I wonder how it would fit into Transducers? > To avoid bias, here's a specification in Haskell (for lists, > because I haven't had any luck installing Data.Witherable). > > uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a > successor, predecessor :: Ord a => a -> [a] -> a > > successor = successorBy compare > > successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT) > > predecessor = predecessorBy compare > > predecessorBy cmp = successorBy (flip cmp) > > The reason these operations exist is to pick neighbouring > elements in SortedCollections and SortedSets. But they make > *sense* for any Enumerable. So there are "generic" > definitions with orderrides for those two classes. > > A filter + a reduce . Traditionally, a #select:thenFold:ifNone: > in order to avoid building an intermediate collection. That much > I see how to do with transducers. But you can't get the desired > override for #successor:[sortBlock:][ifNone:] by overriding > #select:thenFold:ifNone: in SortedCollection or SortedSet. So what > *should* one do? > >

SM

Steffen Märcker

Fri, Apr 21, 2023 8:35 AM

Hi Richard,

Now that's much clearer to me:
min{y | y in c . y > x} "strict supremum"
max{y | y in c . y < x} "strict infimum"

For the general case of a sequence (not sorted) of elements we can do

strictSupremumOf: x in: sequence


	^(sequence transduce filter: [:y | y > x])		"virtual sequence"
		inject: nil
		into: [:min :b | min ifNotNil: [:a | a min: b]]

I just picked a variant of minimum that answers nil if no element is found. Other variants would work, too.
The focus of transducers is on re-use and composition of processing steps. We can break this up into steps if needed:

minimum := [:min :b | min ifNotNil: [:a | a min: b]] init: nil.	"reduction"
upperBounds := Filter predicate: [:y | y > x].				"transducer"
strictSup := minimum transduce: upperBounds.			"transformed reduction"
^strictSup reduce: sequence

We can also use a different notation similar to a data flow:

minimum <~ upperBounds <~ sequence

Of course, if we know how the sequence is sorted, we should use another algorithm. Assuming an ascending order with no random access, we'd change minimum to stop early:

minimum := [:min :b | Stop result: b].

Kind regards,
Steffen

Richard O'Keefe schrieb am Freitag, 21. April 2023 05:33:44 (+02:00):

successor of x in c = the smallest element of c that is larger than x
min {y | y in c . y > x}

predecessor of x in c = the largest element of c that is smaller than x
max {y | y in c . y < x}

On Thu, 20 Apr 2023 at 21:08, Steffen Märcker merkste@web.de wrote:

Dear Richard,

thanks for that additional piece. I'll put insert-<left/right> on my list of possible variants. I think we come back to naming after the initial port is done and everyone can play with it. Generally, I made the observation to better be careful with names since it's too easy to alienate other or trigger wrong assumptions.

New topic! (quote below)

Honestly, my knowledge of Haskell is rather limited and rusted. Hence, I am having difficulties understanding what exactly these operations with a sequence of elements. Can you give an example or some pseude/smalltalk code from your use-case and library?

Kind regards

Changing the subject a wee bit, there's an operation family
in my library, and I wonder how it would fit into Transducers?
To avoid bias, here's a specification in Haskell (for lists,
because I haven't had any luck installing Data.Witherable).

uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a
successor, predecessor :: Ord a => a -> [a] -> a

successor = successorBy compare

successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT)

predecessor = predecessorBy compare

predecessorBy cmp = successorBy (flip cmp)

The reason these operations exist is to pick neighbouring
elements in SortedCollections and SortedSets. But they make
sense for any Enumerable. So there are "generic"
definitions with orderrides for those two classes.

A filter + a reduce . Traditionally, a #select:thenFold:ifNone:
in order to avoid building an intermediate collection. That much
I see how to do with transducers. But you can't get the desired
override for #successor:[sortBlock:][ifNone:] by overriding
#select:thenFold:ifNone: in SortedCollection or SortedSet. So what
should one do?

--
Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos unter vivaldi.com herunter

Hi Richard, Now that's much clearer to me: min{y | y in c . y > x} "strict supremum" max{y | y in c . y < x} "strict infimum" For the general case of a sequence (not sorted) of elements we can do strictSupremumOf: x in: sequence ^(sequence transduce filter: [:y | y > x]) "virtual sequence" inject: nil into: [:min :b | min ifNotNil: [:a | a min: b]] I just picked a variant of minimum that answers nil if no element is found. Other variants would work, too. The focus of transducers is on re-use and composition of processing steps. We can break this up into steps if needed: minimum := [:min :b | min ifNotNil: [:a | a min: b]] init: nil. "reduction" upperBounds := Filter predicate: [:y | y > x]. "transducer" strictSup := minimum transduce: upperBounds. "transformed reduction" ^strictSup reduce: sequence We can also use a different notation similar to a data flow: minimum <~ upperBounds <~ sequence Of course, if we know how the sequence is sorted, we should use another algorithm. Assuming an ascending order with no random access, we'd change minimum to stop early: minimum := [:min :b | Stop result: b]. Kind regards, Steffen Richard O'Keefe schrieb am Freitag, 21. April 2023 05:33:44 (+02:00): successor of x in c = the smallest element of c that is larger than x min {y | y in c . y > x} predecessor of x in c = the largest element of c that is smaller than x max {y | y in c . y < x} On Thu, 20 Apr 2023 at 21:08, Steffen Märcker <merkste@web.de> wrote: Dear Richard, thanks for that additional piece. I'll put insert-<left/right> on my list of possible variants. I think we come back to naming after the initial port is done and everyone can play with it. Generally, I made the observation to better be careful with names since it's too easy to alienate other or trigger wrong assumptions. New topic! (quote below) Honestly, my knowledge of Haskell is rather limited and rusted. Hence, I am having difficulties understanding what exactly these operations with a sequence of elements. Can you give an example or some pseude/smalltalk code from your use-case and library? Kind regards Changing the subject a wee bit, there's an operation family in my library, and I wonder how it would fit into Transducers? To avoid bias, here's a specification in Haskell (for lists, because I haven't had any luck installing Data.Witherable). uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a successor, predecessor :: Ord a => a -> [a] -> a successor = successorBy compare successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT) predecessor = predecessorBy compare predecessorBy cmp = successorBy (flip cmp) The reason these operations exist is to pick neighbouring elements in SortedCollections and SortedSets. But they make *sense* for any Enumerable. So there are "generic" definitions with orderrides for those two classes. A filter + a reduce . Traditionally, a #select:thenFold:ifNone: in order to avoid building an intermediate collection. That much I see how to do with transducers. But you can't get the desired override for #successor:[sortBlock:][ifNone:] by overriding #select:thenFold:ifNone: in SortedCollection or SortedSet. So what *should* one do? -- Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos unter vivaldi.com herunter

RO

Richard O'Keefe

Fri, Apr 21, 2023 10:51 PM

I'm sorry, it appears that I failed to explain the question
well enough. I thought I'd explained earlier.

successor: target
^(self select: [:each | target < each]) min

is trivial. What's wrong with it is that it allocates
an intermediate collection and takes two passes.
FIXING that is is also trivial.

successor: target
^(self virtualSelect: [:each | target < each]) min
^^^^^^^

This does allocate something, but it's just a few words,
and a single traversal is one.

In other languages/contexts we'd be talking about
loop fusion/listless transformation/deforestation.
It is my understanding that using Transducers would
get me this level of improvement.

The problem is that this is still a linear-time
algorithm. If you take advantage of the order in
a SortedCollection or SortedSet,it can take logarithmic
time. When SortedSet is implemented as a splay tree
-- as it is in my library -- iterating over all its
elements using #successor: is amortised CONSTANT time
per element. So we need THREE algorithms:

worst case O(n) select+min
worst case O(lg n) binary search
amortised O(1) splaying
and we want the algorithm selection to be

A U T O M A T I C.

That's the point of using an object-oriented language.
I say what I want done and the receiver decides how to
do it. Anything where I have to write different
calling code depending on the structure of the receiver
doesn't count as a solution.

Now we come to the heart of the problem.
The binary search algorithm is NOT a special case
of the linear search algorithm. It is not made of
pieces that can be related to the parts of the linear
search algorithm.
The splaying algorithm is NOT a special case of the
linear search algorithm OR the binary search algorithm.
It is not made of pieces that can be related to their
parts.

So IF I want automatic selection of an appropriate
algorithm, then I have to rely on inheritance and
overriding, and in order to do that I have to have
a named method that can be overridden, and at that
point I'm no longer building a transducer out of
pluggable pieces.

So that's the point of this exercise.
How do we get
(a) composition of transducers out of pluggable parts
AND
(b) automatic selection of appropriate algorithms

On Fri, 21 Apr 2023 at 20:35, Steffen Märcker merkste@web.de wrote:

Hi Richard,

Now that's much clearer to me:
min{y | y in c . y > x} "strict supremum"
max{y | y in c . y < x} "strict infimum"

For the general case of a sequence (not sorted) of elements we can do

strictSupremumOf: x in: sequence

^(sequence transduce filter: [:y | y > x]) "virtual sequence"
inject: nil
into: [:min :b | min ifNotNil: [:a | a min: b]]

I just picked a variant of minimum that answers nil if no element is
found. Other variants would work, too.
The focus of transducers is on re-use and composition of processing steps.
We can break this up into steps if needed:

minimum := [:min :b | min ifNotNil: [:a | a min: b]] init: nil.
"reduction"
upperBounds := Filter predicate: [:y | y > x]. "transducer"
strictSup := minimum transduce: upperBounds. "transformed reduction"
^strictSup reduce: sequence

We can also use a different notation similar to a data flow:

minimum <~ upperBounds <~ sequence

Of course, if we know how the sequence is sorted, we should use another
algorithm. Assuming an ascending order with no random access, we'd change
minimum to stop early:

minimum := [:min :b | Stop result: b].

Kind regards,
Steffen

Richard O'Keefe schrieb am Freitag, 21. April 2023 05:33:44 (+02:00):

successor of x in c = the smallest element of c that is larger than x
min {y | y in c . y > x}
predecessor of x in c = the largest element of c that is smaller than x
max {y | y in c . y < x}

On Thu, 20 Apr 2023 at 21:08, Steffen Märcker merkste@web.de wrote:

Dear Richard,

thanks for that additional piece. I'll put insert-<left/right> on my list
of possible variants. I think we come back to naming after the initial port
is done and everyone can play with it. Generally, I made the observation to
better be careful with names since it's too easy to alienate other or
trigger wrong assumptions.

New topic! (quote below)

Honestly, my knowledge of Haskell is rather limited and rusted. Hence, I
am having difficulties understanding what exactly these operations with a
sequence of elements. Can you give an example or some pseude/smalltalk code
from your use-case and library?

Kind regards

Changing the subject a wee bit, there's an operation family
in my library, and I wonder how it would fit into Transducers?
To avoid bias, here's a specification in Haskell (for lists,
because I haven't had any luck installing Data.Witherable).

uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a
successor, predecessor :: Ord a => a -> [a] -> a

successor = successorBy compare

successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT)

predecessor = predecessorBy compare

predecessorBy cmp = successorBy (flip cmp)

The reason these operations exist is to pick neighbouring
elements in SortedCollections and SortedSets. But they make
sense for any Enumerable. So there are "generic"
definitions with orderrides for those two classes.

A filter + a reduce . Traditionally, a #select:thenFold:ifNone:
in order to avoid building an intermediate collection. That much
I see how to do with transducers. But you can't get the desired
override for #successor:[sortBlock:][ifNone:] by overriding
#select:thenFold:ifNone: in SortedCollection or SortedSet. So what
should one do?

--
Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos unter vivaldi.com
herunter

I'm sorry, it appears that I failed to explain the question well enough. I thought I'd explained earlier. successor: target ^(self select: [:each | target < each]) min is trivial. What's wrong with it is that it allocates an intermediate collection and takes two passes. FIXING that is is also trivial. successor: target ^(self virtualSelect: [:each | target < each]) min ^^^^^^^ This does allocate something, but it's just a few words, and a single traversal is one. In other languages/contexts we'd be talking about loop fusion/listless transformation/deforestation. It is my understanding that using Transducers would get me *this* level of improvement. The problem is that this is still a linear-time algorithm. If you take advantage of the order in a SortedCollection or SortedSet,it can take logarithmic time. When SortedSet is implemented as a splay tree -- as it is in my library -- iterating over all its elements using #successor: is amortised CONSTANT time per element. So we need THREE algorithms: - worst case O(n) select+min - worst case O(lg n) binary search - amortised O(1) splaying and we want the algorithm selection to be A U T O M A T I C. That's the point of using an object-oriented language. I say what I want done and the receiver decides how to do it. Anything where I have to write different calling code depending on the structure of the receiver doesn't count as a solution. Now we come to the heart of the problem. The binary search algorithm is NOT a special case of the linear search algorithm. It is not made of pieces that can be related to the parts of the linear search algorithm. The splaying algorithm is NOT a special case of the linear search algorithm OR the binary search algorithm. It is not made of pieces that can be related to their parts. So *IF* I want automatic selection of an appropriate algorithm, then I have to rely on inheritance and overriding, and in order to do that I have to have a named method that *can* be overridden, and at that point I'm no longer building a transducer out of pluggable pieces. So that's the point of this exercise. How do we get (a) composition of transducers out of pluggable parts AND (b) automatic selection of appropriate algorithms On Fri, 21 Apr 2023 at 20:35, Steffen Märcker <merkste@web.de> wrote: > Hi Richard, > > Now that's much clearer to me: > min{y | y in c . y > x} "strict supremum" > max{y | y in c . y < x} "strict infimum" > > For the general case of a sequence (not sorted) of elements we can do > > strictSupremumOf: x in: sequence > > ^(sequence transduce filter: [:y | y > x]) "virtual sequence" > inject: nil > into: [:min :b | min ifNotNil: [:a | a min: b]] > > I just picked a variant of minimum that answers nil if no element is > found. Other variants would work, too. > The focus of transducers is on re-use and composition of processing steps. > We can break this up into steps if needed: > > minimum := [:min :b | min ifNotNil: [:a | a min: b]] init: nil. > "reduction" > upperBounds := Filter predicate: [:y | y > x]. "transducer" > strictSup := minimum transduce: upperBounds. "transformed reduction" > ^strictSup reduce: sequence > > We can also use a different notation similar to a data flow: > > minimum <~ upperBounds <~ sequence > > Of course, if we know how the sequence is sorted, we should use another > algorithm. Assuming an ascending order with no random access, we'd change > minimum to stop early: > > minimum := [:min :b | Stop result: b]. > > Kind regards, > Steffen > > > Richard O'Keefe schrieb am Freitag, 21. April 2023 05:33:44 (+02:00): > > successor of x in c = the smallest element of c that is larger than x > min {y | y in c . y > x} > predecessor of x in c = the largest element of c that is smaller than x > max {y | y in c . y < x} > > On Thu, 20 Apr 2023 at 21:08, Steffen Märcker <merkste@web.de> wrote: > >> Dear Richard, >> >> thanks for that additional piece. I'll put insert-<left/right> on my list >> of possible variants. I think we come back to naming after the initial port >> is done and everyone can play with it. Generally, I made the observation to >> better be careful with names since it's too easy to alienate other or >> trigger wrong assumptions. >> >> New topic! (quote below) >> >> Honestly, my knowledge of Haskell is rather limited and rusted. Hence, I >> am having difficulties understanding what exactly these operations with a >> sequence of elements. Can you give an example or some pseude/smalltalk code >> from your use-case and library? >> >> Kind regards >> >> >> Changing the subject a wee bit, there's an operation family >> in my library, and I wonder how it would fit into Transducers? >> To avoid bias, here's a specification in Haskell (for lists, >> because I haven't had any luck installing Data.Witherable). >> >> uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a >> successor, predecessor :: Ord a => a -> [a] -> a >> >> successor = successorBy compare >> >> successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT) >> >> predecessor = predecessorBy compare >> >> predecessorBy cmp = successorBy (flip cmp) >> >> The reason these operations exist is to pick neighbouring >> elements in SortedCollections and SortedSets. But they make >> *sense* for any Enumerable. So there are "generic" >> definitions with orderrides for those two classes. >> >> A filter + a reduce . Traditionally, a #select:thenFold:ifNone: >> in order to avoid building an intermediate collection. That much >> I see how to do with transducers. But you can't get the desired >> override for #successor:[sortBlock:][ifNone:] by overriding >> #select:thenFold:ifNone: in SortedCollection or SortedSet. So what >> *should* one do? >> >> > -- > Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos unter vivaldi.com > herunter >

BP

Bernhard Pieber

Sat, Apr 22, 2023 6:56 AM

Hi Richard,

I really liked your concise description about what is the point of using an object-oriented language. It made my day.

I searched the Web for your Smalltalk library and found this link:
http://www.cs.otago.ac.nz/staffpriv/ok/astc-1711.tar.gz

Is there a newer version available somewhere?

Cheers,
Bernhard

Am 22.04.2023 um 00:51 schrieb Richard O'Keefe raoknz@gmail.com:

I'm sorry, it appears that I failed to explain the question
well enough. I thought I'd explained earlier.

successor: target
^(self select: [:each | target < each]) min

is trivial. What's wrong with it is that it allocates
an intermediate collection and takes two passes.
FIXING that is is also trivial.

successor: target
^(self virtualSelect: [:each | target < each]) min
^^^^^^^

This does allocate something, but it's just a few words,
and a single traversal is one.

In other languages/contexts we'd be talking about
loop fusion/listless transformation/deforestation.
It is my understanding that using Transducers would
get me this level of improvement.

The problem is that this is still a linear-time
algorithm. If you take advantage of the order in
a SortedCollection or SortedSet,it can take logarithmic
time. When SortedSet is implemented as a splay tree
-- as it is in my library -- iterating over all its
elements using #successor: is amortised CONSTANT time
per element. So we need THREE algorithms:

worst case O(n) select+min
worst case O(lg n) binary search
amortised O(1) splaying
and we want the algorithm selection to be

A U T O M A T I C.

That's the point of using an object-oriented language.
I say what I want done and the receiver decides how to
do it. Anything where I have to write different
calling code depending on the structure of the receiver
doesn't count as a solution.

Now we come to the heart of the problem.
The binary search algorithm is NOT a special case
of the linear search algorithm. It is not made of
pieces that can be related to the parts of the linear
search algorithm.
The splaying algorithm is NOT a special case of the
linear search algorithm OR the binary search algorithm.
It is not made of pieces that can be related to their
parts.

So IF I want automatic selection of an appropriate
algorithm, then I have to rely on inheritance and
overriding, and in order to do that I have to have
a named method that can be overridden, and at that
point I'm no longer building a transducer out of
pluggable pieces.

So that's the point of this exercise.
How do we get
(a) composition of transducers out of pluggable parts
AND
(b) automatic selection of appropriate algorithms

On Fri, 21 Apr 2023 at 20:35, Steffen Märcker merkste@web.de wrote:

Hi Richard,

Now that's much clearer to me:
min{y | y in c . y > x} "strict supremum"
max{y | y in c . y < x} "strict infimum"

For the general case of a sequence (not sorted) of elements we can do

strictSupremumOf: x in: sequence

^(sequence transduce filter: [:y | y > x])"virtual sequence"
inject: nil
into: [:min :b | min ifNotNil: [:a | a min: b]]

I just picked a variant of minimum that answers nil if no element is found. Other variants would work, too.
The focus of transducers is on re-use and composition of processing steps. We can break this up into steps if needed:

minimum := [:min :b | min ifNotNil: [:a | a min: b]] init: nil."reduction"
upperBounds := Filter predicate: [:y | y > x]."transducer"
strictSup := minimum transduce: upperBounds."transformed reduction"
^strictSup reduce: sequence

We can also use a different notation similar to a data flow:

minimum <~ upperBounds <~ sequence

Of course, if we know how the sequence is sorted, we should use another algorithm. Assuming an ascending order with no random access, we'd change minimum to stop early:

minimum := [:min :b | Stop result: b].

Kind regards,
Steffen

Richard O'Keefe schrieb am Freitag, 21. April 2023 05:33:44 (+02:00):

successor of x in c = the smallest element of c that is larger than x
min {y | y in c . y > x}
predecessor of x in c = the largest element of c that is smaller than x
max {y | y in c . y < x}

On Thu, 20 Apr 2023 at 21:08, Steffen Märcker merkste@web.de wrote:

Dear Richard,

thanks for that additional piece. I'll put insert-<left/right> on my list of possible variants. I think we come back to naming after the initial port is done and everyone can play with it. Generally, I made the observation to better be careful with names since it's too easy to alienate other or trigger wrong assumptions.

New topic! (quote below)

Honestly, my knowledge of Haskell is rather limited and rusted. Hence, I am having difficulties understanding what exactly these operations with a sequence of elements. Can you give an example or some pseude/smalltalk code from your use-case and library?

Kind regards

Changing the subject a wee bit, there's an operation family
in my library, and I wonder how it would fit into Transducers?
To avoid bias, here's a specification in Haskell (for lists,
because I haven't had any luck installing Data.Witherable).

uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a
successor, predecessor :: Ord a => a -> [a] -> a

successor = successorBy compare

successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT)

predecessor = predecessorBy compare

predecessorBy cmp = successorBy (flip cmp)

The reason these operations exist is to pick neighbouring
elements in SortedCollections and SortedSets. But they make
sense for any Enumerable. So there are "generic"
definitions with orderrides for those two classes.

A filter + a reduce . Traditionally, a #select:thenFold:ifNone:
in order to avoid building an intermediate collection. That much
I see how to do with transducers. But you can't get the desired
override for #successor:[sortBlock:][ifNone:] by overriding
#select:thenFold:ifNone: in SortedCollection or SortedSet. So what
should one do?

--
Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos unter vivaldi.com herunter

Hi Richard, I really liked your concise description about what is the point of using an object-oriented language. It made my day. I searched the Web for your Smalltalk library and found this link: http://www.cs.otago.ac.nz/staffpriv/ok/astc-1711.tar.gz Is there a newer version available somewhere? Cheers, Bernhard > Am 22.04.2023 um 00:51 schrieb Richard O'Keefe <raoknz@gmail.com>: > > I'm sorry, it appears that I failed to explain the question > well enough. I thought I'd explained earlier. > > successor: target > ^(self select: [:each | target < each]) min > > is trivial. What's wrong with it is that it allocates > an intermediate collection and takes two passes. > FIXING that is is also trivial. > > successor: target > ^(self virtualSelect: [:each | target < each]) min > ^^^^^^^ > > This does allocate something, but it's just a few words, > and a single traversal is one. > > In other languages/contexts we'd be talking about > loop fusion/listless transformation/deforestation. > It is my understanding that using Transducers would > get me *this* level of improvement. > > The problem is that this is still a linear-time > algorithm. If you take advantage of the order in > a SortedCollection or SortedSet,it can take logarithmic > time. When SortedSet is implemented as a splay tree > -- as it is in my library -- iterating over all its > elements using #successor: is amortised CONSTANT time > per element. So we need THREE algorithms: > - worst case O(n) select+min > - worst case O(lg n) binary search > - amortised O(1) splaying > and we want the algorithm selection to be > > A U T O M A T I C. > > That's the point of using an object-oriented language. > I say what I want done and the receiver decides how to > do it. Anything where I have to write different > calling code depending on the structure of the receiver > doesn't count as a solution. > > Now we come to the heart of the problem. > The binary search algorithm is NOT a special case > of the linear search algorithm. It is not made of > pieces that can be related to the parts of the linear > search algorithm. > The splaying algorithm is NOT a special case of the > linear search algorithm OR the binary search algorithm. > It is not made of pieces that can be related to their > parts. > > So *IF* I want automatic selection of an appropriate > algorithm, then I have to rely on inheritance and > overriding, and in order to do that I have to have > a named method that *can* be overridden, and at that > point I'm no longer building a transducer out of > pluggable pieces. > > So that's the point of this exercise. > How do we get > (a) composition of transducers out of pluggable parts > AND > (b) automatic selection of appropriate algorithms > > On Fri, 21 Apr 2023 at 20:35, Steffen Märcker <merkste@web.de> wrote: > >> Hi Richard, >> >> Now that's much clearer to me: >> min{y | y in c . y > x} "strict supremum" >> max{y | y in c . y < x} "strict infimum" >> >> For the general case of a sequence (not sorted) of elements we can do >> >> strictSupremumOf: x in: sequence >> >> ^(sequence transduce filter: [:y | y > x])"virtual sequence" >> inject: nil >> into: [:min :b | min ifNotNil: [:a | a min: b]] >> >> I just picked a variant of minimum that answers nil if no element is found. Other variants would work, too. >> The focus of transducers is on re-use and composition of processing steps. We can break this up into steps if needed: >> >> minimum := [:min :b | min ifNotNil: [:a | a min: b]] init: nil."reduction" >> upperBounds := Filter predicate: [:y | y > x]."transducer" >> strictSup := minimum transduce: upperBounds."transformed reduction" >> ^strictSup reduce: sequence >> >> We can also use a different notation similar to a data flow: >> >> minimum <~ upperBounds <~ sequence >> >> Of course, if we know how the sequence is sorted, we should use another algorithm. Assuming an ascending order with no random access, we'd change minimum to stop early: >> >> minimum := [:min :b | Stop result: b]. >> >> Kind regards, >> Steffen >> >> Richard O'Keefe schrieb am Freitag, 21. April 2023 05:33:44 (+02:00): >> >>> successor of x in c = the smallest element of c that is larger than x >>> min {y | y in c . y > x} >>> predecessor of x in c = the largest element of c that is smaller than x >>> max {y | y in c . y < x} >>> >>> On Thu, 20 Apr 2023 at 21:08, Steffen Märcker <merkste@web.de> wrote: >>> >>>> Dear Richard, >>>> >>>> thanks for that additional piece. I'll put insert-<left/right> on my list of possible variants. I think we come back to naming after the initial port is done and everyone can play with it. Generally, I made the observation to better be careful with names since it's too easy to alienate other or trigger wrong assumptions. >>>> >>>> New topic! (quote below) >>>> >>>> Honestly, my knowledge of Haskell is rather limited and rusted. Hence, I am having difficulties understanding what exactly these operations with a sequence of elements. Can you give an example or some pseude/smalltalk code from your use-case and library? >>>> >>>> Kind regards >>>> >>>>> >>>> >>>>> Changing the subject a wee bit, there's an operation family >>>>> in my library, and I wonder how it would fit into Transducers? >>>>> To avoid bias, here's a specification in Haskell (for lists, >>>>> because I haven't had any luck installing Data.Witherable). >>>>> >>>>> uccessorBy, predecessorBy :: (a -> a -> Ordering) -> a -> [a] -> a >>>>> successor, predecessor :: Ord a => a -> [a] -> a >>>>> >>>>> successor = successorBy compare >>>>> >>>>> successorBy cmp x = minimumBy cmp . filter (\y -> cmp x y == LT) >>>>> >>>>> predecessor = predecessorBy compare >>>>> >>>>> predecessorBy cmp = successorBy (flip cmp) >>>>> >>>>> The reason these operations exist is to pick neighbouring >>>>> elements in SortedCollections and SortedSets. But they make >>>>> *sense* for any Enumerable. So there are "generic" >>>>> definitions with orderrides for those two classes. >>>>> >>>>> A filter + a reduce . Traditionally, a #select:thenFold:ifNone: >>>>> in order to avoid building an intermediate collection. That much >>>>> I see how to do with transducers. But you can't get the desired >>>>> override for #successor:[sortBlock:][ifNone:] by overriding >>>>> #select:thenFold:ifNone: in SortedCollection or SortedSet. So what >>>>> *should* one do? >> >> -- >> Gesendet mit Vivaldi Mail. Laden Sie Vivaldi kostenlos unter [vivaldi.com](http://vivaldi.com/) herunter

RO

Richard O'Keefe

Tue, Apr 25, 2023 2:11 AM

There is a much newer version. I've made some minor corrections today.
I really must put it up on github.
Let me get back to you about that.

On Sat, 22 Apr 2023 at 18:57, Bernhard Pieber bernhard@pieber.com wrote:

Hi Richard,

I really liked your concise description about what is the point of using
an object-oriented language. It made my day.