[Pharo-users] Set >> collect:thenDo:

Richard O'Keefe raoknz at gmail.com
Mon Sep 9 18:55:30 EDT 2019


An automatic refactoring rule that converts
  (a collect: b) do: c
to
  (a collect: b thenDo: c)
would change the semantics of the code in all of
Squeak, Pharo, and Smalltalk/X, whichever definition
of #collect:thenDo: is taken, and in Squeak would
make the code slower, not faster.

Using a different name *would* reduce confusion.

However, there is an alternative.
My library copied the #virtual{Collect,Select,Reject}:
interface from Strongtalk.  As the comment puts it,
 The idea is to do #select:, #reject:, #collect:, and some
 other collection->collection operations lazily.  As long
 as you just use #do: &c, the operations will be referred
 back to the original collection, so that there is less
 need for fusion methods like #select:thenCollect:.
 ...
 You can do
        (((( coll
            virtualSelect: [..])
            virtualCollect: [...])
            virtualSelect: [..])
            virtualCollect: [...])
          asCollection
  and only *one* new collection is actually built.

So
  (aCollection virtualCollect: transformBlock)
    do: actionBlock
*does* build an intermediate object (with two instance
variables) but *doesn't* build a whole intermediate
collection.  And that combination *could* be fused by
an automatic rule without breaking anything.

Does that sound like a way forward?
I could convert my implementation of this interface to
Pharo if people would like me to.


On Tue, 10 Sep 2019 at 01:51, Steffen Märcker <merkste at web.de> wrote:

> I think this thread indicates that the selector #collect:thenDo: may
> indeed cause some missunderstanding given standard semantics of #collect:.
> Is there any reason not to use a different one, such as #map:thenDo:, to
> make the difference clearer?
>
> A nice side effect could be a new automatic refactoring rule that
> consolidates chained #collect: and #do: sends into a faster
> #collect:thenDo: send without breaking existing code.
>
> Am .09.2019, 07:46 Uhr, schrieb Richard O'Keefe <raoknz at gmail.com>:
>
> > (1) If you want (aSet collect: collectBlock) do: doBlock
> >     you can write exactly that.  Nothing stops you, and it will
> >     be as clear and reliable as any use of Set>>collect:, which
> >     is to say NOT VERY CLEAR OR RELIABLE AT ALL.
> >
> > (2) #collect:then{Do:Select:Reject:} had no other purpose than
> >     to avoid creating an intermediate and otherwise useless
> >     collection.  If you are not trying to involve an intermediate
> >     set then it just plain wrong to use #collect:thenDo:.
> >
> > (3) Oddly enough, the reason that #collect:thenDo: exists in my
> >     library is that I copied it from Squeak, at a time when it had
> >     the same definition as Pharo and ST/X.  Had I known of the change
> >     in Squeak I would have bitterly opposed it.  The comment in the
> >     current Squeak code, that it is for readability, is 100% the
> >     reverse of the truth.  Using the version with parentheses is WAY
> >     clearer than using the portmanteau method.  Sigh.  I see they
> >     broke #collect:thenSelect: in the same way.
> >
> > (4) Let me offer you another member of the #collect:then* family.
> >
> >     Collection
> >       collect: collectBlock thenInject: initial into: injectBlock
> >         |r|
> >         r := initial.
> >         self do: [:each |
> >           r := injectBlock value: r value: (collectBlock value: each)].
> >         ^r
> >
> >     #(now is the hour for us to say goodbye) asSet
> >       collect: [:each | each size]
> >       thenInject: 0 into: [:acc :each | acc + each]
> >  => 29
> >     (#(now is the hour for us to say goodbye) asSet
> >       collect: [:each | each size])
> >       inject: 0 into: [:acc :each | acc + each]
> >  => 16
> >
> >    That is, it would be WRONG to implement #collect:thenInject:into:
> >    as #collect: followed by #inject:into:.  The point is NOT to
> >    coalesce things that the receiver might (in general, incorrectly)
> >    regard as equal.
> >
> > (5) The bottom line is that if #collect:thenDo: and its relatives did
> >     not have their present semantics in Pharo (and ST/X), they would
> >     need to be reinvented, with names that were not as clear.
> >
> > (1) Just to repeat for emphasis, if you *want* (_ collect: _) do: _
> >     then that is exactly what you should write.  There is no
> >     excuse for using #collect:thenDo: in that case.  It is NOT
> >     clearer to do so.  And you should imagine me jumping up and
> >     down screaming that sending #collect: to a set is a bad code
> >     smell which demands very explicit documentation as to why you
> >     (for example) want a Set to answer a Set but an IdentitySet
> >     to also answer a Set, not the expected IdentitySet.  (I have
> >     been bitten by that more than once.)
> >
> >
> >
> > On Mon, 9 Sep 2019 at 01:33, Herby Vojčík <herby at mailbox.sk> wrote:
> >
> >> On 8. 9. 2019 14:28, Peter Kenny wrote:
> >> > Two comments:
> >> > First, the method comment for Collection>>collect:thenDo: is "Utility
> >> method
> >> > to improve readability", which is exactly the same as for
> >> > collect:thenSelect: and collect:thenReject:. This suggests that the
> >> > *intention* of the method is not to introduce new behaviour, but
> >> simply
> >> to
> >> > provide a shorthand for the version with parentheses. For other kinds
> >> of
> >>
> >> I had that same impression.
> >>
> >> > collection this is true; just the deduping makes Set different. If we
> >> want
> >>
> >> I would be more defensive here and say that generic collection should
> >> have (collect:)do: implementation and only sequenceable collections have
> >> the optimized one (if it indeed is the case that it is a shotrhand for
> >> parenthesized one).
> >>
> >> > the different behaviour, this should be indicated by method name and
> >> > comment.
> >> > Second, if we remove asSet from the second snippet, the output is
> >> exactly
> >> > the same. It will be the same as long as the original collection has
> >> no
> >> > duplicates. Somehow the effect is to ignore the asSet. It just smells
> >> wrong.
> >> >
> >> > Peter Kenny
> >>
> >> Herby
> >>
> >> > Kasper Osterbye wrote
> >> >> The first version:
> >> >>
> >> >> (#(1 2 3) asSet collect: #odd)
> >> >> do: [ :each | Transcript show: each; cr ]
> >> >>
> >> >> is rather straight forward I believe, as collect: and do: has been
> >> around
> >> >> forever (relatively speaking).
> >> >>
> >> >>
> >> >> #(1 2 3) asSet collect: #odd
> >> >> thenDo: [ :each | Transcript show: each; cr ]
> >> >>
> >> >>
> >> >> On 8 September 2019 at 09.13.36, Richard Sargent (
> >> >
> >> >> richard.sargent@
> >> >
> >> >> ) wrote:
> >> >>
> >> >>   I am skeptical of one that relies on a specific implementation
> >> rather
> >> >> than
> >> >> a specific definition.
> >> >>
> >> >> I share your feeling. I am not sure where such a definition would
> >> come
> >> >> from. In Squeak it is defined as:
> >> >>
> >> >> collect: collectBlock thenDo: doBlock
> >> >>
> >> >> ^ (self collect: collectBlock) do: doBlock
> >> >>
> >> >> In pharo as:
> >> >>
> >> >> collect: collectBlock thenDo: doBlock
> >> >>
> >> >> ^ self do: [ :each | doBlock value: (collectBlock value: each)]
> >> >>
> >> >> I might have called the method collect:eachDo:, but we each have
> >> slightly
> >> >> different styles. What I like about the pharo version is that is a
> >> >> shorthand for something which is not achieved by mere parentheses.
> >> >>
> >> >> Best,
> >> >>
> >> >> Kasper
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
> >> >
> >>
> >>
> >>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-users_lists.pharo.org/attachments/20190910/32830ee9/attachment-0001.html>


More information about the Pharo-users mailing list