[Pharo-dev] Tell me about your workflow

GOUBIER Thierry thierry.goubier at cea.fr
Mon Dec 16 14:16:41 EST 2013


Yes!

Ok, you're probably reaching over 60% of the time spent writing to disk. I'll try to think over the problem and see if I can come up with something simple and efficient... either within the current filetree format, or with another (future) format.

At the moment, this would mean that, as a rule of thumb, splitting a large package over a few sub-packages could be a good idea once we reach about 200 ~ 300 classes in a package.

Thierry
________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org] de la part de Sebastian Sastre [sebastian at flowingconcept.com]
Date d'envoi : lundi 16 décembre 2013 14:14
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

like that?

[cid:3C6102B3-D4C7-4F37-80F8-552A346E3232]
On Dec 16, 2013, at 11:00 AM, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

I'm happy with your test drive :)

I see that with your largest test, you are at ~6000ms for writing overall in the profiler (basicStoreVersion). Can you drill down a bit in it to reach places where it says FileReference>>writeStream (or something similar); this is the place where the effective writing takes place.

Thierry

________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Sebastian Sastre [sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>]
Date d'envoi : lundi 16 décembre 2013 13:00
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

Okay, I'm impressed :D

filetree performs quite better and

gitfiletree in 3.0 really feels great

my incentive to upgrade went up a lot
<Screen Shot 2013-12-16 at 9.57.04 AM.png>

On Dec 14, 2013, at 1:35 PM, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

To install gitfiletree in 3.0:

Install OSProcess from the configuration browser or 'do it' the following in a workspace

Gofer new
    url: 'http://smalltalkhub.com/Pharo/MetaRepoForPharo30/main/';
    package: 'ConfigurationOfOSProcess';
    load.
((Smalltalk at: #ConfigurationOfOSProcess) project version: #stable) load

Gofer new
    url: 'http://smalltalkhub.com/mc/ThierryGoubier/Alt30/main';
    package: 'MonticelloFileTree-Git';
    load

Then, in your Monticello browser, add repositories of type gitfiletree: in exactly the same way you would use filetree:. Beware: all save, copy, push and pull buttons on the GUI really do work and will call git on your behalf. If you only browse the repository, nothing will be committed or pushed.

The code to profile the writing time is the following :

| package dir  mcR pV pVInfo |
dir := 'temp' asFileReference.
self assert: dir exists not.
dir ensureCreateDirectory.
self assert: dir isWritable.
pV := MCWorkingCopy forPackage: (package := MCPackage named: 'Roassal').
pVInfo := pV ancestry ancestors first.
mcR := MCFileTreeRepository new directory: dir.
TimeProfiler spyAllOn: [mcR
        basicStoreVersion:
            (MCVersion new
                setPackage: package
                info: pVInfo
                snapshot: pV snapshot
                dependencies: #())].
dir deleteAll

Change Roassal with the name of the package you want to use to test. It will save the package, profile it, and then delete the temporary directory used.

Thierry

________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Sebastian Sastre [sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>]
Date d'envoi : samedi 14 décembre 2013 16:10
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow


On Dec 14, 2013, at 5:53 AM, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

I used to do FileTree save and then git commands in a Terminal, and missed a few times :
- Multiple version saves overwriting each other... because I forgot the git commit
- Unability to explore the git history

So, when I discussed with Dale, the author of FileTree, I saw the possibility to just add the git commands I needed on top of FileTree... the result is gitfiletree.

Performance improvements... I'm wondering. From the profiling data I have, if I make writing 4x faster, I will only make a save 1.04x faster (with gitfiletree and Roassal). If to write less I have to compute a lot, do file reads and compares, md5 hashes, a diff... I'll probably replace a lot of writes by a lot of CPU time in Pharo and maybe be slower overall (not to mention more error prone).

Would you be ready to run some profiling code on your set ? I can make a simple test case. I also wonder if writing less could make the git commit faster, in which case it could be valuable.

sure, send me the instructions


Also, if you want to try gitfiletree and your repository, what you can try is just browse your repository and browse versions of your packages (browse, not load: a browse will load your package code components but not compile the code).


ok, how should I install it?

(I noticed that the use of git archive makes gitfiletree slower at package read time, but, since this is dwarfed by the Pharo compilation time when loading such large packages, I don't care :)).

Thierry
________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Sebastian Sastre [sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>]
Date d'envoi : samedi 14 décembre 2013 01:54
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

get it.

In my workflow I don't do git operations from the image.

I do those from the terminal itself in the server and using SourceTree on development

As far as I know, the place that has most space for improvement is when filetree does the writes







On Dec 13, 2013, at 6:28 PM, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

Hi Sebastian,

sorry for the missing instructions... Had to board a train :)

In fact, it's not an alternative to filetree, instead it is an extension of filetree (and integrated into filetree, but not completely yet: no real configuration support, no windows support).

But yes, I should do a blog post somewhere about it :)

Meanwhile, in the train, I tested on Roassal and I profiled the writes...

Git is slow: on Roassal,
    Waiting for the git commit : 67%
    Writing the package to disk : 20% (writing the version file: 0.4%)
        -> No real cost associated with the metadata
        -> git is a lot more than I expected.
When I remove git (i.e. pure filetree)
    Making a snapshot of Roassal: 28.9%
    basic Store version : 67%
        writeBasicDefinitions : 30% (of which half of it is reordering some stuff, not writing)
        writeMethodHolderDefinitions: 27,8%
            Of which writing to disk effectively is 19.7%

Conclusion: gitfiletree will look slow because git is slow on such large commits :( And I don't see much gains to be made on the writing (i.e. a format change won't help much), but a lot to loose (code complexity, bugs). Unless a format change may make it faster for git somehow?

I'll keep profiling, or give you the code to use to profile yourself ?

Thierry
________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Sebastian Sastre [sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>]
Date d'envoi : vendredi 13 décembre 2013 16:56
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

so you are working on an alternative to filetree

I'm pretty sure if you setup a page (or blog post) somewhere with clear and easy instructions to follow many people will try to use it

and, if proven good and reliable, adopt it


sebastian<https://about.me/sebastianconcept>

o/





On Dec 13, 2013, at 12:04 PM, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

gitfiletree can be had this way (filetree is already in 3.0, gitfiletree is in the filetree configuration for Pharo 3.0 but is a bit hard to load).

    pharo/pharo pharo/Pharo.image confighttp://smalltalkhub.com/Pharo/MetaRepoForPharo30/main/ ConfigurationOfOSProcess --install=stable
    pharo/pharo pharo/Pharo.image eval --save Gofer new url:\'http://smalltalkhub.com/mc/ThierryGoubier/Alt30/main\'\<http://smalltalkhub.com/mc/ThierryGoubier/Alt30/main/'/>; package: \'MonticelloFileTree-Git\'\; load

Thierry

________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Sebastian Sastre [sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>]
Date d'envoi : vendredi 13 décembre 2013 14:57
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

I can try a forced load on a 3.0 ignoring requisites for testing purposes

but I'm using this:
https://github.com/dalehenrich/filetree

I'm not sure what you're are using

can you clarify?



On Dec 13, 2013, at 11:43 AM, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

Ok. I'll try Roassal on a slow netbook to see. I don't see a factor of 10 difference between yours and Roassal, so I'll have a look.

Thierry
________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Sebastian Sastre [sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>]
Date d'envoi : vendredi 13 décembre 2013 14:32
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

A bit.

This is from today's current version (and is not all, it's only the two biggest packages):

(MCPackage named: 'flow') workingCopy packageInfo classes size. 363.
(MCPackage named: 'flow') workingCopy packageInfo coreMethods size. 4585.

(MCPackage named: 'airflowing') workingCopy packageInfo classes size. 377.
(MCPackage named: 'airflowing') workingCopy packageInfo coreMethods size. 5818.







On Dec 13, 2013, at 11:25 AM, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

Roassal: 3493

Number of versions in the package history: 733. Size of the version file: 202796.

Is that a lot lower than your count?

Thierry

________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Sebastian Sastre [sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>]
Date d'envoi : vendredi 13 décembre 2013 13:34
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

how many coreMethods?




On Dec 13, 2013, at 7:00 AM, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

Bad news. Roassal package directory has 355 entries (343 classes + a few extensions) and I don't see much of a slow down (on 3.0). It's not instantaneous, but with a bit of feedback, it doesn't seems long.

I'll do some profiling.

Thierry
________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de GOUBIER Thierry
Date d'envoi : jeudi 12 décembre 2013 17:07
À : Pharo Development List
Objet : [PROVENANCE INTERNET] Re: [Pharo-dev] Tell me about your workflow

Thanks for the pointers.

I'll look at Seaside/Moose/Mondrian and Roassal, because I need code I can load and save in an image without destroying the very image I use to test  (which would happen if I load Pharo10 stuff in a 3.0 image ;) ).

Thierry

________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Yuriy Tymchuk [yuriy.tymchuk at me.com<mailto:yuriy.tymchuk at me.com>]
Date d'envoi : jeudi 12 décembre 2013 16:24
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

So if you want something big and with a lot of commits you can use Pharo* in general. Pharo10 has the most versions and Pharo30Inbox is the largest one. If you want some other projects then you heve to take a look at Seaside30, Mondrian, Moose, Glamour or Roassal.

Uko

On 12 Dec 2013, at 16:20, Yuriy Tymchuk <yuriy.tymchuk at me.com<mailto:yuriy.tymchuk at me.com>> wrote:

Pharo10 on SmalltalkHub is humongous. You can definitely do a stress test with it :)

Uko

On 12 Dec 2013, at 15:43, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

I would need a large project, composed of one or more packages, with more than 150~200 classes, which triggers the slow read and writing times Sebastian experience. And, probably, to be complete, a long and complex commit history in git (> 100 commits).

I'll keep in mind the idea of creating one randomly ;)

Thierry

________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Yuriy Tymchuk [yuriy.tymchuk at me.com<mailto:yuriy.tymchuk at me.com>]
Date d'envoi : jeudi 12 décembre 2013 15:37
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

Are you interested in a package or a project? I can provide you information based on size, etc…

Uko

On 12 Dec 2013, at 15:30, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

I gave up running gitfiletree on 1.4 :(

It's possible to use gitfiletree from a 2.0 or a 3.0 image to browse your git repository, but testing the writing will be an issue.

My best chance would be to find a large enough package I can use on 2.0 or 3.0 to test and profile. Does anybody has a large enough package which could fit? Anything that doesn't require a NDA to read it, of course. Is Roassal large enough?

Thierry

________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Sebastian Sastre [sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>]
Date d'envoi : jeudi 12 décembre 2013 12:12
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

gee the big code package is airflowing which I have, quite conservatively, running on #14438 images

 I load filetree like this:

Gofer new
      url: 'http://ss3.gemstone.com/ss/FileTree';
      package: 'ConfigurationOfFileTree';
      load.
((Smalltalk at: #ConfigurationOfFileTree) project version: #'stable') load.

and it never complained

let me know





On Dec 12, 2013, at 3:53 AM, GOUBIER Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

If you would be ready to profile a package save on your repository, it would be great. In the mean time, I'll make available a special gitfiletree package to test. Which version of Pharo you are using? 2.0 or 3.0?

Regards,

Thierry


________________________________
De : Pharo-dev [pharo-dev-bounces at lists.pharo.org<mailto:pharo-dev-bounces at lists.pharo.org>] de la part de Sebastian Sastre [sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>]
Date d'envoi : mercredi 11 décembre 2013 17:09
À : Pharo Development List
Objet : Re: [Pharo-dev] Tell me about your workflow

ok, if saving is dumping all, then 3 is confirmed? After the first commit, I'd say so.

about 2, I don't know. I'm available to make tests and measure results

have a nice trip, keep us tuned about any progress







On Dec 11, 2013, at 2:09 PM, Goubier Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>> wrote:

Yes, you're right in the general case.

But a solution to that general problem will take time to be implemented (time I lack at the moment, sadly) and if the main gain is a few % because it's writing the version file and the metadata for methods which are the "slow" factors, then we'll have worked hard for nothing.

If you want to help, I'd really like to see either 2- or 3- confirmed. I can produce a special gitfiletree to remove writing the metadata, that you can try on a large project temporary copy; if the slow writing (and reading) is confirmed, then this is 3-

(But I'm leaving on a trip tomorrow early, so I have no idea of when I'll have the time to do that :( ).

Thierry

Le 11/12/2013 16:44, Sebastian Sastre a écrit :
Without entering in details, a cause for slow package write is dumping
all every time.

For that strategy, we already have the image save which is magically fast.

So, if we make something to scan the code and write only when it's
different from what's on disk, then we would be preventing tons of
redundant writes

sebastian <https://about.me/sebastianconcept>

o/





On Dec 11, 2013, at 1:43 PM, Goubier Thierry <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>
<mailto:thierry.goubier at cea.fr>> wrote:



Le 11/12/2013 16:27, Esteban Lorenzano a écrit :
ah, and IMHO the problem is not about reading... is about writing (if it
has to write the metadata each time...).

But, personnaly, I don't know if this is the reason for the lack of
performance...

I have three hypothesis for Sebastian problem:
1 - Slow read time for version metadata
- Confirmed because of the 16 seconds wait time for reading the
package metadata in the repository browser.
2 - Slow metadata write
3 - Slow package write

I have an implemented solution for 1-, a very easy to implement for
2-, and none yet for 3-

So I'd really like to check if 3- is confirmed ;)

Thierry


Esteban


On Wed, Dec 11, 2013 at 4:24 PM, Esteban Lorenzano
<estebanlm at gmail.com<mailto:estebanlm at gmail.com> <mailto:estebanlm at gmail.com>
<mailto:estebanlm at gmail.com>> wrote:

  Thierry, I know there is a working version... let me search...

  (5 mins later)


  here:

https://github.com/rjsargent/CypressReferenceImplementation

  Dale says Richard made a metadata-less version.

  We should take a look at that.

  Esteban


  On Wed, Dec 11, 2013 at 4:28 PM, Goubier Thierry
  <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>
<mailto:thierry.goubier at cea.fr><mailto:thierry.goubier at cea.fr>> wrote:

      Esteban, Sebastian,

      In the filetree code, you will find a format without metadata,
      but it's not in use anymore.

      If you use gitfiletree, it will write the metadata for
      compatibility reasons with filetree, but it will never read it
back.

      I'm pushing code to make filetree robust to absence of metadata,
      but I haven't worked on it for a while.

      gitfiletree has solved the problem of a slow metadata read. It
      does not solve any performance problem associated with
writing, yet.

      Thierry

      Le 11/12/2013 16:12, Esteban Lorenzano a écrit :

          I know there is a version of filetree without metadata (more
          compelling
          for projects that will never use other formats).
          Dale told me that there was a preview somewhere, but I
          didn't tested yet
          (lack of time) and now I cannot find the mail...
          Dale, can you re-send the link?

          cheers,
          Esteban


          On Wed, Dec 11, 2013 at 4:08 PM, Sebastian Sastre
          <sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>
<mailto:sebastian at flowingconcept.com>
          <mailto:sebastian at flowingconcept.com>
          <mailto:sebastian at __flowingconcept.com
          <mailto:sebastian at flowingconcept.com>>> wrote:

               I should breath before I type, but you probably already
          got that I
               meant /redundant writes/ (not reads)...


               Anyway.. I was talking with Esteban and he mentions
          some kind of
               compatibility metadata.

               If I'm going to give a leap of faith to filetree repos
          to save code
               why should I care about mcz compatibility? Paying a
          toll for no
               reason is evil.

               Maybe we could make that optional so those who don't
          extract value
               from that feature can opt-out?

               sebastian <https://about.me/__sebastianconcept
          <https://about.me/sebastianconcept>>


               o/





               On Dec 11, 2013, at 12:44 PM, Sebastian Sastre
               <sebastian at flowingconcept.com<mailto:sebastian at flowingconcept.com>
<mailto:sebastian at flowingconcept.com>
          <mailto:sebastian at flowingconcept.com>
          <mailto:sebastian at __flowingconcept.com
          <mailto:sebastian at flowingconcept.com>>>
               wrote:

                   Hi Thierry

                   On Dec 11, 2013, at 12:43 PM, Goubier Thierry
                   <thierry.goubier at cea.fr<mailto:thierry.goubier at cea.fr>
<mailto:thierry.goubier at cea.fr>
              <mailto:thierry.goubier at cea.fr>
              <mailto:thierry.goubier at cea.fr
              <mailto:thierry.goubier at cea.fr>__>> wrote:


                           I have packages (in the order of hundreds
                      of classes) and save
                           delays
                           and package click delays are starting to
                      demand patience in a
                           way that
                           doesn't feel like the right path


                       Which operations ? I didn't remember noticing
                  much with 179
                       classes on a laptop without a SSD.


                   choose one. Just for clicking the package that will
              should you
                   UUID, version and author I need to wait ~16
              seconds. Sounds like a
                   lot of overhead for reading a small .json file.

                   But the write is the most worrisome


                           All that is with a SSD disk, otherwise save
                      delays would be
                           /way/ beyond
                           unacceptable


                       I'd like to know more, and understand the
                  reason, for sure. As
                       far as I know, filetree will rewrite the whole
                  package to disk
                       everytime... and maybe optimising that could be
                  the solution.


                   Well, that explains a lot. Writing all every time
              is the lazy
                   thing that's okay for a prototype and temporary
              code in a proof of
                   concept but that massive redundant reads certainly
              doesn't sounds
                   like pro software. Specially for SSD's which has a
              limited
                   quantity of writes


                       Thierry

                           sebastian
                      <https://about.me/__sebastianconcept
                      <https://about.me/sebastianconcept>>

                           o/






                       --
                       Thierry Goubier
                       CEA list
                       Laboratoire des Fondations des Systèmes Temps
                  Réel Embarqués
                       91191 Gif sur Yvette Cedex
                       France
                       Phone/Fax: +33 (0) 1 69 08 32 92
                  <tel:%2B33%20%280%29%201%2069%2008%2032%2092>
                       <tel:%2B33%20%280%29%201%2069%__2008%2032%2092>
                  / 83 95






      --
      Thierry Goubier
      CEA list
      Laboratoire des Fondations des Systèmes Temps Réel Embarqués
      91191 Gif sur Yvette Cedex
      France
      Phone/Fax: +33 (0) 1 69 08 32 92
      <tel:%2B33%20%280%29%201%2069%2008%2032%2092> / 83 95




--
Thierry Goubier
CEA list
Laboratoire des Fondations des Systèmes Temps Réel Embarqués
91191 Gif sur Yvette Cedex
France
Phone/Fax: +33 (0) 1 69 08 32 92 / 83 95


--
Thierry Goubier
CEA list
Laboratoire des Fondations des Systèmes Temps Réel Embarqués
91191 Gif sur Yvette Cedex
France
Phone/Fax: +33 (0) 1 69 08 32 92 / 83 95

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20131216/75cb54f3/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2013-12-16 at 11.13.10 AM.png
Type: image/png
Size: 139673 bytes
Desc: Screen Shot 2013-12-16 at 11.13.10 AM.png
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20131216/75cb54f3/attachment.png>


More information about the Pharo-dev mailing list