[Pharo-users] Understanding the role of the sources file

Ben Coman btc at openinworld.com
Wed Jan 13 11:25:09 EST 2016


> On Wed, Jan 13, 2016 at 4:58 PM Werner Kassens <wkassens at libello.com> wrote:
>>
>> Hi Dimitris,
>> your formulation "...Pharo bytcode...and convert it to machine code..."
>> is insofar irritating to me as "convert it to machine code" would
>> suggest to me that a compiler is at work here. Davids "executing Pharo
>> byte-code" seems more understandable to me here.

On Wed, Jan 13, 2016 at 11:27 PM, Dimitris Chloupis
<kilon.alios at gmail.com> wrote:
> Thats correct its a compiler, a byte compiler, it compiles bytecode to
> machine code and it does it while the code executes, this is why its called
> JIT , which has the meaning of Just In Time compilation, meaning that
> machine code is compiled just before the code is executed so several
> optimizations can be applied that would not be known before the execution of
> the code. Similar to JAVA's JIT compiler.
>
> Note here that a compiler is not just something that produces machine code,
> a compiler for example can take one language and compile it to another
> language.

Indeed.  The OpalCompiler takes Smalltalk and produces bytecode.

However I think Sven and Werner were referring that much of Pharo code
is not JITd, but *merely* interpreted (IIUC).  See section "Not so
smart questions" here...
https://clementbera.wordpress.com/2014/01/09/the-sista-chronicles-i-an-introduction-to-adaptive-recompilation/

cheers -ben

>>
>> On 01/13/2016 02:22 PM, Dimitris Chloupis wrote:
>> > I assume you have never read a an introduction to C++ then :D
>> >
>> > here is the final addition for the vm
>> >
>> > (Vm) is the only component that is different for each operating system.
>> > The main purpose of the VM is to take Pharo bytcode that is generated
>> > each time user accepts a piece of code and convert it to machine code in
>> > order to be executed, but also to generally handle low level
>> > functionality like interpreting code, handling OS events (mouse and
>> > keyboard), calling C libraries etc. Pharo 4 comes with the Cog VM a very
>> > fast JIT VM.
>> >
>> > I think its clear, precise and does not leave much room for confusion.
>> > Personally I think its very important for the absolute begineer to have
>> > strong foundations of understanding the fundamental of Pharo and not for
>> > things to appear magical and "dont touch this".
>> >
>> > On Wed, Jan 13, 2016 at 2:54 PM Sven Van Caekenberghe <sven at stfx.eu
>> > <mailto:sven at stfx.eu>> wrote:
>> >
>> >
>> >      > On 13 Jan 2016, at 13:42, Dimitris Chloupis
>> >     <kilon.alios at gmail.com <mailto:kilon.alios at gmail.com>> wrote:
>> >      >
>> >      > I mentioned bytecode because I dont want the user to see at some
>> >     point bytecode and say "What the hell is that" I want the reader to
>> >     feel confident that at least understands the basic in Pharo. Also
>> >     very brief explanations about bytecode I have seen in similar python
>> >     tutorials. Obviously I dont want to go any deeper than that because
>> >     the user wont have to worry about the technical details on a daily
>> >     basis anyway.
>> >      >
>> >      > I agree that I could add a bit more on the VM description similar
>> >     to what you posted. I am curious though, wont even the interpreter
>> >     generate machine code in order to execute the code  or does it use
>> >     existing machine code inside the VM binary ?
>> >
>> >     No, a classic interpreter does not 'generate' machine code, it is
>> >     just a program that reads and executes bytes codes in a loop, the
>> >     interpreter 'is' machine code.
>> >
>> >     No offence, but you see why I think it is important to not try to
>> >     use or explain too much complex concepts in the 1st chapter.
>> >
>> >     Learning to program is hard. It should first be done abstractly.
>> >     Think about Scratch. The whole idea of Smalltalk is to create a
>> >     world of interacting objects. (Even byte code is not a necessary
>> >     concept at all, for example, in Pharo, you can compile (translate)
>> >     to AST and execute that, I believe. There are Smalltalk
>> >     implementations that compile directly to C or JavaScript). Hell,
>> >     even 'compile' is not necessary, just 'accept'. See ?
>> >
>> >      > On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe
>> >     <sven at stfx.eu <mailto:sven at stfx.eu>> wrote:
>> >      > Sounds about right.
>> >      >
>> >      > Now, I would swap 1 and 4, as the image is the most important
>> >     abstraction.
>> >      >
>> >      > There is also a bit too much emphasis on (byte|source)code. This
>> >     is already pretty technical (it assume you know what compilation is
>> >     and so on). But I understand it must be explained here, and you did
>> >     it well.
>> >      >
>> >      > However, I would start by saying that the image is a snapshot of
>> >     the object world in memory that is effectively a live Pharo system.
>> >     It contains everything that is available and that exists in Pharo.
>> >     This includes any objects that you created yourself, windows,
>> >     browsers, open debuggers, executing processes, all meta objects as
>> >     well as all representations of code.
>> >      >
>> >      > <sidenote>
>> >      > The fact that there is a sources and changes file is an
>> >     implementation artefact, not something fundamental. There are ideas
>> >     to change this in the future (but you do not have to mention that).
>> >      > </sidenote>
>> >      >
>> >      > Also, the VM not only executes code, it maintains the object
>> >     world, which includes the ability to load and save it from and to an
>> >     image. It creates a portable (cross platform) abstraction that
>> >     isolates the image from the particular details of the underlying
>> >     hardware and OS. In that role it implements the interface with the
>> >     outside world. I would mention that second part before mentioning
>> >     the code execution.
>> >      >
>> >      > The sentence "The purpose of the VM is to take Pharo bytcode that
>> >     is generated each time user accepts a piece of code and convert it
>> >     to machine code in order to be executed." is not 100% correct. It is
>> >     possible to execute the byte code without converting it. This is
>> >     called interpretation. JIT is a faster technique that includes
>> >     converting (some often used) byte code to machine code and caching
>> > that.
>> >      >
>> >      > I hope this helps (it is hard to write a 'definitive explanation'
>> >     as there are some many aspects to this and it depends on the
>> >     context/audience).
>> >      >
>> >      > > On 13 Jan 2016, at 12:58, Dimitris Chloupis
>> >     <kilon.alios at gmail.com <mailto:kilon.alios at gmail.com>> wrote:
>> >      > >
>> >      > > So I am correct that the image does not store the source code,
>> >     and that the source code is stored in sources and changes. The only
>> >     diffirence is that the objects have a source variable that points to
>> >     the right place for finding the source code.
>> >      > >
>> >      > > This is the final text if you find anything incorrect please
>> >     correct me
>> >      > >
>> >      > > ---------------
>> >      > >
>> >      > > 1. The virtual machine (VM) is the only component that is
>> >     different for each operating system. The purpose of the VM is to
>> >     take Pharo bytcode that is generated each time user accepts a piece
>> >     of code and convert it to machine code in order to be executed.
>> >     Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable
>> >     is named:
>> >      > >
>> >      > > • Pharo.exe for Windows; • pharo for Linux ; and
>> >      > >
>> >      > > • Pharo for OSX (inside a package also named Pharo.app).
>> >      > > The other components below are portable across operating
>> >     systems, and
>> >      > >
>> >      > > can be copied and run on any appropriate virtual machine.
>> >      > >
>> >      > > 2. The sources file contains source code for parts of Pharo
>> >     that don’t change frequently. Sources file is important because the
>> >     image file format stores only the bytecode of live objects and not
>> >     their source code. Typically a new sources file is generated once
>> >     per major release of Pharo. For Pharo 4.0, this file is named
>> >     PharoV40.sources.
>> >      > >
>> >      > > 3. The changes file logs of all source code modifications since
>> >     the .sources file was generated. This facilitates a per method
>> >     history for diffs or re- verting.That means that even if you dont
>> >     manage to save the image file on a crash or you just forgot you can
>> >     recover your changes from this file. Each release provides a near
>> >     empty file named for the release, for example Pharo4.0.changes.
>> >      > >
>> >      > > 4. The image file provides a frozen in time snapshot of a
>> >     running Pharo system. This is the file where the Pharo bytecode is
>> >     stored and as such its a cross platform format. This is the heart of
>> >     Pharo, containing the live state of all objects in the system
>> >     (including classes and methods, since they are objects too). The
>> >     file is named for the release (like Pharo4.0.image).
>> >      > >
>> >      > > The .image and .changes files provided by a Pharo release are
>> >     the starting point for a live environment that you adapt to your
>> >     needs. Essentially the image file containes the compiler of the
>> >     language (not the VM) , the language parser, the IDE tools, many
>> >     libraries and acts a bit like a virtual Operation System that runs
>> >     on top of a Virtual Machine (VM), similarly to ISO files.
>> >      > >
>> >      > > As you work in Pharo, these files are modified, so you need to
>> >     make sure that they are writable. The .image and .changes files are
>> >     intimately linked and should always be kept together, with matching
>> >     base filenames. Never edit them directly with a text editor, as
>> >     .images holds your live object runtime memory, which indexes into
>> >     the .changes files for the source. It is a good idea to keep a
>> >     backup copy of the downloaded .image and .changes files so you can
>> >     always start from a fresh image and reload your code. However the
>> >     most efficient way for backing up code is to use a version control
>> >     system that will provide an easier and powerful way to back up and
>> >     track your changes.
>> >      > >
>> >      > > The four main component files above can be placed in the same
>> >     directory, although it’s also possible to put the Virtual Machine
>> >     and sources file in a separate directory where everyone has
>> >     read-only access to them.
>> >      > >
>> >      > > If more than one image file is present in the same directory
>> >     pharo will prompt you to choose an image file you want to load.
>> >      > >
>> >      > > Do whatever works best for your style of working and your
>> >     operating system.
>> >      > >
>> >      > >
>> >      > >
>> >      > >
>> >      > >
>> >      > > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe
>> >     <sven at stfx.eu <mailto:sven at stfx.eu>> wrote:
>> >      > >
>> >      > > > On 13 Jan 2016, at 10:57, Dimitris Chloupis
>> >     <kilon.alios at gmail.com <mailto:kilon.alios at gmail.com>> wrote:
>> >      > > >
>> >      > > > I was adding a short description to the UPBE about sources
>> >     file , I always thought that the sources file is the file that
>> >     contains the source code of the image because the image file itself
>> >     stores only the bytecode.
>> >      > > >
>> >      > > > However its just came to my attention that the sources file
>> >     does not contain code that is recently installed in the image.
>> >      > > >
>> >      > > > So how exactly the sources file works and what it is ?
>> >      > >
>> >      > > The main perspective is from the object point of view: methods
>> >     are just objects like everything else. In order to be executable
>> >     they know their byte codes (which might be JIT compiled on
>> >     execution, but that is an implementation detail) and they know their
>> >     source code.
>> >      > >
>> >      > > Today we would probably just store the source code strings in
>> >     the image (maybe compressed) as memory is pretty cheap. But way back
>> >     when Smalltalk started, that was not the case. So they decided to
>> >     map the source code out to files.
>> >      > >
>> >      > > So method source code is a magic string (RemoteString) that
>> >     points to some position in a file. There are 2 files in use: the
>> >     sources file and the changes file.
>> >      > >
>> >      > > The sources file is a kind of snapshot of the source code of
>> >     all methods at the point of release of a major new version. That is
>> >     why there is a Vxy in their name. The source file never changes once
>> >     created or renewed (a process called generating the sources, see
>> >     PharoSourcesCondenser).
>> >      > >
>> >      > > While developing and creating new versions of methods, the new
>> >     source code is appended to another file called the changes file,
>> >     much like a transaction log. This is also a safety mechanism to
>> >     recover 'lost' changes.
>> >      > >
>> >      > > The changes file can contain multiple versions of a method.
>> >     This can be reduced in size using a process called condensing the
>> >     changes, see PharoChangesCondenser.
>> >      > >
>> >      > > On a new release, the changes file will be (almost) empty.
>> >      > >
>> >      > > HTH,
>> >      > >
>> >      > > Sven
>> >      > >
>> >      > >
>> >      > >
>> >      >
>> >      >
>> >
>> >
>>
>




More information about the Pharo-users mailing list