[Pharo-users] Understanding the role of the sources file

Sven Van Caekenberghe sven at stfx.eu
Wed Jan 13 08:34:39 EST 2016


> On 13 Jan 2016, at 14:22, Dimitris Chloupis <kilon.alios at gmail.com> wrote:
> 
> I assume you have never read a an introduction to C++ then :D

I have and they are too complex.

> here is the final addition for the vm
> 
> (Vm) is the only component that is different for each operating system. The main purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed, but also to generally handle low level functionality like interpreting code, handling OS events (mouse and keyboard), calling C libraries etc. Pharo 4 comes with the Cog VM a very fast JIT VM. 

You added more technical stuff. I tried to make my point, but you are writing it, not me, so I won't continue.

> I think its clear, precise and does not leave much room for confusion. Personally I think its very important for the absolute begineer to have strong foundations of understanding the fundamental of Pharo and not for things to appear magical and "dont touch this". 

Nobody can understand everything at the same time, even experts work on partial abstractions while ignoring most other details.

A beginner has to be guided to the most important concepts first. Explaining something in a simple way is very hard (and I fail most of the time doing that).

Pharo offers curious minds unlimited opportunity to explore, while staying in the same system, but that does not mean everything should be mentioned immediately.

> On Wed, Jan 13, 2016 at 2:54 PM Sven Van Caekenberghe <sven at stfx.eu> wrote:
> 
> > On 13 Jan 2016, at 13:42, Dimitris Chloupis <kilon.alios at gmail.com> wrote:
> >
> > I mentioned bytecode because I dont want the user to see at some point bytecode and say "What the hell is that" I want the reader to feel confident that at least understands the basic in Pharo. Also very brief explanations about bytecode I have seen in similar python tutorials. Obviously I dont want to go any deeper than that because the user wont have to worry about the technical details on a daily basis anyway.
> >
> > I agree that I could add a bit more on the VM description similar to what you posted. I am curious though, wont even the interpreter generate machine code in order to execute the code  or does it use existing machine code inside the VM binary ?
> 
> No, a classic interpreter does not 'generate' machine code, it is just a program that reads and executes bytes codes in a loop, the interpreter 'is' machine code.
> 
> No offence, but you see why I think it is important to not try to use or explain too much complex concepts in the 1st chapter.
> 
> Learning to program is hard. It should first be done abstractly. Think about Scratch. The whole idea of Smalltalk is to create a world of interacting objects. (Even byte code is not a necessary concept at all, for example, in Pharo, you can compile (translate) to AST and execute that, I believe. There are Smalltalk implementations that compile directly to C or JavaScript). Hell, even 'compile' is not necessary, just 'accept'. See ?
> 
> > On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe <sven at stfx.eu> wrote:
> > Sounds about right.
> >
> > Now, I would swap 1 and 4, as the image is the most important abstraction.
> >
> > There is also a bit too much emphasis on (byte|source)code. This is already pretty technical (it assume you know what compilation is and so on). But I understand it must be explained here, and you did it well.
> >
> > However, I would start by saying that the image is a snapshot of the object world in memory that is effectively a live Pharo system. It contains everything that is available and that exists in Pharo. This includes any objects that you created yourself, windows, browsers, open debuggers, executing processes, all meta objects as well as all representations of code.
> >
> > <sidenote>
> > The fact that there is a sources and changes file is an implementation artefact, not something fundamental. There are ideas to change this in the future (but you do not have to mention that).
> > </sidenote>
> >
> > Also, the VM not only executes code, it maintains the object world, which includes the ability to load and save it from and to an image. It creates a portable (cross platform) abstraction that isolates the image from the particular details of the underlying hardware and OS. In that role it implements the interface with the outside world. I would mention that second part before mentioning the code execution.
> >
> > The sentence "The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed." is not 100% correct. It is possible to execute the byte code without converting it. This is called interpretation. JIT is a faster technique that includes converting (some often used) byte code to machine code and caching that.
> >
> > I hope this helps (it is hard to write a 'definitive explanation' as there are some many aspects to this and it depends on the context/audience).
> >
> > > On 13 Jan 2016, at 12:58, Dimitris Chloupis <kilon.alios at gmail.com> wrote:
> > >
> > > So I am correct that the image does not store the source code, and that the source code is stored in sources and changes. The only diffirence is that the objects have a source variable that points to the right place for finding the source code.
> > >
> > > This is the final text if you find anything incorrect please correct me
> > >
> > > ---------------
> > >
> > > 1. The virtual machine (VM) is the only component that is different for each operating system. The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable is named:
> > >
> > > • Pharo.exe for Windows; • pharo for Linux ; and
> > >
> > > • Pharo for OSX (inside a package also named Pharo.app).
> > > The other components below are portable across operating systems, and
> > >
> > > can be copied and run on any appropriate virtual machine.
> > >
> > > 2. The sources file contains source code for parts of Pharo that don’t change frequently. Sources file is important because the image file format stores only the bytecode of live objects and not their source code. Typically a new sources file is generated once per major release of Pharo. For Pharo 4.0, this file is named PharoV40.sources.
> > >
> > > 3. The changes file logs of all source code modifications since the .sources file was generated. This facilitates a per method history for diffs or re- verting.That means that even if you dont manage to save the image file on a crash or you just forgot you can recover your changes from this file. Each release provides a near empty file named for the release, for example Pharo4.0.changes.
> > >
> > > 4. The image file provides a frozen in time snapshot of a running Pharo system. This is the file where the Pharo bytecode is stored and as such its a cross platform format. This is the heart of Pharo, containing the live state of all objects in the system (including classes and methods, since they are objects too). The file is named for the release (like Pharo4.0.image).
> > >
> > > The .image and .changes files provided by a Pharo release are the starting point for a live environment that you adapt to your needs. Essentially the image file containes the compiler of the language (not the VM) , the language parser, the IDE tools, many libraries and acts a bit like a virtual Operation System that runs on top of a Virtual Machine (VM), similarly to ISO files.
> > >
> > > As you work in Pharo, these files are modified, so you need to make sure that they are writable. The .image and .changes files are intimately linked and should always be kept together, with matching base filenames. Never edit them directly with a text editor, as .images holds your live object runtime memory, which indexes into the .changes files for the source. It is a good idea to keep a backup copy of the downloaded .image and .changes files so you can always start from a fresh image and reload your code. However the most efficient way for backing up code is to use a version control system that will provide an easier and powerful way to back up and track your changes.
> > >
> > > The four main component files above can be placed in the same directory, although it’s also possible to put the Virtual Machine and sources file in a separate directory where everyone has read-only access to them.
> > >
> > > If more than one image file is present in the same directory pharo will prompt you to choose an image file you want to load.
> > >
> > > Do whatever works best for your style of working and your operating system.
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <sven at stfx.eu> wrote:
> > >
> > > > On 13 Jan 2016, at 10:57, Dimitris Chloupis <kilon.alios at gmail.com> wrote:
> > > >
> > > > I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.
> > > >
> > > > However its just came to my attention that the sources file does not contain code that is recently installed in the image.
> > > >
> > > > So how exactly the sources file works and what it is ?
> > >
> > > The main perspective is from the object point of view: methods are just objects like everything else. In order to be executable they know their byte codes (which might be JIT compiled on execution, but that is an implementation detail) and they know their source code.
> > >
> > > Today we would probably just store the source code strings in the image (maybe compressed) as memory is pretty cheap. But way back when Smalltalk started, that was not the case. So they decided to map the source code out to files.
> > >
> > > So method source code is a magic string (RemoteString) that points to some position in a file. There are 2 files in use: the sources file and the changes file.
> > >
> > > The sources file is a kind of snapshot of the source code of all methods at the point of release of a major new version. That is why there is a Vxy in their name. The source file never changes once created or renewed (a process called generating the sources, see PharoSourcesCondenser).
> > >
> > > While developing and creating new versions of methods, the new source code is appended to another file called the changes file, much like a transaction log. This is also a safety mechanism to recover 'lost' changes.
> > >
> > > The changes file can contain multiple versions of a method. This can be reduced in size using a process called condensing the changes, see PharoChangesCondenser.
> > >
> > > On a new release, the changes file will be (almost) empty.
> > >
> > > HTH,
> > >
> > > Sven
> > >
> > >
> > >
> >
> >
> 
> 





More information about the Pharo-users mailing list