[Pharo-dev] help about codeImporter

Stéphane Ducasse stephane.ducasse at inria.fr
Fri Dec 6 15:35:25 EST 2013


On Dec 6, 2013, at 3:15 PM, Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com> wrote:

> But MC should work better now that sources are UTF8 encoded (for a few months).
> 
> The problem with old squeak/pharo/MC is that encoding did switch for iso-8859L1 (latin1) to UTF32 if ever a wide character was encountered...
> But this wasn't done properly with the ugly text converters, basicNextPut: et all, the generated stuff was indeed UTF32, but only N bytes would be written instead of N characters !!! That means that you only stored (an can retrieve) first 1/4 of source...
> But you can have more luck, because the ugglyness did not stop there: it's possible that first buffers (4096 bytes) were already sent in latin1 encoding, and the next ones in UTF32 (with size bug). In which cas you can retrieve a bit more of your sources.
> I have a prototype to decode such messy sources, but did not publish it, since you can't recover the whole code anyway.

arghhhhhhh (deep sounds stef falling from a cliff :)
If somebody has time and knowledge to radically fix that please shot. 

> If ever you have problem with recent MC and improper UTF8 please, please report.

For the moment I just have problem with importing old VisualWorks code into Pharo via fileIn :)

> 
> 
> 
> 2013/12/6 Stephan Eggermont <stephan at stack.nl>
> Ben wrote:
> >who put a ô in the code at the first place ? :P
> 
> Doesn’t happen often, I’m happy to observe. Strings in code
> with interesting characters are a much more common problem,
> though. Made it impossible to import MCs into Gemstone.
> 
> Stephan
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20131206/fd7dc4c5/attachment-0002.html>


More information about the Pharo-dev mailing list