[Pharo-users] [Lse-pharo4pharo] Alignment visualization performance

Hernán Morales Durand hernan.morales at gmail.com
Sat Feb 22 17:26:42 EST 2020


Done here : https://github.com/pharo-project/pharo/issues/5737

Cheers,

Hernán

El sáb., 22 feb. 2020 a las 5:22, Stéphane Ducasse (<
stephane.ducasse at inria.fr>) escribió:

> could you open a bug entry and we will tag it for large images.
>
> On 21 Feb 2020, at 06:40, Hernán Morales Durand <hernan.morales at gmail.com>
> wrote:
>
> Hello guys.
>
> I want to visualize DNA sequence alignments in Pharo 8. For this task most
> bioinformatics applications set a background color for each letter. But in
> Pharo the Inspector is too slow to open even for just one small sequence of
> 1Kb. Consider now there are about 37k sequences of COVID-19 and each genome
> contains about 30k of letters, so visualizing and scrolling these should be
> fast (as for zooming).
>
> But have a look at this script which takes about 6 seconds to open an
> Inspector. The script uses BioSmalltalk, and the code could be enhanced for
> sure, but that is not relevant to my performance problem of visualization:
>
> [
> | text attributes |
> " Generate a Text object from a random sequence "
> text := ((BioSequence forAlphabet: BioDNAAlphabet) randomLength: 1000)
> sequence asText.
> " Setup an array for each nucleotide background color "
> attributes := Array new: text size.
> 1 to: text size do: [ : index |
> attributes at: index put: {
> (TextBackgroundColor color: (BioDNAAlphabet colorMap at: (text at:
> index))) }  ].
> text runs: (RunArray newFrom: attributes).
> text inspect
> ] timeToRun asString  "'0:00:00:05.911'"
>
> Also, resizing the opened Inspector takes 2-3 seconds to refresh.
> You can see the output here: https://imgur.com/a/xUlBeVY
>
> I should say without the #inspect the code ran without performance issues:
> "'0:00:00:00.009'"
>
> So I ran again the script for different sequence sizes:
>
> String streamContents: [ : stream |
> 100 to: 2000 by: 100 do: [ : sl |
> stream nextPutAll: ([
> | text attributes |
> " Generate a Text object from a random sequence "
> text := ((BioSequence forAlphabet: BioDNAAlphabet) randomLength: sl)
> sequence asText.
> " Setup an array for each nucleotide background color "
> attributes := Array new: text size.
> 1 to: text size do: [ : index |
> attributes at: index put: {
> (TextBackgroundColor color: (BioDNAAlphabet colorMap at: (text at:
> index))) }  ].
> text runs: (RunArray newFrom: attributes).
> text inspect
> ] timeToRun asString);
> cr
> ]
> ]
>
> And these are the results:
>
> 0:00:00:00.147
> 0:00:00:00.28
> 0:00:00:00.568
> 0:00:00:00.993
> 0:00:00:01.776
> 0:00:00:02.123
> 0:00:00:03.111
> 0:00:00:04.084
> 0:00:00:04.574
> 0:00:00:06.192
> 0:00:00:07.214
> 0:00:00:07.915
> 0:00:00:10.382
> 0:00:00:12.725
> 0:00:00:12.359
> 0:00:00:17.357
> 0:00:00:17.147
> 0:00:00:20.651
> 0:00:00:20.392
> 0:00:00:23.238
>
> At first I thought it was a problem of the Glamout text renderer for
> Rubric Text, but profiling a single pass of the snippet for 2000 letters,
> shows a couple of methods in Rubric scanner, after some DNU sends, which
> are consuming a lot of the time:
> RubCharacterBlockScanner(RubCharacterBlockScanner) >>
> characterBlockAtPoint:index:in: and
> RubCharacterBlockScanner(RubCharacterBlockScanner) >> endOfRun". I attached
> the full profiler report so you may have a look if you like. But the
> summary is:
>
> **Leaves**
> 37.4% {8800ms}
> RubCompositionScanner(RubCharacterScanner)>>basicScanCharactersFrom:to:in:rightX:stopConditions:kern:
> 6.3% {1476ms} Dictionary>>at:ifAbsentPut:
> 6.1% {1425ms} Context>>unwindComplete
> 4.6% {1082ms} Semaphore>>criticalReleasingOnError:
> 4.2% {991ms} Dictionary>>at:ifAbsent:
> 3.3% {785ms} Context>>aboutToReturn:through:
> 2.2% {527ms} Context>>resume:through:
> 2.0% {470ms} ExternalAddress>>isNull
> 1.8% {421ms} BlockClosure>>on:do:
> 1.7% {402ms}
> RubCharacterBlockScanner(RubCharacterScanner)>>setConditionArray:
> 1.6% {378ms} FreeTypeFace>>validate
> 1.6% {376ms} Dictionary>>scanFor:
> 1.5% {364ms} Context>>unwindComplete:
> 1.5% {344ms} Context>>unwindBlock
> 1.4% {323ms} Array(SequenceableCollection)>>do:
> 1.3% {299ms} Dictionary(HashedCollection)>>findElementOrNil:
> 1.2% {293ms} RunArray>>at:setRunOffsetAndValue:
> 1.2% {289ms} FreeTypeCache>>atFont:charCode:type:ifAbsentPut:
> 1.1% {252ms} FreeTypeCacheLinkedList>>moveDown:
>
> **Memory**
> old +0 bytes
> young -1,485,272 bytes
> used -1,485,272 bytes
> free +1,485,272 bytes
>
> **GCs**
> full 0 totalling 0ms (0.0% uptime)
> incr 947 totalling 1,576ms (7.0% uptime), avg 2.0ms
> tenures 0
> root table 0 overflows
>
> So my question is, is there any other text rendering backends to try? And
> when I say backends I say which don't use Rubric.
>
> Cheers,
>
> Hernán
>
> <Profile_DNABgColoring.txt>_______________________________________________
> Lse-pharo4pharo mailing list
> Lse-pharo4pharo at lists.gforge.inria.fr
> https://lists.gforge.inria.fr/mailman/listinfo/lse-pharo4pharo
>
>
> --------------------------------------------
> Stéphane Ducasse
> http://stephane.ducasse.free.fr / http://www.pharo.org
> 03 59 35 87 52
> Assistant: Julie Jonas
> FAX 03 59 57 78 50
> TEL 03 59 35 86 16
> S. Ducasse - Inria
> 40, avenue Halley,
> Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
> Villeneuve d'Ascq 59650
> France
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-users_lists.pharo.org/attachments/20200222/4b82977c/attachment.html>


More information about the Pharo-users mailing list