[Pharo-dev] FloatArray

Jimmie Houchin jlhouchin at gmail.com
Tue May 21 06:50:16 EDT 2019


Not a problem. I greatly respective other peoples time and priorities 
and their personal lives.

Just for the record I am using 64bit Pharo on a fast i7, 16gb ram, 
laptop running Xubunut 18.04 64bit.

I do not remember any problems loading. And within the small amount of 
experimenting that I did, it seemed to operate fine.

Again, thanks for your contribution. I know it is a lot of work and a 
pretty large area to cover. Python/Numpy has armies of people working on 
this.

Jimmie


On 5/21/19 2:54 AM, Nicolas Cellier wrote:
> Hi Jimmie,
> I didn't take time yesterday to analyze your specific example because 
> it was quite late, but here are some remarks:
> 1) First, I recommend using 64bits Pharo, because number crunching and 
> Float operations will be faster (not FloatArray though).
> 2) it would be nice to use a profiler to analyze where time is spent
>   I would not be amazed that (Float readFrom:...) takes a non 
> neglectable percentage of it
> 3) ExternalDoubleArray only add overhead if no bulk-operation is performed
>   (like reading raw binary data or serving as storage area passed to 
> Lapack/blas primitives)
>   it does not provide accelerated features by itself indeed.
>   I think that it is too low level to serve as a primary interface.
> 4) LapackXXXMatrix sum has effectively not been optimized to use BLAS, 
> and this can be easily corrected, thanks for giving this example.
> With some cooperation, we could easily make some progress, there are 
> low hanging fruits.
> But I understand if you prefer to stick with more mature numpy solution.
> Thanks for trying. At least, you were able to load and use Smallapack 
> in Pharo, and this is already a good feedback.
> If you have time, I'll publish a small enhancement for accelerating 
> sum, and ask you to retry.
> Thanks again
>
>
> Le mar. 21 mai 2019 à 05:13, Serge Stinckwich 
> <serge.stinckwich at gmail.com <mailto:serge.stinckwich at gmail.com>> a écrit :
>
>     There is another solution with my TensorFlow Pharo binding:
>     https://github.com/PolyMathOrg/libtensorflow-pharo-bindings
>
>     You can do a matrix multiplication like that :
>
>     | graph t1 t2 c1 c2 mult session result |
>     graph := TF_Graph create.
>     t1 := TF_Tensor fromFloats: (1 to:1000000) asArray shape:#(1000 1000).
>     t2 := TF_Tensor fromFloats: (1 to:1000000) asArray shape:#(1000 1000).
>     c1 := graph const: 'c1' value: t1.
>     c2 := graph const: 'c2' value: t2.
>     mult := c1 * c2.
>     session := TF_Session on: graph.
>     result := session runOutput: (mult output: 0).
>     result asNumbers
>
>     Here I'm doing a multiplication between 2 matrices of 1000X1000
>     size in 537 ms on my computer.
>
>     All operations can be done in a graph of operations that is run
>     outside Pharo, so could be very fast.
>     Operations can be done on CPU or GPU. 32 bits or 64 bits float
>     operations are possible.
>
>     This is a work in progress but can already be used.
>     Regards,
>
>
>
>     On Tue, May 21, 2019 at 6:54 AM Jimmie Houchin
>     <jlhouchin at gmail.com <mailto:jlhouchin at gmail.com>> wrote:
>
>         I wasn't worried about how to do sliding windows. My problem
>         is that using LapackDGEMatrix in my example was 18x slower
>         than FloatArray, which is slower than Numpy. It isn't what I
>         was expecting.
>
>         What I didn't know is if I was doing something wrong to cause
>         such a tremendous slow down.
>
>         Python and Numpy is not my favorite. But it isn't uncomfortable.
>
>         So I gave up and went back to Numpy.
>
>         Thanks.
>
>
>
>         On 5/20/19 5:17 PM, Nicolas Cellier wrote:
>>         Hi Jimmie,
>>         effectively I did not subsribe...
>>         Having efficient methods for sliding window average is
>>         possible, here is how I would do it:
>>
>>         "Create a vector with 100,000 rows filles with random values
>>         (uniform distrubution in [0,1]"
>>         v := LapackDGEMatrix randUniform: #(100000 1).
>>
>>         "extract values from rank 10001 to 20000"
>>         w1 := v atIntervalFrom: 10001 to: 20000 by: 1.
>>
>>         "create a left multiplier matrix for performing average of w1"
>>         a := LapackDGEMatrix nrow: 1 ncol: w1 nrow withAll: 1.0 / w1
>>         size.
>>
>>         "get the average (this is a 1x1 matrix from which we take
>>         first element)"
>>         avg1 := (a * w1) at: 1.
>>
>>         [ "select another slice of same size"
>>         w2 := v atIntervalFrom: 15001 to: 25000 by: 1.
>>
>>         "get the average (we can recycle a)"
>>         avg2 := (a * w2) at: 1 ] bench.
>>
>>         This gives:
>>          '16,500 per second. 60.7 microseconds per run.'
>>         versus:
>>         [w2 sum / w2 size] bench.
>>          '1,100 per second. 908 microseconds per run.'
>>
>>         For max and min, it's harder. Lapack/Blas only provide max of
>>         absolute value as primitive:
>>         [w2 absMax] bench.
>>          '19,400 per second. 51.5 microseconds per run.'
>>
>>         Everything else will be slower, unless we write new
>>         primitives in C and connect them...
>>         [w2 maxOf: [:each | each]] bench.
>>          '984 per second. 1.02 milliseconds per run.'
>>
>>         Le dim. 19 mai 2019 à 14:58, Jimmie <jlhouchin at gmail.com
>>         <mailto:jlhouchin at gmail.com>> a écrit :
>>
>>             On 5/16/19 1:26 PM, Nicolas Cellier wrote:> Any feedback
>>             on this?
>>              > Did someone tried to use Smallapack in Pharo?
>>              > Jimmie?
>>              >
>>
>>             I am going to guess that you are not on pharo-users. My bad.
>>             I posted this in pharo-users as I it wasn't Pharo
>>             development question.
>>
>>             I probably should have posted here or emailed you directly.
>>
>>             All I really need is good performance with a simple array
>>             of floats. No
>>             matrix math. Nothing complicated. Moving Averages over a
>>             slice of the
>>             array. A variety of different averages, weighted, etc.
>>             Max/min of the
>>             array. But just a single simple array.
>>
>>             Any help greatly appreciated.
>>
>>             Thanks.
>>
>>
>>             On 4/28/19 8:32 PM, Jimmie Houchin wrote:
>>             Hello,
>>
>>             I have installed Smallapack into Pharo 7.0.3. Thanks
>>             Nicholas.
>>
>>             I am very unsure on my use of Smallapack. I am not a
>>             mathematician or
>>             scientist. However the only part of Smallapack I am
>>             trying to use at the
>>             moment is something that would  be 64bit and compare to
>>             FloatArray so
>>             that I can do some simple accessing, slicing, sum, and
>>             average on the array.
>>
>>             Here is some sample code I wrote just to play in a
>>             playground.
>>
>>             I have an ExternalDoubleArray, LapackDGEMatrix, and a
>>             FloatArray
>>             samples. The ones not in use are commented out for any run.
>>
>>             fp is a download from
>>             http://ratedata.gaincapital.com/2018/12%20December/EUR_USD_Week1.zip
>>             and unzipped to a directory.
>>
>>             fp := '/home/jimmie/data/EUR_USD_Week1.csv'
>>             index := 0.
>>             pricesSum := 0.
>>             asum := 0.
>>             ttr := [
>>                  lines := fp asFileReference contents lines allButFirst.
>>                  a := ExternalDoubleArray new: lines size.
>>                  "la := LapackDGEMatrix allocateNrow: lines size ncol: 1.
>>                  a := la columnAt: 1."
>>                  "a := FloatArray new: lines size."
>>                  lines do: [ :line || parts price |
>>                      parts := ',' split: line.
>>                      index := index + 1.
>>                      price := Float readFrom: (parts last).
>>                      a at: index put: price.
>>                      pricesSum := pricesSum + price.
>>                      (index rem: 100) = 0 ifTrue: [
>>                          asum := a sum.
>>                   ]]] timeToRun.
>>             { index. pricesSum. asum. ttr }.
>>               "ExternalDoubleArray an Array(337588 383662.5627699992
>>             383562.2956199993 0:00:01:59.885)"
>>               "FloatArray  an Array(337588 383662.5627699992
>>             383562.2954441309
>>             0:00:00:06.555)"
>>
>>             FloatArray is not the precision I need. But it is over
>>             18x faster.
>>
>>             I am afraid I must be doing something badly wrong.
>>             Python/Numpy is over
>>             4x faster than FloatArray for the above.
>>
>>             If I am using Smallapack incorrectly please help.
>>
>>             Any help greatly appreciated.
>>
>>             Thanks.
>>
>>
>
>
>     -- 
>     Serge Stinckwic
>     ​h​
>
>     Int. Research Unit
>      on Modelling/Simulation of Complex Systems (UMMISCO)
>     ​Sorbonne University
>      (SU)
>     French National Research Institute for Sustainable Development (IRD)​
>     U
>     ​niversity of Yaoundé I​, Cameroon
>     "Programs must be written for people to read, and only
>     incidentally for machines to execute."
>     https://twitter.com/SergeStinckwich
>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20190521/0dd6db21/attachment.html>


More information about the Pharo-dev mailing list