[Pharo-dev] FloatArray

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Tue May 21 04:05:41 EDT 2019


Hi Serge,
this is good news, having tensor flow bindings is also a must!
I have this in Smallapack with pure CPU unaccelerated blas (no MKL, nor
ATLAS, just plain and dumb netlib code)

| a b |
a := LapackDGEMatrix randNormal: #(1000 1000).
b := LapackDGEMatrix randNormal: #(1000 1000).
[a * b] timeToRun
 783

| a b |
a := LapackSGEMatrix randNormal: #(1000 1000).
b := LapackSGEMatrix randNormal: #(1000 1000).
[a * b] timeToRun
 448

Intel(R) Xeon(R) CPU E3-1245 v3 @ 3.40GHz
So I think that we can get much better with accelerated library!

Le mar. 21 mai 2019 à 05:13, Serge Stinckwich <serge.stinckwich at gmail.com>
a écrit :

> There is another solution with my TensorFlow Pharo binding:
> https://github.com/PolyMathOrg/libtensorflow-pharo-bindings
>
> You can do a matrix multiplication like that :
>
> | graph t1 t2 c1 c2 mult session result |
> graph := TF_Graph create.
> t1 := TF_Tensor fromFloats: (1 to:1000000) asArray shape:#(1000 1000).
> t2 := TF_Tensor fromFloats: (1 to:1000000) asArray shape:#(1000 1000).
> c1 := graph const: 'c1' value: t1.
> c2 := graph const: 'c2' value: t2.
> mult := c1 * c2.
> session := TF_Session on: graph.
> result := session runOutput: (mult output: 0).
> result asNumbers
>
> Here I'm doing a multiplication between 2 matrices of 1000X1000 size in
> 537 ms on my computer.
>
> All operations can be done in a graph of operations that is run outside
> Pharo, so could be very fast.
> Operations can be done on CPU or GPU. 32 bits or 64 bits float operations
> are possible.
>
> This is a work in progress but can already be used.
> Regards,
>
>
>
> On Tue, May 21, 2019 at 6:54 AM Jimmie Houchin <jlhouchin at gmail.com>
> wrote:
>
>> I wasn't worried about how to do sliding windows. My problem is that
>> using LapackDGEMatrix in my example was 18x slower than FloatArray, which
>> is slower than Numpy. It isn't what I was expecting.
>>
>> What I didn't know is if I was doing something wrong to cause such a
>> tremendous slow down.
>>
>> Python and Numpy is not my favorite. But it isn't uncomfortable.
>>
>> So I gave up and went back to Numpy.
>>
>> Thanks.
>>
>>
>>
>> On 5/20/19 5:17 PM, Nicolas Cellier wrote:
>>
>> Hi Jimmie,
>> effectively I did not subsribe...
>> Having efficient methods for sliding window average is possible, here is
>> how I would do it:
>>
>> "Create a vector with 100,000 rows filles with random values (uniform
>> distrubution in [0,1]"
>> v := LapackDGEMatrix randUniform: #(100000 1).
>>
>> "extract values from rank 10001 to 20000"
>> w1 := v atIntervalFrom: 10001 to: 20000 by: 1.
>>
>> "create a left multiplier matrix for performing average of w1"
>> a := LapackDGEMatrix nrow: 1 ncol: w1 nrow withAll: 1.0 / w1 size.
>>
>> "get the average (this is a 1x1 matrix from which we take first element)"
>> avg1 := (a * w1) at: 1.
>>
>> [ "select another slice of same size"
>> w2 := v atIntervalFrom: 15001 to: 25000 by: 1.
>>
>> "get the average (we can recycle a)"
>> avg2 := (a * w2) at: 1 ] bench.
>>
>> This gives:
>>  '16,500 per second. 60.7 microseconds per run.'
>> versus:
>> [w2 sum / w2 size] bench.
>>  '1,100 per second. 908 microseconds per run.'
>>
>> For max and min, it's harder. Lapack/Blas only provide max of absolute
>> value as primitive:
>> [w2 absMax] bench.
>>  '19,400 per second. 51.5 microseconds per run.'
>>
>> Everything else will be slower, unless we write new primitives in C and
>> connect them...
>> [w2 maxOf: [:each | each]] bench.
>>  '984 per second. 1.02 milliseconds per run.'
>>
>> Le dim. 19 mai 2019 à 14:58, Jimmie <jlhouchin at gmail.com> a écrit :
>>
>>> On 5/16/19 1:26 PM, Nicolas Cellier wrote:> Any feedback on this?
>>>  > Did someone tried to use Smallapack in Pharo?
>>>  > Jimmie?
>>>  >
>>>
>>> I am going to guess that you are not on pharo-users. My bad.
>>> I posted this in pharo-users as I it wasn't Pharo development question.
>>>
>>> I probably should have posted here or emailed you directly.
>>>
>>> All I really need is good performance with a simple array of floats. No
>>> matrix math. Nothing complicated. Moving Averages over a slice of the
>>> array. A variety of different averages, weighted, etc. Max/min of the
>>> array. But just a single simple array.
>>>
>>> Any help greatly appreciated.
>>>
>>> Thanks.
>>>
>>>
>>> On 4/28/19 8:32 PM, Jimmie Houchin wrote:
>>> Hello,
>>>
>>> I have installed Smallapack into Pharo 7.0.3. Thanks Nicholas.
>>>
>>> I am very unsure on my use of Smallapack. I am not a mathematician or
>>> scientist. However the only part of Smallapack I am trying to use at the
>>> moment is something that would  be 64bit and compare to FloatArray so
>>> that I can do some simple accessing, slicing, sum, and average on the
>>> array.
>>>
>>> Here is some sample code I wrote just to play in a playground.
>>>
>>> I have an ExternalDoubleArray, LapackDGEMatrix, and a FloatArray
>>> samples. The ones not in use are commented out for any run.
>>>
>>> fp is a download from
>>> http://ratedata.gaincapital.com/2018/12%20December/EUR_USD_Week1.zip
>>> and unzipped to a directory.
>>>
>>> fp := '/home/jimmie/data/EUR_USD_Week1.csv'
>>> index := 0.
>>> pricesSum := 0.
>>> asum := 0.
>>> ttr := [
>>>      lines := fp asFileReference contents lines allButFirst.
>>>      a := ExternalDoubleArray new: lines size.
>>>      "la := LapackDGEMatrix allocateNrow: lines size ncol: 1.
>>>      a := la columnAt: 1."
>>>      "a := FloatArray new: lines size."
>>>      lines do: [ :line || parts price |
>>>          parts := ',' split: line.
>>>          index := index + 1.
>>>          price := Float readFrom: (parts last).
>>>          a at: index put: price.
>>>          pricesSum := pricesSum + price.
>>>          (index rem: 100) = 0 ifTrue: [
>>>              asum := a sum.
>>>       ]]] timeToRun.
>>> { index. pricesSum. asum. ttr }.
>>>   "ExternalDoubleArray an Array(337588 383662.5627699992
>>> 383562.2956199993 0:00:01:59.885)"
>>>   "FloatArray  an Array(337588 383662.5627699992 383562.2954441309
>>> 0:00:00:06.555)"
>>>
>>> FloatArray is not the precision I need. But it is over 18x faster.
>>>
>>> I am afraid I must be doing something badly wrong. Python/Numpy is over
>>> 4x faster than FloatArray for the above.
>>>
>>> If I am using Smallapack incorrectly please help.
>>>
>>> Any help greatly appreciated.
>>>
>>> Thanks.
>>>
>>>
>>>
>
> --
> Serge Stinckwic
> h
>
> Int. Research Unit
>  on Modelling/Simulation of Complex Systems (UMMISCO)
> Sorbonne University
>  (SU)
> French National Research Institute for Sustainable Development (IRD)
> U
> niversity of Yaoundé I, Cameroon
> "Programs must be written for people to read, and only incidentally for
> machines to execute."
> https://twitter.com/SergeStinckwich
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20190521/dcadd2d2/attachment.html>


More information about the Pharo-dev mailing list