[Pharo-users] [ANN] NeoCSV
Sven Van Caekenberghe
sven at beta9.be
Tue Jun 26 08:45:47 EDT 2012
Doru,
On 26 Jun 2012, at 14:21, Tudor Girba wrote:
> I was just teasing :).
Now that I met you in real life, I really can't imagine you would do something like that, teasing people just for the fun of it ;-)
Actually, just yesterday I was further optimizing NeoCSV on an actual example, reading a 3.3 MB file with 140.000 entries like
16777216,17301503,AU
17367040,17432575,MY
17435136,17435391,AU
17498112,17563647,KR
17563648,17825791,CN
17825792,18087935,KR
18153472,18219007,JP
The old code did very simply this:
readFrom: filename
"Read from a CSV with field start,stop,code as in 3651886848,3651887103,BE"
"self readFrom: '/Users/sven/Tmp/geo-ip-country/GeoIPCountry.csv'."
| instance data |
instance := self new.
data := OrderedCollection new: 145000.
FileStream oldFileNamed: filename do: [ :stream |
[ stream atEnd ] whileFalse: [ | tokens range |
tokens := stream nextLine findTokens: ','.
range := IPAddressRangeCountry
from: tokens first asNumber
to: tokens second asNumber
country: tokens third asSymbol.
data add: range ] ].
instance data: data asArray.
^ instance
The new code using NeoCSV is this:
readFrom: filename
"Read from a CSV with field start,stop,code as in 3651886848,3651887103,BE"
"self readFrom: '/Users/sven/Tmp/geo-ip-country/GeoIPCountry.csv'."
^ self new
data: (FileStream oldFileNamed: filename do: [ :stream |
(NeoCSVReader on: stream)
recordClass: IPAddressRangeCountry;
addIntegerField: #start: ; addIntegerField: #stop: ; addSymbolField: #country: ;
upToEnd ]);
yourself
The new code is simpler, does more internally, and is faster (3.5 vs 2.5 seconds).
Yes I am happy, and looking for users ;-)
Now I am going back to struggling with Metacello.
Sven
--
Sven Van Caekenberghe
http://stfx.eu
Smalltalk is the Red Pill
More information about the Pharo-users
mailing list