[Pharo-users] Web browsing automation

Cédrick Béler cdrick65 at gmail.com
Wed May 4 09:51:44 EDT 2016


Soup is quite easy to use (event if I would prefer an API that uses jQuery/CSS likes navigation selectors.
Easier way is doing it interactively (in a debugger).

Just to give an example so that you can start quickly, I just did something to scrap info from this web site db-ip.
Info is in a table and I want its content:
https://db-ip.com/8.8.8.8
So here is the sample code (soup has lots of examples too):
self cache ''a dictionary to avoid scrapping again'' at: anIPString ifAbsent: [ soup := Soup fromUrl: 'https://db-ip.com/', anIPString. soupTable := soup findAllTags: 'table'. "ensure size =1" geolocDict := Dictionary new. soupTable first childTagsDo: [ :tag | geolocDict at: (tag children first next contents) put: (tag children second next contents) ]. self cache at: anIPString put: geolocDict].

Cheers, Cédrik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-users_lists.pharo.org/attachments/20160504/cb44c815/attachment.html>


More information about the Pharo-users mailing list