Sunday, September 15, 2013

Wordle... so nicely done

Discovered Wordle this morning, pointed to my blog... guess my recent posts really haven't been about java much :)

Wordle: My blog - recent posts

Saturday, September 7, 2013

Sourcing Twitter data, based on search terms

I started messing about with sourcing data from twitter, looking to use this with NLTK and maybe SOLR sometime in the future. I created a simple iPython Notebook on how go grab data from a twitter search stream, all details included in the notebook

I unfortunately couldn't find a simple way to imbed the notebook in Blogger, not wanting to waste time on that I just hosted it as a Gist. It can be viewed here: NBViewer

Wednesday, September 4, 2013

Review: Learning IPython for Interactive Computing and Data Visualization

I have just completed working through Learning IPython for Interactive Computing and Visualization,

Having seen references to iPython from my first ever google for 'python', I somehow managed to disregarded it with the sentiment of who works in a console?? or a browser notebook? what is that? ...
I need an IDE with folders / modules / files / projects... what a shame I wasted so much time...
I blame too many years in Visual Studio, Eclipse, Jetbrains IDEs and XCode for making me ignore this long.
Thankfully I have gotten past that, and this book helps you getting there fast... < 150 pages fast.

IPython, and especially the IPython Notebooks are great tools. I can see it being awesome for a whole number of tasks:

  • learning python and working through books and tutorials
  • running data mining brainstorming sessions 
  • showing people the latest and greatest stuff you've have come up
  • quick cython implementations & performance experiments
  • processing multiple cores / servers 
  • I even saw Harvard now uses it for homework assignments.

That list can just go on and on, but coming back to the book. It was targeted at 2.7, obviously I didn't listen and worked through it in Pythong 3.3., but thankfully there were only a couple very minor changes:

The book uses urllib2 in a couple, that can be replaced with:

import urllib
r = urllib.request.urlopen(')


For the networkx example where was also a slight change:

sg = nx.connected_component_subgraphs(g)

This returned a list of graphs, not a graph, so I just looped the following:

for grp in sg:
    nx.draw_networkx(grp, node_size...


Then for the maps exercise I did not have all the dependancies:
I need to Install GEOS...I used MacPorts for that:
sudo port install geos

Then in my .bash_profile I added:
export GEOS_DIR=/opt/local

To refresh the profile:
source ~/.bash_profile

Then for Basemap, downloaded the zip, here.
Followed by(in basemap-1.0.7 dir):
python setup.py install

That's about it, concise intro for a great product.

Now to really put it to the test the next book I am working through:
Building Machine Learning Systems with Python


Popular Posts

Followers