Chapter 5 of Programming Collective Intelligence (PCI) deals with optimisation problems.
"Optimisation finds the best solution to a problem by trying many different solution and scoring them to determine their quality. Optimisation is typically used in cases where there are too many possible solutions to try them all"
Before embarking on this chapter I decided that it would be best to quickly learn Python, there just seems to be a lot of Python around as soon as you start learning and reading about machine learning and data analysis, it can't actually be ignored.
(Still not sure why this is the case, but set out to get up an running with Python, in 1 weekend)
Some of the resources I used:
As a developer, learning the basics of Python really isn't very difficult, to be honest it probably took me longer to find an development environment I was happy with, consoles and text editors just don't do it for me.
The main ones I investigated were:
Ninja IDE (Free)
Eclipse + PyDev (Free)
I spent quite a bit of time playing with Ninja IDE and Eclipse, but there were just little things that kept bugging me, from strange short cuts to highlighting correct code / syntax as incorrect.
10 minutes after installing PyCharm, I was sold. To be fair, I am probably not the best person to judge.
I code in IntelliJ daily and actually ended up converting all the java developers in my department to drop Eclipse and start using IntelliJ.... I also did the majority of my Objective-C work in AppCode, in other words... I am a JetBrains fanboy, happy to hand over my money for an awesome tool.
Getting back to PCI, where were a couple issues with the code in this chapter, which caused me (a person that just learnt Python) a little bit of pain, 'cause I figured the code had to be right and I was just doing something wrong, eventually I went searching and found:
With that I corrected the issues in the hillclimb and genetic algorithm functions:
The java implementation for the 3 functions ended up twice as long and looking like:
And unlike my previous posts on PCI, I didn't use a whole bunch of open source libraries, only added one.
The whole Chapter 5 Optimisation solution is in my Blog Github repo, the concepts used in both the Simulated Annealing and Genetic Algorithm could easily be adapted and used again if looking for a simple example of those concepts.
Now for Chapter 6 ... Document Filtering...
I have recently been slacking on content on my blog, between long stressful hours at work and to the wonderful toy that is an iPhone, I have...
I make no claim to be a "computer scientist" or a software "engineer", those titles alone can spark some debate, I regar...
I saw an article (well more of a rant) the other day, by Rob Williams Brain Drain in enterprise Dev . I have to say, I do agree with some o...
This series of posts will be about me getting to grips with JBoss Drools . The reasoning behind it is: SAP bought out my company's curre...
I recently finished 97 Things every programmer should know . Well to be completely honest I did skim over a couple of the 97, but all and al...