I just completed working through Chapter 7 of Programming Collective Intelligence (PCI). This chapter demonstrates how, when and who you should use the decision tree construct. The method described was the CART technique.
The basic summary is: A decision tree has each branch node represent a choice between a number of alternatives, and each leaf node represents a decision or (classification). This makes decision tree another supervised machine learning algorithm useful in classifying information.
The main problem it overcome in defining a decision tree is how to identify the best split of the data points. To find this you need to go through all the sets of data, and identify which will give you the best split (gain) and start from there.
For some more technical information about this split / gain:
http://en.wikipedia.org/wiki/Information_gain_in_decision_trees
The biggest advantages I see in using a decision tree are:
It's easy it is to interpret and visualise.
Data didn't need to be normalised or something between -1 and 1.
Decision trees however cant be effectively used on large datasets with a large number of results.
As with my previous Classifiers post, I ended up using SQLite in memory db as it's such a pleasure to use. I did venture into using LambdaJ, but it actually ended up being such an ugly line of code I left it and simply did it manually. I have not looked at the Java 8 implementation of lambdas yet, I just hope it doesn't end in code like (with a whole bunch of static imports):
falseList.add(filter(not(having(on(List.class).get(col).toString(), equalTo((String) value))), asList(rows)));
So my java implementation of the PCI decision tree ended up looking like (All code in Github) :
(once again ... about 50% more code :) ).. really beginning to enjoy Python, I do see me using that for all future AI / ML type work as a first choice.
Subscribe to:
Post Comments (Atom)
Building KubeSkippy: Learnings from a thought experiment
So, I got Claude Code Max and I thought of what would be the most ambitious thing I could try "vibe"? As my team looks after Kuber...
-
I make no claim to be a "computer scientist" or a software "engineer", those titles alone can spark some debate, I regar...
-
So I recently needed to re-certify my AWS Certified Solutions Architect - Professional certification. I tried to keep track of everything I ...
This story is very important.
ReplyDeleteĐặt vé máy bay tại Aivivu, tham khảo
ReplyDeleteVe may bay di My
vé máy bay từ mỹ về vn
thời gian bay từ nhật về vn
vé máy bay từ đức về việt nam
đăng ký bay từ canada về Việt Nam
dat ve may bay tu han quoc ve viet nam
chuyến bay chuyên gia về việt nam
i am glad to discover this page : i have to thank you for the time i spent on this especially great reading !! i really liked each part and also bookmarked you for new information on your site.
ReplyDeletemachine learning institute in delhi