Saturday, May 11, 2013

Similarity Score Algorithms

As per my previous post, I am working through Programming Collection Intelligence the first couple algorithms described in this book are regarding finding a similarity score, the methods they work through are Euclidean Distance and the Pearson Correlation Coefficient. The Manhattan distance score is also mentioned but some what I could find it seems that it is just the sum of the (absolute) differences of their coordinates, instead of Math.pow 2 used in Euclidean distance.

I worked through this and wrote/found some java equivalents for future use:

Euclidean Distance:

Pearson Correlation Coefficient:

Friday, May 3, 2013

Venture into AI, Machine Learning and all those algorithms that go with it.

It's been a 4 months since my last blog entry, I took it easy for a little while as we all need to do from time to time... but before long my brain got these nagging ideas and questions:

How hard can AI and Machine learning actually be?
How does it work?
I bet people are just over complicating it..
How are they currently trying to solve it?
Is it actually that difficult?
Could it be done it differently?

So off I went search the internet, some of useful sites I came across:
http://www.ai-junkie.com
Machine-learning Stanford Video course
Genetic algorithm example

I also ended up buying 2 books on Amazon:

Firstly, from many different recommendations:
Programming Collective Intelligence

I will be "working" through this book. While reading I will be translating, implementing and blogging the algorithms defined (in Python) as well as any mentioned that I will research separately in Java. Mainly for my own understanding and for the benefit of reusing them later, and an excuse to play with Java v7.

However, since I want to practically work through that book, I needed another for some "light" reading before sleep, I found another book from an article on MIT technology review Deep Learning, a bit that caught my eye was:


For all the advances, not everyone thinks deep learning can move artificial intelligence toward something rivaling human intelligence. Some critics say deep learning and AI in general ignore too much of the brain’s biology in favor of brute-force computing.
One such critic is Jeff Hawkins, founder of Palm Computing, whose latest venture, Numenta, is developing a machine-learning system that is biologically inspired but does not use deep learning. Numenta’s system can help predict energy consumption patterns and the likelihood that a machine such as a windmill is about to fail. Hawkins, author of On Intelligence, a 2004 book on how the brain works and how it might provide a guide to building intelligent machines, says deep learning fails to account for the concept of time. Brains process streams of sensory data, he says, and human learning depends on our ability to recall sequences of patterns: when you watch a video of a cat doing something funny, it’s the motion that matters, not a series of still images like those Google used in its experiment. “Google’s attitude is: lots of data makes up for everything,” Hawkins says.



So the second book I purchased - On Intelligence
So far (only page upto page 54) 2 things have from this book have imbedded themselves in my brain:
"Complexity is a symptom of confusion, not a cause" - so so common in the software development world.
&
"AI defenders also like to point out historical instances in which the engineering solution differs radically from natures version"
...
"Some philosophers of mind have taken a shine to the metaphor of the cognitive wheel, that is, an AI solution to some problem that although entirely different from how the brain does it is just as good"

Jeff himself believes we need to look deeper into the brain for a better understanding, but could it be possible to have completely different approach to solve the "intelligence" problem?

Thursday, January 3, 2013

Weblogic JNDI & Security Contexts

Quite often when using multiple services / ejbs from different internal teams we have run into weblogic context / security errors, we always deduced the issue was how Weblogic handles it's contexts, I finally found weblogics' explanations their documents:

JNDI Contexts and Threads

When you create a JNDI Context with a username and password, you associate a user with a thread. When the Context is created, the user is pushed onto the context stack associated with the thread. Before starting a new Context on the thread, you must close the first Context so that the first user is no longer associated with the thread. Otherwise, users are pushed down in the stack each time a new context created. This is not an efficient use of resources and may result in the incorrect user being returned by ctx.lookup() calls. This scenario is illustrated by the following steps:
  1. Create a Context (with username and credential) called ctx1 for user1. In the process of creating the context, user1 is associated with the thread and pushed onto the stack associated with the thread. The current user is now user1.
  2. Create a second Context (with username and credential) called ctx2 for user2. At this point, the thread has a stack of users associated with it. User2 is at the top of the stack and user1 is below it in the stack, so user2 is used is the current user.
  3. If you do a ctx1.lookup("abc") call, user2 is used as the identity rather than user1, because user2 is at the top of the stack. To get the expected result, which is to have ctx1.lookup("abc") call performed as user1, you need to do a ctx2.close() call. The ctx2.close() call removes user2 from the stack associated with the thread and so that a ctx1.lookup("abc") call now uses user1 as expected.
  4. Note: When the weblogic.jndi.enableDefaultUser flag is enabled, there are two situations where a close() call does not remove the current user from the stack and this can cause JNDI context problems. For information on how to avoid JNDI context problems, see How to Avoid Potential JNDI Context Problems.

How to Avoid Potential JNDI Context Problems

Issuing a close() call is usually as described in JNDI Contexts and Threads. However, the following is an exception to the expected behavior that occur when the weblogic.jndi.enableDefaultUser flag is enabled:
Last Used
When using IIOP, an exception to expected behavior arises when there is one Context on the stack and that Context is removed by a close(). The identity of the last context removed from the stack determines the current identity of the user. This scenario is described in the following steps:
  1. Create a Context (with username and credential) called ctx1 for user1. In the process of creating the context, user1 is associated with the thread and stored in the stack, that is, the current identity is set to user1.
  2. Do a ctx1.close() call.
  3. Do a ctx1.lookup()call. The current identity is user1.
  4. Create a Context (with username and credential) called ctx2 for user2. In the process of creating the context, user2 is associated with the thread and stored in the stack, that is, the current identity is set to user2.
  5. Do a ctx2.close() call.
  6. Do a ctx2.lookup()call. The current identity is user2.

Link to the source Weblogic Docs: Weblogic JNDI

Wednesday, October 17, 2012

Setting up and playing with Apache Solr on Tomcat

A while back a had a little time to play with Solr, and was instantly blown away by the performance we could achieve on some of our bigger datasets.
Here is some of my initial setup and configuration learnings to maybe help someone get it up and running a little faster.
Starting with setting both up on windows.

Download and extract Apache Tomcat and Solr and copy into your working folders.
Tomcat Setup
If you want tomcat as a service install it using the following:
bin\service.bat install 
Edit the tomcat users under conf.:

If you are going to query Solr using international characters (>127) using HTTP-GET, you must configure Tomcat to conform to the URI standard by accepting percent-encoded UTF-8. Add: URIEncoding="UTF-8"
to the conf/server.xml

Copy the contents of the example\solr your solr home directory D:\Java\apache-solr-3.6.0\home
create the code fragment on $CATALINA_HOME/conf/Catalina/localhost/solr.xml pointing to your solr home.

Startup tomcat, login, deploy the solr.war. Solr Setup
It should be available at http://localhost:8080/solr/admin/ To create a quick test using SolrJ the creates and reads data: Grab the following Maven Libs: JUnit test: Adding data directly from the DB Firstly you need to add the relevant DB libs to the add classpath. Then create data-config.xml as below, if you require custom fields, those can be specified under the fieldstag in the schema.xml shown below the dataconfig.xml A custom field in the schema.xml: Add in the solrconfig.xml make sure to point the the data-config.xml, the handler has to be registered in the solrconfig.xml as follows. Once that is all setup a full import can be done with the following: http://localhost:8080/solr/admin/dataimport?command=full-import Then you should be good to go with some lightning fast data retrieval.

Sunday, September 9, 2012

Android App : iBoincStats


A while back I did an iOS app iBoincStats which has since been downloaded about 2300 times.

I have recently submitted another tiny game to Apple, and in the doldrums that is the app store approval process I 
set myself a little challenge: download, learn, write and publish iBoincStats for Android be the other application get approved.

I have to give Android full credit, if you are a Java developer, developing for Android is really simple. I tried getting it all
up and running a couple years back, but with the simulator taking 20 minutes+ to start up, I deleted it very quickly. This time with the latest SDK and intelliJ 11 it was just a little slower than the iOS environment and much more usable.

The default "look and feel" on Android really takes a lot more work to make it look as good an iOS app, I didn't really spend
enough time on that.
If anyone actually downloads it, I'll dedicate a little more time to it.
  

iBoincStats (For Android)

This is a simple stats client to view your BOINC project processing statistics.
Enter your cross project id and access your latest stats.
Some of the popular BOINC projects include:
Seti@home
climateprediction.net
Einstein@home
POEM@home
rosetta@home

More information regarding the BOINC project can be found at:
BOINC home
Wikipedia - Berkely Open Infrastructure for Network Computing
Screen Shots:



Popular Posts

Followers