Pages

Senin, 07 Juli 2014

Distributing and Parallelizing Probabilistic Logic Networks Reasoning

During discussion about making Probabilistic Logic Networks (PLN) allow parallel reasoning using distributed AtomSpace, Dr. Ben Goertzel noted:
As for parallelism, I believe that logic chaining can straightforwardly be made parallel, though this involves changes to the algorithms and their behavior as heuristics.   For example, suppose one is backward chaining and wishes to apply deduction to obtain A --> C.   One can then evaluate multiple B potentially serving the role here, e.g.

A --> B1, B1 --> C  |-  A -->C
A --> B2, B2 --> C  |-  A -->C
...

Potentially, each Bi could be explored in parallel, right?    Also, in exploring each of these, the two terms could be backward chained on in paralle, so that e.g.

A --> B1

and

B1 --> C

could be explored in parallel...

In this way the degree of parallelism exploited by the backward chainer would expand exponentially during the course of a single chaining exploration, until reaching the natural limit imposed by the infrastructure.

This will yield behavior that is conceptually similar, though not identical, to serial backward chaining.
I haven't learned much about PLN yet, but I hope it can be made to work by distributing computation across AtomSpace nodes.

Other than strictly parallel, each path can be assigned a priority or heuristic, the tasks become a distributed priority queue. So a task finishing earlier can insert more tasks into the prority queue, and these new tasks don't have to be at the very end, but can be in the middle etc.


If we'd like to explore 1000 paths, and each of those generates another 1000 paths, ideally we'd have 1 million nodes. The first phase will be executed by 1000 nodes in parallel, the next 1 million paths will be executed by 1 million nodes in parallel, and we have a result in 2 ms. :) In practice we probably only can afford a few nodes, and limited time, can PLN use heuristic discovery?

For example, if the AI participates in "Are you smarter than a 5th grader", the discovery paths would be different than "calculate the best company strategy, I'll give you two weeks and detailed report". In a quiz, the AI would need to come up with a vague answer quickly, then refine the answer progressively until time runs out. i.e. when requested 10 outputs, the quiz one will try to get 10 answers as soon as possible even if many of them are incorrect; and the business one will try to get 1 answer correct, even if it means the other 9 is left unanswered.

Does PLN do this? If so, the distributed AtomSpace architecture would evolve hand-in-hand with (distributed) PLN. An app or modules shouldn't be required to be distributed to use AtomSpace, however a module (like PLN) that's aware that AtomSpace is both a distributed data grid and a distributed compute grid, can take advantage of this architecture and make its operations much faster/scalable. It's akin to difference between rendering 3D scenes by CPU vs. using OpenGL-accelerated graphics. However, a computer usually have only 1 or 2 graphics card and fixed, where an AtomSpace cluster can have dynamic number of nodes and you can throw more at it at any time. i.e. for expensive computation you can launch 100 EC2 instances for several hours then turn it off when done.