Knowledge Base Lumen Robot Friend

Tampilkan postingan dengan label Natural Language Processing. Tampilkan semua postingan

Jumat, 11 Juli 2014

My First Merged Pull Request to OpenCog RelEx

https://github.com/opencog/relex/pull/109

Feels good. :-)

Selasa, 08 Juli 2014

Natural Language Processing - Note from Dr. Linas Vepstas

The process should be computationally feasible. We are writing code for it. However, there's an endless list of practicalities, everything from a labor shortage to buggy code, missing infrastructure, poorly expressed and misunderstood ideas, as well as plenty of open questions and research to be done.

Practically, at this time, the big road-blocks are:

1. not having a fully-functional PLN, and
2. not having a large database of common-sense experience/knowledge.

Senin, 07 Juli 2014

Adding Distributed Indexes to Hypergraph Database for Horizontal Scaling of Semantic Reasoning

While discussing distributed AtomSpace architecture in OpenCog group, Dr. Linas Vepstas noted:

Reference resolution, reasoning and induction might be fairly local as well: when reading and trying to understand a wikipedia article, it seems as if its related to a million different things. A single CPU with 16GB RAM can hold 100 million atoms in RAM, requiring no disk or network access.

The only reason for a database then becomes as a place to store, over long periods of time, the results of the computation. Its quite possible that fast performance of the database won't actually be important. Which would mean that the actual database architecture might not be very important. Maybe.

Based on the experiments, while processing (i.e. reasoning) 200,000 "atoms" in 3 seconds on a single host isn't too bad, searching for a few atoms out of 200,000 (or even 1 billion) on single host should take very fast (i.e. ~ 1 ms or less).

So I guess these are two distinct tasks. Searching would use (distributed) indexing, while processing/reasoning can be done by MindAgents combining data-to-compute and compute-to-data, with consideration to data affinity.

For processing which requires non-local data that Dr. Vepstas concerned, when using compute+data grid such as GridGain, a compute grid is automatically a cache, so all required non-local data are automatically cached. Which may or may not be sufficient, depending on the algorithm.

For searches, it seems we need to create separate indexes for each purpose, each index is sharded/partitioned appropriately to distribute compute load. Which means AtomSpace data grid is will have redundancy in many ways. The AtomSpace can probably be "split" into 3 parts:

the hypergraph part (can be stored in HyperGraphDB or Neo4j)
the eager index parts, always generated for the entire hypergraph, required for searches (can be stored in Cassandra or Solr or ElasticSearch)
the lazy index parts, the entries are calculated on demand then stored for later usage (can be stored in Cassandra or Solr or ElasticSearch)

The hypergraph would be good when you already know the handles, and for traversing. But when the task is "which handles A are B of the handles C assuming D is E?" an index is needed to answer this (particular task) quickly. Hopefully ~1 ms for each grid node, so 100 nodes working in parallel, will generate 100 set of answers in the ~1 ms.

Today, a 16 GB RAM node with 2 TB SATA storage is probably typical config (SSD will also work, but just for the sake of thought experiment a spinning disk more performance concerns). The node holds a partition of the distributed AtomSpace, and is expected to answer any search (i.e. give me handles of atoms in your node where it matches criteria X, Y, Z) within 1ms, and can do processing over a select nodes (i.e. for handles [A, B, C, ... N] perform this closure) within 1 second.

To achieve these goals:

For quick searches for that partition, all atom data needs to be indexed in multiple ways, an index for each purpose
For quick updates to the index (triggered by updates to data), the index and data are colocated in the same host to avoid network IO, although can be in different stores (i.e. data in HyperGraphDB and index in Cassandra). The partitioning/sharding need to accomodate this. So for 2 TB storage, we can put perhaps 100 GB data and 1 TB of indexes.
For quick lookup and updates of subset of data, the RAM is used as read-through & write-through cache by the data grid.
For non-local search/update/lookup/processing, it uses the data grid to do so, and caches results locally in RAM, that can overflow to disk. We still have 900 GB of space left, so we can use it for this purpose.
For quick processing of subset of data, local lookups are performed (which should take near-constant time, even with drives) and much faster if requested data is already in cache. Processing is then done using CPU or GPGPU (via OpenCL, e.g. Encog neural network library uses OpenCL to accelerate calculations). Results are then sent back via network.

For question-answering, given the label (e.g. Ibnu Sina) and possible concept types (Person), and optionally discussion contexts (Islam, religion, social, medicine), find the ConceptNode's which has that label, that type, and the confidence value for each contexts. And I want it done in 1 ms. :D

YAGO has 15,372,313 labels (1.1 GB dataset) for 10+ million entities. The entire YAGO is 22 GB. Assuming the entities with labels are stored in AtomSpace, selecting the matching labels without index would take ~150 seconds on a single host and ~50 seconds on 3 nodes (extrapolating my previous results). With indexes this should be 1ms.

First index would give the concepts given a label and types, with structure like :

label -> type -> [concept, concept, concept, ...]

type -> [concept, concept, concept, ...]

Second index would give the confidence, given a concept and contexts, with sample data like :

Ibnu_Sina1 -> { Islam: 0.7, medicine: 0.9, social: 0.3, ... }

Ibnu_Sina2 -> { Islam: 0.1, medicine: 0.3, social: 0.9, ... }

Indexes change constantly, for each atom change multiple indexes must be updated, and index updates would take more resources than updating the atoms themselves, so index updates are asynchronous and eventually consistent. (I guess this also happens on humans, when humans learn new information, they don't immediately "understand" it. I mean, we now know a new fact, but it takes time [or even sleep] to make sense or implications/correlations of that new fact.)

We should agree on a set of a priori indexes. (As new concepts are learned and OpenCog gets queries that take a long time processing too many atoms, the AI may learn to make new indexes or tune existing ones... although this is probably too meta and distant future. :D )

Experimental Performance Test using GridGain for Distributed Natural Language Processing

I did an experimental performance test using GridGain to simulate AtomSpace processing. This is related to discussion in OpenCog group about AtomSpace architecture.

Disclaimer: This is not a benchmark, please don't treat it as such!

First I loaded up 212,351 YAGO labels (from MongoDB, but the actual backend doesn't matter here) for resources starting with letter M :

13:13:43.178 [main] INFO i.a.i.e.l.l.yago.YagoLabelCacheStore - Loading 212351 labels...

13:13:45.595 [main] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - [23%] 50000 labels loaded, 162351 more to go...

13:13:47.571 [main] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - [47%] 100000 labels loaded, 112351 more to go...

13:13:49.139 [main] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - [70%] 150000 labels loaded, 62351 more to go...

13:13:50.608 [main] DEBUG i.a.i.e.l.l.yago.YagoLabelCacheStore - [94%] 200000 labels loaded, 12351 more to go...

13:13:50.914 [main] INFO i.a.i.e.l.l.yago.YagoLabelCacheStore - Loaded 212351 labels...

13:13:50.917 [main] INFO id.ac.itb.ee.lskk.lumen.yago.Worker - For yagoLabel, I have 100000 primary out of 100000 entries + 112351 swap

To make it somewhat more realistic, grid data for a node is capped at 100,000 entries. The configuration is partitioned, so for all 3 nodes then the entire dataset should be held entirely in memory. Then I started two more nodes, and the latest node does a search which resource ID has the label "Muhammad". So it's basically a reverse hashmap lookup, that can be perfectly be done using an index. But I'm treating the entries as atoms, just for the sake of doing distributed-parallel computation on them.

Collection<Set<String>> founds = labelCache.queries().createScanQuery(null).execute(new GridReducer<Entry<String, String>, Set<String>>() {

Set<String> ids = new HashSet<>();

@Override

public boolean collect(Entry<String, String> e) {

if (e.getValue().equalsIgnoreCase(upLabel)) {

ids.add( e.getKey() );

}

return true;

}

@Override

public Set<String> reduce() {

return ids;

}

}).get();

The results, using my workstation i5-3570K @ 4x 3.40GHz, 3 nodes at 1 GB heap each:

[13:29:40] GridGain node started OK (id=03a07172)

[13:29:40] Topology snapshot [ver=5, nodes=3, CPUs=4, heap=3.0GB]

13:29:40.043 [main] INFO i.a.i.e.l.l.yago.YagoLabelLookupCli2 - Finding resource for label 'Muhammad'...

13:29:43.131 [main] INFO i.a.i.e.l.l.yago.YagoLabelLookupCli2 - Found for Muhammad: [[Muhammad_Khalil_al-Hukaymah, Muhammad_S._Eissa, Muhammad_Musa, Muhammad_Okil_Musalman, Muhammad_Loutfi_Goumah, Muhammad_Sadiq, Muhammad_Salih, Muhammad_Ismail_Agha, Muhammad_Yusuf_Hashmi, Mustafah_Muhammad, Muhammad_Mahbubur_Rahman, Muhammad_Ahmad_Said_Khan_Chhatari, Muhammad_Jamiruddin_Sarkar, Muhammad_Ibrahim_Joyo, Muhammad_bin_Tughluq, Muhammad_Sohail_Anwar_Choudhry, Muhammad_Tariq_Tarar], [Muhammad_Salman, Muhammad_Jailani_Abu_Talib, Muhammad_Qutb], [Muhammad_Ibrahim_Kamel, Muhammad_Amin_Khan_Turani, Muhammad_Ali_Pate, Muhammad_Rafi_Usmani, Muhammad_Faisal, Muhammad, Muhammad_Ilham, Muhammad_Kurd_Ali, Muhammad_Umar, Muhammad_Shahidullah, Muhammad_Anwar_Khan, Muhammad_Saifullah, Muhammad_Saqlain]]

[13:29:43] GridGain node stopped OK [uptime=00:00:03:865]

Searched 212,351 entries in 3088 ms, using 3 nodes × 4 threads = 12 total threads on single host. So the rate is ~68766 entries/second.

To be fair, GridGain is giving performance hints: (so for serious benchmark, these should be tuned)

[13:29:40] ^-- Decrease number of backups (set 'keyBackups' to 0)

[13:29:40] ^-- Disable fully synchronous writes (set 'writeSynchronizationMode' to PRIMARY_SYNC or FULL_ASYNC)

[13:29:40] ^-- Enable write-behind to persistent store (set 'writeBehindEnabled' to true)

[13:29:40] ^-- Disable query index (set 'queryIndexEnabled' to false)

[13:29:40] ^-- Disable peer class loading (set 'peerClassLoadingEnabled' to false)

[13:29:40] ^-- Disable grid events (remove 'includeEventTypes' from configuration)

Of course, 12 threads running on a single host isn't optimal, and there's no network saturation effects since all nodes are on the same host.

From how GridGain works, the performance should be (much?) better when there are 3 actual nodes/processors to work on. The key thing is that the calculation (map/reduce) is done on each node, so the "Mind Agent" (node 3) here only does roughly ~33% of the job, the other 2 "AtomSpace" nodes aren't just serving data, they're also processing data they already have, no need to move these bits around the network.

Since the closure is code (Java code), it's possible to use OpenCL/GPU for certain tasks, which should increase performance for math-intensive processing.

Fault tolerance also works very well, so you can kill and rearrange nodes at will, the grid will stay there as long as at least 1 node is up.

Distributed Natural Language Parsing using GridGain as Compute and Data Grid

Discussion in OpenCog group about AtomSpace architecture. Dr. Ben Goertzel notes:

Section 5.3 of my distributed AtomSpace design from June 2012
http://wiki.opencog.org/wikihome/images/e/ea/Distributed_AtomSpace_Design_Sketch_v6.pdf

is titled "Importance Dynamics" and deals with problem of handling STI and LTI (attention) values in a distributed OpenCog system.... It is brief and only gives a general approach, as I figured it would be best to work out the details after the distributed Atomspace was in the detailed design phase. Recall that document was written after long discussions with you and others.

I've been experimenting with GridGain and it seems to be ticking most if not all of the performance requirements you need, plus with Neo4j as the persistent graph store which would allow intuitive querying and visual exploring of the AtomSpace.

My (very) simple use case is NLP parsing of sentence to match question-answer pattern.

Currently I have 34 rules (imagine that this is the number of Atoms). The core to process them is: (Java8)

Collection<GridFuture<MatchedYagoRule>> matchers = Collections2.transform(ruleIds, (ruleId) ->

grid.compute().affinityCall(cache.name(), ruleId,

new GridCallable<MatchedYagoRule>() {

@Override

public MatchedYagoRule call()

throws Exception {

final YagoRule rule = cache.get(ruleId);

Pattern pattern = Pattern.compile(rule.questionPattern_en, Pattern.CASE_INSENSITIVE);

Matcher matcher = pattern.matcher(msg);

if (matcher.matches()) {

log.info("MATCH {} Processing rule #{} {}", matcher, ruleId, rule.property);

return new MatchedYagoRule(rule, matcher.group("subject"));

} else {

log.info("not match Processing rule #{} {}", ruleId, rule.property);

return null;

}

}) );

which probably needs explanation for someone unfamiliar with in-memory datagrid, but the whole experiment does very sophisticated things for very little code / setup, and it will be scalable (I can only find this 2010 article for comparison benchmark, but I'm sure today GridGain is much improved).

How it works is it distributes the compute task (triggered by node2) for 34 rules across nodes and threads (cores) inside each node. For this example I use 2 nodes in the same machine, the output for node2 is:

...

06:18:46.470 [gridgain-#5%pub-null%] INFO i.a.i.e.l.l.yago.AnswerYagoFactTests - not match Processing rule hasHeight How tall is (?<subject>.+)\?

06:18:46.473 [gridgain-#7%pub-null%] INFO i.a.i.e.l.l.yago.AnswerYagoFactTests - not match Processing rule hasEconomicGrowth How much is the economic growth of (?<subject>.+)\?

06:18:46.477 [gridgain-#6%pub-null%] INFO i.a.i.e.l.l.yago.AnswerYagoFactTests - not match Processing rule isMarriedTo Who did (?<subject>.+) marry\?

06:18:46.485 [gridgain-#10%pub-null%] INFO i.a.i.e.l.l.yago.AnswerYagoFactTests - Found matcher: MatchedYagoRule [rule=YagoRule [property=wasBornIn, questionPattern_en=Where was (?<subject>.+) born\?, questionPattern_id=Di mana (?<subject>.+) dilahirkan\?, answerTemplateHtml_en={{subject}} was born in {{object}}., answerTemplateHtml_id={{subject}} lahir di {{object}}.], subject=Michael Jackson]

06:18:46.485 [gridgain-#10%pub-null%] INFO i.a.i.e.l.l.yago.AnswerYagoFactTests - Subject: MatchedYagoRule [rule=YagoRule [property=wasBornIn, questionPattern_en=Where was (?<subject>.+) born\?, questionPattern_id=Di mana (?<subject>.+) dilahirkan\?, answerTemplateHtml_en={{subject}} was born in {{object}}., answerTemplateHtml_id={{subject}} lahir di {{object}}.], subject=Michael Jackson]

[06:18:46] GridGain node stopped OK [uptime=00:00:00:989]

However the match actually didn't happen in node2, it actually happened in node1:

06:18:46.436 [gridgain-#7%pub-null%] INFO i.a.i.e.l.l.yago.AnswerYagoFactTests - MATCH java.util.regex.Matcher[pattern=Where was (?<subject>.+) born\? region=0,31 lastmatch=Where was Michael Jackson born?] Processing rule wasBornIn Where was (?<subject>.+) born\?

node1 and node2 holds different partitions of the 34 rules. So what happens is node2 as that triggers the job, will distribute the job (literally sending the Java closure bytecode over network) to other nodes, based on affinity to the requested rule. node1 will process that closure/job over entries/rules that it holds. In my example node2 also does the same, since it also holds a partition of the rules, but it doesn't have. All jobs send the result (map), which will then be reduced, and we get the output.

Also, the rules are held in persistent storage, which in my simple case is actually from CSV file. In reality this would be a data store such as Neo4j or PostgreSQL or Cassandra. Meaning that the maximum AtomSpace capacity is equal to sum of harddrives (depending on replication factor). And during processing, each node's RAM will be utilized to process the data it has closest/based on affinity.

We get several nice properties:

distribution/partitioning of data, which means:
increased storage space, and
distribution of compute, which is enabled by
affinity-based computation, i.e. a node processes request based on the atoms it already has
with pluggable persistent storage, the "API" so-to-speak to process atoms remain the same, even if we process 100 GB of (total) atoms with only 4 GB of (total) RAM. since GridGain will manage the read-through & write-through based on the GridCacheStore implementation
GridGain allow indexes on data (which work in-memory), if used can complement the datastore indexes and provide flexible querying (i.e. other than fetching keys) while retaining performance
GridGain is Apache Licensed, with commercial support :) companies using OpenCog have option of GridGain's consulting & commercial support to tune their system

Kamis, 03 Juli 2014

Knowledge Base YAGO2s untuk Uji Pengetahuan Robot

Semantic knowledge base YAGO2s akan digunakan sebagai data fakta untuk Lumen Knowledge Base.

Agar pengembangan aplikasi terarah dan evaluasinya terukur, maka perlu membuat data uji.

Untuk FitNesse acceptance testing nantinya, beberapa contoh data uji yang dihasilkan sebagai berikut, berupa pasangan pertanyaan dan jawaban dalam dua bahasa, Inggris dan Indonesia. Ini akan menguji dari kapabilitas baik dari segi language detection, natural language parsing, natural language generation, localization, dan semantic query untuk fakta langsung (bukan inference, tanpa reasoning).

English	Bahasa Indonesia
What is the airport code of Freeman Municipal Airport? Airport code of Freeman Municipal Airport is 'SER'.	Apa kode bandara Freeman Municipal Airport? Kode bandara Freeman Municipal Airport adalah 'SER'.
What is Huayna Picchu's latitude? Huayna Picchu's latitude is -13.158°.	Berapa lintang Huayna Picchu? Lintang Huayna Picchu adalah -13,158°.
When was Vampire Lovers destroyed? Vampire Lovers was destroyed on year 1990.	Kapan Vampire Lovers dihancurkan? Vampire Lovers dihancurkan pada tahun 1990.
What is the gini index of Republica De Nicaragua? Gini index of Republica De Nicaragua is 52.3%.	Berapa indeks gini Republica De Nicaragua? Indeks gini Republica De Nicaragua adalah 52,3%.
What movies did Anand Milind write the music for? Anand Milind wrote music for Jeevan Ki Shatranj.	Anand Milind menciptakan lagu untuk film apa? Anand Milind menciptakan lagu untuk Jeevan Ki Shatranj.
How many people live in Denton, Montana? Denton, Montana's population is 301 people.	Berapa populasi Denton, Montana? Populasi Denton, Montana adalah 301 orang.
How much is the GDP of Беларусь? GDP of Беларусь is $55,483,000,000.00.	Berapa PDB Беларусь? PDB Беларусь adalah USD55.483.000.000,00.
Where is House of Flora's website? House of Flora's website is at http://houseofflora.bigcartel.com/products.	Di mana alamat website House of Flora? Alamat website House of Flora ada di http://houseofflora.bigcartel.com/products.
How much is the revenue of Scientific-Atlanta? Revenue of Scientific-Atlanta is $1,900,000,000.00.	Berapa pendapatan Scientific-Atlanta? Pendapatan Scientific-Atlanta adalah USD1.900.000.000,00.
Who are the children of Rodney S. Webb? Children of Rodney S. Webb are Todd Webb.	Siapa saja anak Rodney S. Webb? Anak Rodney S. Webb adalah Todd Webb.
What is the currency of Kyrgzstan? Currency of Kyrgzstan is Kyrgyzstani som.	Apa mata uang Kyrgzstan? Mata uang Kyrgzstan adalah Kyrgyzstani som.
Where did Siege of Candia happen? Siege of Candia happened in Ηράκλειο.	Di mana Siege of Candia terjadi? Siege of Candia terjadi di Ηράκλειο.
What is the citizenship of Amanda Mynhardt? Amanda Mynhardt is a citizen of Republic of South Africa.	Amanda Mynhardt warganegara mana? Amanda Mynhardt adalah warganegara Republic of South Africa.
When was Stelios born? Stelios was born on Tuesday, November 15, 1977.	Kapan Stelios dilahirkan? Stelios lahir pada Selasa 15 November 1977.
Where did Henry Hallett Dale die? Henry Hallett Dale died in Grantabridge.	Di mana Henry Hallett Dale meninggal dunia? Henry Hallett Dale meninggal dunia di Grantabridge.
Who did Diefenbaker marry? Diefenbaker is married to John Diefenbaker.	Siapa pasangan Diefenbaker? Diefenbaker menikahi John Diefenbaker.
How tall is Calpine Center? Calpine Center's height 138.074 m.	Berapa tinggi Calpine Center? Tinggi Calpine Center adalah 138,074 m.
What does Pearlette lead? Pearlette is a leader of St.lucia.	Pearlette memimpin apa? Pearlette adalah pemimpin St.lucia.
Where does Ty Tryon live? Ty Tryon lives in Orlando, Fla..	Ty Tryon tinggal di mana? Ty Tryon tinggal di Orlando, Fla..
What movies did Markowitz direct? Markowitz directed Murder in the Heartland.	Markowitz menyutradarai film apa? Markowitz menyutradarai Murder in the Heartland.
What did Thalía create? Thalía created I Want You/Me Pones Sexy.	Apa yang dibuat Thalía? Thalía membuat I Want You/Me Pones Sexy.
How much is Pōtītī's inflation? Pōtītī's inflation is 1.1 %.	Berapa inflasi Pōtītī? Inflasi Pōtītī adalah 1,1 %.
What is the capital city of Kingdom of Bavaria? Capital city of Kingdom of Bavaria is Minga.	Apa ibu kota Kingdom of Bavaria? Ibu kota Kingdom of Bavaria adalah Minga.
How much does Mária Mohácsik weight? Mária Mohácsik weights 70,000 g.	Berapa berat Mária Mohácsik? Berat Mária Mohácsik adalah 70.000 g.
What is the language code of Gujarati (India)? The language code of Gujarati (India) is 'gu'.	Apa kode bahasa dari Gujarati (India)? Kode bahasa dari Gujarati (India) adalah 'gu'.
What movies star Raaj Kumar? Raaj Kumar acted in Pakeezah.	Film apa saja yang dibintangi Raaj Kumar? Raaj Kumar membintangi Pakeezah.

Setelah data uji siap, langkah selanjutnya tentunya mengusahakan agar aplikasi yang dijalankan dapat lulus/pass semua tes-tes di atas. :-) Amiiin.

Selasa, 01 Juli 2014

Using BabelNet 1.1.1 Multilingual Dictionary & Word Sense Disambiguation (Tutorial)

BabelNet v2.5 as of July 1st 2014 has not provided downloadable path indexes so I'm using BabelNet v1.1.1 for this tutorial.

How to Install BabelNet version 1.1.1

Extract BabelNet API 1.1.1 as ~/babelnet-api-1.1.1
Extract babelnet core lucene 1.1.1 to ~/babelnet-1.1.1
Edit ~/babelnet-api-1.1.1/config/babelnet.var.properties :
babelnet.dir=/home/ceefour/babelnet-1.1.1
BabelNet demo requires WordNet 3.0 in /usr/local/share/wordnet-3.0/dict (by default).
Download WordNet-3.0.tar.bz2 (8.6 MB) from http://wordnet.princeton.edu/wordnet/download/current-version/. Extract it to your home directory so it will create ~/WordNet-3.0 directory.
Edit ~/babelnet-api-1.1.1/config/jlt.var.properties:
jlt.wordnetPrefix=/home/ceefour/WordNet
Using shell, go to ~/babelnet-api-1.1.1 and run:
./run-babelnetdemo.sh

Example: (output is very long, so this is not complete output)

ceefour@amanah:~/babelnet-api-1.1.1 > ./run-babelnetdemo.sh
[ INFO ] BabelNetConfiguration - Loading babelnet.properties FROM /home/ceefour/babelnet-api-1.1.1/config/babelnet.properties
[ INFO ] BabelNet - OPENING BABEL LEXICON FROM: /home/ceefour/babelnet-1.1.1/lexicon
[ INFO ] BabelNet - OPENING BABEL DICTIONARY FROM: /home/ceefour/babelnet-1.1.1/dict
[ INFO ] BabelNet - OPENING BABEL GLOSSES FROM: /home/ceefour/babelnet-1.1.1/gloss
[ INFO ] BabelNet - OPENING BABEL GRAPH FROM: /home/ceefour/babelnet-1.1.1/graph
SYNSETS WITH English word: "bank"
[ INFO ] Configuration - Loading jlt.properties FROM /home/ceefour/babelnet-api-1.1.1/config/jlt.properties
=>(bn:00008363n) SOURCE: WIKIWN; TYPE: Concept; WN SYNSET: [09213565n];
MAIN LEMMA: bank#n#1;
IMAGES: [<a href="http://upload.wikimedia.org/wikipedia/commons/8/8c/Kuekenhoff_Canal_002.jpg">Kuekenhoff_Canal_002.jpg</a>, <a href="http://upload.wikimedia.org/wikipedia/commons/9/93/Namoi-River-sand-bank.jpg">Namoi-River-sand-bank.jpg</a>, <a href="http://upload.wikimedia.org/wikipedia/commons/1/10/Skawa_River,_Poland,_flood_2001.jpg">Skawa_River,_Poland,_flood_2001.jpg</a>, <a href="http://upload.wikimedia.org/wikipedia/commons/c/c6/RanelvaSelfors08.JPG">RanelvaSelfors08.JPG</a>, <a href="http://upload.wikimedia.org/wikipedia/commons/d/d6/Albertville_Voie_sur_berge.JPG">Albertville_Voie_sur_berge.JPG</a>, <a href="http://upload.wikimedia.org/wikipedia/commons/6/6b/Wheeling_Creek_Ohio.jpg">Wheeling_Creek_Ohio.jpg</a>, <a href="http://upload.wikimedia.org/wikipedia/commons/1/12/Regge_river_P3260276.JPG">Regge_river_P3260276.JPG</a>, <a href="http://upload.wikimedia.org/wikipedia/commons/5/55/2Kanal_bei_Tritolwerk.jpg">2Kanal_bei_Tritolwerk.jpg</a>, <a href="http://upload.wikimedia.org/wikipedia/commons/8/89/Shirakara_Canal,_Gion,_Kyoto.jpg">Shirakara_Canal,_Gion,_Kyoto.jpg</a>, <a href="http://upload.wikimedia.org/wikipedia/commons/2/24/Shukugawa03s3200.jpg">Shukugawa03s3200.jpg</a>, <a href="http://upload.wikimedia.org/wikipedia/commons/a/af/Damaged_Park_Road_at_Carbon.jpg">Damaged_Park_Road_at_Carbon.jpg</a>];
CATEGORIES: [BNCAT:EN:Hydrology, BNCAT:EN:Geomorphology, BNCAT:EN:Limnology, BNCAT:EN:Freshwater_ecology, BNCAT:EN:Fluvial_landforms, BNCAT:EN:Riparian, BNCAT:EN:Rivers, BNCAT:EN:Water_streams, BNCAT:EN:Water_and_the_environment, BNCAT:FR:Cours_d'eau];
SENSES (German): { WIKITR:DE:bank_0.55000_11_20 WIKITR:DE:streamside_1.00000_1_1 WIKITR:DE:flussufern_0.40000_2_5 WIKITR:DE:ufer_0.40000_2_5 WIKITR:DE:strom_bank_0.42857_3_7 WIKITR:DE:streambanks_1.00000_1_1 WIKITR:DE:ufer_0.42857_9_21 WNTR:DE:bank_0.53846_7_13 }
-----
EDGE gdis bn:00110761a { WN:EN:sloping }
EDGE gdis bn:00046303n { WN:EN:slope, WN:EN:incline, WN:EN:side }
EDGE gdis bn:00011766n { WIKIRED:EN:Water(molecule), WIKIRED:EN:Hydrogen_oxide, WIKIRED:EN:Water_(liquid), WIKIRED:EN:H₂O, WIKIRED:EN:Hydroxilic_acid, WIKIRED:EN:Oxygen_dihydride, WIKIRED:EN:Water_body, WIKIRED:EN:Chemical_water, WIKIRED:EN:Μ-oxido_dihydrogen, WIKIRED:EN:Hydric_oxide, WIKIRED:EN:Water_(molecule), WIKIRED:EN:Dihydrogenoxide, WIKIRED:EN:Diprotium_oxide, WIKIRED:EN:Hydroxylic_acid, WIKIRED:EN:OH2, WIKIRED:EN:Bodies_of_water, WIKIRED:EN:Density_of_water, WIKI:EN:Properties_of_water, WIKIRED:EN:Hydrogen_Hydroxide, WIKIRED:EN:Hydroxic_acid, WIKIRED:EN:Dihydrogen_oxide, WIKI:EN:Body_of_water, WIKIRED:EN:Waterbodies, WIKIRED:EN:Hydrogen_hydroxide, WIKIRED:EN:Water_(properties), WIKIRED:EN:Water_(Molecule), WIKIRED:EN:Unique_properties_of_water, WIKIRED:EN:Μ-oxido_hydrogen, WIKIRED:EN:H1.5O, WIKIRED:EN:Hydroxyl_monohydride, WIKIRED:EN:Hydrohydroxic_acid, WIKIRED:EN:Hydrogen_monoxide, WIKIRED:EN:Waterbody, WIKIRED:EN:Water_molecule, WIKIRED:EN:Water_bodies, WN:EN:body_of_water, WN:EN:water }
...
EDGE r bn:01001902n { WIKIRED:EN:Spa_(Belgium), WIKI:EN:Spa,_Belgium }
-----
=>(bn:03802146n) SOURCE: WIKI; TYPE: Concept; WN SYNSET: [];
MAIN LEMMA: WIKI:EN:Ocean_bank_(topography);
IMAGES: [];
CATEGORIES: [BNCAT:EN:Physical_oceanography, BNCAT:EN:Fishing_banks, BNCAT:EN:Undersea_banks];
SENSES (German): { WIKITR:DE:ozean_bank_1.00000_1_1 WIKITR:DE:bank_0.50000_1_2 WIKITR:DE:ufer_0.50000_1_2 WIKITR:DE:bank_0.94444_17_18 WIKITR:DE:fischerei_bank_0.25000_5_20 WIKITR:DE:fishing_bank_0.25000_5_20 }
-----
EDGE r bn:03180024n { WIKIRED:EN:Carbonate_mound, WIKIRED:EN:Platform_carbonate, WIKI:EN:Carbonate_platform, WIKIRED:EN:Carbonate_platforms }
EDGE r bn:00175840n { WIKIRED:EN:Coastal_upwelling, WIKI:EN:Upwelling }
EDGE r bn:00009026n { WIKI:EN:Continental_margin, WIKI:EN:Bathyal_zone, WIKIRED:EN:Continental-margin, WIKIRED:EN:Continental_slope, WIKIRED:EN:Bathypelagic, WIKIRED:EN:Midnight_Zone, WIKIRED:EN:Bathyal_Zone, WIKIRED:EN:Bathyal, WIKIRED:EN:Passive_continental_margin, WIKIRED:EN:Active_continental_margin, WN:EN:continental_slope, WN:EN:bathyal_zone, WN:EN:bathyal_district }
EDGE r bn:00047612n { WIKIRED:EN:Volcanic_isles, WIKIRED:EN:Islands, WIKIRED:EN:Volcanic_islands, WIKIRED:EN:IslandS, WIKIRED:EN:Ocean_islands, WIKIRED:EN:Former_island, WIKI:EN:Island, WIKIRED:EN:Eilean, WIKIRED:EN:Pulau, WN:EN:island }
EDGE r bn:02811607n { WIKIRED:EN:Grand_Banks, WIKI:EN:Grand_Banks_of_Newfoundland, WIKIRED:EN:Great_Banks }
EDGE r bn:00025408n { WIKIRED:EN:Seafloor, WIKIRED:EN:Seafloor_exploration, WIKIRED:EN:Sea_floor, WIKIRED:EN:Marine_floor, WIKIRED:EN:Underwater_seafloor_exploration, WIKI:EN:Seabed, WIKI:EN:Davy_Jones_(racing_driver), WIKIRED:EN:Ocean_floor, WIKIRED:EN:Davy_Jones_(driver), WIKIRED:EN:Sea_bed, WN:EN:ocean_floor, WN:EN:sea_floor, WN:EN:ocean_bottom, WN:EN:seabed, WN:EN:sea_bottom, WN:EN:Davy_Jones's_locker, WN:EN:Davy_Jones }
EDGE r bn:03225190n { WIKI:EN:Oceanic_plateau, WIKIRED:EN:Submarine_Plateau, WIKIRED:EN:Oceanic_Plateau, WIKIRED:EN:Submarine_plateau }
EDGE r bn:00383615n { WIKIRED:EN:Peñasco_Quebrado, WIKIRED:EN:Middle_Farallon_Island, WIKIRED:EN:Maintop_Island, WIKIRED:EN:Farallon_Island_Nuclear_Waste_Dump, WIKIRED:EN:Drunk_Uncle_Islets, WIKIRED:EN:Sugarloaf_Island, WIKIRED:EN:Aulone_Island, WIKIRED:EN:Farallon_Island, WIKIRED:EN:Great_Arch_Rock, WIKIRED:EN:Farallones, WIKIRED:EN:Seal_Rock,_Farallon_Islands, WIKIRED:EN:Piedra_Guadalupe, WIKIRED:EN:Farallón_Islands, WIKIRED:EN:Farallon_Islands_National_Wildlife_Refuge, WIKIRED:EN:Farallon_National_Wildlife_Refuge, WIKIRED:EN:Farallon_Wilderness, WIKI:EN:Farallon_Islands, WIKIRED:EN:North_Farallon_Island, WIKIRED:EN:Seal_Rock_(Farallon_Islands), WIKIRED:EN:Island_of_St._James, WIKIRED:EN:Farallone_Islands, WIKIRED:EN:Farallón_Viscaíno, WIKIRED:EN:Southeast_Farallon_Island }
EDGE r bn:00069946n { WIKI:EN:Sea, WIKIRED:EN:Worlds_seas, WIKIRED:EN:ทะเล, WN:EN:sea }
EDGE r bn:00049842n { WIKI:EN:Landmass, WIKI:EN:Land_mass, WN:EN:landmass, WN:EN:land_mass }
EDGE r bn:00070032n { WIKI:EN:Seamount, WIKIRED:EN:Sea_mount, WIKIRED:EN:Seamounts, WN:EN:seamount }
EDGE r bn:03433800n { WIKIRED:EN:Sedimented, WIKIRED:EN:Sedimentary_soil, WIKIRED:EN:Sedements, WIKIRED:EN:Detrital_sediment, WIKIRED:EN:Sea_Sediment, WIKI:EN:Sediment, WIKIRED:EN:Bomb_sag, WIKIRED:EN:Sediments, WIKIRED:EN:Sedimentary_layer }
EDGE r bn:01303244n { WIKI:EN:Wachusett_Reef, WIKIRED:EN:Wachusett_Bank }
EDGE r bn:00077192n { WIKIRED:EN:Tidal_flow, WIKIRED:EN:Tidal_current, WN:EN:tidal_flow, WN:EN:tidal_current }
EDGE r bn:00071161n { WIKIRED:EN:Sandbank, WIKIRED:EN:Sand_bank, WIKIRED:EN:Shoals, WIKIRED:EN:Longshore_bar, WIKIRED:EN:Offshore_bar, WIKIRED:EN:Barrier_beach, WIKIRED:EN:Barrier_bar, WIKIRED:EN:Sandbars, WIKI:EN:Shoal, WIKIRED:EN:Bar_(landform), WIKIRED:EN:Sand_banks, WN:EN:shoal }
EDGE r bn:00735063n { WIKIRED:EN:Dogger_bank, WIKIRED:EN:Dogger_Hills, WIKI:EN:Dogger_Bank, WIKIRED:EN:Doggerbank, WIKIRED:EN:Doggersbank }
EDGE r bn:00080211n { WIKIRED:EN:Dormant_volcanoes, WIKIRED:EN:Volcanos, WIKIRED:EN:Extinct_volcanoes, WIKIRED:EN:How_volcanoes_are_formed, WIKIRED:EN:Volcano_eruption, WIKIRED:EN:Valcano, WIKIRED:EN:Volcanoe_facts, WIKIRED:EN:Volcanicity, WIKIRED:EN:Active_Volcano, WIKIRED:EN:Volcanic_vent, WIKIRED:EN:Volcanic_activity, WIKIRED:EN:Volcano_(geological_landform), WIKIRED:EN:Erupt, WIKIRED:EN:🌋, WIKIRED:EN:Extinct_Volcano, WIKIRED:EN:Volcanic_mountains, WIKIRED:EN:Volcanoes, WIKIRED:EN:Volcanoe, WIKIRED:EN:Volcanic_mountain, WIKIRED:EN:All_about_Volcanos, WIKI:EN:Volcano, WIKIRED:EN:Volcanic_aerosols, WIKIRED:EN:Volcanic, WIKIRED:EN:Last_eruption, WIKIRED:EN:Crater_Row, WIKIRED:EN:Valcanos, WN:EN:volcano }
EDGE r bn:03087586n { WIKIRED:EN:Deep-sea, WIKIRED:EN:Ocean_depths, WIKIRED:EN:Deep_ocean, WIKIRED:EN:Deep_layer, WIKI:EN:Deep_sea }
EDGE r bn:00006813n { WIKIRED:EN:Faru, WIKIRED:EN:Darwin_point, WIKI:EN:Atoll, WIKIRED:EN:Coral_atoll, WIKIRED:EN:Atolls, WIKIRED:EN:Atoll_reef, WN:EN:atoll }
-----

About WordNet 3.0 Ubuntu package

WordNet 3.0 Ubuntu package is not usable for BabelNet but is educational and informational.
To install WordNet 3.0 Ubuntu package: (6.5 MB)

sudo aptitude install wordnet

WordNet 3.0 will be installed at /usr/share/wordnet.
There's also wordnet executable that you can use, e.g. "wordnet bird -over"

ceefour@amanah:~ > wordnet bird -over

Overview of noun bird

The noun bird has 5 senses (first 2 from tagged texts)

1. (29) bird -- (warm-blooded egg-laying vertebrates characterized by feathers and forelimbs modified as wings)
2. (1) bird, fowl -- (the flesh of a bird or fowl (wild or domestic) used as food)
3. dame, doll, wench, skirt, chick, bird -- (informal terms for a (young) woman)
4. boo, hoot, Bronx cheer, hiss, raspberry, razzing, razz, snort, bird -- (a cry or noise made to express displeasure or contempt)
5. shuttlecock, bird, birdie, shuttle -- (badminton equipment consisting of a ball of cork or rubber with a crown of feathers)

Overview of verb bird

The verb bird has 1 sense (no senses from tagged texts)

1. bird, birdwatch -- (watch and study birds in their natural habitat)

Dictionary bahasa Indonesia untuk Link Grammar Parser

Saya mencoba bereksperimen membuat dictionary bahasa Indonesia sederhana untuk Link Grammar Parser versi 4.7.4 (yang tersedia di distro Ubuntu/Linux Mint 17). Project ini saya share di https://github.com/ceefour/link-grammar-id dengan lisensi open source MIT (sesuai lisensi dictionary link-grammar versi English).

Dictionary yang saya gunakan:

kuda gajah unta kucing anjing tikus burung: S+ or O- or O+;
Prabowo Jokowi saya: (S+ or O+) or (Oh- & {W-});

makan memakan lari berlari: S- & {W-};

apa apakah: Ss+;
siapa: Oh+;

itu: (Ss- & O+ & {W-}) or (O- & Ss- & {W-}) or (O- & Ss- & {W-});

LEFT-WALL: W+ & {Xp+};

"?": Xp-;

Hasilnya adalah sebagai berikut:

link-grammar: Info: Library version link-grammar-4.7.4. Enter "!help" for help.
linkparser> siapa Jokowi?
Found 1 linkage (1 had no P.P. violations)
Unique linkage, cost vector = (UNUSED=0 DIS=0 FAT=0 AND=0 LEN=3)

+--------Xp--------+
+-------W------+ |
| +--Oh--+ |
| | | |
LEFT-WALL siapa Jokowi ?

linkparser> apa itu kuda?
Found 1 linkage (1 had no P.P. violations)
Unique linkage, cost vector = (UNUSED=0 DIS=0 FAT=0 AND=0 LEN=4)

+--------Xp--------+
+-----W----+ |
| +-Ss+--O-+ |
| | | | |
LEFT-WALL apa itu kuda ?

linkparser> apa itu kucing?
Found 1 linkage (1 had no P.P. violations)
Unique linkage, cost vector = (UNUSED=0 DIS=0 FAT=0 AND=0 LEN=4)

+---------Xp---------+
+-----W----+ |
| +-Ss+--O--+ |
| | | | |
LEFT-WALL apa itu kucing ?

linkparser> apa kucing itu?
Found 1 linkage (1 had no P.P. violations)
Unique linkage, cost vector = (UNUSED=0 DIS=0 FAT=0 AND=0 LEN=6)

+---------Xp---------+
+--------W--------+ |
| +----Ss----+ |
| | +--O-+ |
| | | | |
LEFT-WALL apa kucing itu ?

linkparser> siapa saya?
Found 1 linkage (1 had no P.P. violations)
Unique linkage, cost vector = (UNUSED=0 DIS=0 FAT=0 AND=0 LEN=3)

+-------Xp-------+
+------W------+ |
| +--Oh-+ |
| | | |
LEFT-WALL siapa saya ?

linkparser> kuda makan
Found 1 linkage (1 had no P.P. violations)
Unique linkage, cost vector = (UNUSED=0 DIS=0 FAT=0 AND=0 LEN=1)

+--S-+
| |
kuda makan

Saya pikir untuk keperluan Lumen Robot Friend Knowledge Base saja yaitu pengenalan kalimat setara playgroup/TK, menggunakan Link Grammar cukup masuk akal. Tapi kalau membuat Link Grammar dictionary bahasa Indonesia, itu bisa jadi topik thesis tersendiri he..he.. ;-) (dan tentunya butuh pengetahuan formal di bidang sastra dan tata bahasa Indonesia)

Link Grammar Parser

Link Grammar Parser adalah software natural language processing (NLP) yang menganalisa keterhubungan dan struktur kata dalam kalimat. Link Grammar Parser digunakan sebagai basis untuk RelEx semantic relationships extractor pada OpenCog.

Link Grammar saat ini berada dalam naungan project AbiWord dan dipelihara oleh Dom Lachowicz dan Dr. Linas Vepstas dari project OpenCog.

Untuk menginstall Link Grammar Parser di Ubuntu / Linux Mint yang sudah packaged:

sudo aptitude install link-grammar link-grammar-dictionaries-en

Versi Link Grammar Parser terbaru per April 2014 adalah 5.0.8, namun yang tersedia di Ubuntu repositories per 1 Juli 2014 masih versi 4.7.4. Link Grammar saat ini mendukung 8 bahasa yaitu English, Russian, Persian, Arabic, German, Lithuanian, Hebrew, Turkish, French (Luthor project).

Dukungan Link Grammar untuk bahasa Tagalog (Filipina) sedang dikembangkan oleh Lareina Milambiling.

Untuk menambah dukungan bahasa, caranya ada di short primer for creating dictionaries for new languages.
Bagaimana kalau menambah dukungan Link Grammar untuk bahasa Indonesia yang dapat dipakai oleh Lumen Robot Friend (tentunya sederhana, tidak lengkap), butuh usaha besar ga ya?

Contoh penggunaan link-grammar dalam bahasa Inggris sebagai berikut:

ceefour@hendy:~ > link-parser
link-grammar: Info: Dictionary found at /usr/share/link-grammar/en/4.0.dict
link-grammar: Info: Dictionary version 4.7.4.
link-grammar: Info: Library version link-grammar-4.7.4. Enter "!help" for help.
linkparser> i like an elephant which eats orange cookies.
No complete linkages found.
Found 3 linkages (3 had no P.P. violations) at null count 1
Linkage 1, cost vector = (UNUSED=1 DIS=0 FAT=0 AND=0 LEN=12)

+------------------------------Xp------------------------------+
| +-----Os----+------Bs------+-------Op-------+ |
+-----Wi-----+ +--Ds--+---R---+--RS--+ +----A---+ |
| | | | | | | | |
LEFT-WALL [i] like.v an elephant.n which eats.v orange.a cookies.n .

Press RETURN for the next linkage.

linkparser>
Linkage 2, cost vector = (UNUSED=1 DIS=0 FAT=0 AND=0 LEN=12)

+------------------------------Xp------------------------------+
| +-----Os----+------Bs------+-------Op-------+ |
+-----Wi-----+ +--Ds--+---R---+--RS--+ +---AN---+ |
| | | | | | | | |
LEFT-WALL [i] like.v an elephant.n which eats.v orange.s cookies.n .

Press RETURN for the next linkage.

linkparser>
Linkage 3, cost vector = (UNUSED=1 DIS=0 FAT=0 AND=0 LEN=12)

+-------------------------------Xp-------------------------------+
| +-----Os----+------Bs------+--------Op--------+ |
+-----Wi-----+ +--Ds--+---R---+--RS--+ +----AN---+ |
| | | | | | | | |

LEFT-WALL [i] like.v an elephant.n which eats.v orange.n-u cookies.n .

Senin, 30 Juni 2014

Memasukkan BabelNet sebagai dependency di Maven Project

Maven POM

Repository configuration:

<id>bippo-nexus-public</id>

<url>http://nexus.bippo.co.id/nexus/content/groups/public/</url>

<enabled>false</enabled>

</snapshots>

</repository>

</repositories>

Dependencies:

<groupId>commons-configuration</groupId>

<artifactId>commons-configuration</artifactId>

</dependency>

<artifactId>jung-algorithms</artifactId>

</dependency>

</dependency>

<groupId>net.sourceforge.owlapi</groupId>

<artifactId>owlapi-distribution</artifactId>

</dependency>

<groupId>it.uniroma1.lcl</groupId>

</dependency>

<groupId>org.apache.jena</groupId>

</dependency>

<groupId>org.apache.lucene</groupId>

<artifactId>lucene-core</artifactId>

</dependency>

<groupId>org.babelnet</groupId>

<artifactId>babelnet-api</artifactId>

</dependency>

Sample Code

TODO: BabelNet 2.5 belum bisa dipakai untuk WSD. Butuh BabelNet 1.0.1 + path indexes v1.0.1.

Required to run `id.ac.itb.ee.lskk.relexid.core.BabelNetTest`

1. Extract [BabelNet-API-2.5.zip](http://babelnet.org/download.jsp) to `$HOME/BabelNet-API-2.5`

2. Extract the indexes to $HOME (will create subdirectories inside `$HOME/BabelNet-2.5`. For testing you can use the small indexes only:

a. babelnet-2.5-APACHE-20-index.tar.bz2

b. babelnet-2.5-CC-BY-30-index.tar.bz2

c. babelnet-2.5-CC-BY-NC-SA-30-index.tar.bz2

d. babelnet-2.5-CECILL-C-index.tar.bz2

3. BabelNet API v1.0.1 + Path indexes v1.0.1:

a. http://lcl.uniroma1.it/babelnet/data/babelnet-api-1.0.1.tar.gz

b. http://lcl.uniroma1.it/babelnet/data/babelnet-1.0.1-core-lucene.tar.bz2

See [Ciarán Ó Duibhín's article](http://www.smo.uhi.ac.uk/~oduibhin/oideasra/interfaces/winbabelnet.htm) for reason.

4. Edit `$HOME/BabelNet-API-2.5/config/babelnet.var.properties` and set `babelnet.dir` to `${user.home}/BabelNet-2.5`.

5. Edit `$HOME/BabelNet-API-2.5/config/knowledge.var.properties` and set `knowledge.graph.pathIndex` to `${user.home}/BabelNet-1.0.1`.

Reference: https://groups.google.com/d/msg/babelnet-kb/2EIKgvDVE2c/eKnHT65JN-IJ

(Self-note) Deploy babelnet-api (dan beberapa dependency JARs) ke Maven repository

Download dulu BabelNet Java API.

Lalu extract distribusi BabelNet Java API.

Buat sources.jar :

jar cvf babelnet-api-2.5-sources.jar -C src .

Upload file-file JAR ke Nexus Maven repository:

mvn deploy:deploy-file -Dfile=lib/jlt-1.0.0.jar -DgroupId=it.uniroma1.lcl -DartifactId=jlt -Dversion=1.0.0 -Dpackaging=jar -Durl=http://nexus.bippo.co.id/nexus/content/repositories/soluvas-public-thirdparty/ -DrepositoryId=soluvas-public-thirdparty

mvn deploy:deploy-file -Dfile=babelnet-api-2.5.jar -DgroupId=org.babelnet -DartifactId=babelnet-api -Dversion=2.5 -Dpackaging=jar -Durl=http://nexus.bippo.co.id/nexus/content/repositories/soluvas-public-thirdparty/ -DrepositoryId=soluvas-public-thirdparty

mvn deploy:deploy-file -Dfile=babelnet-api-2.5-sources.jar -DgroupId=org.babelnet -DartifactId=babelnet-api -Dversion=2.5 -Dpackaging=jar -Dclassifier=sources -Durl=http://nexus.bippo.co.id/nexus/content/repositories/soluvas-public-thirdparty/ -DrepositoryId=soluvas-public-thirdparty

(pas deploy sources ini bakal 400 Bad Request tapi nggak papa koq)

Senin, 23 Juni 2014

Diagram visual berbasis category theory untuk analisa kalimat bahasa Indonesia

Salah satu masalah yang cukup sulit dalam natural language processing (NLP) adalah word sense disambiguation. Memilah arti (sense) yang dimaksud pembicara dari sebuah kata, berdasarkan konteks kalimat maupun pembicaraan.

Contoh: gajah (elephant-n) dapat berarti:

binatang gajah (102506148-n) dalam konteks umum
simbol partai Republik (106894712-n) dalam konteks politik
buah gajah (102847294-n) dalam konteks permainan catur

Diagram visual berbasis category theory yang dikembangkan oleh Bob Coecke, dapat membantu menyelesaikan masalah ini:

Sederhananya, setiap kata/sense diberikan beberapa link yang dapat dihubungkan dengan sense lain yang memiliki link dengan kategori tertentu. Dalam contoh kasus di atas saya dapat mendeklarasikan (pseudo DSL):

menunggang
-> Animal
102506148-n
<- Animal

Dengan informasi tersebut, maka untuk verb menunggang, keterhubungannya untuk kata gajah-n adalah dengan sense 102506148-n (binatang gajah).

Metode ini juga tujuan awalnya digunakan untuk memvisualisasikan formula quantum theory.

Referensi:

Diskusi di milis OpenCog
Paper terkait: Mathematical Foundations for a Compositional Distributional Model of Meaning
Paper terkait: Lambek vs. Lambek: Functorial Vector Space Semantics and String Diagrams for Lambek Calculus

Sabtu, 21 Juni 2014

Mengenali adjective (kata sifat) dan adjective satelite dalam kalimat bahasa Indonesia

relex-id grammar relationship extractor sekarang dapat mengenali kata sifat (adjective) dan adjective satellite (apa yach bahasa Indonesianya? sepertinya di bahasa Indonesia dianggap sebagai kata sifat juga) dalam kalimat yang diberikan.

Contoh output yang mengandung kata sifat untuk input "Hati ibu lembut."

Sentence structure:
(S (NP (NP mother-n) heart-n) (AP kind-a) . )
Sentence in English:
Mother heart kind.
Sentence in Indonesian:
Kebaikan hati ibu penyayang.

Dalam contoh di atas, mother-n adalah possessive noun (tapi ini belum dimodelkan). kind-a adalah adjective.

relex-id juga sudah dapat mengenali kalimat dengan 4 kata berisi pronoun, noun, verb, dan adjective satellite, misalnya "Aku menyayangi unta jingga."

Sentence structure:
(S (PP i) (VP love-v (NP (SP orange-s) camel-n)) . )
Sentence in English:
I love orange camel.

Sentence in Indonesian:
Aku sayang unta jingga.

Uji coba lainnya untuk kalimat "Aku menginginkan tas biru." :

Sentence structure:
(S (PP i) (VP want-v (NP (SP blue-s) bag-n)) . )
Sentence in English:
I want blue bag.

Sentence in Indonesian:
Aku hendak tas biru.

Contoh-contoh kalimat di atas sengaja saya pilih agar hasilnya "agak masuk akal", karena database WordNet berisi 155287 synsets dan di relex-id belum ada penentuan relevance score synsets yang cocok untuk sebuah kata. Jadi yang dipilih adalah synset pertama yang match, terkadang membuat hasilnya jadi agak janggal.

Untuk saat ini saya sudah cukup senang dengan progress relex-id :) , dan relatif usable untuk dilanjutkan ke tahap selanjutnya yaitu penyempurnaan grammatical relationship extractor dan semantic relationship extractor.

Lex rules yang digunakan adalah sebagai berikut:

<?xml version="1.0" encoding="ASCII"?>
<relexid:LexRules xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:relexid="http://relexid/1.0"
xsi:schemaLocation="http://relexid/1.0 relexid.ecore">
<rules>
<replacements xsi:type="relexid:ResourceReplacement"
partOfSpeech="noun" captureGroup="1">
<replacements xsi:type="relexid:ResourceReplacement"
partOfSpeech="noun" captureGroup="2" />
</replacements>
<replacements xsi:type="relexid:ResourceReplacement"
partOfSpeech="adjective" captureGroup="3" />
<matchers xsi:type="relexid:PartOfSpeechMatcher"
partOfSpeech="noun" />
<matchers xsi:type="relexid:PartOfSpeechMatcher"
partOfSpeech="noun" />
<matchers xsi:type="relexid:PartOfSpeechMatcher"
partOfSpeech="adjective" />
</rules>
<rules>
<replacements xsi:type="relexid:ResourceReplacement"
partOfSpeech="noun" captureGroup="1">
<replacements xsi:type="relexid:ResourceReplacement"
partOfSpeech="adjective_satellite" captureGroup="2" />
</replacements>
<matchers xsi:type="relexid:PartOfSpeechMatcher"
partOfSpeech="noun" />
<matchers xsi:type="relexid:PartOfSpeechMatcher"
partOfSpeech="adjective_satellite" />
</rules>
<rules>
<replacements xsi:type="relexid:PronounReplacement"
partOfSpeech="pronoun" />
<replacements xsi:type="relexid:ResourceReplacement"
partOfSpeech="verb" captureGroup="2">
<replacements xsi:type="relexid:PronounReplacement"
partOfSpeech="pronoun" person="second" case="object" />
</replacements>
<matchers xsi:type="relexid:LiteralMatcher">
<literals>aku</literals>
</matchers>
<matchers xsi:type="relexid:PartOfSpeechMatcher"
partOfSpeech="verb" />
<matchers xsi:type="relexid:LiteralMatcher">
<literals>kamu</literals>
</matchers>
</rules>
<rules>
<replacements xsi:type="relexid:PronounReplacement"
partOfSpeech="pronoun" />
<replacements xsi:type="relexid:ResourceReplacement"
partOfSpeech="verb" captureGroup="2">
<replacements xsi:type="relexid:RecognizedReplacement"
capturingGroup="3" />
</replacements>
<matchers xsi:type="relexid:LiteralMatcher">
<literals>aku</literals>
</matchers>
<matchers xsi:type="relexid:PartOfSpeechMatcher"
partOfSpeech="verb" />
<matchers xsi:type="relexid:RecognizedMatcher" partOfSpeech="noun" />
</rules>
<rules>
<replacements xsi:type="relexid:PronounReplacement"
partOfSpeech="pronoun" />
<replacements xsi:type="relexid:ResourceReplacement"
partOfSpeech="verb" captureGroup="2">
<replacements xsi:type="relexid:ResourceReplacement"
partOfSpeech="noun" captureGroup="3" />
</replacements>
<matchers xsi:type="relexid:LiteralMatcher">
<literals>aku</literals>
</matchers>
<matchers xsi:type="relexid:PartOfSpeechMatcher"
partOfSpeech="verb" />
<matchers xsi:type="relexid:PartOfSpeechMatcher"
partOfSpeech="noun" />
</rules>
<rules>
<replacements xsi:type="relexid:PunctuationReplacement" />
<matchers xsi:type="relexid:LiteralMatcher">
<literals>.</literals>
</matchers>
</rules>
</relexid:LexRules>

Log:

20:25:06.293 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Initializing WordNet 3.1 TDB database at /home/ceefour/wn31_tdb
20:25:06.552 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Loading LexRules from class id.ac.itb.ee.lskk.relexid.core.RelExTest > lumen.LexRules.xmi
20:25:06.786 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loading XMI: lumen.LexRules.xmi from id.ac.itb.ee.lskk.relexid.core.RelExTest
20:25:06.854 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loaded id.ac.itb.ee.lskk.relexid.core.impl.LexRulesImpl object from file:/home/ceefour/git/relex-id/core/target/classes/id/ac/itb/ee/lskk/relexid/core/lumen.LexRules.xmi
20:25:06.856 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Loading RelationRules from class id.ac.itb.ee.lskk.relexid.core.RelExTest > lumen.RelationRules.xmi
20:25:06.856 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loading XMI: lumen.RelationRules.xmi from id.ac.itb.ee.lskk.relexid.core.RelExTest
20:25:06.860 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loaded id.ac.itb.ee.lskk.relexid.core.impl.RelationRulesImpl object from file:/home/ceefour/git/relex-id/core/target/classes/id/ac/itb/ee/lskk/relexid/core/lumen.RelationRules.xmi
20:25:06.861 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Tokens: [Hati, , ibu, , lembut, .]
20:25:06.862 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 4 walls for ['Hati', ' ', 'ibu', ' ', 'lembut', '.'] » [0, 2, 4, 5]
20:25:06.987 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for noun 'Hati' chose the first sense wn31:104632183-n but matched 9 senses: [wn31:104632183-n, wn31:104864721-n, wn31:105392877-n, wn31:105929717-n, wn31:105927857-n, wn31:107667661-n, wn31:107667514-n, wn31:113888525-n, wn31:114158105-n]
20:25:06.996 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}104632183-n] for #0: 'Hati'
20:25:07.004 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for noun 'ibu' chose the first sense wn31:105843616-n but matched 5 senses: [wn31:105843616-n, wn31:110352574-n, wn31:110297825-n, wn31:110352098-n, wn31:110352666-n]
20:25:07.004 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}105843616-n] for #2: 'ibu'
20:25:07.027 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for adjective 'lembut' chose the first sense wn31:301374976-a but matched 11 senses: [wn31:301374976-a, wn31:300228210-a, wn31:300644180-a, wn31:301156249-a, wn31:300709335-a, wn31:301510813-a, wn31:301159626-a, wn31:302455719-a, wn31:302244586-a, wn31:302457962-a, wn31:301160432-a]
20:25:07.028 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «adjective» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}301374976-a] for #4: 'lembut'
20:25:07.035 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» MATCH for [0‥4]: ['Hati', ' ', 'ibu', ' ', 'lembut']
20:25:07.047 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 2 parts at index #0: [(NP (NP mother-n) heart-n), (AP kind-a)]
20:25:07.048 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 1 walls for [(NP (NP mother-n) heart-n), (AP kind-a), '.'] » [2]
20:25:07.048 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Resetting rule iterator due to matching rule
20:25:07.051 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: '.'
20:25:07.051 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of [(NP (NP mother-n) heart-n), (AP kind-a), '.']
20:25:07.055 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: '.'
20:25:07.055 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of [(NP (NP mother-n) heart-n), (AP kind-a), '.']
20:25:07.056 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: '.'
20:25:07.056 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb 'kamu'» NOT match for any sublist of [(NP (NP mother-n) heart-n), (AP kind-a), '.']
20:25:07.056 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: '.'
20:25:07.056 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb +noun» NOT match for any sublist of [(NP (NP mother-n) heart-n), (AP kind-a), '.']
20:25:07.056 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: '.'
20:25:07.056 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb noun» NOT match for any sublist of [(NP (NP mother-n) heart-n), (AP kind-a), '.']
20:25:07.056 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'.'» MATCH 1 [null] for #2: '.'
20:25:07.057 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'.'» MATCH for [2‥2]: ['.']
20:25:07.058 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 1 parts at index #2: [.]
20:25:07.058 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 0 walls for [(NP (NP mother-n) heart-n), (AP kind-a), .] » []
20:25:07.058 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 0 for matchers [pronoun, verb] against [(NP (NP mother-n) heart-n), (AP kind-a)]
20:25:07.058 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 0 for matchers [pronoun, verb] against [(AP kind-a), .]
20:25:07.059 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 0 for matchers [pronoun, verb] against [(NP (NP mother-n) heart-n), (AP kind-a)]
20:25:07.059 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 0 for matchers [pronoun, verb] against [(AP kind-a), .]
20:25:07.059 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Deduced 0 relations from 3 parts [(NP (NP mother-n) heart-n), (AP kind-a), .] >> []
20:25:07.059 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Deduced 0 relations for sentence 'null': []
20:25:07.059 [main] INFO i.a.i.ee.lskk.relexid.core.RelExTest - Sentence structure: (S (NP (NP mother-n) heart-n) (AP kind-a) . )
20:25:07.073 [main] INFO i.a.i.ee.lskk.relexid.core.RelExTest - Sentence in English: Mother heart kind.
20:25:07.109 [main] INFO i.a.i.ee.lskk.relexid.core.RelExTest - Sentence in Indonesian: Kebaikan hati ibu penyayang.

20:41:07.783 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Initializing WordNet 3.1 TDB database at /home/ceefour/wn31_tdb
20:41:08.070 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Loading LexRules from class id.ac.itb.ee.lskk.relexid.core.RelExTest > lumen.LexRules.xmi
20:41:08.327 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loading XMI: lumen.LexRules.xmi from id.ac.itb.ee.lskk.relexid.core.RelExTest
20:41:08.515 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loaded id.ac.itb.ee.lskk.relexid.core.impl.LexRulesImpl object from file:/home/ceefour/git/relex-id/core/target/classes/id/ac/itb/ee/lskk/relexid/core/lumen.LexRules.xmi
20:41:08.517 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Loading RelationRules from class id.ac.itb.ee.lskk.relexid.core.RelExTest > lumen.RelationRules.xmi
20:41:08.517 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loading XMI: lumen.RelationRules.xmi from id.ac.itb.ee.lskk.relexid.core.RelExTest
20:41:08.522 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loaded id.ac.itb.ee.lskk.relexid.core.impl.RelationRulesImpl object from file:/home/ceefour/git/relex-id/core/target/classes/id/ac/itb/ee/lskk/relexid/core/lumen.RelationRules.xmi
20:41:08.522 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Tokens: [Aku, , menyayangi, , unta, , jingga, .]
20:41:08.523 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 5 walls for ['Aku', ' ', 'menyayangi', ' ', 'unta', ' ', 'jingga', '.'] » [0, 2, 4, 6, 7]
20:41:08.643 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #0: 'Aku'
20:41:08.643 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', 'unta', ' ', 'jingga', '.']
20:41:08.655 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: 'menyayangi'
20:41:08.655 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', 'unta', ' ', 'jingga', '.']
20:41:08.672 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}102439767-n] for #4: 'unta'
20:41:08.684 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for noun 'jingga' chose the first sense wn31:104972356-n but matched 2 senses: [wn31:104972356-n, wn31:115015777-n]
20:41:08.691 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}104972356-n] for #6: 'jingga'
20:41:08.695 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «adjective» NOT match for #7: '.'
20:41:08.696 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', 'unta', ' ', 'jingga', '.']
20:41:08.699 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for noun 'jingga' chose the first sense wn31:104972356-n but matched 2 senses: [wn31:104972356-n, wn31:115015777-n]
20:41:08.700 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}104972356-n] for #6: 'jingga'
20:41:08.703 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #7: '.'
20:41:08.704 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', 'unta', ' ', 'jingga', '.']
20:41:08.707 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #7: '.'
20:41:08.707 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', 'unta', ' ', 'jingga', '.']
20:41:08.710 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #0: 'Aku'
20:41:08.711 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', 'unta', ' ', 'jingga', '.']
20:41:08.714 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: 'menyayangi'
20:41:08.715 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', 'unta', ' ', 'jingga', '.']
20:41:08.718 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}102439767-n] for #4: 'unta'
20:41:08.723 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «adjective_satellite» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}300379954-s] for #6: 'jingga'
20:41:08.731 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» MATCH for [4‥6]: ['unta', ' ', 'jingga']
20:41:08.742 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 1 parts at index #4: [(NP (SP orange-s) camel-n)]
20:41:08.742 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 3 walls for ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n), '.'] » [0, 2, 5]
20:41:08.742 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Resetting rule iterator due to matching rule
20:41:08.745 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #0: 'Aku'
20:41:08.745 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n), '.']
20:41:08.750 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: 'menyayangi'
20:41:08.750 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n), '.']
20:41:08.753 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #5: '.'
20:41:08.753 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n), '.']
20:41:08.756 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #0: 'Aku'
20:41:08.756 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n), '.']
20:41:08.760 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: 'menyayangi'
20:41:08.760 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n), '.']
20:41:08.763 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #5: '.'
20:41:08.763 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n), '.']
20:41:08.764 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» MATCH 1 [null] for #0: 'Aku'
20:41:08.774 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for verb 'menyayangi' chose the first sense wn31:201779085-v but matched 3 senses: [wn31:201779085-v, wn31:201779456-v, wn31:201781131-v]
20:41:08.774 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «verb» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}201779085-v] for #2: 'menyayangi'
20:41:08.774 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'kamu'» NOT match for #4: (NP (SP orange-s) camel-n)
20:41:08.774 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb 'kamu'» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n), '.']
20:41:08.774 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: 'menyayangi'
20:41:08.774 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb 'kamu'» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n), '.']
20:41:08.774 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #5: '.'
20:41:08.774 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb 'kamu'» NOT match for any sublist of ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n), '.']
20:41:08.774 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» MATCH 1 [null] for #0: 'Aku'
20:41:08.778 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for verb 'menyayangi' chose the first sense wn31:201779085-v but matched 3 senses: [wn31:201779085-v, wn31:201779456-v, wn31:201781131-v]
20:41:08.778 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «verb» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}201779085-v] for #2: 'menyayangi'
20:41:08.778 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «+noun» MATCH 1 [(NP (SP orange-s) camel-n)] for #4: (NP (SP orange-s) camel-n)
20:41:08.778 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb +noun» MATCH for [0‥4]: ['Aku', ' ', 'menyayangi', ' ', (NP (SP orange-s) camel-n)]
20:41:08.799 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 2 parts at index #0: [(PP i), (VP love-v (NP (SP orange-s) camel-n))]
20:41:08.799 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 1 walls for [(PP i), (VP love-v (NP (SP orange-s) camel-n)), '.'] » [2]
20:41:08.799 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Resetting rule iterator due to matching rule
20:41:08.802 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: '.'
20:41:08.802 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of [(PP i), (VP love-v (NP (SP orange-s) camel-n)), '.']
20:41:08.805 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: '.'
20:41:08.805 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of [(PP i), (VP love-v (NP (SP orange-s) camel-n)), '.']
20:41:08.805 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: '.'
20:41:08.805 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb 'kamu'» NOT match for any sublist of [(PP i), (VP love-v (NP (SP orange-s) camel-n)), '.']
20:41:08.805 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: '.'
20:41:08.805 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb +noun» NOT match for any sublist of [(PP i), (VP love-v (NP (SP orange-s) camel-n)), '.']
20:41:08.805 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: '.'
20:41:08.805 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb noun» NOT match for any sublist of [(PP i), (VP love-v (NP (SP orange-s) camel-n)), '.']
20:41:08.805 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'.'» MATCH 1 [null] for #2: '.'
20:41:08.806 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'.'» MATCH for [2‥2]: ['.']
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 1 parts at index #2: [.]
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 0 walls for [(PP i), (VP love-v (NP (SP orange-s) camel-n)), .] » []
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcher pronoun matches (PP i)
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 0 for matchers [pronoun] against [(NP (SP orange-s) camel-n)]
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 1 for matchers [pronoun, verb] against [(PP i), (VP love-v (NP (SP orange-s) camel-n))]
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 0 for matchers [pronoun, verb] against [(VP love-v (NP (SP orange-s) camel-n)), .]
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcher pronoun matches (PP i)
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcher noun matches (NP (SP orange-s) camel-n)
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 1 for matchers [noun] against [(NP (SP orange-s) camel-n)]
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcher verb matches (VP love-v (NP (SP orange-s) camel-n))
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 2 for matchers [pronoun, verb] against [(PP i), (VP love-v (NP (SP orange-s) camel-n))]
20:41:08.807 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Relation rule [pronoun verb => _subj(2, 1) || _obj(2, 2/1)] matches 0..1 [(PP i), (VP love-v (NP (SP orange-s) camel-n))]
20:41:08.809 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 0 for matchers [pronoun, verb] against [(VP love-v (NP (SP orange-s) camel-n)), .]
20:41:08.809 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Deduced 2 relations from 3 parts [(PP i), (VP love-v (NP (SP orange-s) camel-n)), .] >> [_subj(wn31:201779085-v, I), _obj(wn31:201779085-v, wn31:102439767-n)]
20:41:08.809 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Deduced 2 relations for sentence 'null': [_subj(wn31:201779085-v, I), _obj(wn31:201779085-v, wn31:102439767-n)]
20:41:08.809 [main] INFO i.a.i.ee.lskk.relexid.core.RelExTest - Sentence structure: (S (PP i) (VP love-v (NP (SP orange-s) camel-n)) . )
20:41:08.821 [main] INFO i.a.i.ee.lskk.relexid.core.RelExTest - Sentence in English: I love orange camel.

20:41:08.859 [main] INFO i.a.i.ee.lskk.relexid.core.RelExTest - Sentence in Indonesian: Aku sayang unta jingga.

20:43:30.797 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Initializing WordNet 3.1 TDB database at /home/ceefour/wn31_tdb
20:43:31.082 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Loading LexRules from class id.ac.itb.ee.lskk.relexid.core.RelExTest > lumen.LexRules.xmi
20:43:31.319 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loading XMI: lumen.LexRules.xmi from id.ac.itb.ee.lskk.relexid.core.RelExTest
20:43:31.390 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loaded id.ac.itb.ee.lskk.relexid.core.impl.LexRulesImpl object from file:/home/ceefour/git/relex-id/core/target/classes/id/ac/itb/ee/lskk/relexid/core/lumen.LexRules.xmi
20:43:31.391 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Loading RelationRules from class id.ac.itb.ee.lskk.relexid.core.RelExTest > lumen.RelationRules.xmi
20:43:31.392 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loading XMI: lumen.RelationRules.xmi from id.ac.itb.ee.lskk.relexid.core.RelExTest
20:43:31.396 [main] INFO o.soluvas.commons.OnDemandXmiLoader - Loaded id.ac.itb.ee.lskk.relexid.core.impl.RelationRulesImpl object from file:/home/ceefour/git/relex-id/core/target/classes/id/ac/itb/ee/lskk/relexid/core/lumen.RelationRules.xmi
20:43:31.397 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Tokens: [Aku, , menginginkan, , tas, , biru, .]
20:43:31.398 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 5 walls for ['Aku', ' ', 'menginginkan', ' ', 'tas', ' ', 'biru', '.'] » [0, 2, 4, 6, 7]
20:43:31.523 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #0: 'Aku'
20:43:31.523 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', 'tas', ' ', 'biru', '.']
20:43:31.537 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: 'menginginkan'
20:43:31.537 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', 'tas', ' ', 'biru', '.']
20:43:31.557 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for noun 'tas' chose the first sense wn31:102776042-n but matched 6 senses: [wn31:102776042-n, wn31:104129919-n, wn31:102776843-n, wn31:104144300-n, wn31:102801040-n, wn31:113786779-n]
20:43:31.561 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}102776042-n] for #4: 'tas'
20:43:31.571 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for noun 'biru' chose the first sense wn31:104976072-n but matched 4 senses: [wn31:104976072-n, wn31:108497858-n, wn31:109247473-n, wn31:115011152-n]
20:43:31.571 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}104976072-n] for #6: 'biru'
20:43:31.575 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «adjective» NOT match for #7: '.'
20:43:31.576 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', 'tas', ' ', 'biru', '.']
20:43:31.580 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for noun 'biru' chose the first sense wn31:104976072-n but matched 4 senses: [wn31:104976072-n, wn31:108497858-n, wn31:109247473-n, wn31:115011152-n]
20:43:31.580 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}104976072-n] for #6: 'biru'
20:43:31.583 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #7: '.'
20:43:31.583 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', 'tas', ' ', 'biru', '.']
20:43:31.587 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #7: '.'
20:43:31.587 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', 'tas', ' ', 'biru', '.']
20:43:31.590 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #0: 'Aku'
20:43:31.591 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', 'tas', ' ', 'biru', '.']
20:43:31.596 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: 'menginginkan'
20:43:31.596 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', 'tas', ' ', 'biru', '.']
20:43:31.600 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for noun 'tas' chose the first sense wn31:102776042-n but matched 6 senses: [wn31:102776042-n, wn31:104129919-n, wn31:102776843-n, wn31:104144300-n, wn31:102801040-n, wn31:113786779-n]
20:43:31.600 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}102776042-n] for #4: 'tas'
20:43:31.604 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «adjective_satellite» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}300371931-s] for #6: 'biru'
20:43:31.612 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» MATCH for [4‥6]: ['tas', ' ', 'biru']
20:43:31.620 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 1 parts at index #4: [(NP (SP blue-s) bag-n)]
20:43:31.621 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 3 walls for ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n), '.'] » [0, 2, 5]
20:43:31.621 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Resetting rule iterator due to matching rule
20:43:31.624 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #0: 'Aku'
20:43:31.625 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n), '.']
20:43:31.632 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: 'menginginkan'
20:43:31.632 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n), '.']
20:43:31.635 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #5: '.'
20:43:31.635 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n), '.']
20:43:31.638 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #0: 'Aku'
20:43:31.638 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n), '.']
20:43:31.642 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: 'menginginkan'
20:43:31.642 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n), '.']
20:43:31.646 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #5: '.'
20:43:31.646 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n), '.']
20:43:31.646 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» MATCH 1 [null] for #0: 'Aku'
20:43:31.652 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for verb 'menginginkan' chose the first sense wn31:200711034-v but matched 11 senses: [wn31:200711034-v, wn31:201191258-v, wn31:201828281-v, wn31:201830665-v, wn31:201828474-v, wn31:201828678-v, wn31:201831006-v, wn31:201830126-v, wn31:201832198-v, wn31:201829904-v, wn31:201831174-v]
20:43:31.652 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «verb» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}200711034-v] for #2: 'menginginkan'
20:43:31.652 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'kamu'» NOT match for #4: (NP (SP blue-s) bag-n)
20:43:31.652 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb 'kamu'» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n), '.']
20:43:31.652 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: 'menginginkan'
20:43:31.652 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb 'kamu'» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n), '.']
20:43:31.653 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #5: '.'
20:43:31.653 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb 'kamu'» NOT match for any sublist of ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n), '.']
20:43:31.653 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» MATCH 1 [null] for #0: 'Aku'
20:43:31.657 [main] WARN i.a.i.e.l.r.c.i.PartOfSpeechMatcherImpl - PartOfSpeech matcher for verb 'menginginkan' chose the first sense wn31:200711034-v but matched 11 senses: [wn31:200711034-v, wn31:201191258-v, wn31:201828281-v, wn31:201830665-v, wn31:201828474-v, wn31:201828678-v, wn31:201831006-v, wn31:201830126-v, wn31:201832198-v, wn31:201829904-v, wn31:201831174-v]
20:43:31.658 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «verb» MATCH 1 [{http://wordnet-rdf.princeton.edu/wn31/}200711034-v] for #2: 'menginginkan'
20:43:31.658 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «+noun» MATCH 1 [(NP (SP blue-s) bag-n)] for #4: (NP (SP blue-s) bag-n)
20:43:31.658 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb +noun» MATCH for [0‥4]: ['Aku', ' ', 'menginginkan', ' ', (NP (SP blue-s) bag-n)]
20:43:31.665 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 2 parts at index #0: [(PP i), (VP want-v (NP (SP blue-s) bag-n))]
20:43:31.665 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 1 walls for [(PP i), (VP want-v (NP (SP blue-s) bag-n)), '.'] » [2]
20:43:31.665 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Resetting rule iterator due to matching rule
20:43:31.668 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: '.'
20:43:31.668 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun noun adjective» NOT match for any sublist of [(PP i), (VP want-v (NP (SP blue-s) bag-n)), '.']
20:43:31.671 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «noun» NOT match for #2: '.'
20:43:31.671 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «noun adjective_satellite» NOT match for any sublist of [(PP i), (VP want-v (NP (SP blue-s) bag-n)), '.']
20:43:31.671 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: '.'
20:43:31.671 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb 'kamu'» NOT match for any sublist of [(PP i), (VP want-v (NP (SP blue-s) bag-n)), '.']
20:43:31.671 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: '.'
20:43:31.671 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb +noun» NOT match for any sublist of [(PP i), (VP want-v (NP (SP blue-s) bag-n)), '.']
20:43:31.671 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'aku'» NOT match for #2: '.'
20:43:31.671 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'aku' verb noun» NOT match for any sublist of [(PP i), (VP want-v (NP (SP blue-s) bag-n)), '.']
20:43:31.672 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Lex «'.'» MATCH 1 [null] for #2: '.'
20:43:31.672 [main] INFO id.ac.itb.ee.lskk.relexid.core.RelEx - Rule «'.'» MATCH for [2‥2]: ['.']
20:43:31.673 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 1 parts at index #2: [.]
20:43:31.673 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - 0 walls for [(PP i), (VP want-v (NP (SP blue-s) bag-n)), .] » []
20:43:31.673 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcher pronoun matches (PP i)
20:43:31.673 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 0 for matchers [pronoun] against [(NP (SP blue-s) bag-n)]
20:43:31.673 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 1 for matchers [pronoun, verb] against [(PP i), (VP want-v (NP (SP blue-s) bag-n))]
20:43:31.673 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 0 for matchers [pronoun, verb] against [(VP want-v (NP (SP blue-s) bag-n)), .]
20:43:31.674 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcher pronoun matches (PP i)
20:43:31.674 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcher noun matches (NP (SP blue-s) bag-n)
20:43:31.674 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 1 for matchers [noun] against [(NP (SP blue-s) bag-n)]
20:43:31.674 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcher verb matches (VP want-v (NP (SP blue-s) bag-n))
20:43:31.674 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 2 for matchers [pronoun, verb] against [(PP i), (VP want-v (NP (SP blue-s) bag-n))]
20:43:31.674 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Relation rule [pronoun verb => _subj(2, 1) || _obj(2, 2/1)] matches 0..1 [(PP i), (VP want-v (NP (SP blue-s) bag-n))]
20:43:31.675 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Matcheds 0 for matchers [pronoun, verb] against [(VP want-v (NP (SP blue-s) bag-n)), .]
20:43:31.675 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Deduced 2 relations from 3 parts [(PP i), (VP want-v (NP (SP blue-s) bag-n)), .] >> [_subj(wn31:200711034-v, I), _obj(wn31:200711034-v, wn31:102776042-n)]
20:43:31.675 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Deduced 2 relations for sentence 'null': [_subj(wn31:200711034-v, I), _obj(wn31:200711034-v, wn31:102776042-n)]
20:43:31.675 [main] INFO i.a.i.ee.lskk.relexid.core.RelExTest - Sentence structure: (S (PP i) (VP want-v (NP (SP blue-s) bag-n)) . )
20:43:31.688 [main] INFO i.a.i.ee.lskk.relexid.core.RelExTest - Sentence in English: I want blue bag.

20:43:31.724 [main] INFO i.a.i.ee.lskk.relexid.core.RelExTest - Sentence in Indonesian: Aku hendak tas biru.

Knowledge Base Lumen Robot Friend

Pages

Jumat, 11 Juli 2014

My First Merged Pull Request to OpenCog RelEx

Selasa, 08 Juli 2014

Natural Language Processing - Note from Dr. Linas Vepstas

Senin, 07 Juli 2014

Adding Distributed Indexes to Hypergraph Database for Horizontal Scaling of Semantic Reasoning

Experimental Performance Test using GridGain for Distributed Natural Language Processing

Distributed Natural Language Parsing using GridGain as Compute and Data Grid

Kamis, 03 Juli 2014

Knowledge Base YAGO2s untuk Uji Pengetahuan Robot

Selasa, 01 Juli 2014

Using BabelNet 1.1.1 Multilingual Dictionary & Word Sense Disambiguation (Tutorial)

How to Install BabelNet version 1.1.1

About WordNet 3.0 Ubuntu package

Dictionary bahasa Indonesia untuk Link Grammar Parser

Link Grammar Parser

Senin, 30 Juni 2014

Memasukkan BabelNet sebagai dependency di Maven Project

Maven POM

Sample Code

(Self-note) Deploy babelnet-api (dan beberapa dependency JARs) ke Maven repository

Senin, 23 Juni 2014

Diagram visual berbasis category theory untuk analisa kalimat bahasa Indonesia

Sabtu, 21 Juni 2014

Mengenali adjective (kata sifat) dan adjective satelite dalam kalimat bahasa Indonesia

Mengenai Saya

Blog Archive

Labels

Project

Friends