Pages

Jumat, 20 Juni 2014

relex-id semantic relationship extractor untuk bahasa Indonesia

Project relex-id saat ini dapat melakukan ekstraksi semantik sangat sederhana dengan format Subjek-Predikat-Objek.


Selain itu juga dapat melakukan natural language generation dalam bahasa Indonesia (utama) dan Inggris (hanya proof of concept saja, tapi tidak akan dikembangkan) dengan sistem concept/term dictionary.

Contoh input:

Aku cinta kamu.

Output:

Sentence structure:
(S (PP i) (VP dbpedia:Love (PP you_o)) . )

Sentence in English: I love you.
Sentence in Indonesian: Aku cinta kamu.

Contoh input:

Aku suka gajah.

Output:

Sentence structure:
(S (PP i) (VP dbpedia:Like (NP dbpedia:Elephant)) . )

Sentence in English: I like elephant.
Sentence in Indonesian: Aku suka gajah.

Kata-kata yang dikenali dicatat sebagai semantic resource berdasarkan data dari DBpedia. Teknik ini diharapkan nantinya dapat dikembangkan lebih lanjut untuk mendeduksi pengetahuan tambahan berdasarkan keterangan semantic dari DBpedia (dengan format RDF/N3).

Log:

01:40:14.164 [main] DEBUG o.a.j.riot.stream.JenaIOEnvironment - Failed to find configuration: location-mapping.ttl;location-mapping.rdf;location-mapping.n3;etc/location-mapping.rdf;etc/location-mapping.n3;etc/location-mapping.ttl
01:40:14.248 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Loading rules from class id.ac.itb.ee.lskk.relexid.core.RelExTest > lumen.LexRules.xmi
01:40:14.488 [main] INFO  o.soluvas.commons.OnDemandXmiLoader - Loading XMI: lumen.LexRules.xmi from id.ac.itb.ee.lskk.relexid.core.RelExTest
01:40:14.493 [main] DEBUG o.soluvas.commons.OnDemandXmiLoader - Loading XMI from URI: file:/home/ceefour/git/relex-id/core/target/classes/id/ac/itb/ee/lskk/relexid/core/lumen.LexRules.xmi using classpath
01:40:14.576 [main] INFO  o.soluvas.commons.OnDemandXmiLoader - Loaded id.ac.itb.ee.lskk.relexid.core.impl.LexRulesImpl object from file:/home/ceefour/git/relex-id/core/target/classes/id/ac/itb/ee/lskk/relexid/core/lumen.LexRules.xmi
01:40:14.578 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Tokens: [Aku,  , cinta,  , kamu, .]
01:40:14.588 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.LiteralElementImpl@1198f101 (literals: [aku], caseSensitive: false) match for #0: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@4ff681ad (literal: Aku, resource: null)
01:40:14.591 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.ResourceElementImpl@285c5e36 (resource: dbpedia:Love) match for #2: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@482d3fed (literal: cinta, resource: null)
01:40:14.591 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.LiteralElementImpl@12c005a0 (literals: [kamu], caseSensitive: false) match for #4: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@7cb154fd (literal: kamu, resource: null)
01:40:14.602 [main] INFO  id.ac.itb.ee.lskk.relexid.core.RelEx - Rule id.ac.itb.ee.lskk.relexid.core.impl.LexRuleImpl@5ca61f7f match for [0‥4]: [id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@4ff681ad (literal: Aku, resource: null), id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@1e5dfa5e (literal:  , resource: null), id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@482d3fed (literal: cinta, resource: null), id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@2be040e5 (literal:  , resource: null), id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@7cb154fd (literal: kamu, resource: null)]
01:40:14.604 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 2 parts at index #0: [(PP i), (VP dbpedia:Love (PP you_o))]
01:40:14.607 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.LiteralElementImpl@15197854 (literals: [aku], caseSensitive: false) not match for #2: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@245fea74 (literal: ., resource: null)
01:40:14.607 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule id.ac.itb.ee.lskk.relexid.core.impl.LexRuleImpl@3b2add9e not match for [Aku,  , cinta,  , kamu, .]
01:40:14.607 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.LiteralElementImpl@7b3bce6a (literals: [.], caseSensitive: false) match for #2: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@245fea74 (literal: ., resource: null)
01:40:14.607 [main] INFO  id.ac.itb.ee.lskk.relexid.core.RelEx - Rule id.ac.itb.ee.lskk.relexid.core.impl.LexRuleImpl@c22ad50 match for [2‥2]: [id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@245fea74 (literal: ., resource: null)]
01:40:14.610 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 1 parts at index #2: [.]
01:40:14.610 [main] INFO  i.a.i.ee.lskk.relexid.core.RelExTest - Sentence structure: (S (PP i) (VP dbpedia:Love (PP you_o)) . )
01:40:14.611 [main] INFO  i.a.i.ee.lskk.relexid.core.RelExTest - Sentence in English: I love you.
01:40:14.611 [main] INFO  i.a.i.ee.lskk.relexid.core.RelExTest - Sentence in Indonesian: Aku cinta kamu.



01:40:29.996 [main] DEBUG o.a.j.riot.stream.JenaIOEnvironment - Failed to find configuration: location-mapping.ttl;location-mapping.rdf;location-mapping.n3;etc/location-mapping.rdf;etc/location-mapping.n3;etc/location-mapping.ttl
01:40:30.082 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Loading rules from class id.ac.itb.ee.lskk.relexid.core.RelExTest > lumen.LexRules.xmi
01:40:30.283 [main] INFO  o.soluvas.commons.OnDemandXmiLoader - Loading XMI: lumen.LexRules.xmi from id.ac.itb.ee.lskk.relexid.core.RelExTest
01:40:30.287 [main] DEBUG o.soluvas.commons.OnDemandXmiLoader - Loading XMI from URI: file:/home/ceefour/git/relex-id/core/target/classes/id/ac/itb/ee/lskk/relexid/core/lumen.LexRules.xmi using classpath
01:40:30.356 [main] INFO  o.soluvas.commons.OnDemandXmiLoader - Loaded id.ac.itb.ee.lskk.relexid.core.impl.LexRulesImpl object from file:/home/ceefour/git/relex-id/core/target/classes/id/ac/itb/ee/lskk/relexid/core/lumen.LexRules.xmi
01:40:30.357 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Tokens: [Aku,  , suka,  , gajah, .]
01:40:30.370 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.LiteralElementImpl@40bbadde (literals: [aku], caseSensitive: false) match for #0: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@6193eb05 (literal: Aku, resource: null)
01:40:30.374 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.ResourceElementImpl@12c005a0 (resource: dbpedia:Love) not match for #2: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@7cb154fd (literal: suka, resource: null)
01:40:30.374 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Rule id.ac.itb.ee.lskk.relexid.core.impl.LexRuleImpl@5e42edff not match for [Aku,  , suka,  , gajah, .]
01:40:30.374 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.LiteralElementImpl@7c1730b1 (literals: [aku], caseSensitive: false) match for #0: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@6193eb05 (literal: Aku, resource: null)
01:40:30.374 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.ResourceElementImpl@567df41c (resource: dbpedia:Like) match for #2: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@7cb154fd (literal: suka, resource: null)
01:40:30.374 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.ResourceElementImpl@672586a0 (resource: dbpedia:Elephant) match for #4: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@50a9a747 (literal: gajah, resource: null)
01:40:30.386 [main] INFO  id.ac.itb.ee.lskk.relexid.core.RelEx - Rule id.ac.itb.ee.lskk.relexid.core.impl.LexRuleImpl@418b04a5 match for [0‥4]: [id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@6193eb05 (literal: Aku, resource: null), id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@dde0e41 (literal:  , resource: null), id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@7cb154fd (literal: suka, resource: null), id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@6d79d483 (literal:  , resource: null), id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@50a9a747 (literal: gajah, resource: null)]
01:40:30.388 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 2 parts at index #0: [(PP i), (VP dbpedia:Like (NP dbpedia:Elephant))]
01:40:30.391 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Element id.ac.itb.ee.lskk.relexid.core.impl.LiteralElementImpl@40de3b0e (literals: [.], caseSensitive: false) match for #2: id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@32f43d34 (literal: ., resource: null)
01:40:30.391 [main] INFO  id.ac.itb.ee.lskk.relexid.core.RelEx - Rule id.ac.itb.ee.lskk.relexid.core.impl.LexRuleImpl@6647555e match for [2‥2]: [id.ac.itb.ee.lskk.relexid.core.impl.UnrecognizedPartImpl@32f43d34 (literal: ., resource: null)]
01:40:30.393 [main] DEBUG id.ac.itb.ee.lskk.relexid.core.RelEx - Replacing with 1 parts at index #2: [.]
01:40:30.393 [main] INFO  i.a.i.ee.lskk.relexid.core.RelExTest - Sentence structure: (S (PP i) (VP dbpedia:Like (NP dbpedia:Elephant)) . )
01:40:30.393 [main] INFO  i.a.i.ee.lskk.relexid.core.RelExTest - Sentence in English: I like elephant.
01:40:30.394 [main] INFO  i.a.i.ee.lskk.relexid.core.RelExTest - Sentence in Indonesian: Aku suka gajah.