Archive for December, 2009

Progress towards Unification

Thursday, December 31st, 2009

I’ve now created data structures to represent features, conjuncts and disjuncts, and can perform unification in a few simple cases — but not yet generally in the presence of disjuncts. The latter is hard to do efficiently, so in the first instance I plan to use an inefficient but simple algorithm (probably expanding both inputs to disjunctive normal form prior to unification).

Also not implemented yet are paths, constituent sets and patterns. An open question is whether to implement paths explicitly, or instead use named variables to tie branches of the tree together. So far as I can see, these options provide essentially the same functionality, however I suspect that variables may be more convenient to implement given that the data structures I am using only allow you to move down the tree, not upwards.

I’m now confident that unification can provide all of the current functionality of the translation system except (perhaps) for morphological operations and left-right agreement. Furthermore, it would do so in a significantly more flexible and elegant manner than the ad-hoc mechanisms that exist at present. My main concern remains that of efficiency. I’m also uncertain as to how best to support dialects.

One mechanism which could be eliminated is the use of namespaces. For example, the predicate zoo:species:tyrannosaurus:rex could be replaced by:

[type=animal,rank=species,genus=tyrannosaurus,species=rex]

This is more verbose, but self-describing (a bit like the difference between X.500 and the DNS) and able to be processed in ways that an opaque predicate name cannot.