Archive for the ‘Predicates’ Category

Computer Terminology part 2: Topics

Tuesday, May 11th, 2010

In choosing what the topics should be, my main priority has been to avoid creating large grey areas. This is partly to simplify the initial classification process, but mostly to make the resulting predicates easier to memorise: if the classifications are obvious given the topics, then all you have to remember are the topics.

Computer technology makes a good choice for a top-level namespace because it is a large subject that has much specialised vocabulary and reasonably well-defined boundaries. I’ve called this namespace ‘comp‘. The namespaces immediately below it wil include:

comp:alg (algorithms)
comp:arch (machine architecture)
comp:code (source and object code)
comp:data (data types and structures)
comp:dev (software development)
comp:exec (execution)
comp:io (input/output)
comp:net (networking)
comp:os (operating systems)
comp:ui (the user interface)

Some of the second-level namespaces have third-level namespaces within them. Examples include comp:alg:sort and comp:io:printer. The main reason for introducing them was the groups of predicate names that would otherwise share a common suffix, such as inkjet, laser and thermal printers, or bubble, insertion and selection sorts. The word ‘printer’ or ’sort’ was going to have to appear in the predicate name anyway, so why not make it a namespace and obtain the benefit of grouping like terms together?

Much computer-related terminology has been adapted from other fields. It will therefore be no surprise to find the same predicate names appearing in other namespaces with similar definitions, both inside and outside the comp hierarchy. An example would be comp:dev:fork and comp:os:fork. One could argue that these are essentially the same process applied respectively to source code and to a running process, but there are two good reasons for keeping them separate:

  • Although they have much in common, there is enough contextual baggage associated with each that it would be difficult for a generic description to convey their full meanings
  • The concept of forking is one that applies only to a very small number of isolated concepts. It is therefore quite practicable to enumerate the alternatives, and there is little to be gained by attempting to generalise.

A subtopic that I am unsure about adding is one for types of software. Originally I had put together quite a number of predicates that could be grouped together in this way, but it became increasing apparent that most of them could be expressed using the pattern ‘program to do X’. Thus, a compiler is a program for compilation, a text editor is a program for editing text, and so on. There may be a case for providing these anyway, as aliases, but I’m not yet convinced that is justified.

There may also be a need to add subtopics relating to specific technologies. For example, ‘method’ is a generic term for a subroutine that is a member of a class, but for methods written in C++ it is more common and more appropriate to use the term ‘member function’. This difference needs to be captured somehow, and one way to do that would be to treat a C++ member function as a type of method that is sufficiently distinctive to be given its own predicate. However I’m not necessarily convinced that this is the right answer, not least because it would violate orthogonality.

One final point: I think my work on this namespace is sufficiently mature to start committing parts of it to the repository, but this does not mean that the predicate names should be considered stable. On the contrary, I expect change as it beds in, and I won’t be making any particular effort to ensure backward compatibility. The only predicates that have been declared stable so far are those relating to numbers. At some point in the future it will be necessary to identify stable and non-stable predicates in the BabelScript documentation, but for now it is enough to say that everything apart from cardinal and ordinal numbers is non-stable.

Tags in Predicate Definitions

Sunday, November 1st, 2009

There are several stages of the translation process during which tags can be applied, but currently almost all of the raw information on which they are based is provided by the lexicon. Some of that information is language-independent, and I want to move it into the predicate dictionary so that it can be shared between languages. Examples of the type of information I have in mind are the fact that a predicate represents:

  • a colour, or
  • a substance, or
  • an animal.

As I’ve previously indicated, language-independence need not be absolute: if a particular language needs to handle a particular predicate differently then it can be overridden. I wouldn’t want to over-use this facility, because attributes which are substantially language-dependent belong in the lexicon, but it will simplify the handling of edge cases and idiosyncrasies.

The syntax for attaching tags to a predicate is as follows:

predicate foo
{
  tag bar,baz,quux;
};

Multiple tag statements are permitted as an alternative to listing them on one line. Tags are applied as the very first stage of the translation process. I’ve also made some changes to later stages so that tags survive up to and beyond word selection.

What can’t easily be achieved at this point is tagging of composite predicates prior to word selection, so while the system can be told that zoo:genus:vulpes has the characteristic of being animate, it is not able to deduce that (colour:red zoo:genus:vulpes) is equally animate. This is an issue I intend to address soon.

Merging the predicate dictionary into the language description

Wednesday, September 30th, 2009

I’m going to eliminate any formal distinction between the predicate dictionary and the language description. Instead there will be only one type of file, which can hold any of the currently supported types of declaration (predicate, morpheme, reading and so on).

Of course, I certainly don’t want to copy and paste the same set of predicate declarations into many separate files, but that won’t be necessary. There will need to be a way for one language to inherit declarations from another language in order to efficiently support dialects. The same mechanism, once it has been implemented, can be used across all languages to share a common set of predicate declarations.

One important difference between this and the current arrangement is that it will provide a basis for predicate declarations to be overridden. This will allow information to be associated with a predicate even if it is not strictly language-independent.

For example, Polish treat nouns differently according to whether they are animate or inanimate. For the most part animacy is defined as you would expect it to be, but there are marginal cases which are decided by convention (plants are generally inanimate, but viruses, bacteria and fungi are animate), and more than a few outright exceptions (units of currency, such as the złoty, are animate). To the extent that the classification is based on objective criteria it can and should be shared between languages, but exceptions rightly belong within the relevant language description.

Implementing this capability will not be a great burden. Arguably it simplifies the translation system slightly, and it avoids the annoyance (within the internal C++ API) of having to explicitly instantiate the predicate dictionary and provide a reference to it when constructing a language object.

Names of Colours part 4: Implementation

Monday, August 31st, 2009

I’ve now had some experience working with the colour predicates described previously, and so far they have proved to be satisfactory. Certainly I have not yet had cause to wish that they were defined differently. However there are a number of constructions which cannot currently be translated, due in large part to how the word selection algorithm works. Here is an outline of what works, what doesn’t, and how the situation could be improved.

Unqualified hues present no difficulty provided that the readings given cover all of the allowed predicates. This can and often does result in several readings for the same colour name. For example, both colour:azure and colour:blue have been translated as ‘blue’ in English.

Hues qualified with colour:dark, colour:light or colour:bright also work as intended provided that they are bound together as a compound predicate, for example:

(colour:dark colour:orange) ⇒ ‘brown’

With a few additions to the language description this can be expanded to:

((colour:dark colour:orange) bio:genus:vulpes) ⇒ ‘brown fox’.

However, other permutations of these predicates have a less desirable surface form:

((colour:orange colour:dark) bio:genus:vulpes) ⇒ ‘orange dark fox’
(colour:dark (colour:orange bio:genus:vulpes)) ⇒ ‘dark orange fox’
(colour:orange (colour:dark bio:genus:vulpes)) ⇒ ‘orange dark fox’

The question is, should the translation system do better with these inputs, or should these inputs be avoided?

A partial answer to this question is that it shouldn’t matter whether colour:dark or colour:orange is specified first, because the effect of these predicates on the membership function is linear. By this I mean that if f(x) represents darkness and g(x) represents orangeness then:

f(g(x)) ∝ f(x) × g(x)

Since multiplication is commutative it follows that:

f(g(x)) ∝ g(f(x))

This does not mean that (colour:dark colour:orange) and (colour:orange colour:dark) should necessarily produce the same output, but the surface forms should at least be of similar quality, which is clearly not the case at present.

One solution would be to provide two readings for the word ‘brown’, but this would be inelegant, and scales poorly if more than two predicates were involved. The alternatives are to improve the word selection algorithm so as to recognise when permutations are equivalent to each other, or to force the predicates into a particular canonical order.

Many languages have a preferred order for adjectives, so some reordering will be needed whether or not it is a requirement for word selection. For those languages which don’t have a preferred order, there is no reason why one can’t be imposed anyway. Even for those languages which use adjective order to indicate emphasis, there is no need to preserve the original order of the predicates, because that would not be a correct way to deduce what should be emphasised.

However I can see one situation where reordering won’t help. Where a noun encompasses the meaning of one or more adjectives (such as ‘lamb’ or ‘ewe’ in place of ’sheep’) there is no guarantee that the predicates replaced will be canonically adjacent to each other (for example ‘young black sheep’). For this reason I think impovements to the word selection algorithm will be needed, even if canonicalisation is introduced too.

Regarding the question of associativity, (colour:dark (colour:orange bio:genus:vulpes)) is certainly acceptable: it merely applies the three predicates in sequence. As they are all descriptive and do not contradict each other there is no reason why this shouldn’t happen, but it is not something that the translation system can handle currently.

The colour in isolation, however, must be expressed as (colour:dark colour:orange) (or vice versa), because there is no other way in which two predicates can be combined. It follows that all of these forms need to be matched. The options are similar to before, except that there will almost certainly need to be changes to word selection (because the current system cannot replace a set of predicates which do not form a subtree).

It has occurred to me that I may be making life unnecessarily difficult for myself by using an explicit binary tree structure as opposed to something more akin to the list structures used in Lisp. In the latter case there is a terminator at the end of the list, so there is no structural difference when colour:dark and colour:orange are applied to each other or applied to something else. This would be a radical change that would affect the whole translation system, but I think it is worth considering.

Hedges

Sunday, August 9th, 2009

In fuzzy logic, a ‘hedge’ is a type of unary operator which acts on the membership function of a fuzzy set. Hedges typically correspond in meaning to adverbs such as ‘very’ and ’slightly’. Effects may include displacing the peak of the membership function, or causing it to become more concentrated or spread out.

Many predicates need to have fuzzy semantics if they are to accurately represent their linguistic counterparts. For example, colour:bright is fully true for colours with a saturation and value of 100%, but would still be mostly true if the value were lowered to 90%. The way I envisage the membership function to be defined, it would become completely false only on reaching the grey line. Similar considerations apply to colour:pale and colour:dark.

Because there are degrees of truth to these predicates it is possible to talk of colours which are (for example) ‘very bright’ or ’slightly dark’. All that is needed are predicates to represent the required hedges. Here are the ones I intend to make available in the first instance:

  • not[1] inverts the membership function, so that members become non-members and vice versa. It is normally implemented by the function f(x) = 1−x.
  • very concentrates the membership function, peaking at the highest possible values. It is often implemented by the function f(x) = x2 but there are other ways to define it.
  • slightly concentrates the membership function, peaking at what would otherwise be the lowest possible values. A possible definition is (very not).
  • moderately concentrates the membership function, peaking at what would otherwise be middling values. A possible definition is the union of (not slightly) and (not very).

Hedges are sufficiently fundamental that I see no objection to their being placed in the global namespace.

Given that slightly and moderately could be defined in terms of other predicates it may be asked whether they are necessary. I’ve included them for three reasons: in the interests of brevity, to avoid committing to any particular definition of the membership function, and because I’ve not seen any evidence that decomposing these hedges would be of value linguistically.

On the other hand, I do intend to make use of constructs such as (not very) and (very slightly) because equivalent forms are used in many languages.

I’ve not (yet) defined a somewhat hedge because I’m not convinced that the way this term is used in fuzzy logic (typically implemented by means of the square root function) corresponds to ’somewhat’ in English, or indeed to any word that I have been able to identify. To my mind, if there were to be a somewhat hedge then it should peak between slightly and moderately.

A potentially useful benefit of defining very to mean f(x) = x2 is that it allows ((very dark) orange) to be re-expressed as (dark dark orange), which could then be translated as ‘dark brown’. I’m not going to commit to saying that very is defined this way, because I want to avoid that level of detail. I will say that it is an acceptable approximation, which is all the translation system needs to know in order to make use of it (since translation is not an exact process).

There will probably need to be restrictions on when and where these predicates can be used, and in what combinations. For example, whereas (very dark) makes sense and ought, (very (cardinal 5)) does not (because being fivefold is not an attribute of degree). I’ll return to this subject when I have more experience of implementation.

[1] Some sources class not as a hedge, others as part of a larger class of functions called modifiers.

Names of Colours part 3: Darkness, Lightness and Brightness

Friday, July 17th, 2009

The corners of the triangle that I described previously were black, white, and the saturated hue. The areas around those corners can be described by the adjectives ‘dark’, ‘pale’ (or ‘light’) and ‘bright’. My initial attempt at a system of predicates to describe regions of this triangle will be based on those adjectives:

  • colour:dark
  • colour:pale
  • colour:bright

These can be combined with each other, and also with hedge operators such as ‘very’ and ’slightly’. I’ve chosen the word ‘pale’ rather than ‘light’ because being both ‘dark’ and ‘light’ sounds like a contradiction in terms, whereas ‘dark’ and ‘pale’ suggests the desired meaning of ‘greyish’. As always, the choice of predicate name does not necessarily indicate how the concept should be translated. I’ve not yet decided what hedges to provide or how to define them, but will do soon.

There is a complication, in that the triangle is not symmetrical. Consider shades of blue. Whereas the bright corner represents the best example of ‘bright blue’, the dark corner does not qualify at all as ‘dark blue’ because it is actually pure black. Similarly, the pale corner is pure white. Put another way, the unqualified colour ‘blue’ doesn’t refer equally to all parts of the triangle. Instead it is most true for the region around to the bright corner, and not true at all on the grey edge that connects the other two corners.

For this reason I think it is necessary to define the hue predicates as if they had a built-in qualification of colour:bright. That gives approximately the right semantics for representing unqualified basic colours such as ‘red’, ‘yellow’, ‘green’ and ‘blue’. Further desirable effects are that:

  • Explicit qualification using colour:bright would reinforce the implicit qualification, placing the colour very close to the bright corner, whereas qualification using colour:dark or colour:pale would pull it only half-way down the triangle.
  • Qualification using one of colour:dark or colour:pale will cause the colour to peak at the edge of the triangle, as opposed to a band crossing the interior.
  • However dark or pale a colour is made, it cannot reach the grey edge of the triangle.

This raises the question of how you do represent colours along the grey edge of the triangle. As hue is mathematically indeterminate along that line I have no problem with treating these colours as special cases:

  • colour:black
  • colour:white
  • colour:grey

I’m leaving it open whether colour:black and colour:white are shorthand for extreme shades of grey, or whether they represent distinct concepts. Either way, they are important enough that they should not need to be explicitly synthesised.

That, I think, is as much as I want to specify at present. I don’t intend to tie the predicates to an absolute colour space (such as sRGB), partly because I haven’t been able to formulate any defensible basis for selecting one, but mainly because I’m not convinced that the colour terms typically used in natural languages imply any particular colour space. Similarly, I’ve avoided making any distinction between transmitted and reflected light (although there will be a need to address concepts such as transparency and reflectivity at some point in the future).

Names of Colours part 2: Hues

Wednesday, July 15th, 2009

Having established that one of the components of a colour should be its hue, it is necessary to decide what hues to provide, at what level or levels of precision, and how to name them.

In the first instance I intend to provide predicates for hues at 30° intervals and with the following names:

  • colour:red (0°)
  • colour:orange (30°)
  • colour:yellow (60°)
  • colour:chartreuse (90°)
  • colour:green (120°)
  • colour:spring-green (150°)
  • colour:cyan (180°)
  • colour:azure (210°)
  • colour:blue (240°)
  • colour:violet (270°)
  • colour:magenta (300°)
  • colour:rose (330°)

This ought to provide sufficient resolution for the colours that are commonly used in most natural languages. It won’t be sufficient for very specific colour names like ‘burnt umber’ or ‘cerulean blue’, but I have my doubts as to whether it is feasible to translate those at all. I’m open to the possibility of narrowing down to 15° or less if that would allow ordinary colours to be translated more accurately, but would want to see evidence.

I want to stress that just because a particular colour name has been used for a predicate does not mean that the same name will be used when it is translated into English, nor that languages need to provide distinct readings for all twelve hues. For example, my expectation is that the default English translations for colour:azure and colour:blue will both be ‘blue’.

It will be desirable to have some method available for expressing precision because there are some situations where you do need to distinguish between ‘azure’ and ‘blue’, however:

  1. in the case of colours this can and probably should be done by qualifying the predicates listed above in some way (not by creating a separate set of low-precision predicates); and
  2. rather than being specific to colours, it would be preferable for the method to be a generic one.

In view of the latter point I’m going to defer the issue of precision and treat it as a separate topic. In the meantime, any readings that are created will assume a ‘basic’ level of precision (meaning the level that would typically be used in the absence of any reason to be more specific).

Names of Colours part 1: Colour Models

Tuesday, July 14th, 2009

Colours were one of the first types of concept that I tried to define predicates for, but it quickly became apparent that there were a number of quite complex issues to resolve before any firm decisions could be made. These include:

  • which of the many possible colour models to use as the basis for the system of predicates;
  • whether to provide one level of precision or several, and what those levels should be;
  • when to synthesise and when to enumerate; and
  • how to handle characteristics which cannot be made fully independent of each other.

From a programmer’s perspective the simplest and most familiar colour model is RGB, but that would not be a good choice for this application because its purpose is to model the physical composition of light, not how it is perceived and verbalised. For example, while it is quite common in English for cyan to be called ‘blue-green’, I’m not aware of any natural language which expresses yellow as ‘red-green’ or white as ‘red-green-blue’.

More promising candidates are HSL (Hue, Saturation, Lightness) and HSV (Hue, Saturation, Value). They more closely correspond to the surface forms that are commonly found in natural languages because of their separation of hue from the other two attributes. Examples include ‘knallrot’ (German, bright red), ‘vert foncé’ (French, dark green) and ‘glas golau’ (Welsh, light blue).

The main advantage of HSL is that it treats lightness and darkness on an equal footing, whereas HSV does not. Although both systems have a component called ’saturation’ they define it in different ways. In HSL especially the behaviour of this component can be counterintuitive because colours can be 100% saturated even if they are very pale. Both systems allow colours to be 100% saturated but very dark.

These idiosyncrasies are the result of mapping a coordinate space which is naturally triangular onto a square. The three corners of the triangle are black, white, and the saturated hue. All three edges of the triangle are smooth gradients, whereas this is only true for two (HSL) or three (HSV) edges of the square. Because of this I don’t intend to relate the system of predicates directly to HSL or HSV, but instead to the underlying triangular coordinate space.

Update 2009-07-15: Two colour systems which I should have mentioned are the ISCC-NBS and CNS systems. In some ways they are both very similar to the system I have chosen, in that they attempt to represent colours in a manner that is both systematic and linguistically meaningful. Unfortunately both of them are quite biased towards English (ISCC-NBS more so than CNS), and consequently are rather less systematic than they could have been. I did consider using a notation very similar to CNS for representing lightness and saturation, and may yet do so if current plans fail to deliver, but I wanted a system which allowed more extensive use of hedges and that was not directly based on HSL.

Predicate Orthogonality

Sunday, June 14th, 2009

Words in natural languages often represent a combination of distinct concepts which could otherwise be expessed independently of each other. For the most part I intend to avoid this practice when defining predicates, keeping independent concepts separate unless there is good reason to do otherwise.

For example, a ‘colt’ is an animal which is (a) male, (b) under four years of age, and (c) a member of the subspecies Equus ferus caballus. Each of these concepts is capable of being changed independently of the others, so it makes sense for each to be represented by a separate predicate. This increases the length of the source text, but has two notable benefits:

  • It reduces the number of predicates that need to be provided by the source language.
  • By making the source text semantics more explicit, it reduces the likelihood that they will be unnecessarily overspecified.

The second point is an important one because different languages do not necessarily provide words that express the same combinations of concepts. For example, some languages might have a word for Equus ferus caballus, but not for male, female, young or old members of that subspecies. In that case the word selection algorithm needs to know whether it is sufficient to identify the animal as a ‘horse’ or necessary to describe it as a ‘young male horse’. For this to be possible the source text must distinguish between essential and optional attributes, and that is most easily achieved when the attributes in question are expressed separately.

(To be clear, the word selection algorithm isn’t able to support this behaviour yet, but I’m designing the source language on the assumption that it will need to in the future.)

Maximally independent sets of concepts can be said to be ‘orthogonal’, and typically that is what I’ll be aiming for. However it is possible to have too much of a good thing, and there are some concepts for which a strictly orthogonal approach may be more trouble than it is worth. Examples include:

  • the points of the compass, which could be separated into north-south and east-west components. This would reduce the number of predicates needed, but only by a small number, and at the expense of significant verbosity. On balance I think it will be simpler and easier just to list the available directions.
  • colours, for which I think some decomposition will be useful, but only up to a point. For example, I think it will be useful to separate hue from saturation and value (or lightness), but I would not want to define yellow as being half-way between green and red.

Finally, it is worth noting that expressions with substantial idiomatic content cannot and should not be decomposed into orthogonal components. For example, ‘North Dakota’ means more than simply the intersection of ‘North’ and ‘Dakota’: it refers to a very specific geopolitical entity. It would be difficult to define ‘North Dakota’ in terms of other predicates, and impossible to do so concisely, so the only viable option is to provide a separate predicate dedicated to that meaning.

Geography: Names of Continents

Thursday, April 23rd, 2009

Having decided to provide fallbacks for plant and animal species based on attributes such as geographical range, I now need predicates to represent those attributes - preferably before creating large numbers of readings rather than afterwards. I have begun with ones to represent the continents of the world.

The formats I have chosen for the predicate names is:

geog:continent:<name>

Unfortunately there is no consensus as to what criteria a body of land must satisfy in order to qualify as a separate continent. For example, some people consider Europe to be a continent in its own right, but to others it is merely part of a larger continent called Eurasia. A third view is that Europe, Asia and Africa are all part of a single continent called Afro-Eurasia.

I don’t feel that it is necessary or appropriate for the translation system to take sides on this issue. Europe, Eurasia and Afro-Eurasia are all useful concepts which people might want to talk about, whether or not they qualify as continents according to any particular definition. For this reason I have decided on an inclusive approach which is able to accomodate any of the plausible systems that are in common use. The full list is:

geog:continent:africa
geog:continent:afro-eurasia
geog:continent:america
geog:continent:antarctica
geog:continent:asia
geog:continent:australia
geog:continent:eurasia
geog:continent:europe
geog:continent:north-america
geog:continent:south-america

A few points to note:

  • I’ve chosen ‘America’ (as opposed to ‘The Americas’) as the basis for the corresponding predicate name because the namespace makes clear that it refers to the continent, not the country. This decision need not have any bearing on how the name is translated (and indeed, my expectation is that it would normally be rendered in English as ‘The Americas’).
  • Whilst ‘Oceania’ is undoubtedly a useful concept which ought be represented somehow, it isn’t a continent by any reasonable definition.
  • Similarly ‘Australasia’ is not a continent because it encompasses both Australia and New Zealand.
  • I’m deferring consideration of microcontinents, historical continents and mythical continents for the time being on the grounds that there are more important topics to address first.
  • The precise borders of each continent are unspecified, on the grounds that they are often hard to define and have no bearing on what the continent is called. However I am treating ‘continent’ as a term of physical geography, not a social or geopolitical one. On that basis Switzerland is unquestionably part of Europe (but outside the EU), whereas French Guiana is not (despite being part of the EU).

The semantics that I’m provisionally attaching to these predicates is that they are true when applied to the corresponding continent itself. This is a departure from previous policy, in that it associates the predicate with the noun rather than the adjective (so, for example, geog:continent:europe translates to ‘Europe’ rather than ‘European’). The reason for this approach is because of the difficulties associated with defining the adjectival form, about which I will say more soon.