GenI-0.23.20130212: A natural language generator (specifically, an FB-LTAG surface realiser)

Safe HaskellSafe-Infered

NLP.GenI.Builder

Description

The heavy lifting of GenI, the whole chart/agenda mechanism, can be implemented in many ways. To make it easier to write different algorithms for GenI and compare them, we provide a single interface for what we call Builders.

This interface is then used called by the Geni module and by the graphical interface. Note that each builder has its own graphical interface and that we do a similar thing in the graphical interface code to make it possible to use these GUIs.

Synopsis

Documentation

data Builder st it

Constructors

Builder 

Fields

init :: Input -> [Flag] -> (st, Statistics)

initialise the machine from the semantics and lexical selection

step :: BuilderState st ()

run a realisation step

stepAll :: BuilderState st ()

run all realisations steps until completion

finished :: st -> GenStatus

determine if realisation is finished

unpack :: st -> [Output]

unpack chart results into a list of sentences

partial :: st -> [Output]
 

data GenStatus

Constructors

Finished 
Active 
Error Text 

Instances

lexicalSelection :: TagDerivation -> [Text]

The names of lexically selected chart items used in a derivation

data FilterStatus a

Constructors

Filtered 
NotFiltered a 

(>-->) :: Monad s => DispatchFilter s a -> DispatchFilter s a -> DispatchFilter s a

Sequence two dispatch filters.

defineSemanticBits :: Sem -> SemBitMap

assign a bit vector value to each literal in the semantics the resulting map can then be used to construct a bit vector representation of the semantics

type DispatchFilter s a = a -> s (FilterStatus a)

Dispatching consists of assigning a chart item to the right part of the chart (agenda, trash, results list, etc). This is implemented as a series of filters which can either fail or succeed. If a filter fails, it may modify the item before passing it on to future filters.

condFilter :: Monad s => (a -> Bool) -> DispatchFilter s a -> DispatchFilter s a -> DispatchFilter s a

If the item meets some condition, use the first filter, otherwise use the second one.

defaultStepAll :: Builder st it -> BuilderState st ()

Default implementation for the stepAll function in Builder

data Input

To simplify interaction with the backend, we provide a single data structure which represents all the inputs a backend could take.

Constructors

Input 

Fields

inSemInput :: SemInput
 
inLex :: [LexEntry]

for the debugger

inCands :: [(TagElem, BitVector)]

tag tree

Instances

unlessEmptySem :: Input -> [Flag] -> a -> a

Equivalent to id unless the input contains an empty or uninstatiated semantics

type SentenceAut = NFA Int LemmaPlus

A SentenceAut represents a set of sentences in the form of an automaton. The labels of the automaton are the words of the sentence. But note! “word“ in the sentence is in fact a tuple (lemma, inflectional feature structures). Normally, the states are defined as integers, with the only requirement being that each one, naturally enough, is unique.

run :: Builder st it -> Input -> [Flag] -> (st, Statistics)

Performs surface realisation from an input semantics and a lexical selection.

Statistics tracked

  • pol_used_bundles - number of bundled paths through the polarity automaton. see automatonPathSets
  • pol_used_paths - number of paths through the final automaton
  • pol_seed_paths - number of paths through the seed automaton (i.e. with no polarities). This is normally just 1, unless you have multi-literal semantics
  • pol_total_states - combined number of states in the all the polarity automata
  • pol_total_tras - combined number of transitions in all polarity automata
  • pol_max_states - number of states in the polarity automaton with the most states
  • pol_total_tras - number of transitions in the polarity automata with the most transitions
  • sem_literals - number of literals in the input semantics
  • lex_trees - total number of lexically selected trees