GenI-0.23.20130212: A natural language generator (specifically, an FB-LTAG surface realiser)

Safe HaskellSafe-Infered

NLP.GenI

Contents

Description

This is the interface between the front and backends of the generator. The GUI and the console interface both talk to this module, and in turn, this module talks to the input file parsers and the surface realisation engine.

Synopsis

Main interface

Program state and configuration

data ProgState

The program state consists of its configuration options and abstract, cleaned up representations of all the data it's had to load into memory (tree schemata files, lexicon files, etc). The intention is for the state to stay static until the next time something triggers some file loading.

Constructors

ProgState 

Fields

pa :: Params

the current configuration

gr :: Macros

tree schemata

le :: Lexicon

lexical entries

morphinf :: MorphInputFn

function to extract morphological information from the semantics (you may instead be looking for customMorph)

traces :: [Text]

simplified traces (optional)

customMorph :: Maybe MorphRealiser
 

Instances

HasFlags ProgState

Note that this affects the geniFlags; we assume the morph flags are not our business

emptyProgState :: Params -> ProgState

The program state when you start GenI for the very first time

type LexicalSelector sem = Macros -> Lexicon -> sem -> IO LexicalSelection

See Configuration if you want to use GenI with a custom lexical selection function.

Running GenI

runGeni :: ProgState -> CustomSem sem -> Builder st it -> TestCase sem -> ErrorIO (GeniResults, st)

Entry point! (the most useful function to know here)

  • Initialises the realiser (lexical selection, among other things),
  • Runs the builder (the surface realisation engine proper)
  • Unpacks the builder results
  • Finalises the results (morphological generation)

In addition to the results, this returns a generator state. The latter is is mostly useful for debugging via the graphical interface. Note that we assumes that you have already loaded in your grammar and parsed your input semantics.

simplifyResults :: Either Text (GeniResults, st) -> GeniResults

simplifyResults $ runGenI...' for an easier time if you don't need the surface realiser state

defaultCustomSem :: ProgState -> IO (CustomSem SemInput)

Standard GenI semantics and lexical selection algorithm (with optional preanchored mode)

data GeniResults

GeniResults is the outcome of running GenI on a single input semantics. Each distinct result is returned as a single GeniResult (NB: a single result may expand into multiple strings through morphological post-processing),

Constructors

GeniResults 

Fields

grResults :: [GeniResult]

one per chart item

grGlobalWarnings :: [Text]

usually from lexical selection

grStatistics :: Statistics

things like number of chart items to help study efficiency

Instances

data GeniSuccess

Constructors

GeniSuccess 

Fields

grLemmaSentence :: LemmaPlusSentence

“original” uninflected result

grRealisations :: [Text]

results after morphology

grResultType :: ResultType
 
grWarnings :: [Text]

warnings “local” to this particular item, cf. grGlobalWarnings

grDerivation :: TagDerivation

derivation tree behind the result

grOrigin :: Integer

normally a chart item id

grLexSelection :: [GeniLexSel]

the lexical selection behind this result (info only)

grRanking :: Int

see OptimalityTheory

grViolations :: [OtViolation]

which OT constraints were violated

Helpers

initGeni :: ProgState -> CustomSem sem -> sem -> ErrorIO (Input, GeniWarnings)

initGeni performs lexical selection and strips the input semantics of any morpohological literals

See defaultCustomSem

extractResults :: ProgState -> Builder st it -> st -> IO [GeniResult]

This is a helper to runGenI. It's mainly useful if you are building interactive GenI debugging tools.

Given a builder state,

  • Unpacks the builder results
  • Finalises the results (morphological generation)

lemmaSentenceString :: GeniSuccess -> Text

No morphology! Pretend the lemma string is a sentence

showRealisations :: [String] -> String

Show the sentences produced by the generator, in a relatively compact form

histogram :: Ord a => [a] -> Map a Int

getTraces :: ProgState -> Text -> [Text]

getTraces is most likely useful for grammars produced by a metagrammar system. Given a tree name, we retrieve the `trace' information from the grammar for all trees that have this name. We assume the tree name was constructed by GenI; see the source code for details.

Loading things

loadEverything :: ProgStateRef -> CustomSem sem -> IO ()

We have one master function that loads all the files GenI is expected to use. This just calls the sub-loaders below, some of which are exported for use by the graphical interface. The master function also makes sure to complain intelligently if some of the required files are missing.

class Loadable x where

The file loading functions all work the same way: we load the file, and try to parse it. If this doesn't work, we just fail in IO, and GenI dies. If we succeed, we update the program state passed in as an IORef.

Methods

lParse

Arguments

:: FilePath

source (optional)

-> Text 
-> Either Text x 

lSet :: x -> ProgState -> ProgState

lSummarise :: x -> String

Instances

Loadable Lexicon 
Loadable Macros 
Loadable OtRanking 
Loadable PreAnchoredL 
Loadable TestSuiteL 
Loadable TracesL 
Loadable MorphFnL 

loadGeniMacros :: ProgStateRef -> IO Macros

The macros are stored as a hashing function in the monad.

loadFromString

Arguments

:: Loadable a 
=> ProgStateRef 
-> String

description

-> Text

string to load

-> IO a 

Load something from a string rather than a file