Using ACE from PyDelphin

ACE is one of the most efficient processors for DELPH-IN grammars, and has an impressively fast start-up time. PyDelphin tries to make it easier to use ACE from Python with the delphin.ace module, which provides functions and classes for compiling grammars, parsing, transfer, and generation.

In this guide, delphin.ace is assumed to be imported as ace, as in the following:

>>> from delphin import ace

Compiling a Grammar

The compile() function can be used to compile a grammar from its source. It takes two arguments, the location of the ACE configuration file and the path of the compiled grammar to be written. For instance (assume the current working directory is the grammar directory):

>>> ace.compile('ace/config.tdl', 'zhs.dat')

This is equivalent to running the following from the commandline (again, from the grammar directory):

[~/zhong/cmn/zhs/]$ ace -g ace/config.tdl -G zhs.dat

All of the following topics assume that a compiled grammar exists.

Parsing

The ACE interface handles the interaction between Python and ACE, giving ACE the arguments to parse and then interpreting the output back into Python data structures.

The easiest way to parse a single sentence is with the parse() function. Its first argument is the path to the compiled grammar, and the second is the string to parse:

>>> response = ace.parse('zhs.dat', '狗 叫 了')
>>> len(response['results'])
8
>>> response['results'][0]['mrs']
'[ LTOP: h0 INDEX: e2 [ e SF: prop-or-ques E.ASPECT: perfective ] RELS: < [ "_狗_n_1_rel"<0:1> LBL: h4 ARG0: x3 [ x SPECI: + SF: prop COG-ST: uniq-or-more PNG.PERNUM: pernum PNG.GENDER: gender PNG.ANIMACY: animacy ] ]  [ generic_q_rel<-1:-1> LBL: h5 ARG0: x3 RSTR: h6 BODY: h7 ]  [ "_叫_v_3_rel"<2:3> LBL: h1 ARG0: e2 ARG1: x3 ARG2: x8 [ x SPECI: bool SF: prop COG-ST: cog-st PNG.PERNUM: pernum PNG.GENDER: gender PNG.ANIMACY: animacy ] ] > HCONS: < h0 qeq h1 h6 qeq h4 > ICONS: < e2 non-focus x8 > ]'

Notice that the response is a Python dictionary. They are in fact a subclass of dictionaries with some added convenience methods. Using dictionary access methods returns the raw data, but the function access can simplify interpretation of the results. For example:

>>> len(response.results())
8
>>> response.result(0).mrs()
<Mrs object (狗 generic 叫) at 2567183400998>

These response objects are described in the documentation for the interface module.

In addition to single sentences, a sequence of sentences can be parsed, yielding a sequence of results, using parse_from_iterable():

>>> for response in ace.parse_from_iterable('zhs.dat', ['狗 叫 了', '狗 叫']):
...     print(len(response.results()))
...
8
5

Both parse() and parse_from_iterable() use the ACEParser class for interacting with ACE. This class can also be instantiated directly and interacted with as long as the process is open, but don’t forget to close the process when done.

>>> parser = ace.ACEParser('zhs.dat')
>>> len(parser.interact('狗 叫 了').results())
8
>>> parser.close()
0

The class can also be used as a context manager, which removes the need to explicitly close the ACE process.

>>> with ace.ACEParser('zhs.dat') as parser:
...     print(len(parser.interact('狗 叫 了').results()))
...
8

The ACEParser class and parse() and parse_from_iterable() functions all take additional arguments for affecting how ACE is accessed, e.g., for selecting the location of the ACE binary, setting command-line options, and changing the environment variables of the subprocess:

>>> with ace.ACEParser('zhs-0.9.26.dat',
...                    executable='/opt/ace-0.9.26/ace',
...                    cmdargs=['-n', '3', '--timeout', '5']) as parser:
...     print(len(parser.interact('狗 叫 了').results()))
...
5

See the delphin.ace module documentation for more information about options for ACEParser.

Generation

Generating sentences from semantics is similar to parsing, but the simplemrs serialization of the semantics is given as input instead of sentences. You can generate from a single semantic representation with generate():

>>> m = '''
... [ LTOP: h0
...   RELS: < [ "_rain_v_1_rel" LBL: h1 ARG0: e2 [ e TENSE: pres ] ] >
...   HCONS: < h0 qeq h1 > ]'''
>>> response = ace.generate('erg.dat', m)
>>> response.result(0)['surface']
'It rains.'

The response object is the same as with parsing. You can also generate from a list of MRSs with generate_from_iterable():

>>> responses = list(ace.generate_from_iterable('erg.dat', [m, m]))
>>> len(responses)
2

Or instantiate a generation process with ACEGenerator:

>>> with ace.ACEGenerator('erg.dat') as generator:
...     print(generator.iteract(m).result(0)['surface'])
...
It rains.

Transfer

ACE also implements most of the LOGON transfer formalism, and this functionality is available in PyDelphin via the ACETransferer class and related functions. In the current version of ACE, transfer does not return as much information as with parsing and generation, but the response object in PyDelphin is the same as with the other tasks.

>>> j_response = ace.parse('jacy.dat', '雨 が 降る')
>>> je_response = ace.transfer('jaen.dat', j_response.result(0)['mrs'])
>>> e_response = ace.generate('erg.dat', je_response.result(0)['mrs'])
>>> e_response.result(0)['surface']
'It rains.'

Tips and Tricks

Sometimes the input data needs to be modified before it can be parsed, such as the morphological segmentation of Japanese text. Users may also wish to modify the results of processing, such as to streamline an MRS–DMRS conversion pipeline. The former is an example of a preprocessor and the latter a postprocessor. There can also be “coprocessors” that execute alongside the original, such as for returning the result of a statistical parser when the original fails to reach a parse. It is straightforward to accomplish all of these configurations with Python and PyDelphin, but the resulting pipeline may not be compatible with other interfaces, such as TestSuite.process(). By using the delphin.interface.Process class to wrap an ACEProcess instance, these pre-, co-, and post-processors can be implemented in a more useful way. See Wrapping a Processor for Preprocessing for an example of using Process as a preprocessor.

Troubleshooting

Some environments have an encoding that isn’t compatible with what ACE expects. One way to mitigate this issue is to pass in the appropriate environment variables via the env parameter. For example:

>>> import os
>>> env = os.environ
>>> env['LANG'] = 'en_US.UTF8'
>>> ace.parse('zhs.dat', '狗 叫 了', env=env)