delphin.ace

See also

See Using ACE from PyDelphin for a more user-friendly introduction.

An interface for the ACE processor.

This module provides classes and functions for managing interactive communication with an open ACE process. The ACE software is required for the functionality in this module, but it is not included with PyDelphin. Pre-compiled binaries are available for Linux and MacOS at http://sweaglesw.org/linguistics/ace/, and for installation instructions see https://github.com/delph-in/docs/wiki/AceInstall.

The ACEParser, ACETransferer, and ACEGenerator classes are used for parsing, transferring, and generating with ACE. All are subclasses of ACEProcess, which connects to ACE in the background, sends it data via its stdin, and receives responses via its stdout. Responses from ACE are interpreted so the data is more accessible in Python.

Warning

Instantiating ACEParser, ACETransferer, or ACEGenerator opens ACE in a subprocess, so take care to close the process (ACEProcess.close()) when finished or, preferably, instantiate the class in a context manager so it is closed automatically when the relevant code has finished.

Interpreted responses are stored in a dictionary-like Response object. When queried as a dictionary, these objects return the raw response strings. When queried via its methods, the PyDelphin models of the data are returned. The response objects may contain a number of Result objects. These objects similarly provide raw-string access via dictionary keys and PyDelphin-model access via methods. Here is an example of parsing a sentence with ACEParser:

>>> from delphin import ace
>>> with ace.ACEParser('erg-2018-x86-64-0.9.30.dat') as parser:
...     response = parser.interact('A cat sleeps.')
...     print(response.result(0)['mrs'])
...     print(response.result(0).mrs())
...
[ LTOP: h0 INDEX: e2 [ e SF: prop TENSE: pres MOOD: indicative PROG: - PERF: - ] RELS: < [ _a_q<0:1> LBL: h4 ARG0: x3 [ x PERS: 3 NUM: sg IND: + ] RSTR: h5 BODY: h6 ]  [ _cat_n_1<2:5> LBL: h7 ARG0: x3 ]  [ _sleep_v_1<6:13> LBL: h1 ARG0: e2 ARG1: x3 ] > HCONS: < h0 qeq h1 h5 qeq h7 > ICONS: < > ]
<MRS object (_a_q _cat_n_1 _sleep_v_1) at 140612036960072>

Functions exist for non-interactive communication with ACE: parse() and parse_from_iterable() open and close an ACEParser instance; transfer() and transfer_from_iterable() open and close an ACETransferer instance; and generate() and generate_from_iterable() open and close an ACEGenerator instance. Note that these functions open a new ACE subprocess every time they are called, so if you have many items to process, it is more efficient to use parse_from_iterable(), transfer_from_iterable(), or generate_from_iterable() than the single-item versions, or to interact with the ACEProcess subclass instances directly.

Basic Usage

The following module funtions are the simplest way to interact with ACE, although for larger or more interactive jobs it is suggested to use an ACEProcess subclass instance.

delphin.ace.compile(cfg_path, out_path, executable=None, env=None, stdout=None, stderr=None)[source]

Use ACE to compile a grammar.

Parameters:
  • cfg_path (str) – the path to the ACE config file

  • out_path (str) – the path where the compiled grammar will be written

  • executable (str, optional) – the path to the ACE binary; if None, the ace command will be used

  • env (dict, optional) – environment variables to pass to the ACE subprocess

  • stdout (file, optional) – stream used for ACE’s stdout

  • stderr (file, optional) – stream used for ACE’s stderr

delphin.ace.parse(grm, datum, **kwargs)[source]

Parse sentence datum with ACE using grammar grm.

Parameters:
  • grm (str) – path to a compiled grammar image

  • datum (str) – the sentence to parse

  • **kwargs – additional keyword arguments to pass to the ACEParser

Returns:

Response

Example

>>> response = ace.parse('erg.dat', 'Dogs bark.')
NOTE: parsed 1 / 1 sentences, avg 797k, time 0.00707s
delphin.ace.parse_from_iterable(grm, data, **kwargs)[source]

Parse each sentence in data with ACE using grammar grm.

Parameters:
  • grm (str) – path to a compiled grammar image

  • data (iterable) – the sentences to parse

  • **kwargs – additional keyword arguments to pass to the ACEParser

Yields:

Response

Example

>>> sentences = ['Dogs bark.', 'It rained']
>>> responses = list(ace.parse_from_iterable('erg.dat', sentences))
NOTE: parsed 2 / 2 sentences, avg 723k, time 0.01026s
delphin.ace.transfer(grm, datum, **kwargs)[source]

Transfer from the MRS datum with ACE using grammar grm.

Parameters:
  • grm (str) – path to a compiled grammar image

  • datum – source MRS as a SimpleMRS string

  • **kwargs – additional keyword arguments to pass to the ACETransferer

Returns:

Response

delphin.ace.transfer_from_iterable(grm, data, **kwargs)[source]

Transfer from each MRS in data with ACE using grammar grm.

Parameters:
  • grm (str) – path to a compiled grammar image

  • data (iterable) – source MRSs as SimpleMRS strings

  • **kwargs – additional keyword arguments to pass to the ACETransferer

Yields:

Response

delphin.ace.generate(grm, datum, **kwargs)[source]

Generate from the MRS datum with ACE using grm.

Parameters:
  • grm (str) – path to a compiled grammar image

  • datum – the SimpleMRS string to generate from

  • **kwargs – additional keyword arguments to pass to the ACEGenerator

Returns:

Response

delphin.ace.generate_from_iterable(grm, data, **kwargs)[source]

Generate from each MRS in data with ACE using grammar grm.

Parameters:
  • grm (str) – path to a compiled grammar image

  • data (iterable) – MRSs as SimpleMRS strings

  • **kwargs – additional keyword arguments to pass to the ACEGenerator

Yields:

Response

Classes for Managing ACE Processes

The functions described in Basic Usage are useful for small jobs as they handle the input and then close the ACE process, but for more complicated or interactive jobs, directly interacting with an instance of an ACEProcess sublass is recommended or required (e.g., in the case of [incr tsdb()] testsuite processing). The ACEProcess class is where most methods are defined, but in practice the ACEParser, ACETransferer, or ACEGenerator subclasses are directly used.

class delphin.ace.ACEProcess(grm, cmdargs=None, executable=None, env=None, tsdbinfo=True, full_forest=False, stderr=None)[source]

Bases: Processor

The base class for interfacing ACE.

This manages most subprocess communication with ACE, but does not interpret the response returned via ACE’s stdout. Subclasses override the receive() method to interpret the task-specific response formats.

Note that not all arguments to this class are used by every subclass; the documentation for each subclass specifies which are available.

Parameters:
  • grm (str) – path to a compiled grammar image

  • cmdargs (list, optional) – a list of command-line arguments for ACE; note that arguments and their values should be separate entries, e.g. ['-n', '5']

  • executable (str, optional) – the path to the ACE binary; if None, ACE is assumed to be callable via ace

  • env (dict) – environment variables to pass to the ACE subprocess

  • tsdbinfo (bool) – if True and ACE’s version is compatible, all information ACE reports for [incr tsdb()] processing is gathered and returned in the response

  • full_forest (bool) – if True and tsdbinfo is True, output the full chart for each parse result

  • stderr (file) – stream used for ACE’s stderr

property ace_version

The version of the specified ACE binary.

close()[source]

Close the ACE process and return the process’s exit code.

interact(datum)[source]

Send datum to ACE and return the response.

This is the recommended method for sending and receiving data to/from an ACE process as it reduces the chances of over-filling or reading past the end of the buffer. It also performs a simple validation of the input to help ensure that one complete item is processed at a time.

If input item identifiers need to be tracked throughout processing, see process_item().

Parameters:

datum (str) – the input sentence or MRS

Returns:

Response

process_item(datum, keys=None)[source]

Send datum to ACE and return the response with context.

The keys parameter can be used to track item identifiers through an ACE interaction. If the task member is set on the ACEProcess instance (or one of its subclasses), it is kept in the response as well. :param datum: the input sentence or MRS :type datum: str :param keys: a mapping of item identifier names and values :type keys: dict

Returns:

Response

receive()[source]

Return the stdout response from ACE.

Warning

Reading beyond the last line of stdout from ACE can cause the process to hang while it waits for the next line. Use the interact() method for most data-processing tasks with ACE.

property run_info

Contextual information about the the running process.

send(datum)[source]

Send datum (e.g. a sentence or MRS) to ACE.

Warning

Sending data without reading (e.g., via receive()) can fill the buffer and cause data to be lost. Use the interact() method for most data-processing tasks with ACE.

class delphin.ace.ACEParser(grm, cmdargs=None, executable=None, env=None, tsdbinfo=True, full_forest=False, stderr=None)[source]

Bases: ACEProcess

A class for managing parse requests with ACE.

See ACEProcess for initialization parameters.

class delphin.ace.ACETransferer(grm, cmdargs=None, executable=None, env=None, stderr=None)[source]

Bases: ACEProcess

A class for managing transfer requests with ACE.

See ACEProcess for initialization parameters.

class delphin.ace.ACEGenerator(grm, cmdargs=None, executable=None, env=None, tsdbinfo=True, stderr=None)[source]

Bases: ACEProcess

A class for managing realization requests with ACE.

See ACEProcess for initialization parameters.

Exceptions

exception delphin.ace.ACEProcessError(*args, **kwargs)[source]

Bases: PyDelphinException

Raised when the ACE process has crashed and cannot be recovered.

ACE stdout Protocols

PyDelphin communicates with ACE via its “stdout protocols”, which are the ways ACE’s outputs are encoded across its stdout stream. There are several protocols that ACE uses and that this module supports:

  • regular parsing

  • parsing with ACE’s --tsdb-stdout option

  • parsing with --tsdb-stdout and --itsdb-forest

  • transfer

  • regular generation

  • generation with ACE’s --tsdb-stdout option

When a user interacts with ACE via the classes and functions in this module, responses will be interpreted and wrapped in Response objects, thus separating the user from the details of ACE’s stdout protocols. Sometimes, however, the user will store or pipe ACE’s output directly, such as when using the delphin convert command with ace at the command line. Even though ACE outputs MRSs using the common SimpleMRS format, additional content used in ACE’s stdout protocols can complicate tasks such as format or represenation conversion. The user can provide some options to ACE (see https://github.com/delph-in/docs/wiki/AceOptions), such as -T, to filter the non-MRS content, but for convenience PyDelphin also provides the ace codec, available at delphin.codecs.ace. The codec ignores the non-MRS content in ACE’s stdout so the user can use ACE output as a stream or as a corpus of MRS representations. For example:

[~]$ ace -g erg.dat < sentences.txt | delphin convert --from ace

The codec does not support every stdout protocol that this module does. Those it does support are:

  • regular parsing

  • parsing with ACE’s --tsdb-stdout option

  • generation with ACE’s --tsdb-stdout option