delphin.interface

Interfaces for external data providers.

This module manages the communication between data providers, namely processors like ACE or remote services like the DELPH-IN Web API, and user code or storage backends, namely [incr tsdb()] test suites. An interface sends requests to a provider, then receives and interprets the response.

The interface may also detect and deserialize supported DELPH-IN formats if the appropriate modules are available.

class delphin.interface.Processor[source]

Base class for processors.

This class defines the basic interface for all PyDelphin processors, such as ACEProcess and Client. It can also be used to define preprocessor wrappers of other processors such that it has the same interface, allowing it to be used, e.g., with TestSuite.process().

task

name of the task the processor performs (e.g., “parse”, “transfer”, or “generate”)

process_item(datum, keys=None)[source]

Send datum to the processor and return the result.

This method is a generic wrapper around a processor-specific processing method that keeps track of additional item and processor information. Specifically, if keys is provided, it is copied into the keys key of the response object, and if the processor object’s task member is non-None, it is copied into the task key of the response. These help with keeping track of items when many are processed at once, and to help downstream functions identify what the process did.

Parameters
  • datum – the item content to process

  • keys – a mapping of item identifiers which will be copied into the response

class delphin.interface.Response[source]

A wrapper around the response dictionary for more convenient access to results.

result(i)[source]

Return a Result object for the result i.

results()[source]

Return Result objects for each result.

tokens(tokenset='internal')[source]

Interpret and return a YYTokenLattice object.

If tokenset is a key under the tokens key of the response, interpret its value as a YYTokenLattice from a valid YY serialization or from a dictionary. If tokenset is not available, return None.

Parameters

tokenset (str) – return ‘initial’ or ‘internal’ tokens (default: ‘internal’)

Returns

YYTokenLattice

Raises

InterfaceError – when the value is an unsupported type or delphin.tokens is unavailble

class delphin.interface.Result[source]

A wrapper around a result dictionary to automate deserialization for supported formats. A Result is still a dictionary, so the raw data can be obtained using dict access.

derivation()[source]

Interpret and return a Derivation object.

If delphin.derivation is available and the value of the derivation key in the result dictionary is a valid UDF string or a dictionary, return the interpeted Derivation object. If there is no ‘derivation’ key in the result, return None.

Raises

InterfaceError – when the value is an unsupported type or delphin.derivation is unavailable

dmrs()[source]

Interpret and return a Dmrs object.

If delphin.codecs.dmrsjson is available and the value of the dmrs key in the result is a dictionary, return the interpreted DMRS object. If there is no dmrs key in the result, return None.

Raises

InterfaceError – when the value is not a dictionary or delphin.codecs.dmrsjson is unavailable

eds()[source]

Interpret and return an Eds object.

If delphin.codecs.eds is available and the value of the eds key in the result is a valid “native” EDS serialization, or if delphin.codecs.edsjson is available and the value is a dictionary, return the interpreted EDS object. If there is no eds key in the result, return None.

Raises

InterfaceError – when the value is an unsupported type or the corresponding module is unavailable

mrs()[source]

Interpret and return an MRS object.

If delphin.codecs.simplemrs is available and the value of the mrs key in the result is a valid SimpleMRS string, or if delphin.codecs.mrsjson is available and the value is a dictionary, return the interpreted MRS object. If there is no mrs key in the result, return None.

Raises

InterfaceError – when the value is an unsupported type or the corresponding module is unavailable

tree()[source]

Interpret and return a labeled syntax tree.

The tree data may be a standalone datum, or embedded in a derivation.

Wrapping a Processor for Preprocessing

The Processor class can be used to implement a preprocessor that maintains the same interface as the underlying processor. The following example wraps an ACEParser instance of the English Resource Grammar with a REPP instance.

>>> from delphin import interface
>>> from delphin import ace
>>> from delphin import repp
>>>
>>> class REPPWrapper(interface.Processor):
...     def __init__(self, cpu, rpp):
...         self.cpu = cpu
...         self.task = cpu.task
...         self.rpp = rpp
...     def process_item(self, datum, keys=None):
...         preprocessed_datum = str(self.rpp.tokenize(datum))
...         return self.cpu.process_item(preprocessed_datum, keys=keys)
...
>>> # The preprocessor can be used like a normal Processor:
>>> rpp = repp.REPP.from_config('../../grammars/erg/pet/repp.set')
>>> grm = '../../grammars/erg-2018-x86-64-0.9.30.dat'
>>> with ace.ACEParser(grm, cmdargs=['-y']) as _cpu:
...     cpu = REPPWrapper(_cpu, rpp)
...     response = cpu.process_item('Abrams hired Browne.')
...     for result in response.results():
...         print(result.mrs())
...
<Mrs object (proper named hire proper named) at 140488735960480>
<Mrs object (unknown compound udef named hire parg addressee proper named) at 140488736005424>
<Mrs object (unknown proper compound udef named hire parg named) at 140488736004864>
NOTE: parsed 1 / 1 sentences, avg 1173k, time 0.00986s

A similar technique could be used to manage external processes, such as MeCab for morphological segmentation of Japanese for Jacy. It could also be used to make a postprocessor, a backoff mechanism in case an input fails to parse, etc.