delphin.commands

PyDelphin API counterparts to the delphin commands.

The public functions in this module largely mirror the front-end subcommands provided by the delphin command, with some small changes to argument names or values to be better-suited to being called from within Python.

convert

delphin.commands.convert(path, source_fmt, target_fmt, select='result.mrs', properties=True, lnk=True, color=False, indent=None, show_status=False, predicate_modifiers=False, semi=None)[source]

Convert between various DELPH-IN Semantics representations.

The source_fmt and target_fmt arguments are downcased and hyphens are removed to normalize the codec name.

Note

For syntax highlighting, delphin.highlight must be installed, and it is only available for select target formats.

Parameters
  • path (str, file) – filename, testsuite directory, open file, or stream of input representations

  • source_fmt (str) – convert from this format

  • target_fmt (str) – convert to this format

  • select (str) – TSQL query for selecting data (ignored if path is not a testsuite directory; default: “result:mrs”)

  • properties (bool) – include morphosemantic properties if True (default: True)

  • lnk (bool) – include lnk surface alignments and surface strings if True (default: True)

  • color (bool) – apply syntax highlighting if True and target_fmt is “simplemrs” (default: False)

  • indent (int, optional) – specifies an explicit number of spaces for indentation

  • show_status (bool) – show disconnected EDS nodes (ignored if target_fmt is not “eds”; default: False)

  • predicate_modifiers (bool) – apply EDS predicate modification for certain kinds of patterns (ignored if target_fmt is not an EDS format; default: False)

  • semi – a delphin.semi.SemI object or path to a SEM-I (ignored if target_fmt is not indexedmrs)

Returns

str – the converted representation

select

delphin.commands.select(query, path, record_class=None)[source]

Select data from [incr tsdb()] test suites.

Parameters
  • query (str) – TSQL select query (e.g., ‘i-id i-input mrs’ or ‘* from item where readings > 0’)

  • path – path to a TSDB test suite

  • record_class – alternative class for records in the selection

Yields

selected data from the test suite

mkprof

delphin.commands.mkprof(destination, source=None, schema=None, where=None, delimiter=None, refresh=False, skeleton=False, full=False, gzip=False, quiet=False)[source]

Create [incr tsdb()] profiles or skeletons.

Data for the testsuite may come from an existing testsuite or from a list of sentences. There are four main usage patterns:

  • source=”testsuite/” – read data from testsuite/

  • source=None, refresh=True – read data from destination

  • source=None, refresh=False – read sentences from stdin

  • source=”sents.txt” – read sentences from sents.txt

The latter two require the schema parameter.

Parameters
  • destination (str) – path of the new testsuite

  • source (str) – path to a source testsuite or a file containing sentences; if not given and refresh is False, sentences are read from stdin

  • schema (str) – path to a relations file to use for the created testsuite; if None and source is a test suite, the schema of source is used

  • where (str) – TSQL condition to filter records by; ignored if source is not a testsuite

  • delimiter (str) – if given, split lines from source or stdin on the character delimiter; if delimiter is “@”, split using delphin.tsdb.split(); a header line with field names is required; ignored when the data source is not text lines

  • refresh (bool) – if True, rewrite the data at destination; implies full is True; ignored if source is not None, best combined with schema or gzip (default: False)

  • skeleton (bool) – if True, only write tsdb-core files (default: False)

  • full (bool) – if True, copy all data from the source testsuite; ignored if the data source is not a testsuite or if skeleton is True (default: False)

  • gzip (bool) – if True, non-empty tables will be compressed with gzip

  • quiet (bool) – if True, don’t print summary information

process

delphin.commands.process(grammar, testsuite, source=None, select=None, generate=False, transfer=False, full_forest=False, options=None, all_items=False, result_id=None, gzip=False, stderr=None)[source]

Process (e.g., parse) a [incr tsdb()] profile.

Results are written to directly to testsuite.

If select is None, the defaults depend on the task:

Task

Default value of select

Parsing

item.i-input

Transfer

result.mrs

Generation

result.mrs

Parameters
  • grammar (str) – path to a compiled grammar image

  • testsuite (str) – path to a [incr tsdb()] testsuite where data will be read from (see source) and written to

  • source (str) – path to a [incr tsdb()] testsuite; if None, testsuite is used as the source of data

  • select (str) – TSQL query for selecting processor inputs (default depends on the processor type)

  • generate (bool) – if True, generate instead of parse (default: False)

  • transfer (bool) – if True, transfer instead of parse (default: False)

  • options (list) – list of ACE command-line options to use when invoking the ACE subprocess; unsupported options will give an error message

  • all_items (bool) – if True, don’t exclude ignored items (those with i-wf==2) when parsing

  • result_id (int) – if given, only keep items with the specified result-id

  • gzip (bool) – if True, non-empty tables will be compressed with gzip

  • stderr (file) – stream for ACE’s stderr

compare

delphin.commands.compare(testsuite, gold, select='i-id i-input mrs')[source]

Compare two [incr tsdb()] profiles.

Parameters
  • testsuite (str, TestSuite) – path to the test [incr tsdb()] testsuite or a TestSuite object

  • gold (str, TestSuite) – path to the gold [incr tsdb()] testsuite or a TestSuite object

  • select – TSQL query to select (id, input, mrs) triples (default: i-id i-input mrs)

Yields

dict

Comparison results as:

{"id": "item identifier",
 "input": "input sentence",
 "test": number_of_unique_results_in_test,
 "shared": number_of_shared_results,
 "gold": number_of_unique_results_in_gold}

repp

delphin.commands.repp(source, config=None, module=None, active=None, format=None, trace_level=0)[source]

Tokenize with a Regular Expression PreProcessor (REPP).

Results are printed directly to stdout. If more programmatic access is desired, the delphin.repp module provides a similar interface.

Parameters
  • source (str, file) – filename, open file, or stream of sentence inputs

  • config (str) – path to a PET REPP configuration (.set) file

  • module (str) – path to a top-level REPP module; other modules are found by external group calls

  • active (list) – select which modules are active; if None, all are used; incompatible with config (default: None)

  • format (str) – the output format (“yy”, “string”, “line”, or “triple”; default: “yy”)

  • trace_level (int) – if 0 no trace info is printed; if 1, applied rules are printed, if greather than 1, both applied and unapplied rules (in order) are printed (default: 0)

Exceptions

exception delphin.commands.CommandError(*args, **kwargs)[source]

Raised on an invalid command call.