delphin.commands

PyDelphin API counterparts to the delphin commands.

The public functions in this module largely mirror the front-end subcommands provided by the delphin command, with some small changes to argument names or values to be better-suited to being called from within Python.

convert

delphin.commands.convert(path, source_fmt, target_fmt, select='result.mrs', properties=True, lnk=True, color=False, indent=None, show_status=False, predicate_modifiers=False, semi=None)[source]

Convert between various DELPH-IN Semantics representations.

If source_fmt ends with “-lines”, then path must be an input file containing one representation per line to be read with the decode() function of the source codec. If target_fmt ends with “-lines”, then any HEADER, JOINER, or FOOTER defined by the target codec are ignored. The source_fmt and target_fmt arguments are then downcased and hyphens are removed to normalize the codec name.

Note

For syntax highlighting, delphin.highlight must be installed, and it is only available for select target formats.

Parameters
  • path (str, Path, open file) – filename, testsuite directory, open file, or stream of input representations

  • source_fmt (str) – convert from this format

  • target_fmt (str) – convert to this format

  • select (str) – TSQL query for selecting data (ignored if path is not a testsuite directory; default: “result:mrs”)

  • properties (bool) – include morphosemantic properties if True (default: True)

  • lnk (bool) – include lnk surface alignments and surface strings if True (default: True)

  • color (bool) – apply syntax highlighting if True and target_fmt is “simplemrs” (default: False)

  • indent (int) – specifies an explicit number of spaces for indentation

  • show_status (bool) – show disconnected EDS nodes (ignored if target_fmt is not “eds”; default: False)

  • predicate_modifiers (bool) – apply EDS predicate modification for certain kinds of patterns (ignored if target_fmt is not an EDS format; default: False)

  • semi – a delphin.semi.SemI object or path to a SEM-I (ignored if target_fmt is not indexedmrs)

Returns

str – the converted representation

select

delphin.commands.select(query, path, record_class=None)[source]

Select data from [incr tsdb()] test suites.

Parameters
  • query (str) – TSQL select query (e.g., ‘i-id i-input mrs’ or ‘* from item where readings > 0’)

  • path (str, Path) – path to a TSDB test suite

  • record_class – alternative class for records in the selection

Yields

selected data from the test suite

mkprof

delphin.commands.mkprof(destination, source=None, schema=None, where=None, delimiter=None, refresh=False, skeleton=False, full=False, gzip=False, quiet=False)[source]

Create [incr tsdb()] profiles or skeletons.

Data for the testsuite may come from an existing testsuite or from a list of sentences. There are four main usage patterns:

  • source="testsuite/" – read data from testsuite/

  • source=None, refresh=True – read data from destination

  • source=None, refresh=False – read sentences from stdin

  • source="sents.txt" – read sentences from sents.txt

The latter two require the schema parameter.

Parameters
  • destination (str, Path) – path of the new testsuite

  • source (str, Path) – path to a source testsuite or a file containing sentences; if not given and refresh is False, sentences are read from stdin

  • schema (str, Path) – path to a relations file to use for the created testsuite; if None and source is a test suite, the schema of source is used

  • where (str) – TSQL condition to filter records by; ignored if source is not a testsuite

  • delimiter (str) – if given, split lines from source or stdin on the character delimiter; if delimiter is “@”, split using delphin.tsdb.split(); a header line with field names is required; ignored when the data source is not text lines

  • refresh (bool) – if True, rewrite the data at destination; implies full is True; ignored if source is not None, best combined with schema or gzip (default: False)

  • skeleton (bool) – if True, only write tsdb-core files (default: False)

  • full (bool) – if True, copy all data from the source testsuite; ignored if the data source is not a testsuite or if skeleton is True (default: False)

  • gzip (bool) – if True, non-empty tables will be compressed with gzip

  • quiet (bool) – if True, don’t print summary information

process

delphin.commands.process(grammar, testsuite, source=None, select=None, generate=False, transfer=False, full_forest=False, options=None, all_items=False, result_id=None, gzip=False, stderr=None, report_progress=True)[source]

Process the [incr tsdb()] profile testsuite with grammar.

Inputs are read from source and results are written to testsuite. If source is None, it is set to testsuite. It is common for source to be None in parsing tasks, but it is not recommended for transfer or generation because the MRS field is both read from and written to for these tasks. If source points to a valid [incr tsdb()] profile and testsuite is a path to a non-existing location, the profile directory is created at that path.

The default task is parsing, but generation is done if generate is True and transfer is done if transfer is True; only one or neither may be True. Input data is extracted from source using the TSQL query select. If select is None, the default depends on the task:

Task

Default value of select

Parsing

item.i-input

Transfer

result.mrs

Generation

result.mrs

Parameters
  • grammar (str, Path) – path to a compiled grammar image

  • testsuite (str, Path) – path to the destination [incr tsdb()] testsuite

  • source (str, Path) – path to the source [incr tsdb()] testsuite; if None, testsuite is used as the source of data

  • select (str) – TSQL query for selecting processor inputs (default depends on the processor type)

  • generate (bool) – if True, generate instead of parse (default: False)

  • transfer (bool) – if True, transfer instead of parse (default: False)

  • options (list) – list of ACE command-line options to use when invoking the ACE subprocess; unsupported options will give an error message

  • all_items (bool) – if True, don’t exclude ignored items (those with i-wf==2) when parsing

  • result_id (int) – if given, only select inputs with the specified result-id (transfer and generation)

  • gzip (bool) – if True, non-empty tables will be compressed with gzip

  • stderr (file) – stream for ACE’s stderr

  • report_progress (bool) – print a progress bar to stderr if True and logging verbosity is at WARNING or lower; (default: True)

compare

delphin.commands.compare(testsuite, gold, select='i-id i-input mrs')[source]

Compare two [incr tsdb()] profiles.

Parameters
  • testsuite (str, Path, TestSuite) – path to the test [incr tsdb()] testsuite or a TestSuite object

  • gold (str, Path, TestSuite) – path to the gold [incr tsdb()] testsuite or a TestSuite object

  • select – TSQL query to select (id, input, mrs) triples (default: ‘i-id i-input mrs’)

Yields

dict

Comparison results as:

{"id": "item identifier",
 "input": "input sentence",
 "test": number_of_unique_results_in_test,
 "shared": number_of_shared_results,
 "gold": number_of_unique_results_in_gold}

repp

delphin.commands.repp(source, config=None, module=None, active=None, format=None, color=False, trace_level=0)[source]

Tokenize with a Regular Expression PreProcessor (REPP).

Results are printed directly to stdout. If more programmatic access is desired, the delphin.repp module provides a similar interface.

Parameters
  • source (str, Path, open file) – filename, open file, or stream of sentence inputs

  • config (str, Path) – path to a PET REPP configuration (.set) file

  • module (str, Path) – path to a top-level REPP module; other modules are found by external group calls

  • active (list) – select which modules are active; if None, all are used; incompatible with config (default: None)

  • format (str) – the output format (“yy”, “string”, “line”, or “triple”; default: “yy”)

  • color (bool) – apply syntax highlighting if True (default: False)

  • trace_level (int) – if 0 no trace info is printed; if 1, applied rules are printed, if greater than 1, both applied and unapplied rules (in order) are printed (default: 0)

Exceptions

exception delphin.commands.CommandError(*args, **kwargs)[source]

Raised on an invalid command call.