Skip to Content
DocsData QualityData matching: dedupliationExecuting matching specification

Executing matching specification

  • Right click on the matching specification in the navigation tree. Matching specifications are located in the Matching subsection of the Specifications section of the navigation tree.

    • You will see the following options you can select from:

      • Execute all (command: execAll) will execute all steps of the specification: adding required columns to the input table, matching, applying manual merging/unmerging, computing consolidated records, applying manual editing of consolidataed records.

      • Execute all auto rules (command: execAllAutoRules) is the same as Execute all but only auto matching rules are executed.

      • Execute all manual rules (command: execAllManualRules) is the same as Execute all but only manual matching rules are executed.

      • Execute - match rules (command: execMatchRuns) executes all matching rules without executing other steps such as consolidation.

      • Execute - match rule… (command: execMatchRun) executes particular matching rule without executing other steps such as consolidation.

      • Execute - apply manual (command: execApplyManual) applies manual merges and unmerges.

      • Execute consolidation (command: execConsolidation) executes the consolidation step.

      • Erase previous runs (command: erasePreviousRuns) unmerges all records in the input table that could be merged by the previous execution of matching rules. The consolidation table is not affected.

      • Add required columns (command: addRequiredColsToMatchTable) adds all required columns listed above (such as set_id) to the input table.

      • Create manual tables (command: createManualTables) creates manual tables listed in the Settings tab of the specification.

      • Remove all manual (command: removeAllManual): the result of all manual manipulations performed in the Steward UI are stored in manual tables defined in the matching specification in the Manual Matching tab. Remove all manual deletes records from the manual tables.

      • [Not supported from the UI] (command: getPotentialMatches): takes the following parameters tableName1, tableName2, targetTableName and returns all records from tableName1 that are pontential matches (using rules from the matching specification) to the records in tableName2. The result is stored in targetTableName. The potential matches are obtained by joining the two tables using only exact matching columns - fuzzy matching columns are not taken into account. getPotentialMatches is usually used to optimize matching during incremental data loading.

Some execution steps related to execution of matching rules (such as Execute - match rule or Execute all) has Mode, which can be normal or debug (normal by default). debug mode will produce an additional target table named <match table name>_<match run name>_pairs. This table will contain pairs with the combined score of the comparison functions. It is helpful when you need to debug fuzzy matching functions.

Last updated on