Update of Data Search for Data Mining extension

Advanced Usage of Unconstrained Search

Posted by Edwin Yaqub (RapidMiner) on April 26, 2019

The new version of the RapidMiner extension "Data Search for Data Mining" (version 2.1.0) has been released on the RapidMiner Marketplace. This version brings some code-level and documentation enhancements in the "Enrich Table by Data Fusion" operator. This operator augments a data table with multiple new attributes by performing Unconstrained Search at the client side.

This release provides a new application template and a tutorial process. These implement a new Usecase, which shows how to design a fully automated data augmentation process to fill missing attribute values. We achieve this by transforming the augmentation problem into a multi-objective optimization problem. This is highly beneficial because:

  • It removes the burden from the end-user to find good parameter values for the operator.
  • It allows to evaluate the quality of fuzzy matches by letting the end-user define her own 'precise match' criteria (as a variable representing the count of True Positives) into the optimization problem.