The DS4DM project has envisioned and implemented innovative concepts regarding data search and extraction. Search heuristics such as Correspondence Search, Unconstrained Search and Correlation Search were developed that harness tabular collections in an efficient manner, while allowing the end users to (semi)automatically integrate data. The project also developed tools for extracting data from non-trivial but important document formats, accessing data from increasingly popular cloud-based online spreadsheets, but also largely used data store such as SharePoint. The central foci of our work has been on extending the capabilities of RapidMiner platform by means of implementing RapidMiner extensions that can bring data in RapidMiner and thus enhance RapidMiner processes.
In the third project year, we leveraged the project vision to bring RapidMiner capabilities to Informatica – arguably the most popular data integration platform. This integration is the result of a strategic partnership signed between RapidMiner and Informatica. The former is a leading ETL and Machine Learning platform, while the latter is the world leader in data integration tools - as recognized by Gartner. This integration is achieved by implementing an enterprise connector, that allows to invoke RapidMiner processes (as REST webservices) from within Informatica Cloud workflows (called as Mappings). This integration not only opens new doors for exposure and sustainability of our DS4DM work to be executed as part of Informatica mappings (e.g., data search or extraction processes), but it also allows to utilize any other RapidMiner process from within Informatica as well.
The initial version of this connector had been published on May 13, 2018 on two production environments of Informatica Cloud and is aimed at joint customers of Informatica and RapidMiner, who may now inject RapidMiner’s predictive modelling into Informatica Cloud workflows. A demo using the connector was shown by RapidMiner at the Informatica World event in Las Vegas, USA during the week of May 22-24, 2018. A press release about this collaboration had been issued on May 23.