Remove a pattern in string with RegEx

In the “Replace in string” step you can also use RegEx to remove variable content.

use Regex in Replace in String step can be use to try your Regex


Extract email from website

How to extract the email of the corresponding author of a publication, like: with Pentaho Data integration? as rendered HTML (excerpt of HTML source code)
  1. Get the HTML of the publications via REST Step, store it in one field.
  2. Extract email via “Regex evaluation” step using the Regex
    with the step options:
    • Enable dotall mode
    • Enable multiline mode

The first email appearing in the HTML will put into the filed email.

Alternatively the Online Service also provides a nice possibility to extract emails from several websites: