Split field to rows with lookahead regex

Given a HTML Text with alternating paragragraphs, where the second line actually belongs to a the first one:

<p>Simon Burger</p>
<p><a href="https://www.simonburger.com">Website SB</a></p>
<p>Petra Ohnsorg</p>
<p><a href="https://www.petraohnsorg.com">Website PO</a></p>

I want to split the text, so there’s a line for each person. This can be achieve by using a regex with a negative lookahead:

<p>(?!<)

This only splits the first paragraph text when there is no following “<“

The fields of the so created rows can than be retrieved by using another regex:

Leave a comment