I have a Node.js/Express application which takes user input as HTML, and I need to make it both well-formed and remove or replace all but a small subset of allowed tags. What are the existing options for doing so?
For example, after cleaning, I might want to regard <div><br></div> as empty and remove it, and replace <div>Text</div> with <p>Text</p>.
EDIT
@kaareal suggests htmltidy, which deals nicely with the cleaning it up part. How can I take this clean output and remove or replace elements?
I only know of one library "htmltidy"
There is already a wrapper for it https://npmjs.org/package/htmltidy