Multiline attributes with jsdom

is it possible to get the value of multiline attributes with jsdom (I use it with Node.js+JQuery)?

The site to scrape includes this HTML:

<li><a data-title="<strong>hello world
this is a test</strong>" href="example.org</strong>">A link</a></li>

Unfortunately, this gets parsed to

<li><a data-title="data-title"><strong>hello world
this is a test</strong>' href="example.org">A link</a></li>

and so i cannot extract the title and href attribute e.g. via JQuery: $("a").attr("data-title").

Any ideas?

Yes, that is a bug in jsdom parser. This is because it does not use a full HTML5-compliant parser. You can see such bugs are still unresolved :

  1. https://github.com/tmpvar/jsdom/issues/494
  2. https://github.com/tmpvar/jsdom/issues/482

You can try cheerio for scraping.