Parse and Convert Raw HTML String into React Elements or Components
So you have an HTML string like this:
const htmlString = `
<div data-library="react">
<h1 data-type="heading">Hello</h1>
<p>React!</p>
</div>
`;
And you want to convert it into React elements. Of course in JSX if we just replace the backticks (`...`
) with parentheses ((...)
), we will automatically get the React element object tree assigned to htmlString
. But we’re talking about cases where you got a plain/raw HTML string from somewhere and would want to convert that to one or more React elements. Let’s talk a bit about rendering raw HTML and then we’ll come back to the conversion.
If we only wanted to render the HTML string, we could make use of the dangerouslySetInnerHTML
feature.
function Component() {
const htmlString = '...';
return <div dangerouslySetInnerHTML={__html: htmlString} />;
}
The component above would return and render a React element representing a div
with the htmlString
shoved inside it as children. The feature or property name to inject the raw HTML string is dangerouslySetInnerHTML
because injecting HTML directly into your webpage can be vulnerable to XSS attacks. Basically if the HTML was coming from an untrusted source, it could contain malicious JavaScript code that would run on the user’s browser with the ability to cause havoc.
Can we solve the XSS problem ? There’s a library called DOMPurify that we could use to sanitize the HTML string before rendering it.
import DOMPurify from 'dompurify';
function Component() {
const htmlString = ...;
return <div dangerouslySetInnerHTML={__html: DOMPurify.sanitize(htmlString)} />;
}
There’s one tiny thing you’ll notice here. When using dangerouslySetInnerHTML
, the HTML string that’ll be inserted into the DOM will always be wrapped by the element on which the dangerous property is set. In this case that wrapper element is the div
. There may be a possibility that you don’t want it. A hard-core requirement.
If you do have this requirement or instead of dealing with strings you want to deal with React element objects for whatever reason (getting direct access to the HTML attributes your JS code is a good one), then we’ll have to use a “parser”. Coming back to where we started – converting HTML string into React elements.
We will need a parser that is able to take the HTML string as an input and give out React elements as output. There are a few options (third party packages) that we can use:
Let’s randomly pick html-react-parser
(as that has the highest weekly downloads right now) and look it its usage:
import parse from 'html-react-parser';
function Component() {
const htmlString = '...';
const reactElement = parse(htmlString);
return reactElement;
}
Really easy to use! Will these libraries sanitize the HTML ? Check their documentation but html-react-parser
for instance does not. Hence if you need sanitization then feel free to use DOMPurify.sanitize(htmlString)
here as well.
The only piece of important code in the example above that does the parsing and conversion is the parse
function which returns a React element (tree). Basically:
parse('<p>Hello, World!</p>');
// returns
React.createElement('p', {}, 'Hello, World!')
One of the biggest advantages of using elements via these parsers over raw HTML via dangerouslySetInnerHTML
is that, the former becomes a part of the React tree leading to efficient reconciliation or DOM diffing. Where as with the latter, React can never update just what is required, it has to always replace the entire HTML to flush changes into the DOM.
We’ve learnt the following today:
- How to inject raw HTML into DOM by using the
dangerouslySetInnerHTML
special property. - How to sanitize raw HTML with
DOMPurify
. - How to convert raw HTML into React element with third party libraries like
html-react-parser
.