Dashboard
View project on npmView project on GitHub

htmlToSlate

All output.json file content on this page is generated with the htmlToSlate serializer.

Default

By default, htmlToSlate incorporates transformation rules based on the example in HTML | Serializing | Slate.


import { htmlToSlate } from '@slate-serializers/html'

const html = <h1>Heading 1</h1><p>Paragraph 1</p>

const serializedToSlate = htmlToSlate(html)

output.json
[
{
"type": "h1",
"children": [
{
"text": "Heading 1"
}
]
},
{
"type": "p",
"children": [
{
"text": "Paragraph 1"
}
]
}
]

Configuration

Slate JS has a schema-less core. It makes few assumptions the schema of the data you will be transforming. See Principles | Introduction | Slate

As a result, it is likely that you will need to create your own configuration file that implements your schema.

Starting point

Payload CMS

If you are using Slate Rich Text in Payload CMS, a dedicated configuration file is available. See htmlToSlate: Payload CMS configuration.

Options

textTags

Define transform functions for HTML formatting elements.

In the following example, strong and i HTML tags are mapped in the default configuration.


import {
htmlToSlate,
htmlToSlateConfig,
HtmlToSlateConfig,
} from '@slate-serializers/html';
import { getAttributeValue } from 'domutils';

const slate = `<p><strong>I am bold text</strong> whereas <sub><strong><i>I am subscript italic bold text</i></strong></sub>.</p><p>Published: <time datetime="2016-01-20">20 January 2016</time></p>`

const config: HtmlToSlateConfig = {
...htmlToSlateConfig,
textTags: {
...htmlToSlateConfig.textTags,
sub: () => ({ subscript: true }),
time: (el) => ({
...(el && {
datetime: getAttributeValue(el, 'datetime'),
}),
time: true,
}),
},
};

const serializedToSlate = htmlToSlate(slate, config)

output.json
[
{
"type": "p",
"children": [
{
"text": "I am bold text",
"bold": true
},
{
"text": " whereas "
},
{
"text": "I am subscript italic bold text",
"subscript": true,
"bold": true,
"italic": true
},
{
"text": "."
}
]
},
{
"type": "p",
"children": [
{
"text": "Published: "
},
{
"text": "20 January 2016",
"datetime": "2016-01-20",
"time": true
}
]
}
]

elementTags

Map HTML element tags to Slate JSON nodes.


import {
htmlToSlate,
HtmlToSlateConfig,
htmlToSlateConfig,
} from '@slate-serializers/html';
import { getAttributeValue } from 'domutils';

const html = `<article id="main"><h1>Heading 1</h1><p>Paragraph 1</p><h2>Lists</h2><ul><li>Unordered list item 1</li><li>Unordered list item 2</li></ul><ol><li>Ordered list item 1</li><li>Ordered list item 2</li></ol><h2>Quotes</h2><blockquote>Quote</blockquote></article>`

const config: HtmlToSlateConfig = {
...htmlToSlateConfig,
elementTags: {
...htmlToSlateConfig.elementTags,
article: (el) => ({
...(el && {
id: getAttributeValue(el, 'id'),
}),
type: 'article',
}),
},
};

const serializedToSlate = htmlToSlate(html, config)

output.json
[
{
"id": "main",
"type": "article",
"children": [
{
"type": "h1",
"children": [
{
"text": "Heading 1"
}
]
},
{
"type": "p",
"children": [
{
"text": "Paragraph 1"
}
]
},
{
"type": "h2",
"children": [
{
"text": "Lists"
}
]
},
{
"type": "ul",
"children": [
{
"type": "li",
"children": [
{
"text": "Unordered list item 1"
}
]
},
{
"type": "li",
"children": [
{
"text": "Unordered list item 2"
}
]
}
]
},
{
"type": "ol",
"children": [
{
"type": "li",
"children": [
{
"text": "Ordered list item 1"
}
]
},
{
"type": "li",
"children": [
{
"text": "Ordered list item 2"
}
]
}
]
},
{
"type": "h2",
"children": [
{
"text": "Quotes"
}
]
},
{
"type": "blockquote",
"children": [
{
"text": "Quote"
}
]
}
]
}
]

textTags vs elementTags

Use elementTags transform functions for HTML element tags that structure content. e.g. h1, h2, div...etc.

Use textTags transform functions for HTML element tags that define inline meaning, structure or style of content. e.g. strong, abbr, sub...etc.

Note how textTags are combined to represent inline meaning/structure/style whereas elementTags always create new Slate nodes.

import {
htmlToSlate,
HtmlToSlateConfig,
htmlToSlateConfig,
} from '@slate-serializers/html';
import { getAttributeValue } from 'domutils';

const html = <article id="main"><p>Published: <time datetime="2016-01-20"><strong>20 January 2016</strong></time></p></article>

const config: HtmlToSlateConfig = {
...htmlToSlateConfig,
textTags: {
...htmlToSlateConfig.textTags,
time: (el) => ({
...(el && {
datetime: getAttributeValue(el, 'datetime'),
}),
time: true,
}),
},
elementTags: {
...htmlToSlateConfig.elementTags,
article: (el) => ({
...(el && {
id: getAttributeValue(el, 'id'),
}),
type: 'article',
}),
},
};

const serializedToSlate = htmlToSlate(slate, config)

output.json
[
{
"id": "main",
"type": "article",
"children": [
{
"type": "p",
"children": [
{
"text": "Published: "
},
{
"text": "20 January 2016",
"datetime": "2016-01-20",
"time": true,
"bold": true
}
]
}
]
}
]

elementAttributeTransform

Apply attribute transformations to every node.


import {
htmlToSlate,
HtmlToSlateConfig,
htmlToSlateConfig,
} from '@slate-serializers/html';
import { getAttributeValue } from 'domutils';

const html = <h1 id="h1-1">Heading 1</h1><p id="p-1">Paragraph 1</p><h2 id="h2-1">Lists</h2><ul id="ul-1"><li id="li-1">Unordered list item 1</li><li id="li-2">Unordered list item 2</li></ul><ol id="ol-1"><li id="li-3">Ordered list item 1</li><li id="li-4">Ordered list item 2</li></ol><h2 id="h2-2">Quotes</h2><blockquote id="blockquote-1">Quote</blockquote>

const config: HtmlToSlateConfig = {
...htmlToSlateConfig,
elementAttributeTransform: ({ el }) => {
const attribs: { [key: string]: string } = {};
const id = getAttributeValue(el, 'id');
if (id) {
attribs['id'] = id;
}
return attribs;
},
};

const serializedToSlate = htmlToSlate(slate, config)

output.json
[
{
"type": "h1",
"id": "h1-1",
"children": [
{
"text": "Heading 1"
}
]
},
{
"type": "p",
"id": "p-1",
"children": [
{
"text": "Paragraph 1"
}
]
},
{
"type": "h2",
"id": "h2-1",
"children": [
{
"text": "Lists"
}
]
},
{
"type": "ul",
"id": "ul-1",
"children": [
{
"type": "li",
"id": "li-1",
"children": [
{
"text": "Unordered list item 1"
}
]
},
{
"type": "li",
"id": "li-2",
"children": [
{
"text": "Unordered list item 2"
}
]
}
]
},
{
"type": "ol",
"id": "ol-1",
"children": [
{
"type": "li",
"id": "li-3",
"children": [
{
"text": "Ordered list item 1"
}
]
},
{
"type": "li",
"id": "li-4",
"children": [
{
"text": "Ordered list item 2"
}
]
}
]
},
{
"type": "h2",
"id": "h2-2",
"children": [
{
"text": "Quotes"
}
]
},
{
"type": "blockquote",
"id": "blockquote-1",
"children": [
{
"text": "Quote"
}
]
}
]

htmlUpdaterMap

Manipulate/Transform your HTML before serialization.

A powerful feature that allows you to hook into the DOM object created using htmlparser2 and perform manipultion with utilities such as domutils.

htmlPreProcessString

Perform any operations on the HTML string before serializing to the DOM. This is the first operation to run.

String operations are not ideal, but may be necessary in some cases.

filterWhitespaceNodes

Remove any Slate JSON nodes that have no type or content. For example:

{
"children": []
}

These nodes may appear after processing whitespace.

convertBrToLineBreak

Convert <br> HTML element tags to Slate nodes with empty content or \n as appropriate.

Default: true.

trimWhiteSpace

Extra whitespace is valid in HTML and will often be reduced to a single space or removed when the HTML is rendered. By default, htmlToSlate will apply such whitespace reduction rules to Slate node values.

Default: true.