ToolActToolAct

HTML Formatting Tool

Input HTML
Output
Lines: 1Characters: 0Bytes: 0
Lines: 1Characters: 0

What is HTML Formatting?

HTML formatting reorganizes messy or compressed markup into readable code with consistent indentation, line breaks, and visible tag hierarchy. It makes review, debugging, teaching, handoff, and maintenance easier, especially when looking for unclosed tags, incorrect nesting, long attribute lists, or generated HTML that is hard to scan. For production assets, the opposite operation can also be useful: minification removes unnecessary whitespace, comments, and breaks to reduce transfer size. This tool helps move between human-readable and compact HTML, but formatting does not automatically fix semantic structure, accessibility issues, broken links, unsafe embedded content, or framework-specific template mistakes. Those still need targeted validation and review.

How to Use

How to use

  1. Paste or enter HTML code in the left input box
  2. Select indent size and wrap line length
  3. Click "Format" to beautify code, or click "Minify" to reduce size
  4. View formatted code on the right (with syntax highlighting)
  5. Click "Copy" to copy to clipboard

Options Description

Indent SizeChoose 2 spaces, 4 spaces or Tab indentation
Wrap Line LengthAuto-wrap when exceeding specified characters, choose "No Limit" to keep original

Keyboard Shortcuts

  • Ctrl + EnterFormat
  • Ctrl + Shift + CCopy result

Formatting Tips

  • Formatting can reveal broken nesting or suspicious inline scripts, but it does not sanitize HTML or make unsafe markup safe.
  • Minify only copies intended for delivery. Keep readable source markup for review, diffs, and accessibility checks.

Use Cases

Format pasted HTML with embedded style and script blocksPaste markup and choose 2 spaces, 4 spaces, or tabs. The formatter removes comments, preserves pre content, and formats extracted style and script blocks before putting them back into the HTML structure.
Minify a small HTML snippet for embeddingSwitch to minify mode to collapse whitespace between tags and remove comments before copying or downloading a compact HTML fragment. Parse errors include a line and column hint when possible, so a missing closing tag from a CMS export can be fixed without re-opening the original editor. Run the result through a project linter or html-validate pipeline before embedding it in production, since formatting does not sanitize attributes, inline scripts, or unsafe href values.
Use it for reviewable snippets, not full build validationThe implementation is intentionally lightweight and token-based, so it is best suited to copied snippets, email fragments, and CMS exports. Complex templates, especially those using Vue, Angular, or server-rendered partials with custom elements, should still go through the project's own formatter or parser, and accessibility checks should run separately on the rendered DOM with axe-core or Lighthouse rather than on the formatted source.
Format generated HTML from a CMS export or email builderPaste exported markup from a CMS, newsletter tool, or email builder and re-indent to surface unclosed divs, mis-nested tables, or stray script tags before sending or shipping the campaign. Pair the formatted output with a separate accessibility check on the rendered DOM because the formatter only re-indents, it does not fix missing alt text, contrast, or ARIA roles.
Toggle between formatted and minified for diff comparisonSwitch the same fragment between format and minify modes to compare a human-readable side-by-side diff against the production-minified version when investigating a layout or attribute regression. The two outputs come from the same parser, so any difference can be attributed to whitespace, comments, or attribute ordering, not to two separate formatters disagreeing about the DOM.

Technical Principle

An HTML formatter follows the WHATWG HTML Standard tree-construction algorithm. Bytes are decoded (typically UTF-8 per the meta charset or BOM), the tokenizer steps through the data, RCDATA, RAWTEXT, script-data and several attribute states defined in §13.2.5, and the tree-construction stage builds a DOM using the insertion-mode state machine (initial, before html, in head, in body, in table, in select and so on). The parser is forgiving by design: unclosed `<p>` is auto-closed when a block follows, stray `</tr>` is ignored, and a misplaced `<table>` is foster-parented. Browser-grade parsers expose this through `new DOMParser().parseFromString(src, 'text/html')`; Node-side tooling uses parse5 (the reference whatwg implementation), htmlparser2 or cheerio. Reprinting walks the resulting tree and serializes per element category. Void elements - `area`, `base`, `br`, `col`, `embed`, `hr`, `img`, `input`, `link`, `meta`, `source`, `track`, `wbr` - never receive a closing tag in HTML5 (the XHTML self-closing `/>` is permitted but optional). Block-level containers (`div`, `section`, `article`, `header`, `footer`, `nav`, `main`, `ul`, `ol`, `li`, `table`) get their own line at depth*indent spaces; phrasing/inline content (`span`, `a`, `strong`, `em`, `code`) is kept inline because CSS `white-space: normal` collapses runs of whitespace into a single character. Two element types must be preserved verbatim: `<pre>` and `<textarea>` carry `white-space: pre` semantics, and their leading newline is consumed by the parser; `<script>` and `<style>` bodies are RAWTEXT, so re-indenting their contents would change runtime behavior - they are usually delegated to a JS or CSS sub-formatter and re-embedded. Attribute serialization normalizes quoting (single vs double) and re-encodes embedded delimiters via the named character references `&amp;`, `&lt;`, `&gt;`, `&quot;` and the numeric `&#39;`. Attribute order is not significant in the DOM, so formatters typically apply a stable order (`id`, `class`, then alphabetical). Wrapping kicks in past a configured printWidth: attributes split one-per-line with the closing `>` either on its own line (prettier-plugin-html `bracketSameLine: false`) or attached. Minification reverses the process - whitespace between block tags is removed, comments are stripped except for IE conditional `<!--[if`, and quotes are dropped from attributes whose value matches `[A-Za-z0-9._-]+`. Both operations are O(n) over input length.

  • Parser: WHATWG HTML tree construction (§13) via `DOMParser('text/html')` in the browser, parse5/htmlparser2/cheerio on Node; insertion-mode state machine handles missing/misplaced tags and foster parenting.
  • Void elements (no end tag): area, base, br, col, embed, hr, img, input, link, meta, source, track, wbr - 14 elements per HTML Living Standard. XHTML self-closing `/>` is tolerated but redundant.
  • Whitespace-preserving elements: `<pre>` and `<textarea>` carry CSS `white-space: pre` and consume one leading newline; their interior bytes are never re-indented or the visible output changes.
  • RAWTEXT contexts: `<script>` and `<style>` bodies are not HTML - they are delegated to JS/CSS sub-formatters (or left verbatim). Re-indenting raw text inside a regex or string literal would break runtime semantics.
  • Attribute normalization: quote style picked (single/double), embedded delimiters re-encoded with the five named refs (`&amp;`, `&lt;`, `&gt;`, `&quot;`, `&#39;`), stable order applied (id, class, then alphabetical) since DOM order is not significant.
  • Wrapping past printWidth (typically 80/100/120): attributes split one-per-line with the closing `>` either on its own line or hugging the last attribute, mirroring prettier-plugin-html's `bracketSameLine` option.
  • Minification: collapses inter-tag whitespace, strips comments (keeping `<!--[if IE]>` conditionals), removes optional end tags (`</li>`, `</p>` when followed by a sibling), and unquotes attributes matching `[A-Za-z0-9._-]+`; round-trip is O(n).

Examples

Minified HTML → formatted (2-space indent)

Input:
<div class="card"><h2>Title</h2><p>Body text</p></div>

Formatted:
<div class="card">
  <h2>Title</h2>
  <p>Body text</p>
</div>

Format → minify (production deploy)

Source (412 bytes):
<nav>
  <ul>
    <li><a href="/">Home</a></li>
    <li><a href="/about">About</a></li>
  </ul>
</nav>

Minified (96 bytes, -77%):
<nav><ul><li><a href="/">Home</a></li><li><a href="/about">About</a></li></ul></nav>

Long attribute list wrapped onto multiple lines

Input:
<input type="email" id="user-email" name="email" placeholder="you@example.com" required autocomplete="email">

Formatted (wrap > 80 chars):
<input
  type="email"
  id="user-email"
  name="email"
  placeholder="you@example.com"
  required
  autocomplete="email"
>

Self-closing tags and inline content preserved

Input:
<p>Click <a href="/docs">here</a> for help.<br><img src="icon.png" alt=""></p>

Formatted (inline siblings stay on one line):
<p>
  Click <a href="/docs">here</a> for help.
  <br>
  <img src="icon.png" alt="">
</p>

FAQ

What does it format?

HTML markup: indentation, attribute placement, empty-line handling, and (optionally) line wrapping at a chosen column width. Doesn't change the rendered output - your page looks identical, just the source is more readable.

Will it preserve my <pre>, <textarea>, and <script> blocks?

Yes. Whitespace inside pre, textarea, code, and script blocks is meaningful (it shows up rendered) and the formatter leaves it untouched. Other elements get normalized whitespace.

Why are my self-closing tags rewritten?

<br/>, <img/>, <hr/> in HTML5 are equivalent to <br>, <img>, <hr> (HTML5 doesn't require the slash). The formatter may pick one style consistently. For XHTML/JSX/Vue templates, configure the formatter to keep the trailing slash.

Does it validate HTML?

No. It formats whatever you paste. To validate, use the W3C Validator or a browser's developer tools. Common issues like unclosed tags or mismatched quotes pass through formatting.

Is the HTML uploaded?

No. Formatting runs in your browser. Pasted HTML never crosses the network.

Can it minify HTML?

Yes if there's a minify mode. HTML minification removes whitespace between tags, collapses runs of spaces inside text, and strips optional closing tags. The result is harder to read but smaller - useful for production builds.

What about embedded CSS and JS?

<style> and <script> contents are usually formatted with their own (CSS, JS) formatters when the page supports it. Otherwise their inner content is left as-is to avoid breaking syntax.