Image/link attributes containing `]`, `)`, or spaces produce broken Markdown output

`convert_img`, `convert_a`, and `convert_video` can emit Markdown that downstream parsers do not read as the original HTML. The failure mode is that markdownify drops raw values into Markdown link/image syntax without escaping them for that context. For `img` and `video` that means attribute-backed values like `alt`, `src`, and `poster`; for `a` and `video` it also includes generated label text inside `[...]`.

Confirmed on `1.2.2` and current `develop` (at the time of filing, `markdownify/__init__.py` was byte-identical on both).

## Reproducer

```python
from markdownify import markdownify as md

md('<img src="/a" alt="]">')
# Output:   '![]](/a)'
# Expected: '![\\]](/a)' (image preserved)
# Re-parse: renders as literal text, image destroyed

md('<img src="/a b" alt="x">')
# Output:   '![x](/a b)'
# Expected: '![x](</a b>)' or URL-encoded
# Re-parse: literal text in 3 of 4 parsers

md('<img src="/safe" alt="](http://attacker)">')
# Output:   '![](http://attacker)](/safe)'
# Re-parse: <img src="http://attacker" alt=""/>](/safe)
#            attacker-controlled URL substituted, original destination left as trailing literal text

md('<a href="/a)b">click</a>')
# Output:   '[click](/a)b)'
# Re-parse: <a href="/a">click</a>b)
```

I re-ran those outputs through Python-Markdown, Mistune, commonmark.py, and markdown-it-py. The delimiter-truncation and URL-substitution cases break in all four. The space-in-destination cases are accepted by Python-Markdown but rendered as literal text by the three CommonMark parsers.

That matches [CommonMark §6.3 (links)](https://spec.commonmark.org/0.31.2/#links) and [§6.4 (images)](https://spec.commonmark.org/0.31.2/#images): brackets in labels need to be escaped or balanced, raw destinations cannot contain spaces unless they are written as `<...>`, and an unescaped `)` closes an unbalanced destination early.

`escape_misc=True` is not a full workaround. It does not help for attribute-backed fields such as `img alt/src/title`, `href`, `src`, or `poster`, because those values bypass `escape()`. It does help when the broken piece is generated label text. For example, `<a href="link">text]</a>` becomes `[text\]](link)` with `escape_misc=True`.

`convert_img` is the clearest example: it pulls attributes directly from `el.attrs` and returns `![%s](%s%s)` without routing `alt` or `src` through `escape()`.

## Failing input patterns

The confirmed input shapes so far are unbalanced `[` or `]` in `alt`, `)` or a space in `src` or `href`, and `](...)` appearing in `alt` or link text. The last case is the URL-substitution variant: the embedded URL becomes the parsed destination and the original `src`/`href` is left behind as trailing literal text.

## Security note

The `](http://...)` case is the one I would call out separately because it can substitute an attacker-controlled URL into the parsed Markdown output. That seems relevant for any pipeline that treats markdownify output as a trusted source of destinations, including HTML-to-Markdown storage flows or LLM ingest pipelines. I am not filing this as a CVE; I just want the behavior on record.

I have not tested sanitizer behavior here, so I am not making a stronger mitigation claim in this issue body.

## Affected functions

Affected code paths include `convert_img` for raw `src`, `alt`, and `title`; `convert_a` for raw `href` plus the surrounding `[...]` around generated link text; and `convert_video` for raw `src`, `poster`, fallback `<source src>`, and generated label text. The existing `title.replace('"', r'\"')` in `convert_img` is a partial version of the kind of context-aware escaping that is needed here.

## Fix shape

If you want a PR, my preference would be a shared escape layer for Markdown labels, destinations, and titles, applied anywhere markdownify emits link/image syntax. A narrower delimiter-by-delimiter patch would fix the immediate repros, but it would keep the escaping rules fragmented across emitters and make this class of bug easy to reintroduce.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image/link attributes containing `]`, `)`, or spaces produce broken Markdown output #261

Reproducer

Failing input patterns

Security note

Affected functions

Fix shape

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Image/link attributes containing ], ), or spaces produce broken Markdown output #261

Description

Reproducer

Failing input patterns

Security note

Affected functions

Fix shape

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Image/link attributes containing `]`, `)`, or spaces produce broken Markdown output #261