Skip to content

fix: ignore legacy manual notes during sync#60

Merged
TheZupZup merged 1 commit into
mainfrom
claude/ignore-legacy-markdown-notes-XBj7W
May 8, 2026
Merged

fix: ignore legacy manual notes during sync#60
TheZupZup merged 1 commit into
mainfrom
claude/ignore-legacy-markdown-notes-XBj7W

Conversation

@TheZupZup
Copy link
Copy Markdown
Owner

Summary

Stops the WebDAV sync engine from re-importing legacy / hand-edited
Markdown files (no NexaNote frontmatter id) on every pull, which was
producing duplicate notes and __Md… title artifacts derived from
synthesised slugs.

  • New nexanote/sync/sync_state.py — per-data_dir JSON registry
    tracking adopted remote paths (with the local note id they map to)
    and ignored remote paths (with a reason and timestamps). Loaded at
    engine init, saved at the end of every sync() even on errors.
  • _pull_note now resolves a remote note in three stages:
    1. by note id (frontmatter id matches a local note),
    2. by remote_path (we previously adopted this path under another id),
    3. otherwise — if the id looks legacy (md.… synthesised prefix or
      missing) — record an ignore marker and skip silently on every
      later sync. A previously-adopted path whose local note has been
      purged is also ignored, never re-imported.
  • SyncReport.notes_ignored_legacy surfaces the count in the
    diagnostic summary.

Test plan

  • tests/test_sync_legacy_ignore.py — 18 new tests covering:
    • manual .md (no NexaNote id) is ignored, never imported,
    • 3 syncs in a row don't grow the local note count,
    • no __Md… title artifacts leak in,
    • notes with valid frontmatter still adopt normally,
    • mixed remote (real + legacy) imports the real one, ignores the legacy one,
    • registry persists across sessions (load → save → reload),
    • remote_path fallback avoids duplicates when ids change,
    • report summary mentions/omits the legacy count appropriately,
    • SyncState handles missing/corrupt JSON gracefully.
  • Full test suite passes (pytest tests/ -q → 279 passed).

Generated by Claude Code

Legacy or hand-edited Markdown files that lack a NexaNote frontmatter
id were being re-imported on every sync, producing duplicate notes and
weird "__Md…" titles derived from synthesised slugs.

Add a per-data-dir SyncState registry that tracks adopted remote paths
(remote_path → local_id) and ignored remote paths (with reasons). The
pull flow now resolves a remote note in this order:
  1. by note id (frontmatter id matches a local note),
  2. by remote_path (we adopted this path under another id before),
  3. otherwise — if the id looks legacy/synthetic — record an ignore
     marker and skip silently on every later sync.

The SyncReport gains a `notes_ignored_legacy` counter so the diagnostic
summary surfaces how many manual files were skipped.
@TheZupZup TheZupZup marked this pull request as ready for review May 8, 2026 19:28
@TheZupZup TheZupZup merged commit fc9afc3 into main May 8, 2026
1 check passed
@TheZupZup TheZupZup deleted the claude/ignore-legacy-markdown-notes-XBj7W branch May 8, 2026 19:29
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f18bc6b274

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread nexanote/sync/client.py
Comment on lines +773 to +775
if local_note is None and legacy_id:
reason = (
"no NexaNote frontmatter id; legacy/manual Markdown file"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Do not drop first-time pull for all md. note ids

The new legacy guard treats every id starting with md. as non-importable, so a client can never pull those notes on first sync even when they were produced by NexaNote itself (plain-markdown notes are synthesized with PLAIN_MD_ID_PREFIX = "md." in FileNoteStore and are pushed with that id). In a multi-device setup, device A can upload such a note, but device B will always skip it at this branch and never materialize the note locally, which is a functional sync regression rather than just duplicate prevention.

Useful? React with 👍 / 👎.

Comment thread nexanote/sync/client.py
Comment on lines +709 to +712
if self.sync_state.is_ignored(remote_path):
self.sync_state.touch_ignored(remote_path)
report.notes_ignored_legacy += 1
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Re-evaluate ignored paths before permanently skipping pull

This early return makes ignore decisions sticky forever for a given remote_path: once a path is marked ignored, pull never fetches note.json again, so it cannot recover if that remote folder is later fixed to include a valid NexaNote id. In practice, a user who migrates or repairs a previously legacy/manual note at the same path will still be skipped indefinitely unless they manually delete .nexanote_sync_state.json.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants