Commit 4c474e0
fix(connectors): contentDeferred pattern + validation fixes across all connectors (#3793)
* fix(knowledge): enqueue connector docs per-batch to survive sync timeouts
* fix(connectors): convert all connectors to contentDeferred pattern and fix validation issues
All 10 connectors now use contentDeferred: true in listDocuments, returning
lightweight metadata stubs instead of downloading content during listing.
Content is fetched lazily via getDocument only for new/changed documents,
preventing Trigger.dev task timeouts on large syncs.
Connector-specific fixes from validation audit:
- Google Drive: metadata-based contentHash, orderBy for deterministic pagination,
precise maxFiles, byte-length size check with truncation warning
- OneDrive: metadata-based contentHash, orderBy for deterministic pagination
- SharePoint: metadata-based contentHash, byte-length size check
- Dropbox: metadata-based contentHash using content_hash field
- Notion: code/equation block extraction, empty page fallback to title,
reduced CHILD_PAGE_CONCURRENCY to 5, syncContext parameter
- Confluence: syncContext caching for cloudId, reduced label concurrency to 5
- Gmail: use joinTagArray for label tags
- Obsidian: syncRunId-based stub hash for forced re-fetch, mtime-based hash
in getDocument, .trim() on vaultUrl, lightweight validateConfig
- Evernote: retryOptions threaded through apiFindNotesMetadata and apiGetNote
- GitHub: added contentDeferred: false to getDocument, syncContext parameter
Infrastructure:
- sync-engine: added syncRunId to syncContext for Obsidian change detection
- confluence/utils: replaced raw fetch with fetchWithRetry, added retryOptions
- oauth: added supportsRefreshTokenRotation: false for Dropbox
- Updated add-connector and validate-connector skills with contentDeferred docs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(connectors): address PR review comments - metadata merge, retryOptions, UTF-8 safety
- Sync engine: merge metadata from getDocument during deferred hydration,
so Gmail/Obsidian/Confluence tags and metadata survive the stub→full transition
- Evernote: pass retryOptions {retries:3, backoff:500} from listDocuments and
getDocument callers into apiFindNotesMetadata and apiGetNote
- Google Drive + SharePoint: safe UTF-8 truncation that walks back to the last
complete character boundary instead of splitting multi-byte chars
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(evernote): use correct RetryOptions property names
maxRetries/initialDelayMs instead of retries/backoff to match the
RetryOptions interface from lib/knowledge/documents/utils.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(sync-engine): merge title from getDocument and skip unchanged docs after hydration
- Merge title from getDocument during deferred hydration so Gmail
documents get the email Subject header instead of the snippet text
- After hydration, compare the hydrated contentHash against the stored
DB hash — if they match, skip the update. This prevents Obsidian
(and any connector with a force-refresh stub hash) from re-uploading
and re-processing unchanged documents every sync
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(sync-engine): dedup externalIds, enable deletion reconciliation, merge sourceUrl
Three sync engine gaps identified during audit:
1. Duplicate externalId guard: if a connector returns the same externalId
across pages (pagination overlap), skip the second occurrence to prevent
unique constraint violations on add and double-uploads on update.
2. Deletion reconciliation: previously required explicit fullSync or
syncMode='full', meaning docs deleted from the source accumulated in
the KB forever. Now runs on all non-incremental syncs (which return
ALL docs). Includes a safety threshold: if >50% of existing docs
(and >5 docs) would be deleted, skip and warn — protects against
partial listing failures. Explicit fullSync bypasses the threshold.
3. sourceUrl merge: hydration now picks up sourceUrl from getDocument,
falling back to the stub's sourceUrl if getDocument doesn't set one.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* lint
* fix(connectors): confluence version metadata fallback and google drive maxFiles guard
- Confluence: use `version?.number` directly (undefined) in metadata instead
of `?? ''` (empty string) to prevent Number('') = 0 passing NaN check in
mapTags. Hash still uses `?? ''` for string interpolation.
- Google Drive: add early return when previouslyFetched >= maxFiles to prevent
effectivePageSize <= 0 which violates the API's pageSize requirement (1-1000).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(connectors): blogpost labels and capped listing deletion reconciliation
- Confluence: fetchLabelsForPages now tries both /pages/{id}/labels and
/blogposts/{id}/labels, preventing label loss when getDocument hydrates
blogpost content (previously returned empty labels on 404).
- Sync engine: skip deletion reconciliation when listing was capped
(maxFiles/maxThreads). Connectors signal this via syncContext.listingCapped.
Prevents incorrect deletion of docs beyond the cap that still exist in source.
fullSync override still forces deletion for explicit cleanup.
- Google Drive & Gmail: set syncContext.listingCapped = true when cap is hit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(connectors): set syncContext.listingCapped in all connectors with caps
OneDrive, Dropbox, SharePoint, Confluence (v2 + CQL), and Notion (3 listing
functions) now set syncContext.listingCapped = true when their respective
maxFiles/maxPages limit is hit. Without this, the sync engine's deletion
reconciliation would run against an incomplete listing and incorrectly
hard-delete documents that exist in the source but fell outside the cap window.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(evernote): thread retryOptions through apiListTags and apiListNotebooks
All calls to apiListTags and apiListNotebooks in both listDocuments and
getDocument now pass retryOptions for consistent retry protection across
all Thrift RPC calls.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>1 parent b0980b1 commit 4c474e0
File tree
15 files changed
+571
-509
lines changed- .agents/skills
- add-connector
- validate-connector
- apps/sim
- connectors
- confluence
- dropbox
- evernote
- github
- gmail
- google-drive
- notion
- obsidian
- onedrive
- lib
- knowledge/connectors
- oauth
- tools/confluence
15 files changed
+571
-509
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
74 | | - | |
| 74 | + | |
| 75 | + | |
75 | 76 | | |
76 | 77 | | |
77 | 78 | | |
78 | 79 | | |
79 | | - | |
| 80 | + | |
| 81 | + | |
80 | 82 | | |
81 | 83 | | |
82 | 84 | | |
| |||
281 | 283 | | |
282 | 284 | | |
283 | 285 | | |
284 | | - | |
| 286 | + | |
| 287 | + | |
285 | 288 | | |
286 | | - | |
| 289 | + | |
287 | 290 | | |
288 | 291 | | |
289 | 292 | | |
290 | 293 | | |
291 | 294 | | |
292 | | - | |
| 295 | + | |
293 | 296 | | |
294 | | - | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
295 | 323 | | |
296 | 324 | | |
297 | | - | |
298 | | - | |
299 | | - | |
300 | | - | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
301 | 380 | | |
302 | 381 | | |
303 | 382 | | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
304 | 390 | | |
305 | 391 | | |
306 | 392 | | |
| |||
409 | 495 | | |
410 | 496 | | |
411 | 497 | | |
412 | | - | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
413 | 502 | | |
414 | 503 | | |
415 | 504 | | |
| |||
425 | 514 | | |
426 | 515 | | |
427 | 516 | | |
428 | | - | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
429 | 520 | | |
430 | 521 | | |
431 | 522 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
144 | 156 | | |
145 | 157 | | |
146 | 158 | | |
147 | 159 | | |
148 | 160 | | |
149 | | - | |
| 161 | + | |
150 | 162 | | |
151 | 163 | | |
152 | 164 | | |
| |||
200 | 212 | | |
201 | 213 | | |
202 | 214 | | |
| 215 | + | |
| 216 | + | |
203 | 217 | | |
204 | 218 | | |
205 | 219 | | |
| |||
253 | 267 | | |
254 | 268 | | |
255 | 269 | | |
| 270 | + | |
| 271 | + | |
256 | 272 | | |
257 | 273 | | |
258 | 274 | | |
| |||
300 | 316 | | |
301 | 317 | | |
302 | 318 | | |
| 319 | + | |
303 | 320 | | |
304 | 321 | | |
305 | 322 | | |
| |||
0 commit comments