Problem Statement
For non–URL-mapped HTML pages, site search currently overwrites the title parsed from rendered HTML with page.getTitle() (page properties). Legacy site search (StaticHTMLPageBundler / older ESSiteSearchPublisher) indexed Tika metadata from the rendered file, so the indexed title effectively followed the HTML <title> (as produced by template/theme/SEO, e.g. dotSeo). URL-mapped pages follow a different path where metadata title is not replaced the same way.
Customer impact: Partners/clients built thousands of pages assuming site search matched rendered <title>, which often differs from the CMS Title field (theme/SEO). They cannot migrate all page titles. This blocks expected search result titles.
Customer request: Raise with product; add configuration (default false) to optionally populate site search title from parsed <title> / HTML metadata for backward compatibility.
Steps to Reproduce
- Create or use an HTML page that is not URL-mapped.
- Set Page Title (page properties) to a distinct value, e.g. CMS_TITLE_FOR_SEARCH_TEST.
- Ensure the published page’s HTML has a different <title> in (e.g. via theme / html_head.vtl / dotSeo / hardcoded in template) — e.g. HTML_TITLE_IN_HEAD_ONLY.
- Publish the page and confirm in View Source on the live URL that <title>HTML_TITLE_IN_HEAD_ONLY</title> while the CMS Title remains CMS_TITLE_FOR_SEARCH_TEST.
- Run Site Search indexing (full or incremental as appropriate) for that site/index.
- Search in Site Search for content unique to that page.
- Actual result: The hit title in site search is CMS_TITLE_FOR_SEARCH_TEST (page properties), not HTML_TITLE_IN_HEAD_ONLY.
Acceptance Criteria
-
Today (documented as current product behaviour): Site search uses page properties (page.getTitle()) for the indexed title on normal HTML pages, so the steps above currently yield the CMS Title in results.
-
Desired / backward-compatible behaviour (customer & partner ask):
-
Default: Keep current behaviour so existing installs do not change unexpectedly.
-
With opt-in config (e.g. false by default): For normal HTML pages, the site search title should match the rendered HTML <title> when present (same source as legacy: metadata/Tika on the bundled HTML file), falling back to page.getTitle() only when that parsed title is missing or empty.
dotCMS Version
latest
Severity
High - Major functionality broken
Links
https://helpdesk.dotcms.com/a/tickets/36008
Problem Statement
For non–URL-mapped HTML pages, site search currently overwrites the title parsed from rendered HTML with page.getTitle() (page properties). Legacy site search (StaticHTMLPageBundler / older ESSiteSearchPublisher) indexed Tika metadata from the rendered file, so the indexed title effectively followed the HTML <title> (as produced by template/theme/SEO, e.g. dotSeo). URL-mapped pages follow a different path where metadata title is not replaced the same way.
Customer impact: Partners/clients built thousands of pages assuming site search matched rendered <title>, which often differs from the CMS Title field (theme/SEO). They cannot migrate all page titles. This blocks expected search result titles.
Customer request: Raise with product; add configuration (default false) to optionally populate site search title from parsed <title> / HTML metadata for backward compatibility.
Steps to Reproduce
Acceptance Criteria
Today (documented as current product behaviour): Site search uses page properties (page.getTitle()) for the indexed title on normal HTML pages, so the steps above currently yield the CMS Title in results.
Desired / backward-compatible behaviour (customer & partner ask):
Default: Keep current behaviour so existing installs do not change unexpectedly.
With opt-in config (e.g. false by default): For normal HTML pages, the site search title should match the rendered HTML <title> when present (same source as legacy: metadata/Tika on the bundled HTML file), falling back to page.getTitle() only when that parsed title is missing or empty.
dotCMS Version
latest
Severity
High - Major functionality broken
Links
https://helpdesk.dotcms.com/a/tickets/36008