This is a command line tool to convert the contents of a Confluence space into a MediaWiki import data format. See also the official BlueSpice Helpdesk entry.
The migrate confluence tool is available as docker image.
- Create an export of your confluence space
Step 1:
Step 2:
Step 3:
- Save it to a location that is accessbile by this tool (e.g.
/tmp/confluence/input/Confluence-export.zip) - Extract the ZIP file (e.g.
/tmp/confluence/input/Confluence-export)- The folder should contain the files
entities.xmlandexportDescriptor.properties, as well as the folderattachments
- The folder should contain the files
- Create the "workspace" directory (e.g.
/tmp/confluence/workspace/) - From the parent directory (e.g.
/tmp/), run the migration commands- Run
docker run -v $(pwd)/confluence:/data bluespice/migrate-confluence:latest analyze --src=/data/input --dest=/data/workspaceto create "working files". After the script has run you can check those files and maybe apply changes if required (e.g. when applying structural changes). - Run
docker run -v $(pwd)/confluence:/data bluespice/migrate-confluence:latest extract --src=/data/input --dest=/data/workspaceto extract all contents, like wikipage contents, attachments and images into the workspace - Run
docker run -v $(pwd)/confluence:/data bluespice/migrate-confluence:latest convert --src=/data/workspace --dest=/data/workspace(yes,--src /data/workspace/) to convert the wikipage contents from Confluence Storage XML to MediaWiki WikiText. For large spaces, see Parallel convert below. - Run
docker run -v $(pwd)/confluence:/data bluespice/migrate-confluence:latest compose --src=/data/workspace --dest=/data/workspace(yes,--src /data/workspace/) to create importable data
- Run
If you re-run the scripts you will need to clean up the "workspace" directory!
- Copy the diretory "workspace/result" directory (e.g.
/tmp/confluence/workspace/result/) to your target wiki server (e.g./tmp/result) - Go to your MediaWiki installation directory
- Make sure you have the target namespaces set up properly. See
workspace/space-id-to-prefix-map.phpfor reference. - Make sure $wgFileExtensions is setup properly. See
workspace/attachment-file-extensions.phpfor reference. - Use
php maintenance/importImages.php /tmp/result/images/to first import all attachment files and images - Use
php maintenance/importDump.php /tmp/result/pages.xmlto import the actual pages
You may need to update your MediaWiki search index afterwards.
It is possible to use a yaml file to configure the commands analyze, extract and convert. As an example see /doc/config.sample.yaml.
The configuration file can be applied by adding the option --config /data/config.yaml.
Not all parameters of config.sample.yaml have to be used in the config file. If something is not part of it the default will be used.
For large Confluence spaces the convert step can be slow. You can speed it up by running multiple worker processes in parallel using the --workers option.
docker run -v $(pwd)/confluence:/data bluespice/migrate-confluence:latest convert \
--src=/data/workspace --dest=/data/workspace \
--workers=4The command spawns the requested number of child processes automatically. Each worker handles a disjoint slice of the file list, so every file is converted exactly once. Progress lines are prefixed with [Worker N] so you can follow each process individually. If any worker fails the command exits with a non-zero status and reports which workers were affected.
Choose --workers based on the number of available CPU cores. A value between 2 and 8 is typical; there is no benefit in exceeding the number of cores on your machine.
Note:
--workers=1(the default) behaves identically to running without the option — no child processes are spawned.
There is now a compatibility for the mediawiki extension https://www.mediawiki.org/wiki/Extension:NSFileRepo which restricts access files and images to a given set of user groups associated with protected namespaces.
If NSFileRepo is used the upload of the images can not be done with the script maintenance/importImages.php but with extensions/NSFileRepo/maintenance/importFiles.php.
Example: php extensions/NSFileRepo/maintenance/importFiles.php /tmp/result/images/
In confluence user spaces are protected. In MediaWiki this is not possible for namespace User. Therefore user spaces are migrated to a namespace User<username> which can be protected in BlueSpice for MediaWiki.
AttachmentsSectionEndAttachmentsSectionStartDetailsDetailsSummaryExcerptExcerptIncludeInfoInlineCommentLayoutLayouts.cssNotePanelRecentlyUpdatedSubpageListSubpageListRowTipWarningPageTreeSpaceDetailsViewFile
Be aware that those pages may be overwritten by the import if they already exist in the target wiki.
Icon-info.svgIcon-note.svgIcon-tip.svgIcon-warning.svg
Be aware that those files may be overwritten by the import if they already exist in the target wiki.
In case your pages contain a lot of external images (<img /> elements), be aware that MediaWiki does not show them by default. You'd need to configure $wgAllowExternalImages.
Read https://www.mediawiki.org/wiki/Manual:$wgAllowExternalImages for more information.
Confluence pages that contain Jira macros are converted to use MediaWiki interwiki links. Two separate prefixes are used because Jira issue keys and JQL queries have different URL patterns:
| Interwiki prefix | Purpose | Example URL pattern |
|---|---|---|
jira |
Link to a specific Jira issue by key | https://jira.example.com/browse/$1 |
jira-jql |
Link to a Jira issue list filtered by JQL | https://jira.example.com/issues/?jql=$1 |
Add both entries to the interwiki table of your MediaWiki database, or configure them via $wgExtraInterlanguageLinkPrefixes and the interwiki cache. Replace https://jira.example.com with the base URL of your Jira instance.
The output generated by the tool contains certain elements that need additonal extensions to be enabled.
- TemplateStyles
- ParserFunctions
- DateTimeTools
- Checklists
- SimpleTasks
- EnhancedUploads
- Semantic MediaWiki
- HeaderTabs
- SubPageList
- TableTools
These extensions are not strictly required but are recommended for full compatibility with the migrated content.
- WikiMarkdown - Renders
<markdown>tags produced from Confluence markdown macros
In the case that the tool can not migrate content or functionality it will create a category, so you can manually fix issues after the import
Broken_linkBroken_user_linkBroken_page_linkBroken_imageBroken_layoutBroken_macro/<macro-name>
- User identities
- Comments
- Various macros
- Various layouts
- Blog posts
- Files of a space which can not be assigned to a page
- Clone this repo
- Run
composer update --no-dev - Run
box compileto actually create the PHAR file indist/. See also https://github.com/humbug/box
- Reduce multiple linebreaks (
<br />) to one - Remove line breaks and arbitrary fromatting (e.g.
<b>) from headings - Mask external images (
<img />) - Preserve filename of "Broken_attachment"
- Merge multiple
<code>lines into<pre> - Remove bold/italic formatting from wikitext headings (e.g.
=== '''Some heading''' ===) - Fix unconverted HTML lists in wikitext (e.g.
<ul><li>==== Lorem ipsum ====</li><li>'''<span class="confluence-link"> </span>[[Media:Some_file.pdf]]'''</li></ul><ul>) - Remove empty confluence storage format fragments (e.g.
<span class="confluence-link"> </span>,<span class="no-children icon">)


