Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -66,9 +66,14 @@
"test-coverage-ci": "cross-env NODE_ENV=test vitest --run --dir ./test --coverage.enabled=true --coverage.reporter=lcovonly --coverage.reporter=text",
"test:integration": "cross-env NODE_ENV=test vitest --run --config vitest.config.integration.ts",
"test:watch": "cross-env NODE_ENV=test vitest --dir ./test --watch",
"test:migrate": "cross-env NODE_ENV=test vitest --run --dir ./scripts/migrate/test",
"prepare": "node ./scripts/prepare.js",
"lint": "eslint",
"lint:fix": "eslint --fix",
"migrate:urls": "node scripts/migrate/migrate-urls.js",
"migrate:users": "node scripts/migrate/migrate-users.js",
"backup:urls": "node scripts/migrate/backup-urls.js",
"backup:users": "node scripts/migrate/backup-users.js",
"format": "prettier --write \"**/*.{js,jsx,ts,tsx,json,md,yml,yaml,css,scss}\" --ignore-path .gitignore --config ./.prettierrc",
"format:check": "prettier --check \"**/*.{js,jsx,ts,tsx,json,md,yml,yaml,css,scss}\" --ignore-path .gitignore --config ./.prettierrc",
"gen-schema-doc": "node ./scripts/doc-schema.js",
Expand Down
96 changes: 96 additions & 0 deletions scripts/migrate/Migration-guide-v1-to-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Git-Proxy v1.19.2 → v2.0.0 MongoDB migration

Operator prep for upgrade, aligned with [finos/git-proxy#1535](https://github.com/finos/git-proxy/issues/1535#issuecomment-4478956510) (these scripts do **not** replace your own DB backup/snapshot).
**Behavior:** dry-run by default for both phases; normalization is idempotent; email apply skips unchanged rows and checks uniqueness before writes; backups are explicit helper scripts plus your own infra.

| Phase | Scripts | Goal |
| ----- | ------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **1** | `migrate-urls.js`, `backup-urls.js` | **Repo URL normalization** — append `.git` to `repos.url` where missing (idempotent) |
| **2** | `migrate-users.js`, `backup-users.js` | **Email audit** (blocking issues) + optional **CSV apply**; **ACL audit** — list `canPush` / `canAuthorise` entries that do not resolve to any `User.username` (no silent rewrite) |

Env: `MONGO_URI`, `DB_NAME` (see `scripts/migrate/lib/config.js`). Reports: `reports/{date}-migration/`.

## npm scripts

```bash
npm run migrate:urls # repo URL normalization — dry-run
npm run backup:urls
npm run migrate:urls -- --apply # apply normalization

npm run migrate:users # email + ACL audit (dry-run)
npm run backup:users
npm run migrate:users -- --apply --csv ./map.csv
```

---

## Phase 1 — Repo URL normalization (append `.git` where missing)

**Goal:** every `repos.url` that v2 will match must include the `.git` suffix where it is missing.

**Why:** v2 resolves repos by **exact** `url` via `getRepoByUrl`; v1 often relied on `name`. Incoming git HTTP traffic is normalized to a URL that includes `.git` (see `parseAction`), while legacy `repos` rows may have been stored without it. Those rows no longer match, so processors such as `checkRepoInAuthorisedList` treat the repo as unauthorized. (The admin UI already requires `.git` when creating new repositories.)

| | v1.19.2 | v2.0.0 |
| ------------ | ------------ | ------------------------------------------ |
| Lookup | `name` | `url` (exact `$eq`) |
| `.git` in DB | not required | required for parity with incoming requests |

**Scripts:** `migrate-urls.js`, `backup-urls.js`; helpers under `lib/` (`analyze-urls.js`, `reporting.js`, `common.js`, `config.js`).

```bash
npm run migrate:urls
npm run backup:urls
npm run migrate:urls -- --apply
```

Notes: trailing `/` is normalized (`.../repo/` → `.../repo.git`). Blank/non-http(s) URLs are reported as issues and require manual fixing.

Reports: `report-{ts}.yaml`, `report-{ts}.csv` (pending changes), `url-issues-{ts}.csv` (manual fixes), `backup-urls-{ts}.json`.

---

## Phase 2 — User emails & ACL audit

**Goal:** unblock v2 pushes: valid **unique** `users.email` (audit + CSV apply fallback); surface **ACL orphan** `username` strings for manual UI fix.

**migrate-urls vs migrate-users**

| | `migrate-urls.js` | `migrate-users.js` |
| ----------- | ---------------------- | ------------------------------- |
| Apply flags | `--apply` | `--apply` **and** `--csv` |
| Writes | `repos.url` only | `users.email` from CSV only |
| Always | normalization analysis | email audit + ACL orphan report |

`backup-users.js` is separate (not invoked by `migrate-users`) and writes a full JSON snapshot plus a `users-email-*.csv` template.

```bash
npm run migrate:users
npm run backup:users
npm run migrate:users -- --apply --csv ./mappings.csv
```

For **apply** (`migrate-users --apply --csv ...`): CSV header must be `username,email` (`lib/csv.js`). The command exits `1` on blocking email issues, ACL orphans, CSV/apply failures, or duplicate-email simulation.

CSV input: UTF‑8, one row per line, only those two columns; parser is minimal (quoted commas OK, **`""`** escapes inside fields not supported). Prefer export without BOM.

Extra CSVs when applicable: `users-audit-*.csv`, `acl-orphans-*.csv`, `email-changes-*.csv`.

---

## Pre-upgrade checklist

```bash
export MONGO_URI="mongodb://host:27017"
export DB_NAME="git_proxy"

# Phase 1 — repo URL normalization
npm run migrate:urls
npm run backup:urls
npm run migrate:urls -- --apply
npm run migrate:urls # expect nothing left to normalize

# Phase 2 — email + ACL (timing vs app upgrade — your runbook)
npm run migrate:users
npm run backup:users
npm run migrate:users -- --apply --csv ./mappings.csv
```
92 changes: 92 additions & 0 deletions scripts/migrate/backup-urls.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
#!/usr/bin/env node

/**
* Copyright 2026 GitProxy Contributors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/**
* Backup: Create backup of repos without .git suffix before migration
*
* BACKUP ONLY
*
* Usage:
* npm run backup:urls
* # or: node scripts/migrate/backup-urls.js
*/

const config = require('./lib/config');
const { analyzeRepos } = require('./lib/analyze-urls');
const { generateReports } = require('./lib/reporting');
const { createBackup } = require('./lib/common');

config.ensureReportsDir();

async function main() {
try {
const { allRepos, report } = await analyzeRepos(config.mongoUri, config.dbName);
const issues = Array.isArray(report.issues) ? report.issues : [];

if (report.reposNeedingUpdate === 0 && issues.length === 0) {
console.log('\n=== BACKUP PHASE ===');
console.log('No repos need migration - backup not necessary');
process.exit(0);
}

console.log('\n=== BACKUP PHASE ===');
const repoById = new Map(allRepos.map((r) => [r._id?.toString?.() ?? String(r._id ?? ''), r]));
const backupData = [];

for (const change of report.changes) {
const repo = repoById.get(change.repoId);
if (!repo) continue;
backupData.push({
...repo,
backupReason: 'missing-dot-git',
normalizedUrl: change.oldUrl,
newUrl: change.newUrl,
});
}

for (const issue of issues) {
const repo = repoById.get(issue.repoId);
if (!repo) continue;
backupData.push({
...repo,
backupReason: 'url-issue',
rawUrl: issue.rawUrl,
normalizedUrl: issue.normalizedUrl,
issueReason: issue.reason,
issueScheme: issue.scheme,
});
}

const backupPath = createBackup(config.reportsDir, 'backup-urls', backupData);
console.log(`SUCCESS Backup created: ${backupPath}`);
console.log(` (${report.reposNeedingUpdate} repos missing .git, ${issues.length} URL issues)`);
console.log('\nBackup completed. Ready to apply migration:');
console.log(' node scripts/migrate/migrate-urls.js --apply');

const timestamp = Date.now();
report.mode = 'backup-only';
generateReports(config.reportsDir, report, timestamp);

process.exit(0);
} catch (error) {
console.error('FATAL ERROR:', error.message);
process.exit(1);
}
}

main();
86 changes: 86 additions & 0 deletions scripts/migrate/backup-users.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
#!/usr/bin/env node

/**
* Copyright 2026 GitProxy Contributors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/**
* Backup: Create backup of users before email migration
*
* BACKUP ONLY
*
* Outputs:
* - backup-users-*.json (full users snapshot; passwords excluded)
* - users-email-*.csv (username,email template for admin edits + later --apply --csv)
*
* Usage:
* npm run backup:users
* # or: node scripts/migrate/backup-users.js
*/

const { MongoClient } = require('mongodb');
const fs = require('fs');
const path = require('path');

const config = require('./lib/config');
const { generateReports } = require('./lib/reporting');
const { createBackup } = require('./lib/common');

config.ensureReportsDir();

function toCsvValue(v) {
if (v === null || v === undefined) return '""';
const s = String(v).replace(/"/g, '""');
return `"${s}"`;
}

async function main() {
const client = new MongoClient(config.mongoUri);

try {
await client.connect();
const db = client.db(config.dbName);
const usersCollection = db.collection('users');

console.log('\n=== BACKUP USERS PHASE ===');
const users = await usersCollection.find({}).project({ password: 0 }).toArray();
console.log(`Total users in database: ${users.length}`);

const backupPath = createBackup(config.reportsDir, 'backup-users', users);
console.log(`SUCCESS Backup created: ${backupPath}`);

const timestamp = Date.now();

const emailCsvPath = path.join(config.reportsDir, `users-email-${timestamp}.csv`);
const header = ['username', 'email'].join(',') + '\n';
const rows = users
.map((u) => [toCsvValue(u.username ?? ''), toCsvValue(u.email ?? '')].join(','))
.join('\n');
fs.writeFileSync(emailCsvPath, header + rows);
console.log(`SUCCESS CSV template: ${emailCsvPath}`);

const report = { mode: 'backup-users', totalUsers: users.length };
generateReports(config.reportsDir, report, timestamp);

process.exit(0);
} catch (error) {
console.error('FATAL ERROR:', error.message);
process.exit(1);
} finally {
await client.close().catch(() => {});
}
}

main();
99 changes: 99 additions & 0 deletions scripts/migrate/lib/analyze-acl.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
/**
* Copyright 2026 GitProxy Contributors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

const { MongoClient } = require('mongodb');

function normalizeUsername(v) {
return (v || '').toString().trim().toLowerCase();
}

function collectAclOrphans(repos, usernameSet) {
const orphans = [];

for (const repo of repos) {
const repoId = repo._id?.toString?.() ?? String(repo._id ?? '');
const repoName = repo.name ?? '';
const repoUrl = repo.url ?? '';
const usersObj = repo.users ?? {};

for (const field of ['canPush', 'canAuthorise']) {
const list = Array.isArray(usersObj[field]) ? usersObj[field] : [];
for (let i = 0; i < list.length; i++) {
const raw = list[i];
if (typeof raw !== 'string') continue;
const entry = normalizeUsername(raw);
if (!entry) continue;

if (!usernameSet.has(entry)) {
orphans.push({
repoId,
repoName,
repoUrl,
field,
orphanUsername: raw,
normalizedOrphan: entry,
index: i,
});
}
}
}
}

return orphans;
}

async function analyzeAcl(mongoUri, dbName) {
const client = new MongoClient(mongoUri);

try {
await client.connect();
const db = client.db(dbName);
const usersCollection = db.collection('users');
const reposCollection = db.collection('repos');

return await analyzeAclWithCollections(usersCollection, reposCollection);
} finally {
await client.close();
}
}

async function analyzeAclWithCollections(usersCollection, reposCollection) {
console.log('\n=== ACL AUDIT PHASE ===');

const users = await usersCollection.find({}).project({ password: 0 }).toArray();
const usernameSet = new Set(users.map((u) => normalizeUsername(u.username)).filter(Boolean));

const repos = await reposCollection.find({}).toArray();
const orphans = collectAclOrphans(repos, usernameSet);

const report = {
totalRepos: repos.length,
totalUsers: users.length,
orphanCount: orphans.length,
orphans,
};

console.log(`ACL orphan entries: ${report.orphanCount}`);

return { report };
}

module.exports = {
analyzeAcl,
analyzeAclWithCollections,
collectAclOrphans,
normalizeUsername,
};
Loading
Loading