EPIC-025 — Multi-Source Profile Enrichment
Goal
Extend the existing EPIC-024 enrichment system to support profile enrichment from 5 additional sources: Website URL, Facebook Page, Instagram Profile, YouTube Channel, and Linktree — using the pluggable EnrichmentProviderInterface pattern already in place.
Background
EPIC-024 delivered the core enrichment framework with Apollo (company/person), Hunter/Clearbit (logo), and Demo providers. This epic adds source-based providers that extract profile data from public URLs rather than API lookups by email/domain.
New Source Providers
| # |
Source |
Provider Class |
API / Method |
Fields Extracted |
| 1 |
Website URL |
WebsiteProvider |
Crawler (meta tags, OG, JSON-LD, schema.org) |
name, description, emails, phones, address, social links, logo, favicon |
| 2 |
Facebook Page |
FacebookProvider |
Facebook Graph API (preferred) / OG fallback |
page name, bio, website, phone, address, profile pic, cover photo, category |
| 3 |
Instagram Profile |
InstagramProvider |
Instagram Basic Display API / OG fallback |
username, full name, bio, profile pic, website link, follower count |
| 4 |
YouTube Channel |
YouTubeProvider |
YouTube Data API v3 |
channel name, description, custom URL, subscriber count, avatar, banner |
| 5 |
Linktree |
LinktreeProvider |
Crawler (structured HTML/JSON) |
display name, bio, avatar, all link entries (social, website, payment) |
Architecture
Extends existing EnrichmentProviderInterface with a new method:
public function enrichFromUrl(string $url): EnrichmentResult;
public function canHandleUrl(string $url): bool;
The EnrichmentService gains a enrichByUrl(string $url) method that auto-detects the URL type and dispatches to the correct provider.
Stories
| # |
Story |
GitHub Issue |
MAGIK |
Priority |
| 1 |
Provider Interface Extension — enrichFromUrl() + URL detection |
#TBD |
MAGIK-934 |
P0 |
| 2 |
Website URL Provider — Meta/OG/JSON-LD Extraction |
#TBD |
MAGIK-935 |
P0 |
| 3 |
Facebook Page Provider — Graph API + OG Fallback |
#TBD |
MAGIK-936 |
P1 |
| 4 |
Instagram Profile Provider — Basic Display API + OG Fallback |
#TBD |
MAGIK-937 |
P1 |
| 5 |
YouTube Channel Provider — Data API v3 |
#TBD |
MAGIK-938 |
P1 |
| 6 |
Linktree Provider — Structured HTML/JSON Crawler |
#TBD |
MAGIK-939 |
P1 |
| 7 |
Field Normalization & Deduplication Service |
#TBD |
MAGIK-940 |
P0 |
| 8 |
DB Schema — Enrichment Sources + Evidence Table |
#TBD |
MAGIK-941 |
P0 |
| 9 |
Review UI — Current vs Suggested, Accept/Reject per Field |
#TBD |
MAGIK-942 |
P0 |
| 10 |
Media Integration — S3-Ready Logo/Cover Image Storage |
#TBD |
MAGIK-943 |
P1 |
| 11 |
Rate Limiting & Consent Controls |
#TBD |
MAGIK-944 |
P1 |
| 12 |
Multi-Tenant Scoping — Parent/Child Feature Toggles |
#TBD |
MAGIK-945 |
P1 |
| 13 |
Demo Provider Extension — Fake URL-Based Data |
#TBD |
MAGIK-946 |
P2 |
| 14 |
Integration Testing — Multi-Source End-to-End |
#TBD |
MAGIK-947 |
P2 |
Technical Approach
- Extend
EnrichmentProviderInterface — add enrichFromUrl() + canHandleUrl() methods; update existing providers with no-op stubs
- Strategy pattern — URL → provider routing via
canHandleUrl() chain in EnrichmentService
- Confidence + Evidence — each extracted field carries confidence score (0.0–1.0) and source evidence (snippet + URL)
- Normalization — E.164 phones via libphonenumber, address component parsing, unique social link dedup
- Draft-only storage — results stored in
enrichment_drafts (extended schema); never auto-applied
- Review UI — side-by-side current vs suggested, per-field accept/reject, batch apply with audit log
- Rate limiting — per-provider, per-tenant rate limits via
enrichment_rate_limits table
- Consent — checkbox required before enrichment; consent stored in audit log
Scope
- CI4 Portal (
app.portalv2)
- All user roles (Admin, Reseller, Org, Employee, Individual) with appropriate access gates
- Extends existing EPIC-024 infrastructure (same tables, same controller, same service)
Acceptance Criteria (Epic-Level)
Security & Compliance
- No automatic profile overwrite — all results are DRAFT
- Consent required before enrichment (GDPR/CCPA alignment)
- Rate limiting prevents abuse and cost overrun
- Crawler providers document robots.txt compliance
- API keys stored in
.env, never in code
- PII fields encrypted at rest in draft table
EPIC-025 — Multi-Source Profile Enrichment
Goal
Extend the existing EPIC-024 enrichment system to support profile enrichment from 5 additional sources: Website URL, Facebook Page, Instagram Profile, YouTube Channel, and Linktree — using the pluggable
EnrichmentProviderInterfacepattern already in place.Background
EPIC-024 delivered the core enrichment framework with Apollo (company/person), Hunter/Clearbit (logo), and Demo providers. This epic adds source-based providers that extract profile data from public URLs rather than API lookups by email/domain.
New Source Providers
WebsiteProviderFacebookProviderInstagramProviderYouTubeProviderLinktreeProviderArchitecture
Extends existing
EnrichmentProviderInterfacewith a new method:The
EnrichmentServicegains aenrichByUrl(string $url)method that auto-detects the URL type and dispatches to the correct provider.Stories
enrichFromUrl()+ URL detectionTechnical Approach
EnrichmentProviderInterface— addenrichFromUrl()+canHandleUrl()methods; update existing providers with no-op stubscanHandleUrl()chain inEnrichmentServiceenrichment_drafts(extended schema); never auto-appliedenrichment_rate_limitstableScope
app.portalv2)Acceptance Criteria (Epic-Level)
Security & Compliance
.env, never in code