Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
0fd6c84
Add HuggingFace dataset discovery and import options
kalebbroo Nov 30, 2025
518533a
Add support for importing images from ZIP archives
kalebbroo Dec 5, 2025
10e696f
docs: Add comprehensive refactor plan for Dataset Studio migration
kalebbroo Dec 8, 2025
2ce1697
docs: Add Phase 1 execution guide and complete file migration map
kalebbroo Dec 8, 2025
02524dc
docs: Add Phase 1 execution checklist
kalebbroo Dec 8, 2025
7ea279c
docs: Add comprehensive refactor guide and summary
kalebbroo Dec 8, 2025
5015edb
Add APIBackend project with dataset management endpoints
kalebbroo Dec 10, 2025
b70487f
Update README.md
kalebbroo Dec 10, 2025
ba5100e
refactor: Complete Phase 1 - Transform to Dataset Studio architecture
kalebbroo Dec 11, 2025
e0d9dcf
docs: Add comprehensive refactor completion summary
kalebbroo Dec 11, 2025
0fbb856
docs: Add Quick Start guide
kalebbroo Dec 11, 2025
4b208bc
cleanup: Remove old HartsysDatasetEditor projects
kalebbroo Dec 11, 2025
c900518
feat: Complete Phase 2 - PostgreSQL + Parquet Storage Infrastructure
kalebbroo Dec 11, 2025
0d9c281
docs: Add Phase 2 completion summary
kalebbroo Dec 11, 2025
9c34507
feat: Complete Phase 3 Scaffold - Extension System Architecture
kalebbroo Dec 12, 2025
f79d5d2
docs: Add Phase 3 completion summary
kalebbroo Dec 12, 2025
c2a21d7
docs: Update QUICK_START with Phase 3 progress
kalebbroo Dec 12, 2025
a402e54
feat: Phase 3.1 - Extension Loading Infrastructure
kalebbroo Dec 12, 2025
2a254f7
delete unwanted files
kalebbroo Dec 15, 2025
490cc9e
Update .gitignore
kalebbroo Dec 15, 2025
580e947
cleanup
kalebbroo Dec 15, 2025
cbdd902
Phase 1 and 2 rewrite
kalebbroo Dec 15, 2025
41db8df
add more tests
kalebbroo Dec 16, 2025
cf0637b
Implement dataset ingestion and item repository for PostgreSQL
kalebbroo Dec 30, 2025
4c63271
Implement HuggingFace dataset import in ingestion service
kalebbroo Dec 30, 2025
b5d8673
Add extension system discovery and loading (API/Client)
kalebbroo Dec 31, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
10 changes: 0 additions & 10 deletions .claude/settings.local.json

This file was deleted.

13 changes: 13 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -64,3 +64,16 @@ dkms.conf
/src/HartsysDatasetEditor.Core/obj
/.vs
/src/HartsysDatasetEditor.Api/data
/src/DTO/obj
/src/DTO/bin
/src/Core/obj
/src/ClientApp/obj
/.claude
/src/APIBackend/obj
/src/APIBackend/bin
/src/ClientApp/bin
/src/Core/bin
/tests/ClientApp.Tests/obj
/tests/ClientApp.Tests/bin
/tests/APIBackend.Tests/obj
/tests/APIBackend.Tests/bin
138 changes: 138 additions & 0 deletions ApprovedExtensions.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
{
"schemaVersion": 1,
"lastUpdated": "2025-01-15T00:00:00Z",
"description": "Curated list of verified Dataset Studio extensions",
"extensions": [
{
"id": "CoreViewer",
"name": "Core Viewer",
"author": "Hartsy AI",
"description": "Essential dataset viewing with grid, list, and masonry layouts. Provides foundational viewing capabilities for all dataset types.",
"repositoryUrl": "https://github.com/hartsy-ai/ds-ext-coreviewer",
"category": "BuiltIn",
"verified": true,
"isOfficial": true,
"minCoreVersion": "1.0.0",
"latestVersion": "1.0.0",
"downloadCount": 0,
"rating": 5.0,
"tags": ["viewer", "grid", "list", "masonry", "official", "essential"],
"permissions": [
"datasets.read"
],
"screenshots": [],
"documentation": "https://github.com/hartsy-ai/ds-ext-coreviewer/blob/main/README.md"
},
{
"id": "Creator",
"name": "Dataset Creator",
"author": "Hartsy AI",
"description": "Create datasets from multiple sources: CSV, TSV, JSON, JSONL, ZIP archives, folders, and HuggingFace. Supports both streaming and download modes for HuggingFace datasets.",
"repositoryUrl": "https://github.com/hartsy-ai/ds-ext-creator",
"category": "BuiltIn",
"verified": true,
"isOfficial": true,
"minCoreVersion": "1.0.0",
"latestVersion": "1.0.0",
"downloadCount": 0,
"rating": 5.0,
"tags": ["creator", "upload", "import", "huggingface", "official", "essential"],
"permissions": [
"datasets.write",
"filesystem.read",
"network.external"
],
"screenshots": [],
"documentation": "https://github.com/hartsy-ai/ds-ext-creator/blob/main/README.md"
},
{
"id": "Editor",
"name": "Advanced Editor",
"author": "Hartsy AI",
"description": "Advanced dataset editing with bulk operations, batch tagging, metadata editor, and powerful search/filter capabilities. Perfect for dataset curation and refinement.",
"repositoryUrl": "https://github.com/hartsy-ai/ds-ext-editor",
"category": "BuiltIn",
"verified": true,
"isOfficial": true,
"minCoreVersion": "1.0.0",
"latestVersion": "1.0.0",
"downloadCount": 0,
"rating": 5.0,
"tags": ["editor", "bulk-edit", "curation", "official"],
"permissions": [
"datasets.read",
"datasets.write",
"items.edit",
"items.bulk_edit",
"items.delete"
],
"screenshots": [],
"documentation": "https://github.com/hartsy-ai/ds-ext-editor/blob/main/README.md"
},
{
"id": "AITools",
"name": "AI Tools",
"author": "Hartsy AI",
"description": "AI-powered caption generation, image tagging, and quality scoring using BLIP, CLIP, and other vision models. Supports OpenAI and Anthropic API integration.",
"repositoryUrl": "https://github.com/hartsy-ai/ds-ext-aitools",
"category": "BuiltIn",
"verified": true,
"isOfficial": true,
"minCoreVersion": "1.0.0",
"latestVersion": "1.0.0",
"downloadCount": 0,
"rating": 5.0,
"tags": ["ai", "caption", "tagging", "machine-learning", "official"],
"permissions": [
"datasets.read",
"datasets.write",
"items.edit",
"network.external",
"ai.inference"
],
"screenshots": [],
"documentation": "https://github.com/hartsy-ai/ds-ext-aitools/blob/main/README.md"
}
],
"categories": [
{
"id": "BuiltIn",
"name": "Built-In",
"description": "Official extensions maintained by the Dataset Studio team"
},
{
"id": "Community",
"name": "Community",
"description": "Third-party extensions developed by the community"
},
{
"id": "Tools",
"name": "Tools",
"description": "Utility extensions for dataset manipulation and analysis"
},
{
"id": "Integrations",
"name": "Integrations",
"description": "Extensions that integrate with external services"
},
{
"id": "Visualization",
"name": "Visualization",
"description": "Extensions for advanced dataset visualization"
}
],
"permissionDescriptions": {
"datasets.read": "View datasets and items",
"datasets.write": "Create and update datasets",
"datasets.delete": "Delete datasets",
"items.edit": "Edit individual items",
"items.bulk_edit": "Bulk edit multiple items",
"items.delete": "Delete items",
"filesystem.read": "Read files from local filesystem",
"filesystem.write": "Write files to local filesystem",
"network.external": "Make requests to external APIs",
"ai.inference": "Run AI model inference",
"extensions.manage": "Install and uninstall extensions",
"users.manage": "Manage users and permissions"
}
}
84 changes: 84 additions & 0 deletions DatasetStudio.sln
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@

Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio Version 17
VisualStudioVersion = 17.0.31903.59
MinimumVisualStudioVersion = 10.0.40219.1
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "src", "src", "{827E0CD3-B72D-47B6-A68D-7590B98EB39B}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Core", "src\Core\Core.csproj", "{77007545-7C22-45D8-B0C6-7D754D40EBF2}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "DTO", "src\DTO\DTO.csproj", "{4330827C-C747-4754-AEF5-69E9AB4FDD22}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "APIBackend", "src\APIBackend\APIBackend.csproj", "{D909E26C-4A44-4485-BE66-44DC98BC2145}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ClientApp", "src\ClientApp\ClientApp.csproj", "{0D968462-1C85-4C18-BB73-8ADB02DD4301}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Any CPU = Debug|Any CPU
Debug|x64 = Debug|x64
Debug|x86 = Debug|x86
Release|Any CPU = Release|Any CPU
Release|x64 = Release|x64
Release|x86 = Release|x86
EndGlobalSection
GlobalSection(ProjectConfigurationPlatforms) = postSolution
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Debug|Any CPU.Build.0 = Debug|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Debug|x64.ActiveCfg = Debug|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Debug|x64.Build.0 = Debug|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Debug|x86.ActiveCfg = Debug|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Debug|x86.Build.0 = Debug|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Release|Any CPU.ActiveCfg = Release|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Release|Any CPU.Build.0 = Release|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Release|x64.ActiveCfg = Release|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Release|x64.Build.0 = Release|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Release|x86.ActiveCfg = Release|Any CPU
{77007545-7C22-45D8-B0C6-7D754D40EBF2}.Release|x86.Build.0 = Release|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Debug|Any CPU.Build.0 = Debug|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Debug|x64.ActiveCfg = Debug|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Debug|x64.Build.0 = Debug|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Debug|x86.ActiveCfg = Debug|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Debug|x86.Build.0 = Debug|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Release|Any CPU.ActiveCfg = Release|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Release|Any CPU.Build.0 = Release|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Release|x64.ActiveCfg = Release|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Release|x64.Build.0 = Release|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Release|x86.ActiveCfg = Release|Any CPU
{4330827C-C747-4754-AEF5-69E9AB4FDD22}.Release|x86.Build.0 = Release|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Debug|Any CPU.Build.0 = Debug|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Debug|x64.ActiveCfg = Debug|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Debug|x64.Build.0 = Debug|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Debug|x86.ActiveCfg = Debug|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Debug|x86.Build.0 = Debug|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Release|Any CPU.ActiveCfg = Release|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Release|Any CPU.Build.0 = Release|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Release|x64.ActiveCfg = Release|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Release|x64.Build.0 = Release|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Release|x86.ActiveCfg = Release|Any CPU
{D909E26C-4A44-4485-BE66-44DC98BC2145}.Release|x86.Build.0 = Release|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Debug|Any CPU.Build.0 = Debug|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Debug|x64.ActiveCfg = Debug|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Debug|x64.Build.0 = Debug|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Debug|x86.ActiveCfg = Debug|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Debug|x86.Build.0 = Debug|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Release|Any CPU.ActiveCfg = Release|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Release|Any CPU.Build.0 = Release|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Release|x64.ActiveCfg = Release|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Release|x64.Build.0 = Release|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Release|x86.ActiveCfg = Release|Any CPU
{0D968462-1C85-4C18-BB73-8ADB02DD4301}.Release|x86.Build.0 = Release|Any CPU
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
EndGlobalSection
GlobalSection(NestedProjects) = preSolution
{77007545-7C22-45D8-B0C6-7D754D40EBF2} = {827E0CD3-B72D-47B6-A68D-7590B98EB39B}
{4330827C-C747-4754-AEF5-69E9AB4FDD22} = {827E0CD3-B72D-47B6-A68D-7590B98EB39B}
{D909E26C-4A44-4485-BE66-44DC98BC2145} = {827E0CD3-B72D-47B6-A68D-7590B98EB39B}
{0D968462-1C85-4C18-BB73-8ADB02DD4301} = {827E0CD3-B72D-47B6-A68D-7590B98EB39B}
EndGlobalSection
EndGlobal
Loading