Skip to content

Conversation

@beck-8
Copy link
Contributor

@beck-8 beck-8 commented Jan 16, 2026

#30

@FilOzzy FilOzzy added this to FOC Jan 16, 2026
@github-project-automation github-project-automation bot moved this to 📌 Triage in FOC Jan 16, 2026
@beck-8 beck-8 marked this pull request as draft January 16, 2026 13:54
@rjan90 rjan90 linked an issue Jan 16, 2026 that may be closed by this pull request
@rjan90 rjan90 moved this from 📌 Triage to ⌨️ In Progress in FOC Jan 16, 2026
… tools in image

Mounting empty host directories to these paths hides the installed tools, causing 'cargo/forge not found' errors.
pdptool is compiled for the container environment and cannot execute
on the host due to platform/library incompatibilities. Run it via
docker exec and use mounted fast-storage for test files.
pdptool runs inside the container and must connect to localhost:4702
(container internal port), not the dynamically allocated host port.
@beck-8 beck-8 changed the title fix: Docker build GID conflict on macOS fix: Docker build on macOS Jan 16, 2026
@beck-8
Copy link
Contributor Author

beck-8 commented Jan 16, 2026

 INFO src/docker/logs.rs:109: ✓ Removed 24 dead foc* containers
 INFO src/commands/start/mod.rs:487: [3/3] Writing post-start status snapshot...
 INFO src/docker/logs.rs:119: Writing post-start status to: /Users/beck/.foc-devnet/run/26jan17-0123_WoblyFig/post_start_status.log
 INFO src/docker/logs.rs:132: ✓ Post-start status logged
 INFO src/commands/start/mod.rs:490: ═══════════════════════════════════════════════════════════
 INFO src/commands/start/mod.rs:491: Post-start teardown completed successfully
 INFO src/commands/start/mod.rs:492: ═══════════════════════════════════════════════════════════
 INFO src/commands/start/mod.rs:464: Cluster started successfully!

# beck @ beckMacBook-Pro in ~/workspace/foc-localnet on git:fix/docker-gid-conflict x [1:34:12]

@beck-8
Copy link
Contributor Author

beck-8 commented Jan 16, 2026

pdptool was also placed in the container because cross-platform issues aren't limited to macOS. The same problem might occur when using it on Ubuntu versions below 24.04.

@beck-8 beck-8 marked this pull request as ready for review January 16, 2026 17:46
@wjmelements
Copy link

I have tried cargo run -- init on this branch and it worked.

@wjmelements
Copy link

I have tried cargo run -- init on this branch and it worked.

The other steps worked too

@rjan90 rjan90 requested a review from redpanda-f January 17, 2026 08:00
@rjan90 rjan90 moved this from ⌨️ In Progress to 🔎 Awaiting review in FOC Jan 17, 2026
WORKDIR /workspace

# Define volumes for external access
VOLUME ["/home/foc-user/.cargo", "/home/foc-user/.rustup", "/home/foc-user/go", "/var/tmp/filecoin-proof-parameters"]
Copy link
Collaborator

@redpanda-f redpanda-f Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this removed? This is good for caching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works when there is data present. The first time you run the mount, it will clear these directories, causing the toolchain to be lost. I added the critical mappings in a subsequent commit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, but it cleans up only the first time. The caches are for when the curio or lotus codebase is actively being changed and rebuilt frequently. In those instances, re-init is not needed, and these caches would be useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

The cargo bin directory was overwritten, so this binary will never exist unless it is manually downloaded on the host machine.

There are two commits here: one temporarily disables everything, and the second maps according to what's actually usable. The -v parameter plays a crucial role. Therefore, I've removed it here because I think it's meaningless.

However, I must admit that some Rust binaries are not cached. The rest, such as go pkg and rust registry, are cached.

"run".to_string(),
"-u".to_string(),
"foc-user".to_string(),
"--rm".to_string(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure if the intent really is:

  • "do not use foc-user for building"
  • "remove" the container, lose logs

I don't disagree with them, but would like to understand safety argument for removing -u foc-user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --rm option is used because if cargo run --build lotus fails, subsequent runs will save the changes, requiring manual cleanup. Error logs will be output upon exit, so there's no need to worry.

The -u foc-user was removed because you added -u gid:uid repeatedly later, so this redundant part was removed.

.parse()?;
// When running pdptool inside the container, use the container's internal port (4702)
// not the host-mapped port
let service_url = "http://localhost:4702";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not hardcode stuff. What's wrong with host-mapped port here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because pdptool was moved inside the container, so a fixed internal port was used.

The reason for moving pdptool internally is that the original logic allowed programs compiled inside the container to be used externally, which was not as expected and could cause problems in most environments.

Comment on lines 121 to 127
let run_id = context.run_id();
let container_name = format!("foc-{}-curio-{}", run_id, sp_index);

let args = [
"exec",
&container_name,
"/usr/local/bin/lotus-bins/pdptool",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure testing from within curio container constitutes "healthy" properly, especially given public. Would rather have foc-builder run pdptool for stricter testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I will put pdptool in foc-builder and keep testing service_url externally.

/// Returns the appropriate YugabyteDB tarball URL for the current platform:
/// - el8-aarch64 for ARM64 systems (Apple Silicon, AWS Graviton, etc.)
/// - linux-x86_64 for x86_64 systems
fn get_default_yugabyte_url() -> String {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This addition is a "feature update". The S3 fallback has to be changed accordingly in the code as well. Please check that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean by S3 fallback here? But I do see a place I forgot to update.

# Cache Yugabyte tarball

Copy link
Collaborator

@redpanda-f redpanda-f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main changes I’d request before merge:

  • Fix Yugabyte arch detection (runtime vs compile-time). Also, fix the S3 fallback for downloads
  • Re-evaluate removal of -u foc-user.
  • Avoid hardcoded ports and paths. Hoist them into constants if really needed.

@github-project-automation github-project-automation bot moved this from 🔎 Awaiting review to ⌨️ In Progress in FOC Jan 17, 2026
- Change from compile-time cfg!() to runtime std::env::consts::ARCH
- Update cache-artifacts.sh to detect ARM64/x86_64 at runtime
@redpanda-f
Copy link
Collaborator

redpanda-f commented Jan 19, 2026

I feel this PR needs to break down into two separate ones, one (1) supporting groups issues, and the other, (2) more overarching attempting to provide MacOS support. In fact, (1) seems to be a sub-task of (2) here.

Can we just limit this to group ID support @beck-8 ? see: #35 for larger issue

@beck-8
Copy link
Contributor Author

beck-8 commented Jan 19, 2026

Currently I'm testing fine in M1 M2 environment. I don't want to do a split. We can use other PRs to continue to fully support features that may not be implemented currently.

@redpanda-f
Copy link
Collaborator

Thank you so much @beck-8 , this seems to be working on mac-os aarch64.

@github-project-automation github-project-automation bot moved this from ⌨️ In Progress to ✔️ Approved by reviewer in FOC Jan 20, 2026
@redpanda-f redpanda-f merged commit 4b61cbe into FilOzone:main Jan 20, 2026
1 of 2 checks passed
@github-project-automation github-project-automation bot moved this from ✔️ Approved by reviewer to 🎉 Done in FOC Jan 20, 2026
@beck-8 beck-8 deleted the fix/docker-gid-conflict branch January 20, 2026 11:29
BigLep added a commit that referenced this pull request Jan 21, 2026
- Change from 'x86 Architecture' requirement to 'Architecture' supporting both x86 and ARM64
- Update text to reflect that system automatically selects appropriate binaries
- Reflects changes from PR #32 which added ARM64/Apple Silicon support
redpanda-f pushed a commit that referenced this pull request Jan 22, 2026
* docs: rename ADVANCED_README.md to README_ADVANCED.md for lexicographical sorting

- Rename ADVANCED_README.md to README_ADVANCED.md using git mv for clear diff tracking
- Update reference in README.md to point to new filename
- Addresses BigLep's comment about naming consistency

* docs: add links and system requirements callout

- Add link from SP nodes control to Configuration System section in README_ADVANCED.md
- Add x86 architecture requirement note to System Requirements table
- Addresses BigLep's comments about missing links and architecture limitation

* docs: add command prerequisites to build and start commands

- Add note that build must be run after init
- Add note that start should be run after build of lotus and curio
- Addresses BigLep's comment about missing command order information

* docs: add 'How Defaults Work' section explaining config generation

- Explain that defaults are defined in code (src/config.rs Config::default())
- Document how init writes defaults to config.toml
- Explain how to update defaults with new versions using init --force
- Addresses BigLep's question about where defaults are defined and how they get updated

* docs: remove duplicate Manual Cleanup section

- Remove second duplicate Manual Cleanup section
- Keep the first one which has more comprehensive content
- Addresses BigLep's comment about duplicate sections

* docs: update Run ID format to ISO8601 (YYYY-MM-DDTHH:MM)

- Change Run ID format from YYmmmDD-HHMM to YYYY-MM-DDTHH:MM (ISO8601)
- Update code in src/run_id/mod.rs to use new format
- Update test regex to match new format
- Update all documentation examples to use new format (2026-01-02T14:30_ZanyPip)
- Addresses BigLep's comment about date format standardization for natural sorting and clarity

* docs: define what a 'step' is when first introduced

- Add definition of 'step' as discrete unit of work in cluster startup
- Link to Detailed Start Sequence section for complete list
- Reorganize Step Context section for better clarity
- Addresses BigLep's comment about defining 'step' when first mentioned

* docs: add links to context keys code location and example

- Link to src/commands/start/ directory for context key implementations
- Add example link to usdfc_deploy/prerequisites.rs showing context.get() usage
- Clarify that context keys are string literals, not constants
- Addresses BigLep's comment about linking to definitive list in code

* docs: add markdown link formatting for Portainer

- Change plain text 'Portainer' to [Portainer](https://docs.docksal.io/use-cases/portainer/)
- Addresses BigLep's comment about proper link formatting

* docs: clarify container architecture in table

- Clarify that Yugabyte is shared by all Curio SPs
- Update port descriptions from 'Dynamic' to 'Dynamic from range' for all Curio containers
- Addresses BigLep's comments about container descriptions

* docs: fix network diagram and correct Yugabyte architecture

- Simplify diagram to show single representative instance with curio-n/yugabyte-n notation
- Correct Yugabyte architecture: each SP has its own Yugabyte instance, not shared
- Update container architecture table to show yugabyte-1 and yugabyte-N (one per SP)
- Update network explanations to reflect one Yugabyte per SP network
- Addresses BigLep's comment about network diagram readability and accuracy

* fix: change Run ID format to condensed ISO8601 (YYYYMMDDTHHMM) for Docker compatibility

- Change from YYYY-MM-DDTHH:MM to YYYYMMDDTHHMM (no dashes/colons)
- Colons and dashes cause issues with Docker network names
- Update code in src/run_id/mod.rs to use %Y%m%dT%H%M format
- Update test regex to match new format
- Update all documentation examples throughout README_ADVANCED.md
- Still ISO8601-based for natural sorting, but Docker-compatible

* docs: improve repository management section

- Convert Required Repositories table to bulleted list with repo links (avoids outdated version info)
- Add Version Strategy section explaining GitTag/GitCommit/GitBranch approaches
- Link to src/config.rs where default versions are defined
- Link 'Updating defaults' to How Defaults Work section
- Addresses BigLep's questions about version strategy and where defaults are defined

* docs: simplify configuration sharing section

- Reduce from verbose 3-step process to simple 2-step (copy file + run init)
- Remove unnecessary export and documentation steps
- Addresses BigLep's comment about verbose configuration sharing

* docs: move Reproducible builds to its own heading

- Move Reproducible builds from under Sharing Configuration to separate section
- Makes it clearer that it's a distinct topic
- Addresses BigLep's comment about moving it to its own heading

* docs: merge Command Flags section with Commands Reference

- Move all flag information into Commands Reference section
- Remove duplicate Command Flags section
- Add inline 'why' explanations for --volumes-dir, --run-dir, and --notest
- Preserve parallelization guidance (why to use/not use --parallel)
- Add parallelization table to Step execution section showing which epochs are parallelized
- Link from start command to Detailed Start Sequence for parallelization details
- Addresses BigLep's comment about eliminating duplication

* docs: restructure lifecycle section

- Remove ASCII diagram
- Reorganize into Before Steps/Steps/Post Start Steps structure
- Remove hardcoded numbers (1-6) and letters (a-k) from step descriptions
- Add links to Step Implementation Pattern and Configuration System
- Add sections for running, stop, and (re)start states
- Addresses BigLep's comment about improving lifecycle documentation structure

* docs: add inline code comments to Rust examples

- Enhance SetupContext struct comments with detailed explanations
- Add inline comments throughout example flow showing context sharing
- Add method-level comments to Step trait explaining each phase
- Improve code readability by explaining what each part does and why
- Addresses BigLep's comment about adding explanatory content to code examples

* docs: rename Advanced Topics and remove Last Updated section

- Rename 'Advanced Topics' to 'Additional User Actions' for clarity
- Add explanation of genesis template location (~/.foc-devnet/docker/volumes/run-specific/<run-id>/genesis/)
- Note that genesis templates are generated during genesis prerequisites phase
- Remove 'Last Updated' section (redundant - version control tracks changes)
- Addresses BigLep's comments about improving section naming and removing low-value content

* docs: remove redundant sections and improve consistency

- Enhance post_execute comment to include 'confirm deployment'
- Remove duplicate 'Phases' section (already explained in inline code comments)
- Remove redundant 'Testing scenarios' section (just comments)
- Remove 'Reference Links' section (redundant with other documentation)

* docs: fix relative links to use correct paths

- Change ../src/config.rs to src/config.rs (root-level paths)
- Change ../src/commands/start/ to src/commands/start/
- Change ../src/commands/start/usdfc_deploy/prerequisites.rs to src/commands/start/usdfc_deploy/prerequisites.rs
- Ensure all local file links use relative paths from root directory
- All relative links now correctly reference files without parent directory traversal

* docs: fix grammar, links, inconsistencies, and heading levels

- Fix grammar error: 'If you are have troubles' -> 'If you have troubles'
- Fix broken GitHub Issues link in System Requirements table
- Update outdated 'Advanced topics' reference to 'Additional user actions'
- Fix timing inconsistency: align '--parallel' timing with detailed sequence (~5 min to ~3 min)
- Fix Yugabyte note: change 'shared by all SPs' to 'one per SP' and update to yugabyte-1
- Fix heading levels: change running/stop/(re)start from ### to #### under Lifecycle Overview
- Fix typos: 'Regenisis' -> 'Regenesis', improve capitalization
- Fix formatting: remove extra blank line after --force option
- Keep latest symlink references (feature documented in code)

* docs: update architecture requirement to reflect macOS/ARM support

- Change from 'x86 Architecture' requirement to 'Architecture' supporting both x86 and ARM64
- Update text to reflect that system automatically selects appropriate binaries
- Reflects changes from PR #32 which added ARM64/Apple Silicon support

* docs: refine dependency repository management section

- Rename 'Repository Management' to 'Dependency Repository Management'
- Rename 'Required Repositories' to 'Required Dependent Repositories'
- Rename 'Version Strategy' to 'Dependent Version Strategy'
- Simplify version strategy explanation
- Rename 'Using Local Repositories' to 'Using Local Dependency Repositories'
- Add note about using same foc-devnet commit when sharing config
- Clarify reproducible builds use 'tags or commits' not just 'commits'

* refactor: remove unused CLI flags for directory paths

Remove --output-dir, --volumes-dir, and --run-dir flags as they are
either unused or should not be user-configurable. The system now always
uses fixed default paths:

- ~/.foc-devnet/bin/ for build outputs
- ~/.foc-devnet/docker-volumes/<run-id>/ for Docker volumes
- ~/.foc-devnet/run/<run-id>/ for run-specific data

This simplifies the CLI interface and ensures consistent behavior.

* style: fix rustfmt formatting issues

- Format match arms on single lines where appropriate
- Format function call on single line
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🎉 Done

Development

Successfully merging this pull request may close these issues.

cargo run -- init fails in groupadd with GID already exists

3 participants