Skip to content

feat: halt system on significant child process failure#485

Open
Draggu wants to merge 3 commits intoelixir-lang:mainfrom
Draggu:restart
Open

feat: halt system on significant child process failure#485
Draggu wants to merge 3 commits intoelixir-lang:mainfrom
Draggu:restart

Conversation

@Draggu
Copy link
Contributor

@Draggu Draggu commented Mar 4, 2026

gen_lsp part elixir-tools/gen_lsp#80

Close #428, I tried to fix the source problem here (writing to stdout), but I'm not sure if case I found is the only one. I don't think there exists a valid restart option since clients can't recover from gibberish written to output.

{:forge, path: "../forge"},
{:gen_lsp, "~> 0.11.3"},
{:gen_lsp,
git: "https://github.com/elixir-tools/gen_lsp",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip: you can use the github option and branch option together, a little more ergonomic and easier to update when you push changes.

receive do
{:DOWN, ^ref, :process, _, _} ->
Logger.error("Expert application has terminated unexpectedly")
# Signifficant process we cant recover (like the GenLSP) has crashed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typos

Suggested change
# Signifficant process we cant recover (like the GenLSP) has crashed
# A significant process we can't recover from (like the GenLSP) has crashed.

@mhanberg
Copy link
Member

testing this today

Copy link
Member

@mhanberg mhanberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In testing this, I may have found a bug in the neovim lsp client. when you close it, it seems to send the shutdown request multiple times (which we handle gracefully, but was pretty confusing when encountering).

Comment on lines +123 to +133
{:ok, pid} ->
spawn(fn ->
ref = Process.monitor(pid)

receive do
{:DOWN, ^ref, :process, _, _} ->
Logger.error("Expert application has terminated unexpectedly")
# Signifficant process we cant recover (like the GenLSP) has crashed
:timer.apply_after(50, System, :halt, [1])
end
end)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right approach. When i was playing aroudn with this locally, i was never seeing the log hit the system, probably because it was not getting handled before we halted the system.

I think the better option is to implement the c:stop callback on the application and halt the system from there when the application shuts down (which it will when we shutdown the supervisor, which shuts down from the significant process dying).

there is one issue tho, from the perspective of that callback, we can't really know if the application was terminated normally or due to a crash. This same callback is executed when the client sends the exit notification, since we call System.stop there, which gracefully shuts down the system. I think perhaps what we want to do is set a piece of global state (persistent_term maybe) if we exit on purpose, and check that in the callback. if its present, then we can just let it continue to shut down, else we can halt with exit code 1.

This method uses existing otp conventions more or less, whereas the approach in this pr i have never seen before (which is not to say its incorrect, it just is possibly non-idiomatic).

It would ideally be nice if we could list an application as significant, and if it shutdown, the whole beam should terminate. I had sort of assumed that was possible, but doesn't seem like it.

{:engine, path: "../engine", only: [:test]},
{:forge, path: "../forge"},
{:gen_lsp, "~> 0.11.3"},
{:gen_lsp, github: "Draggu/gen_lsp", branch: "restart"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of making changes to gen_lsp, lets just override the child spec in the application child list. it might be better in the long run, but we can make that change if this seems to work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Correctly shutdown in case we shut down the buffer

3 participants