Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Adding Kill(bool entireProcessTree) -- best-effort attempt to terminate a process tree #31827

Closed
wants to merge 38 commits into from

Conversation

bgribaudo
Copy link
Contributor

@bgribaudo bgribaudo commented Aug 17, 2018

[Implements API approved by #26234]

Summary

When Kill(bool entireProcessTree)'s argument is false, behaves identical to Kill(); when true, makes a best-effort attempt to terminate the current process and all descendant processes (children, grandchildren, etc.).

Platform Implementation Status

Implemented for Windows and Linux; not implemented for UAP, UnknownUnix or FreeBSD/OSX.

Looks easy to implement for OSX (comment added in Process.OSX.cs with implementation idea), just don't have an OSX system to test against.

Notes

Potential low-hanging fruit: Behind-the-scenes, the logic in this PR enumerates child processes. This functionality could be leveraged to easily implement #25855.

Windows-specific logic inspired by msbuild's NativeMethodsShared KillTree.

@danmoseley
Copy link
Member

If you want to try OSX without a local machine, you can use the CI system on this PR to try it...

Copy link
Member

@JeremyKuhne JeremyKuhne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on interop inline

@bgribaudo
Copy link
Contributor Author

bgribaudo commented Aug 20, 2018

If you want to try OSX without a local machine, you can use the CI system on this PR to try it...

@danmosemsft, good idea--I'll give that a try. Thanks!

@danmoseley
Copy link
Member

Test build against .NET Framework is failing as you can see in the NETFX details link above. (Most of our tests we build and run against NETFX to double check we get the same results)

ProcessTests.cs(337,73): error CS1739: The best overload for 'Kill' does not have a parameter named 'entireProcessTree' [D:\j\workspace\windows-TGrou---2a8f9c29\src\System.Diagnostics.Process\tests\System.Diagnostics.Process.Tests.csproj]
ProcessTests.cs(349,36): error CS1739: The best overload for 'Kill' does not have a parameter named 'entireProcessTree' [D:\j\workspace\windows-TGrou---2a8f9c29\src\System.Diagnostics.Process\tests\System.Diagnostics.Process.Tests.csproj]
ProcessTests.cs(378,36): error CS1739: The best overload for 'Kill' does not have a parameter named 'entireProcessTree' [D:\j\workspace\windows-TGrou---2a8f9c29\src\System.Diagnostics.Process\tests\System.Diagnostics.Process.Tests.csproj]

Of course NETFX does not have this new method. To fix this, please move your new tests into a new named ProcessTests.netcoreapp.cs. In the csproj, condition it like

  <ItemGroup Condition="'$(TargetGroup)' == 'netcoreapp'">
    <Compile Include="ProcessTests.netcoreapp.cs" />
..

Now it will only build and run against .NET Core and not against .NET Framework.

@danmoseley
Copy link
Member

On OSX, Kill_EntireProcessTree_True_EntireTreeTerminated test is failing with this (again you can see by following Details link above and some more clicks)

System.PlatformNotSupportedException : Operation is not supported on this platform.
Stack Trace :
   at System.Diagnostics.Process.get_ParentProcessId() in /Users/dotnet-bot/j/workspace/dotnet_corefx/master/osx-TGroup_netcoreapp+CGroup_Debug+AGroup_x64+TestOuter_false_prtest/src/System.Diagnostics.Process/src/System/Diagnostics/Process.OSX.cs:line 100
   at System.Diagnostics.Process.IsParentOf(Process possibleChildProcess) in /Users/dotnet-bot/j/workspace/dotnet_corefx/master/osx-TGroup_netcoreapp+CGroup_Debug+AGroup_x64+TestOuter_false_prtest/src/System.Diagnostics.Process/src/System/Diagnostics/Process.Unix.cs:line 319

Sounds like yo'ure going to have a shot at fixing this - if not, we need to disable the test for OSX.

@danmoseley
Copy link
Member

danmoseley commented Aug 23, 2018

For Kill_EntireProcessTree_False_OnlyRootProcessTerminated there are two distinct failures

In OSX

Expected: Boolean[] [True, False, False]
Actual:   SelectListIterator<Process, Boolean> [True, False, False]

Looks like you got the right result but the assert is the wrong overload. You may want to use Assert. Equal<T>(IEnumerable<T> expected, IEnumerable<T> actual) perhaps by casting.

All the Linuxes have the same problem, but apparently the process isn't getting terminated either:

Expected: Boolean[] [True, False, False]
Actual:   SelectListIterator<Process, Boolean> [False, False, False]

Hopefully you can debug that locally.

@bgribaudo hope this gives you enough to make progress! Again, all this should be publicly accessible through the CI links above

@danmoseley
Copy link
Member

danmoseley commented Aug 23, 2018

UWP build has

System\Diagnostics\Process.Uap.cs(26,22): error CS0111: Type 'Process' already defines a member called 'KillTree' with the same parameter types [D:\j\workspace\windows-TGrou---33cbf18b\src\System.Diagnostics.Process\src\System.Diagnostics.Process.csproj]

This is because in the UWP build Process class is Process.cs+Process.Uap.cs+Process.Windows.cs. If you want something to be Windows but not UWP it needs to be in Process.Win32.cs. You can figure this out by looking in the csproj.

You can build the tree for UWP by using build -targetgroup uap at the root.

@danmoseley
Copy link
Member

ERROR_CANT_TERMINATE_SELF

Maybe another reason to not support self terminating.

I can't remember, but I think that was some special case that you would not normally get when terminating yourself.

@MattGal
Copy link
Member

MattGal commented Nov 14, 2018

The work item attempted to run 3x, all 3x it hit a "STATUS_STACK_OVERFLOW". While I agree it's way too much clicking, you can get from Jenkins to here: https://mc.dot.net/#/user/bgribaudo/pr~2Fjenkins~2Fdotnet~2Fcorefx~2Fmaster~2F/test~2Ffunctional~2Fcli~2F/122e1b7ace4a9497ebab9d3ee2a31b499abbc47c/workItem/System.Diagnostics.Process.Tests/wilogs

and see this in the logs:

2018-11-14 16:02:20,726: INFO: proc(55): run_and_log_output: Output: Discovering: System.Diagnostics.Process.Tests
2018-11-14 16:02:21,055: INFO: proc(55): run_and_log_output: Output: Discovered:  System.Diagnostics.Process.Tests
2018-11-14 16:02:21,085: INFO: proc(55): run_and_log_output: Output: Starting:    System.Diagnostics.Process.Tests
2018-11-14 16:02:21,335: INFO: proc(55): run_and_log_output: Output: 
2018-11-14 16:02:21,523: INFO: proc(55): run_and_log_output: Output: 
2018-11-14 16:02:21,523: INFO: proc(55): run_and_log_output: Output: 
2018-11-14 16:02:21,585: INFO: proc(55): run_and_log_output: Output: 
2018-11-14 16:02:21,585: INFO: proc(55): run_and_log_output: Output: 
2018-11-14 16:02:21,585: INFO: proc(55): run_and_log_output: Output: 
2018-11-14 16:02:21,930: INFO: proc(55): run_and_log_output: Output: 
2018-11-14 16:02:21,930: INFO: proc(55): run_and_log_output: Output: Process is terminating due to StackOverflowException.
2018-11-14 16:02:22,351: INFO: proc(55): run_and_log_output: Output: 
2018-11-14 16:02:22,351: INFO: proc(55): run_and_log_output: Output: C:\dotnetbuild\work\ea7229d1-7624-4a47-a955-d6c8ce0b89e0\Work\ffe7a80f-27c8-429b-b0bd-f7739975b896\Unzip>C:\Python\python.exe DumplingHelper.py collect_dump -1073741571 C:\Users\runner\AppData\Local\Temp\CoreRunCrashDumps 1542211335.9 System.Diagnostics.Process.Tests D:\j\workspace\windows-TGrou---33cbf18b\artifacts\bin\runtime/uap-Windows_NT-Debug-x64/,D:\j\workspace\windows-TGrou---33cbf18b\artifacts\bin\tests/System.Diagnostics.Process.Tests/uap-Windows_NT-Debug-x64/ 

It actually even looks like Dumpling might have a .dmp waiting for you too!

@ViktorHofer
Copy link
Member

lol, Matt was faster than me 😁

@danmoseley
Copy link
Member

Thanks, I have no idea how I did not get that when I clicked on it earlier. @bgribaudo maybe you can figure it out from code inspection..

@danmoseley
Copy link
Member

It looks like it's due to https://github.com/dotnet/corefx/issues/31908 -- you ?accidentally removed the [ActiveIssue(31908, TargetFrameworkMonikers.Uap)]

@bgribaudo
Copy link
Contributor Author

@danmosemsft

I would like to make a best effort to throw if appropriate. You can do simple experiments and/or look for the access denied error code (s). We should be able to handle the 99% case without much difficulty I think.

I'll work on it. :-) Probably won't get much done on it, though, until after the holiday next week. Hope you all have a great Thanksgiving!

@danmoseley
Copy link
Member

Sounds good. STill plenty of time for 3.0 🦃 🍽

@karelz
Copy link
Member

karelz commented Nov 28, 2018

What's status of this PR? I don't see update in last 2 weeks ...

@bgribaudo
Copy link
Contributor Author

I'm going to try to pick it back up sometime between now and Christmas when I can find a lull in regular work.

@bgribaudo
Copy link
Contributor Author

@dotnet-bot test this please

1 similar comment
@bgribaudo
Copy link
Contributor Author

@dotnet-bot test this please

@bgribaudo
Copy link
Contributor Author

@dotnet-bot test this please

@bgribaudo
Copy link
Contributor Author

Is there a CI/build problem with the Linux arm Release, Linux x64 Release and Tizen armel Debug builds?

  • Linux arm Release dies with:
    05:09:45 tar: artifacts/bin/runtime/netcoreapp-Linux-Release-arm: Cannot opentar (child): artifacts/bin/build.tar.gz: Cannot open: No such file or directory
    05:09:45 tar: Error is not recoverable: exiting now
    05:09:45 : No such file or directory
    05:09:45 tar (child): Error is not recoverable: exiting now
    05:09:45 
    Build step 'Execute shell' marked build as failure
    
  • Linux x64 Release shows catastrophic failures for the Process project but trying to view logs shows 'Warning! No log available.'
  • Tizen armel Debug dies with a restore packages error which (possibly) is related to:
    error NU1102: Unable to find package runtime.tizen.5.0.0-armel.Microsoft.NETCore.Runtime.CoreCLR with version (>= 3.0.0-preview1-27008-04)
    

The errors for the two build groups where details are shown don't seem to have anything to do with code I've written or build instructions I've configured.

@danmoseley
Copy link
Member

@ViktorHofer?

@ViktorHofer
Copy link
Member

For diagnosing what's going on with the Linux arm Release leg the log doesn't really help here. If the error is not transient we would need a repro environment. Maybe @safern knows more?

@dotnet-bot test Linux arm Release Build
@dotnet-bot test Linux x64 Release Build
@dotnet-bot test Tizen armel Debug Build

@ViktorHofer
Copy link
Member

cc @dotnet/dnceng please take a look at the failing legs. I'm not sure why they are failing and not even producing a log.

@MattGal
Copy link
Member

MattGal commented Dec 11, 2018

@ViktorHofer taking a look...

@MattGal
Copy link
Member

MattGal commented Dec 11, 2018

@ViktorHofer This one failed trying to tar up the build: https://ci.dot.net/job/dotnet_corefx/job/master/job/linux_arm_cross_release_prtest/18512/consoleFull

This run: https://ci3.dot.net/job/dotnet_corefx/job/master/job/linux-TGroup_netcoreapp+CGroup_Release+AGroup_x64+TestOuter_false_prtest/19695/
failed because the following queues were deprecated. Simply get latest from master or modify like so:

Centos.74.Amd64.Open -> Centos.7.Amd64.Open
RedHat.74.Amd64.Open -> RedHat.7.Amd64.Open
Debian.87.Amd64.Open -> Debian.8.Amd64.Open (forwarded you a mail)

https://ci.dot.net/job/dotnet_corefx/job/master/job/tizen_armel_cross_debug_prtest/18383/ failed build looking for Tizen stuff.

@bgribaudo
Copy link
Contributor Author

Thanks, @MattGal & @ViktorHofer. As far as I know, this PR doesn't change anything directly that should cause those failures. I'm trying to update to latest to see if that brings in build definition changes that fix these....

@danmoseley
Copy link
Member

I think the queue updates were already done in 96572c1. This PR is presumably stuck with the ones from when it was created. @bgribaudo you could close this and make a new one if you want to get those tests to work. Seems that it's close to mergeable anyway.

@bgribaudo
Copy link
Contributor Author

Thanks, @danmosemsft! I'll try that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.