Skip to content

Conversation

@djerius
Copy link

@djerius djerius commented Feb 18, 2025

IO objects are special, and sometimes are not destroyed when references to them disappear.

Here's some example code (for real code to run and exhibit the unwanted behavior, see below)

{
  my $cmd = System::Command->new( 'cat' );
  $cmd->stdin->say( $_ ) for 0 ... 100;
};

When the block ends, $cmd will be destroyed. Sometimes the order of destruction is

  • System::Command
  • IO::Handle
  • System::Command::Reaper

and sometimes it is

  • System::Command
  • System::Command::Reaper

Experimentally, when the Reaper object is destroyed (regardless of whether the IO::Handles are destroyed first) it has lost access to its cached IO::Handle objects. They are cached as weak references, and when the Command object is destroyed, it releases its references to them, and the Reaper object loses its references.

If the second order of destruction happens, the handles still exist (Perl seems to have an internal cache for them), but Reaper doesn't see them, and thus can't close them. It will try to reap the child process, but if the child is awaiting input, it will never exit, cannot be reaped, and Reaper::_reap will hang forever.

(Note that explicitly closing the handles via $cmd->close is not safe, as, if the program is interrupted before the close, and the signal handler throws, e.g. $SIG{INT} = sub { die }, the $cmd->close does not get fired.

A bit further below is some real test code. Run it as, e.g.,

% for i in $(seq 100) ; do sc.pl ; done
System::Command::DESTROY: STDIN FH IO::File=IO(0x1a472b0)
IO::Handle::DESTROY DESTROYING IO::File=IO(0x1a472b0)
System::Command::Reaper::DESTROY: STDIN FH DOESN'T EXIST
--------------
System::Command::DESTROY: STDIN FH IO::File=IO(0x67e2f0)
IO::Handle::DESTROY DESTROYING IO::File=IO(0x67e2f0)
System::Command::Reaper::DESTROY: STDIN FH DOESN'T EXIST
--------------
System::Command::DESTROY: STDIN FH IO::File=IO(0x1af93c0)
IO::Handle::DESTROY DESTROYING IO::File=IO(0x1af93c0)
System::Command::Reaper::DESTROY: STDIN FH DOESN'T EXIST
--------------
System::Command::DESTROY: STDIN FH IO::File=IO(0xa20400)
System::Command::Reaper::DESTROY: STDIN FH DOESN'T EXIST

In the above example, the last instance hangs because the stdin IO::Handle hasn't been destroyed.

It may need a few iterations to trigger the bug.

#! /bin/env perl

use v5.10;
use System::Command;
use IO::Handle;

our $FH;

package IO::Handle {
    sub DESTROY {
        say __PACKAGE__ . "::DESTROY DESTROYING $_[0]" if "$_[0]" eq $FH;
    }
}

package System::Command {
    sub DESTROY {
        print STDERR __PACKAGE__ . "::DESTROY: STDIN FH ", *{$_[0]->{stdin}}{IO}, "\n";
    }
}

package System::Command::Reaper {

    BEGIN {
        *orig_DESTROY  = \&DESTROY;
    }

    sub DESTROY {
        my ($self) = @_;
        print STDERR __PACKAGE__ . "::DESTROY: STDIN FH ", $self->{stdin} ? "EXISTS" : "DOESN'T EXIST", "\n";
        $self->orig_DESTROY;
    }
}

{
    my $cmd = System::Command->new( 'cat' );
    $FH = '' . *{$cmd->stdin}{IO};
    $cmd->stdin->say( $_ ) for 0 ... 10000;
};

say STDERR "--------------";

…t destruction.

IO objects are special, and sometimes are not destroyed when
references to them disappear.

Here's some example code:

{
  my $cmd = System::Command->new( 'cat' );
  $cmd->stdin->say( $_ ) for 0 ... 100;
};

When the block ends, $cmd will be destroyed.  Sometimes the
order of destruction is

- System::Command
- IO::Handle
- System::Command::Reaper

and sometimes it is

- System::Command
- System::Command::Reaper

Experimentally, when the Reaper object is destroyed (regardless of
whether the IO::Handles are destroyed first) it has lost access to its
cached IO::Handle objects.  They are cached as weak references, and
when the Command object is destroyed, it releases its references to
them, and the Reaper object loses its refereces.

If the second order of destruction happens, the handles still exist
(Perl seems to have an internal cache for them), but Reaper doesn't
see them, and thus can't close them.  It will try to reap the child
process, but if the child is awaiting input, it will never exit,
cannot be reaped, and Reaper::_reap will hang forever.

(Note that explicitly closing the handles via $cmd->close
is not safe, as, if the program is interrupted before the close,
and the signal handler throws, e.g. $SIG{INT} = sub { die },
the $cmd->close does not get fired.

So, just make sure things are closed.
@djerius
Copy link
Author

djerius commented Feb 18, 2025

TBH the client could do this ( I think this handles all of the race conditions)

my $cmd;
defer { $cmd and $cmd->stdin->close; }
$cmd = System::Command->new( 'cat' );
$cmd->stdin->say( $_ ) for 0 ... 100;

but I think it's cleaner to do it in the System::Command destructor.

@book
Copy link
Owner

book commented Feb 18, 2025

Given the way you produced the check for the issue, I assume there's no easy way to test for this, except running a number of iterations and checking they all pass.

@book
Copy link
Owner

book commented Feb 18, 2025

Oh, and because the issue is a hanging process, it's probably not a nice test to write...

@djerius
Copy link
Author

djerius commented Feb 18, 2025

Oh, and because the issue is a hanging process, it's probably not a nice test to write...

I have written tests that timeout (I had some issues with IPC on CPAN testers running on Windows) but I can't remember much more than that. Will have to go prod my brain cells.

@djerius
Copy link
Author

djerius commented Feb 18, 2025

Oh, and because the issue is a hanging process, it's probably not a nice test to write...

I have written tests that timeout (I had some issues with IPC on CPAN testers running on Windows) but I can't remember much more than that. Will have to go prod my brain cells.

Found it; no use here, it just timed out when trying to launch the child process.

Could set an alarm, but apparently this wouldn't work for this case on Windows; https://metacpan.org/dist/Time-Out/view/lib/Time/Out.pod#CAVEATS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants