Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pipe() doesn't create a blocking pipe #13214

Open
emaxx-google opened this issue Jan 8, 2021 · 14 comments
Open

pipe() doesn't create a blocking pipe #13214

emaxx-google opened this issue Jan 8, 2021 · 14 comments

Comments

@emaxx-google
Copy link

emaxx-google commented Jan 8, 2021

On Emscripten, when reading from a pipe created via pipe(), the read() call returns immediately with EAGAIN when there's no data in the pipe. This contradicts the standard behavior specified by POSIX:

  EAGAIN The file descriptor fd refers to a file other than a
         socket and has been marked nonblocking (O_NONBLOCK), and
         the read would block.  See open(2) for further details on
         the O_NONBLOCK flag.
  EAGAIN or EWOULDBLOCK
         The file descriptor fd refers to a socket and has been
         marked nonblocking (O_NONBLOCK), and the read would block.

https://man7.org/linux/man-pages/man2/read.2.html

  If a process attempts to read from an empty pipe, then read(2)
  will block until data is available. <...>
  Nonblocking I/O is possible by using the fcntl(2)
  F_SETFL operation to enable the O_NONBLOCK open file status flag.

https://man7.org/linux/man-pages/man7/pipe.7.html

Given that pipe() doesn't set the O_NONBLOCK flag, I believe the Emscripten's behavior with returning EAGAIN is wrong.

Sample program:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char** argv) {
  int pipefd[2];
  if (pipe(pipefd)) {
    perror("pipe() failed");
    exit(-1);
  }
  if (argc > 1) {
    char c = 0;
    if (write(pipefd[1], &c, 1) != 1) {
      perror("write() failed");
      exit(-1);
    }
  }
  char c;
  if (read(pipefd[0], &c, 1) != 1) {
    perror("read() failed");
    exit(-1);
  }
  exit(0);
}

This program, when run under Emscripten, fails with "Resource temporarily unavailable". When the same program is run on Linux using standard g++ and libc, it correctly blocks in read().
P.S. The program also demonstrates that, in case write() was previously called, the data is successfully read from it under Emscripten.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 8, 2021

Blocking is very tricky on the web platform. Unless you are running on the main thread you can't really block. You have to condtant yield to the event loop. In emscripten we can do this using ASYNCIFY but it has a cost and is not on by default. If you can think of a reasonable way simulate/emulate a blocking pipe we would welcome it. Or a better way to report that we don't support such things?

@sbc100
Copy link
Collaborator

sbc100 commented Jan 8, 2021

Sorry I mean to write "Unless you are running off the main thread".

@emaxx-google
Copy link
Author

I also tested the same snippet in a background thread, and the result was the same - it returns EAGAIN instead of blocking.

Also if a primitive like that isn't supported on a main thread, then probably the following docs need to be updated (they're written to suggest that all such blocking operation work on the main thread, even at a cost of runtime penalty): https://emscripten.org/docs/porting/pthreads.html#blocking-on-the-main-browser-thread

@sbc100
Copy link
Collaborator

sbc100 commented Jan 8, 2021

If you can think of a way to discuss that issue in that document that could be good yes. That document mostly talks about threading APIs though. Perhaps https://emscripten.org/docs/porting/asyncify.html could be another possible place to mention this since its one way it could be possible to create a blocking pipe.

Since emscripten is single process and doesn't support any kind of fork() i'm not sure how useful pipe really is for our users. Maybe for multi-threaded application one might want to pipe data between threads? I'm curious what you use case for them is?

@emaxx-google
Copy link
Author

emaxx-google commented Jan 8, 2021

It's true that pipe() is intended for multi-process communication, which doesn't make sense in Emscripten. But there's some amount of existing code that uses it as a threading primitive. In the codebase I'm migrating (which is a cross-platform *nix code that also used to work under NaCl correctly) the pipe() call is there: https://github.com/LudovicRousseau/PCSC/blob/113f74a1d7189ced85038b890253f2f3b477c3f0/src/hotplug_libusb.c#L561

And, yeah, it's confusing to see that the pipe() function returns success in Emscripten, but subsequent operations don't really work as expected. If pipes aren't supported at all, it'll be much less surprises if pipe() would return an error straight away. Application authors like me would have to find a workaround, but the problematic piece would be much easier to discover than in the current situation. Of course, it'll be even cooler if Emscripten would be fixed to support pipes fully, so that no workarounds are needed :)

@sbc100
Copy link
Collaborator

sbc100 commented Jan 8, 2021

I agree that maybe failing to create the pipe at all might be more clear. I'd be happy with that approach if the user is building in a configuration where blocking pipes are clearly not possible (e.g. without ASYNCIFY (which could one day allow for actual blocking pipes although I don't think it does today)).

Alternatively we could change the error return into a fatal abort and report "attempt to read from empty pipe without NONBLOCK.. this is not supported on emscripten because we can't block". Then certain programs that don't need blocking would continue to work and developers who rely on blocking would see a fatal error. Either way I think is fine given that there are probably very few uses for pipe in emscripten today.

@emaxx-google
Copy link
Author

Thanks, these options will be a good way to tell developer about the problem. Also would be great to mention it in docs - it helps when you know which unsupported functions you have to look for in the codebase.

@femski
Copy link

femski commented Mar 11, 2021

I had exact same question here:
#10556

and answer then was that its not possible.

But now that we have support for atomic.waitAsync in browser:
https://www.chromestatus.com/feature/6243382101803008

couldn't this be done using native browser feature (instead of asyncify which could also use same browser feature)?
Instead of waiting for a pipe, threads would wait for a piece of memory protected by atomics.

Is support for atomics coming in emscripten/webassembly also?
This will enable a broader range of event driven applications ported to WASM then.

attn: @kripken @sbc100

@sbc100
Copy link
Collaborator

sbc100 commented Mar 11, 2021

Sure if you want to do blocking pipes with multiple threads it could be possible to do with atomics. We do a lot of inter-thread communication within emscripten using atomics already.

What won't work (at least not without ASYNICIFY) is a single threaded app where the pipe is fed, for example, from some kind of JS event (like a fetch request).

@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.

@stale stale bot added the wontfix label Apr 16, 2022
tdf-gerrit pushed a commit to LibreOffice/core that referenced this issue Jan 2, 2023
The immediate reason for this is that pipes are broken in the
Emscripten runtime, see
emscripten-core/emscripten#13214. But if we
can drop the use of a pipe for other platforms, too, why not.

Without this, when attemting to run Collabora Online as WASM, I get:
Aborted(Assertion failed: nRet == 1, at: .../vcl/headless/svpinst.cxx,538,DoYield)

It is quite possible that the code could be simplified drastically. I
only replaced the use of a pipe with hopefully equivalent use of a
queue, a condition variable, and a mutex.

Change-Id: I9259ba36afeabce6474a1aec827d01bcbbd4412b
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/144944
Reviewed-by: Michael Meeks <michael.meeks@collabora.com>
Tested-by: Jenkins
@emaxx-google
Copy link
Author

Stale bot intends to close this issue, but I believe it'll still be a valuable addition: POSIX doesn't forbid in-process usages of pipe(), and it's not unusual for Linux code to use it as a simple equivalent to mutexes/conditionvariables/queues. Silent deviation from the POSIX behavior makes this harder to discover.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 11, 2023

I guess we could error on pipe creation if O_NONBLOCK is not set? (either return an error or just abort). If thats sounds a like a reasonable solution, perhaps you would like to make a PR? Otherwise, we can leave this open an mark as help wanted and good first bug and somebody should pick it up.

@stale stale bot removed the wontfix label Jan 11, 2023
@emaxx-google
Copy link
Author

emaxx-google commented Jan 11, 2023

I'll probably not have cycles to work on this anymore soon. Erroring and/or logging would definitely be much more developer-friendly! Implementing POSIX behavior would be even better, as this would mean developers don't have to rewrite pipe() calls when porting to Emscripten, but I can't judge how difficult would implementing it be.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 11, 2023

Any kind of blocking behaviour on the web is on the more complex/hard problems we face. So failing on the creation of blocking pipes seems like that most reasonable solution for now.

chase pushed a commit to coparse-inc/libreofficekit that referenced this issue May 30, 2023
The immediate reason for this is that pipes are broken in the
Emscripten runtime, see
emscripten-core/emscripten#13214. But if we
can drop the use of a pipe for other platforms, too, why not.

Without this, when attemting to run Collabora Online as WASM, I get:
Aborted(Assertion failed: nRet == 1, at: .../vcl/headless/svpinst.cxx,538,DoYield)

It is quite possible that the code could be simplified drastically. I
only replaced the use of a pipe with hopefully equivalent use of a
queue, a condition variable, and a mutex.

Change-Id: I9259ba36afeabce6474a1aec827d01bcbbd4412b
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/144944
Reviewed-by: Michael Meeks <michael.meeks@collabora.com>
Tested-by: Jenkins
adamziel added a commit to WordPress/wordpress-playground that referenced this issue Jan 17, 2024
Shims read(2) functionallity by providing an alternative read()
function called wasm_read() that gracefully handles blocking
pipes. This is neecessary because Emscripten does not support
blocking pipes and instead returns EWOULDBLOCK.

See:

* #951
* emscripten-core/emscripten#13214

 ## Testing instructions

Confirm the `proc_open()` tests pass on CI. This PR adjusts a few
of them to make sure the output is read without the tricky sleep()
call.
adamziel added a commit to WordPress/wordpress-playground that referenced this issue Jan 18, 2024
Shims read(2) functionallity by providing an alternative read()
function called wasm_read() that gracefully handles blocking
pipes. This is neecessary because Emscripten returns EWOULDBLOCK
when reading from a blocking pipe, whereas a clang-compiled 
program would wait until data becomes available.

See:

* #951
* emscripten-core/emscripten#13214

## Other changes

This PR also ships:

* A regexp fix for `@php-wasm/cli` to preserve a trailing whitespace
when rewrite PHP-related spawn shell commands.
* A fix to correctly propagate the child process exit code back to PHP.
The WordPress bootstrap script for PHPUnit needs that to work correctly.

## Motivation

### wp-cli

Without this PR, running `wp-cli.phar` doesn't output anything. The
output is piped through a pager like `less` using `proc_open` and then
displayed by reading from a blocking pipe.

With this PR, running `wp-cli.phar` returns its regular help message as
the pager pipe can be read:

```
NAME
  wp
DESCRIPTION
  Manage WordPress through the command-line.
SYNOPSIS
  wp <command>
SUBCOMMANDS
  cache                 Adds, removes, fetches, and flushes the WP Object Cache
                        object.
  cap                   Adds, removes, and lists capabilities of a user role.
  cli                   Reviews current WP-CLI info, checks for updates, or
                        views defined aliases.
...
```

### PHPUnit

With this PR, Playground can run PHPunit on `wordpress-develop`!

```
TMPDIR=/tmp PHP=8.2 node --loader ../plugins/playground/packages/nx-extensions/src/executors/built-script/loader.mjs ../plugins/playground/dist/packages/php-wasm/cli/main.js ./vendor/bin/phpunit --no-coverage -v
(node:87723) ExperimentalWarning: Custom ESM Loaders is an experimental feature. This feature could change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
Installing...
aaabb
int(0)
Running as single site... To run multisite, use -c tests/phpunit/multisite.xml
Not running ajax tests. To execute these, use --group ajax.
Not running ms-files tests. To execute these, use --group ms-files.
Not running external-http tests. To execute these, use --group external-http.
PHPUnit 9.6.15 by Sebastian Bergmann and contributors.

Runtime:       PHP 8.2.10-dev
Configuration: /Users/cloudnik/www/Automattic/core/wordpress-develop/phpunit.xml.dist
Warning:       Your XML configuration validates against a deprecated schema.
Suggestion:    Migrate your XML configuration using "--migrate-configuration"!

...........................................................    59 / 17622 (  0%)
...........................................................   118 / 17622 (  0%)
............................FFFFFF.........................   177 / 17622 (  1%)
...........................................................   236 / 17622 (  1%)
```

## Testing instructions

Confirm the `proc_open()` tests pass on CI. This PR adjusts a few
of them to make sure the output is read without the tricky sleep()
call.

Related:

* #827
* #930

CC @mho22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants