Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deno support #3420

Open
6 tasks
rth opened this issue Jan 4, 2023 · 16 comments
Open
6 tasks

Deno support #3420

rth opened this issue Jan 4, 2023 · 16 comments

Comments

@rth
Copy link
Member

rth commented Jan 4, 2023

Deno support was discussed in #1477 (comment) and I think it would be good to add it and have some minimal tests in CI. As discussed in the linked PR only a few minor fixes are needed. I'll open a PR shortly.

Personally, the use case I'm interested in that could benefit from running in Deno is server side sandboxing #869 (comment)

What's left is:

  • Support running in Deno Draft: Improve Deno compatibility #3421
  • Adapt src/js/streams.ts for Deno
  • Add support for deno runtime to pytest-pyodide
  • Add some tests with Deno in CI
  • Update documentation
  • (optional) Make deno work with local file (without network access) if possible for sandboxing
@rajsite
Copy link
Contributor

rajsite commented Apr 18, 2023

@rth know if this issue is still moving forward? I'm really interested in the Deno server side sandboxing use case as well.

@rth
Copy link
Member Author

rth commented Apr 19, 2023

I haven't found much time to work on it lately, but it's still something I want to do.

So if you are interested in hacking on it and continue #3421 or #3428 that would certainly be appreciated :)

@rajsite
Copy link
Contributor

rajsite commented Apr 27, 2023

I experimented with a different direction and it looks really promising! Deno has been adding npm package support so I tried loading pyodide in Deno as a CommonJS module.

I had to make a patch to pyodide.js and pyodide.asm.js as the IN_NODE heuristic was failing in Deno's npm mode:

image

Looks like the process.release.name property is not defined on the process object in Deno's npm compatibility mode and I filed an issue to track it: denoland/deno#18870

But with that patch I can run pyodide in Deno 🎉

One can reproduce it with the current [email protected] release by doing the following:

  1. Create a main.ts with the contents:

    // Note directly importing to the path of the CommonJS module
    // as importing the bare `npm:pyodide` resolves to the esm module
    // which seems to have a lot more changes needed for Deno support
    import {loadPyodide} from "npm:pyodide/pyodide.js";
    
    const pyodide = await loadPyodide();
    await pyodide.loadPackage('micropip');
    const micropip = pyodide.pyimport("micropip");
    await micropip.install('snowballstemmer');
    const result = await pyodide.runPythonAsync(`
    import snowballstemmer
    stemmer = snowballstemmer.stemmer('english')
    stemmer.stemWords('go goes going gone'.split())
    `);
    
    console.log(result.toString());
  2. Create a local node_modules folder by running deno cache --node-modules-dir main.ts and apply the patch described above to node_modules/pyodide/pyodide.js and node_modules/pyodide/pyodide.asm.js

  3. Run the script with deno run --node-modules-dir -A main.ts

During debugging in pyodide.asm.js I found an additional node heuristic already in-use that seems compatible with Deno today and is coming from emscripten:

var ENVIRONMENT_IS_NODE = typeof process == 'object' && typeof process.versions == 'object' && typeof process.versions.node == 'string';

💭 @rth Would y'all be open to changing the IN_NODE check to use the same pattern as emscripten's ENVIRONMENT_IS_NODE check? If so I can make a PR for that.

My ultimate goal however was to have a full local copy of pyodide I can run sandboxed and I found a partial pattern for that. Fundamentally I wanted to use Deno's npm support to load a local pyodide CommonJS module and several approaches I tried didn't work out:

The workaround I did find does the following:

  1. Load pyodide.js and pyodide.asm.js from npm
  2. Use indexURL to reference the local full copy of pyodide

The main.ts script for that workflow looks like:

import { fromFileUrl } from 'https://deno.land/[email protected]/path/mod.ts';

// Manually load pyodide.asm.js first so it registers itself on the global
// The pyodide.js file tries to load it automatically if it's not on the global
// but the way it does it seems to cause it to run outside of Deno's npm mode
await import("npm:pyodide/pyodide.asm.js");
const {loadPyodide} = await import("npm:pyodide/pyodide.js");

const pyodide = await loadPyodide({
  // The full release download of pyodide aligned to the npm imported version
  // saved to a directory named pyodide-local
  indexURL: fromFileUrl(new URL('./pyodide-local/', import.meta.url).href)
});
await pyodide.loadPackage('micropip');
const micropip = pyodide.pyimport("micropip");
await micropip.install('numpy');

const result = await pyodide.runPythonAsync(`
import numpy as np
a = np.arange(15).reshape(3, 5)
a
`);

console.log(result.toString());
// ⚠️Make sure to use the patches described in the other workflow if they are still needed
// 🎉 Run python in the deno sandbox! 🐍🦖
// deno run --node-modules-dir --allow-read=. main.ts

I don't have a super clear direction for removing the workarounds in the local full pyodide workflow. I think getting the CommonJS pyodide.js to import pyodide.asm.js via node require could help. But overall exciting to actually get it sandboxed and running in Deno!

@hoodmane
Copy link
Member

Sure we can change the node detection. We should also really allow the runtime to be controlled explicitly via an argument to loadPyodide...

@rth
Copy link
Member Author

rth commented Apr 27, 2023

That's great @rajsite ! Yes, please make a PR to change it in js/compat.ts I think it should change it in both pyodide.js and pyodide.asm.js (or at least for the latter I don't see where else it could be coming from, that variable is not defined by emscipten). Ideally would be also to add a CI job similar to this that would maybe just run deno on your example to start. We can integrate it in pytest-pyodide at a later stage.

If you manage to make it run with IN_NODE case, that would certainly be better than having to define IN_DENO separately as I tried to do.

We should also really allow the runtime to be controlled explicitly via an argument to loadPyodide...

Are you sure? The runtime is a property where one runs. IMO it doesn't make sense to allow running in the browser and using the code written for Node.

@rth
Copy link
Member Author

rth commented Apr 27, 2023

My ultimate goal however was to have a full local copy of pyodide I can run sandboxed and I found a partial pattern for that.

Yes, I agree that this would be end goal for most people interested in server sandboxing. But so if you are loading pyodide.js and pyodide.asm.js from npm, doesn't it mean that you need to allow network access to npm and disable the network sanboxing? One can't whitelist URLs as far as I understand?

--allow-net= Allow network access. You can specify an optional, comma-separated list of IP addresses or hostnames (optionally with ports) to provide an allow-list of allowed network addresses.

I think once deno support is merged, we may want to add a separate documentation section on sandboxing. Even if it has such workarounds initially it's probably not a major issue. The other thing people would likely care about is startup time, particularly when compared to other server side wasm solutions. In the case for instance, where one would start a new deno process with Pyodide for each untrusted code chunk one would want to run in a sanbdox.

@rth
Copy link
Member Author

rth commented Apr 27, 2023

Though this still doesn't tell us how to load a locally built copy of pyodide that's not uploaded to npm. If it works when loaded from npm, I imagine it should also work if served from a local server, or is npm loader doing something special? But to load it all from local files, I guess more work would be necessary?

@rajsite
Copy link
Contributor

rajsite commented Apr 28, 2023

Tried applying that patch to main and I think the esbuild change is causing Deno to treat the export from pyodide.js as a default export instead of a named export.

import pyodideModule from "npm:pyodide/pyodide.js"; /* local build of main with patch*/
const {loadPyodide} = pyodideModule;

I'm not exactly sure if its a Deno-specific issue or if that would be a break in behavior elsewhere. Doesn't seem to have changed node behavior via require so maybe it's fine.

@rajsite
Copy link
Contributor

rajsite commented Apr 30, 2023

Sure we can change the node detection [...] Yes, please make a PR to change it in js/compat.ts

Done! Created a PR.

Ideally would be also to add a CI job [...] We can integrate it in pytest-pyodide at a later stage.

Sure! I made it a separate Draft PR since there is probably some bike shedding needed and I'll add the docs discussed: #3810

But so if you are loading pyodide.js and pyodide.asm.js from npm, doesn't it mean that you need to allow network access to npm and disable the network sandboxing?

Part of Deno's shtick is that statically analyzable imports can be done safely by the deno process outside of script execution. So all the static imports and the statically analyzable dynamic imports are fetched by Deno outside of the context of the permission sandbox. The permission sandbox is for the actual code that runs doing fetch of network resources and dynamic imports of non-statically analyzable import names. See Integrity Checking and Vendoring Dependencies.

I think the integrity checking on npm packages is good today; npm packages show up in the Deno.lock file. But vendoring for npm packages is still on the roadmap. My guess is npm vendoring will just end up looking like the --node-modules-dir flag workflow discussed below.

Also in the testing draft PR above notice how the run:smoke-test task is able to run with only the allow-read permission after the local cache has been downloaded and patched with the local build output.

Though this still doesn't tell us how to load a locally built copy of pyodide that's not uploaded to npm.

The npm:specifiers topic --node-modules-dir flag covers a workflow where you sync an exisiting package but patch the contents. With that you need to have an exisiting published package (for name and dependency resolution) but then you can replace the contents of the package in the local cache to whatever you need.

That's the workflow used in the smoke test draft PR. It's a little janky if, for example, the dependencies needed to change in the package that you are validating against, I don't think you'll get those dependencies updated. But it's probably workable enough for the pyodide CI use-case. At least until improved local package support comes around in one of the issues linked in the previous comment.

I imagine it should also work if served from a local server, or is npm loader doing something special?

Deno added support for a custom npm registry server configured via NPM_CONFIG_REGISTRY. I didn't explore that direction further but it should probably be documented as an option.

@rajsite
Copy link
Contributor

rajsite commented May 10, 2023

Still need some docs but for some example snippets of Deno with pyodide:

These examples use the --node-modules-dir flag so Deno creates a node_modules folder in the current directory that we can sandbox to (otherwise need to give permission to read from the Deno global node_modules cache folder).

Simple example

// example.ts
import pyodideModule from "npm:pyodide/pyodide.js";
const { loadPyodide } = pyodideModule;
const pyodide = await loadPyodide();
const result = await pyodide.runPythonAsync(`
3+4
`);
console.log("result:", result.toString());

Which you can run with deno run --node-modules-dir --allow-read=. example.ts and has output:

result: 7

Example with additional packages

This example show the workflow with packages fetched over the network:

// numpy.ts
import pyodideModule from "npm:pyodide/pyodide.js";
const { loadPyodide } = pyodideModule;
const pyodide = await loadPyodide();
await pyodide.loadPackage("numpy");
const result = await pyodide.runPythonAsync(`
import numpy as np
a = np.arange(15).reshape(3, 5)
a
`);
console.log("result:", result.toString());

The first run we have to give permissions for pyodide to write to fetch the packages over the network and write to it's cache with deno run --node-modules-dir --allow-read=. --allow-write=node_modules --allow-net numpy.ts and output:

Loading numpy
Didn't find package numpy-1.24.2-cp311-cp311-emscripten_3_1_32_wasm32.whl locally, attempting to load from https://cdn.jsdelivr.net/pyodide/v0.23.2/full/
Package numpy-1.24.2-cp311-cp311-emscripten_3_1_32_wasm32.whl loaded from https://cdn.jsdelivr.net/pyodide/v0.23.2/full/, caching the wheel in node_modules for future use.
Loaded numpy
result: array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

Once everything is cached locally we can run with a stricter sandbox like above: with deno run --node-modules-dir --allow-read=. numpy.ts and output:

Loading numpy
Loaded numpy
result: array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

@simonw
Copy link

simonw commented May 10, 2023

@rajsite thanks for that example, I wrote up some notes on an experiment that it inspired: https://til.simonwillison.net/deno/pyodide-sandbox

@vfssantos
Copy link

vfssantos commented Sep 14, 2023

Hey!

I'm trying to use pyodide in an Edge environment (Deno Deploy), and unfortunately, it doesn't support using npm: specifiers for importing modules.
One other approach that worked, though, is to trick this package into thinking it's running in a browser environment, and import the mjs module dynamically:

window.document = {};
const {loadPyodide} = await import("https://www.unpkg.com/pyodide/pyodide.mjs");
const pyodide = await loadPyodide();
const result = await pyodide.runPythonAsync("3+4");
console.log("result:", result.toString()); // expected -> `result: 7`

However, this will load the module dynamically every time instead of doing a pre-cache of the module, considerably increasing the execution time.

But then I thought, if the above code works, why not just consider Deno runtime and Browser as the same in compat?

@rajsite
Copy link
Contributor

rajsite commented Sep 14, 2023

@vfssantos I was able to get pyodide to work in deno deploy here: https://x.com/rajsite/status/1701328734567956824?s=20

Deno deploy recently added support for npm specifiers: https://deno.com/blog/npm-on-deno-deploy

Deno deploy examples with pyodide:

Playground: https://dash.deno.com/playground/pyodide-example
Running example: https://pyodide-example.deno.dev/?multiply=9

Which has snippet:

import pyodideModule from "npm:pyodide/pyodide.js";
const { loadPyodide } = pyodideModule;
const pyodide = await loadPyodide();

Deno.serve(async (req: Request) => {
    const url = new URL(req.url);
    const multiply = parseInt(url.searchParams.get('multiply') ?? 7, 10);
    pyodide.globals.set('x', multiply)
    const result = await pyodide.runPythonAsync(`x * x`);
    return new Response("Hello World: " + result.toString());
});

The issue I ran into is that the Deno Deploy filesystem is readonly so trying to import additional packages fails as it tries to write to the filesystem. Asked about workarounds but haven't heard much discussion yet: #3950 (comment)

Can re-ask here though, any thoughts on the right workaround to get packages to load in memory instead of from disk?
Another thought I just had was deploying the pyodide packages as an npm package. Then it'll all be local on deno-deploy, no fetching at start-up needed (though that would be a chunky npm package) 🤔

@rajsite
Copy link
Contributor

rajsite commented Sep 14, 2023

But then I thought, if the above code works, why not just consider Deno runtime and Browser as the same in compat?

@vfssantos From what I remember the mjs imports didn't work previously! At the time mjs didn't work but I found with minor tweaks the npm: specifier did. If mjs imports are reliable now that's definitely the winning way to go with Deno. Using the npm: / cjs approach has a lot of drawbacks that made it neat but not practical in my environments.

@rth
Copy link
Member Author

rth commented Sep 14, 2023

Another thought I just had was deploying the pyodide packages as an npm package. Then it'll all be local on deno-deploy, no fetching at start-up needed (though that would be a chunky npm package)

This was indeed suggested several times, but the problem is that the packages Pyodide distributes are only a tiny fraction of all packages people use in Pyodide from PyPI (and we aim to upload binary wheels to PyPI in the long term as well). It wouldn't make sense to put all of PyPI (or even a reasonable subset) to npm, so some other solution needs to be found.

Though once you have an application with a fixed set of dependencies, maybe deploying it as an npm package to some private repository could indeed make this use case easier

@vfssantos
Copy link

vfssantos commented Sep 20, 2023

Hey @rajsite !
Regarding Deno Deploy accepting npm: specifiers, wow! That's awesome, great win for Deno Deploy ecosystem!

Regarding loading python packages in Deno Deploy, I figured a hack that kind of works, but without using the npm: specifier. Also, I'm sometimes hitting on the "cpu usage limit" from Deno Deploy:

// One key catch is to statically import the pyodide.asm.js file, because otherwise 
// pyodide would try to import it dynamically from a template string, 
// which Deno Deploy does not support at the moment.
import "https://cdn.jsdelivr.net/pyodide/v0.24.0/full/pyodide.asm.js" // Statically import the pyodide WASM file

// hacks for tricking pyodide into thinking it's running on browsers
globalThis.document={}; 
globalThis.location= new URL(import.meta.url);

// Load pyodide ESM
const { loadPyodide } = await import("https://cdn.jsdelivr.net/pyodide/v0.24.0/full/pyodide.mjs");
const pyodide = await loadPyodide();

// Load Numpy module
await pyodide.loadPackage("numpy");

// Server
Deno.serve(async req=>{
    const res = await pyodide.runPython("import numpy as np\nnp.sum([10,1])");
    return new Response(res)
})

The basic ideia is that, if it works on browsers, it should work on Deno.
And somehow, it is possible to dynamically load modules in the browser without using the filesystem; and so should it be the case for Deno.

The problem here is the following lines of code that are needed, and that prevent pyodide to be statically imported, and therefore increasing CPU time and overall latency in Deno Deploy.

globalThis.document={}; 
globalThis.location= new URL(import.meta.url);

However, I imagine that there might be a possibly simple fix in /src/js/compat in order to treat Deno runtime the same as in browsers and avoid having to add these lines as a workaround and therefor statically importing pyodide module as well.

I hope to be able to look into it soon :)

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants