Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] FFI - require('dlopen') - libuv dynamic library loader binding #1762

Closed
wants to merge 17 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/api/_toc.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
* [Console](console.html)
* [Crypto](crypto.html)
* [Debugger](debugger.html)
* [Dlopen](dlopen.html)
* [DNS](dns.html)
* [Domain](domain.html)
* [Errors](errors.html)
Expand Down
92 changes: 92 additions & 0 deletions doc/api/dlopen.markdown
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# dlopen

Stability: 1 - Experimental

Dynamic library loader module.

Use `require('dlopen')` to access this module.

## dlopen.extension

* String

File extension used for dynamic libraries.

For example, on Mac OS X:

console.log(dlopen.extension);
// '.dylib'

## dlopen.dlopen(name)

* `name` String or `null` - library filename
* Return: Buffer - backing store for the `uv_lib_t` instance

Load and link a dynamic library with filename `name`.
If `null` is given as the name, then the current node process is
dynamically loaded instead (i.e. you can load symbols already
loaded into memory).

Example:

var libc = dlopen.dlopen('libc.so');
// <Buffer 98 b8 a9 6c ff 7f 00 00 00 00 00 00 00 00 00 00>

// null for the current process' memory
var currentProcess = dlopen.dlopen(null);
// <Buffer fe ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00>

// error is thrown if something goes wrong
dlopen.dlopen('libdoesnotexist.so')
// Error: dlopen(libdoesnotexist.so, 1): image not found
// at Object.dlopen (dlopen.js:9:11)

## dlopen.dlclose(lib)

* `lib` Buffer - the buffer previously returned from `dlopen()`

Closes dynamic library `lib`.

Example:

dlopen.dlclose(libc);

## dlopen.dlsym(lib, namem)

* `lib` Buffer - the buffer previously returned from `dlopen()`
* `name` String - name of the symbol to retrieve from `lib`
* Return: Buffer - a pointer-sized buffer containing the address of `name`

Get the memory address of symbol `name` from dynamic library `lib`.
A new Buffer instance is returned containing the memory address of
the loaded symbol.

Almost always, you will call one of the Buffer `readPointer*()`
functions on the returned buffer in order to interact with the symbol
further.

Example:

var absSymPtr = dlopen.dlsym(libc, 'abs');
// <Buffer 73 75 7d 98 ff 7f 00 00>

// error is thrown if symbol does not exist
dlopen.dlsym(libc, 'doesnotexist')
// Error: dlsym(0x7fff6ad9f898, doesnotexist): symbol not found
// at Object.dlsym (dlopen.js:24:11)

## dlopen.dlerror(lib)

* `lib` Buffer - the buffer previously returned from `dlopen()`
* Return: String - most recent error that has occured on `lib`

Get previous error message from dynamic library `lib`.

You most likely won't need to use this function, since `dlopen()`
and `dlsym()` use them internally when something goes wrong.


Example:

dlopen.dlerror(libc)
// 'no error'
42 changes: 42 additions & 0 deletions lib/dlopen.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
'use strict';

const binding = process.binding('dlopen');

exports.dlopen = function dlopen(name) {
var lib = new Buffer(binding.sizeof_uv_lib_t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be reconsidered because of the upcoming V8 changes. Here you'd essentially be using a Uint8Array(). Problem is that with the new implementation we are going to allow V8'd GC to handle this automatically, and there's no way to use conditional logic to make the typed array external and persistent on instantiation. Best you might be able to do is pass the new instance to another function that does this for you.

But for performance an simplifying code complexity I would instead suggest returning a Persistent<External>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm obviously not getting something here. What exactly is the implied change here from the switch to Uint8Array? As long as the C++ layer can still unwrap to the bare char * then I think it shouldn't really matter. The GC handled Buffers before as well so I'm not sure what the change here will be.

Also this has basically been the recommended way of allocating memory for structs/etc. in node-ffi for years now, so I would hate for that to be an issue 😀

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this has basically been the recommended way of allocating memory for structs/etc. in node-ffi for years now, so I would hate for that to be an issue

This hinges on whether you're alright with V8 cleaning up the pointer, or if you'd like manual control?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling new Buffer() has always implied V8/node cleaning up the memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember you requested we allow a native module to supply the weak callback. Is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry man, still not sure what you're driving at.

Are you worried about automating dlclose() with when lib get GC'd? I'm not so worried about that, I consider that the user's responsibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned with making sure the new Buffer implementation meets all your needs. When Buffer moves to Uint8Array it will not have a weak callback available. Anyone who wishes to use one will have to externalize the object and setup the weak callback on their own.

I'm trying to recall a conversation we had 3+ years ago when I was doing my first rewrite of Buffer. I vaguely recall that for certain cases you wanted access to the weak callback because the lifetime of the pointer might have been different than that of the Object.


if (binding.dlopen(name, lib) !== 0) {
throw new Error(exports.dlerror(lib));
}

return lib;
};

exports.dlclose = function dlclose(lib) {
return binding.dlclose(lib);
};

exports.dlsym = function dlsym(lib, name) {
// TODO: use `sizeof.pointer` for buffer size when nodejs/io.js#1759 is merged
var sym = new Buffer(8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a security risk. Won't it allow me to arbitrarily change the pointer that will be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just ends up containing the memory address of the symbol. You basically always do a readPointer(0, byteLength) call on this resulting buffer. You wouldn't do any damage altering the contents of this buffer, just change the memory of the buffer contents itself which is pretty useless.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure I understand, it's the void* that's assigned to the new Buffer(), not the 8 bytes I see allocated here, that is the needed memory address?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory address of the dlsym'd void* gets written to the contents of the new Buffer(). A number gets written to it basically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that's where I'm confused. Being able to see the bytes of the address would mean that I can rewrite them, correct? Sorry. I'm just really missing something here.


if (binding.dlsym(lib, name, sym) !== 0) {
throw new Error(exports.dlerror(lib));
}

return sym;
};

exports.dlerror = function dlerror(lib) {
return binding.dlerror(lib);
};

exports.extension = {
linux: '.so',
sunos: '.so',
solaris: '.so',
freebsd: '.so',
openbsd: '.so',
darwin: '.dylib',
win32: '.dll'
}[process.platform];
8 changes: 4 additions & 4 deletions lib/repl.js
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,10 @@ function hasOwnProperty(obj, prop) {
exports.writer = util.inspect;

exports._builtinLibs = ['assert', 'buffer', 'child_process', 'cluster',
'crypto', 'dgram', 'dns', 'domain', 'events', 'fs', 'http', 'https', 'net',
'os', 'path', 'punycode', 'querystring', 'readline', 'stream',
'string_decoder', 'tls', 'tty', 'url', 'util', 'v8', 'vm', 'zlib',
'smalloc'];
'crypto', 'dgram', 'dlopen', 'dns', 'domain', 'events', 'fs', 'http',
'https', 'net', 'os', 'path', 'punycode', 'querystring', 'readline',
'stream', 'string_decoder', 'tls', 'tty', 'url', 'util', 'v8', 'vm',
'zlib', 'smalloc'];


const BLOCK_SCOPED_ERROR = 'Block-scoped declarations (let, ' +
Expand Down
14 changes: 13 additions & 1 deletion node.gyp
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
'lib/crypto.js',
'lib/cluster.js',
'lib/dgram.js',
'lib/dlopen.js',
'lib/dns.js',
'lib/domain.js',
'lib/events.js',
Expand Down Expand Up @@ -108,6 +109,7 @@
'src/node_buffer.cc',
'src/node_constants.cc',
'src/node_contextify.cc',
'src/node_dlopen.cc',
'src/node_file.cc',
'src/node_http_parser.cc',
'src/node_javascript.cc',
Expand Down Expand Up @@ -649,6 +651,16 @@
'sources': [
'test/cctest/util.cc',
],
}
},

# "libtest" dynamic library for "dlopen" tests
{
'target_name': 'test',
'type': 'shared_library',
'product_prefix': 'lib',
'sources': [
'test/libtest/libtest.c'
],
},
] # end targets
}
117 changes: 117 additions & 0 deletions src/node_dlopen.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
#include "node.h"
#include "node_buffer.h"
#include "v8.h"
#include "env.h"
#include "env-inl.h"

namespace node {
namespace dlopen {

using v8::Boolean;
using v8::Context;
using v8::FunctionCallbackInfo;
using v8::Handle;
using v8::Integer;
using v8::Local;
using v8::Number;
using v8::Object;
using v8::String;
using v8::Value;
using v8::Uint32;


static void Dlopen(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);

const char* filename;
if (args[0]->IsNull()) {
filename = nullptr;
} else if (args[0]->IsString()) {
node::Utf8Value name(env->isolate(), args[0]);
filename = *name;
} else {
return env->ThrowTypeError(
"expected a string filename or null as first argument");
}

if (!Buffer::HasInstance(args[1]))
return env->ThrowTypeError("expected a Buffer instance as second argument");

Local<Object> buf = args[1].As<Object>();
uv_lib_t* lib = reinterpret_cast<uv_lib_t*>(Buffer::Data(buf));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where I'd, instead of using Buffer, would create a Local<External> to which you pass the memory pointer. Then it would need to be made a Persistent<External>, and then made weak so you can dispose of the memory when it's cleaned up by GC. That would be a reliably future proof way to make this call.

Because once Buffer switches to using Uint8Array you'll have to use Uint8Array::New(). If you'd like V8 to manage the lifetime of the pointer you can then pass in kInternalized. In which case the data will need to be accessed via ArrayBuffer::Externalize() (an API that doesn't exist until v4.3). Of you can decide to manage the lifetime of the object. In that case it will be created externalized. Then you'll still need to create a Persistent<Uint8Array> and make it weak so the pointer can be cleaned up.

Problem is that the former ArrayBuffer approach isn't viable until v4.3 is released. Not sure of any caveats with the later. But either way I'd prefer to not have to rewrite this when that time comes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing some analysis to get a better idea if this could be properly abstracted away.


int r = uv_dlopen(filename, lib);

args.GetReturnValue().Set(r);
}


static void Dlclose(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);

if (!Buffer::HasInstance(args[0]))
return env->ThrowTypeError("expected a Buffer instance as first argument");

Local<Object> buf = args[0].As<Object>();
uv_lib_t* lib = reinterpret_cast<uv_lib_t*>(Buffer::Data(buf));

uv_dlclose(lib);
}


static void Dlsym(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);

if (!Buffer::HasInstance(args[0]))
return env->ThrowTypeError("expected a Buffer instance as first argument");
if (!args[1]->IsString())
return env->ThrowTypeError("expected a string as second argument");
if (!Buffer::HasInstance(args[2]))
return env->ThrowTypeError("expected a Buffer instance as third argument");

Local<Object> buf = args[0].As<Object>();
uv_lib_t* lib = reinterpret_cast<uv_lib_t*>(Buffer::Data(buf));

void* sym;
node::Utf8Value name(env->isolate(), args[1]);
int r = uv_dlsym(lib, *name, &sym);

Local<Object> sym_buf = args[2].As<Object>();

memcpy(Buffer::Data(sym_buf), &sym, sizeof(sym));

args.GetReturnValue().Set(r);
}


static void Dlerror(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);

if (!Buffer::HasInstance(args[0]))
return env->ThrowTypeError("expected a Buffer instance as first argument");

Local<Object> buf = args[0].As<Object>();
uv_lib_t* lib = reinterpret_cast<uv_lib_t*>(Buffer::Data(buf));

args.GetReturnValue().Set(OneByteString(env->isolate(), uv_dlerror(lib)));
}


void Initialize(Handle<Object> target,
Handle<Value> unused,
Handle<Context> context) {
Environment* env = Environment::GetCurrent(context);
env->SetMethod(target, "dlopen", Dlopen);
env->SetMethod(target, "dlclose", Dlclose);
env->SetMethod(target, "dlsym", Dlsym);
env->SetMethod(target, "dlerror", Dlerror);

target->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "sizeof_uv_lib_t"),
Uint32::NewFromUnsigned(env->isolate(),
static_cast<uint32_t>(sizeof(uv_lib_t))));
}

} // namespace dlopen
} // namespace node

NODE_MODULE_CONTEXT_AWARE_BUILTIN(dlopen, node::dlopen::Initialize)
27 changes: 27 additions & 0 deletions test/libtest/libtest.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#include <stdio.h>
#include <stdint.h>

#if defined(WIN32) || defined(_WIN32)
#define EXPORT __declspec(dllexport)
#else
#define EXPORT
#endif

EXPORT int six = 6;

EXPORT void* n = NULL;

EXPORT char str[] = "hello world";

EXPORT uint64_t factorial(int max) {
int i = max;
uint64_t result = 1;

while (i >= 2) {
result *= i--;
}

return result;
}

EXPORT uintptr_t factorial_addr = (uintptr_t)factorial;
49 changes: 49 additions & 0 deletions test/parallel/test-dlopen.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
'use strict';
var common = require('../common');
var assert = require('assert');
var path = require('path');
var endianness = require('os').endianness();

var dl = require('dlopen');

var root = path.join(__dirname, '..', '..');
var libPath = path.join(root, 'out', 'Release', 'libtest' + dl.extension);
console.log(libPath);

var libtest = dl.dlopen(libPath);
console.log(libtest);

// EXPORT int six = 6
var sixSymPtr = dl.dlsym(libtest, 'six');
// TODO: use `sizeof.int`
var sixSym = sixSymPtr['readPointer' + endianness](0, 4);
assert.equal(6, sixSym['readInt' + (4 * 8) + endianness](0));

// EXPORT void* n = NULL;
var nSymPtr = dl.dlsym(libtest, 'n');
// TODO: use `sizeof.pointer`
var nSym = nSymPtr['readPointer' + endianness](0, 8);
assert.strictEqual(null, nSym['readPointer' + endianness](0));

// EXPORT char str[] = "hello world";
var strSymPtr = dl.dlsym(libtest, 'str');
// XXX: We need a way to read a null-terminated array :( :( :(
var strSym = strSymPtr['readPointer' + endianness](0, 12);
assert.equal('hello world', strSym.toString('ascii', 0, 11));
assert.equal(0, strSym[11]);

// EXPORT uint64_t factorial(int max)
var factorialSymPtr = dl.dlsym(libtest, 'factorial');
// TODO: use `sizeof.pointer`
var factorialSym = factorialSymPtr['readPointer' + endianness](0, 0);

// EXPORT intptr_t factorial_addr = (intptr_t)factorial;
var factorialAddrSymPtr = dl.dlsym(libtest, 'factorial_addr');
var factorialAddrSym = factorialAddrSymPtr['readPointer' + endianness](0, 8);
var factorialSym2 = factorialAddrSym['readPointer' + endianness](0, 0);

assert.equal(factorialSym.address(), factorialSym2.address());


// we're done ☺
dl.dlclose(libtest);