Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for inlinling via the id-hash #5281

Merged
merged 3 commits into from
Aug 24, 2018
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion core/commands/add.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ import (
cmds "gx/ipfs/QmPTfgFTo9PFr1PvPKyKoeMgBvYPh6cX3aDP7DHKVbnCbi/go-ipfs-cmds"
mh "gx/ipfs/QmPnFwZ2JXKnXgMw8CdBPxn7FWh6LLdjUjxV1fKHuJnkr8/go-multihash"
pb "gx/ipfs/QmPtj12fdwuAqj9sBSTNUxBNu8kCGNp8b3o8yUzMm5GHpq/pb"
cidutil "gx/ipfs/QmPyxJ2QS7L5FhGkNYkNcXHGjDhvGHueJ4auqAstFHYxy5/go-cidutil"
cmdkit "gx/ipfs/QmSP88ryZkHSRn1fnngAaV2Vcn63WUJzAavnRM9CVdU1Ky/go-ipfs-cmdkit"
files "gx/ipfs/QmSP88ryZkHSRn1fnngAaV2Vcn63WUJzAavnRM9CVdU1Ky/go-ipfs-cmdkit/files"
offline "gx/ipfs/QmZxjqR9Qgompju73kakSoUj3rbVndAzky3oCDiBNCxPs1/go-ipfs-exchange-offline"
Expand All @@ -44,6 +45,8 @@ const (
fstoreCacheOptionName = "fscache"
cidVersionOptionName = "cid-version"
hashOptionName = "hash"
inlineOptionName = "inline"
inlineLimitOptionName = "inline-limit"
)

const adderOutChanSize = 8
Expand Down Expand Up @@ -120,6 +123,8 @@ You can now check what blocks have been created by:
cmdkit.BoolOption(fstoreCacheOptionName, "Check the filestore for pre-existing blocks. (experimental)"),
cmdkit.IntOption(cidVersionOptionName, "CID version. Defaults to 0 unless an option that depends on CIDv1 is passed. (experimental)"),
cmdkit.StringOption(hashOptionName, "Hash function to use. Implies CIDv1 if not sha2-256. (experimental)").WithDefault("sha2-256"),
cmdkit.BoolOption(inlineOptionName, "Inline small blocks into CIDs. (experimental)"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops sorry. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Np. I didn't want to bother you with it (and then made more work for myself by forgetting about GitCop 🙄).

cmdkit.IntOption(inlineLimitOptionName, "Maximum block size to inline. (experimental)").WithDefault(64),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whyrusleeping says: don't add this right now, just make it 64. If users complain, we can revisit it later; this is the most conservative approach and should be good enough for pretty much all cases. That fine with you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to know the reasons for this. Is it the change in the API string because of the use of WithDefault() or something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning is simply "fewer knobs" (or "be conservative"). That is, we don't really need this knob so we might as well add it later when needed (when a user presents a use-case for changing the default).

Copy link
Contributor Author

@kevina kevina Aug 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't necessarily agree with that philosophy but won't object strongly.

I can see a case of wanting either 32 or 64 and don't think there is a clear answer. I am okay with limiting it to either 32 or 64 if you think that will be better.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User here looking forward to this feature, I'm mirroring a large-ish dataset using filestore to avoid disk duplication, in this case there is a few files being updated often which is causing issues with missing data, these are 41-42 bytes in size. example: http://distfiles.gentoo.org/experimental/timestamp.x
So minimal size here is 64 bytes - but I don't see any good reason to not add the option for the user to change this. Not adding this option could also result in assumptions being made about the actual max of ident hashes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that we may actually need to define a limit on the maximum CID size and 64 bytes is really long (see my comment below). However, having to define a new block for a ~40 byte file really does suck.

Let's wait for the browser working group to comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes 64 bytes is long, but if it is part of a directory entry, the user is ever unlikely to need to actual CID and it won't show up as part of the URL. All the more reason to keep this option.

I could change the default to 32 though so it won't create a problem. When someone knows what they are doing they can increase it to suit there needs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless a user just adds a file directly. E.g., /ipfs/z3Fxd5vkFxAu4YbpXLBj8EvUCEeXLhNRGaYpEL43EHt99kaB82LfGbEVBqUJPaSZsHtJANxV8EGDZ

Copy link
Contributor Author

@kevina kevina Aug 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless a user just adds a file directly.

Correct. So make the default 32 to avoid increasing the size of hashes by default. When a user has a directory with lots of small files they want inline they can increase the limit .

Copy link
Member

@Stebalien Stebalien Aug 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to talk to the browser people. I know there were talks about supporting https://CID.dweb.link and that won't work for long CIDs. The current plan is ipfs://CID and there may be length restrictions.

(nvm, didn't see the response from lidel)

},
PreRun: func(req *cmds.Request, env cmds.Environment) error {
quiet, _ := req.Options[quietOptionName].(bool)
Expand Down Expand Up @@ -173,6 +178,8 @@ You can now check what blocks have been created by:
fscache, _ := req.Options[fstoreCacheOptionName].(bool)
cidVer, cidVerSet := req.Options[cidVersionOptionName].(int)
hashFunStr, _ := req.Options[hashOptionName].(string)
inline, _ := req.Options[inlineOptionName].(bool)
inlineLimit, _ := req.Options[inlineLimitOptionName].(int)

// The arguments are subject to the following constraints.
//
Expand Down Expand Up @@ -279,7 +286,14 @@ You can now check what blocks have been created by:
fileAdder.Silent = silent
fileAdder.RawLeaves = rawblks
fileAdder.NoCopy = nocopy
fileAdder.CidBuilder = &prefix
fileAdder.CidBuilder = prefix

if inline {
fileAdder.CidBuilder = cidutil.InlineBuilder{
Builder: fileAdder.CidBuilder,
Limit: inlineLimit,
}
}

if hash {
md := dagtest.Mock()
Expand Down
2 changes: 1 addition & 1 deletion test/sharness/t0040-add-and-cat.sh
Original file line number Diff line number Diff line change
Expand Up @@ -590,7 +590,7 @@ test_add_cat_expensive "--cid-version=1" "zdj7WcatQrtuE4WMkS4XsfsMixuQN2po4irkYh
# encoded with the blake2b-256 hash funtion
test_add_cat_expensive '--hash=blake2b-256' "zDMZof1kwndounDzQCANUHjiE3zt1mPEgx7RE3JTHoZrRRa79xcv"

test_add_named_pipe " Post http://$API_ADDR/api/v0/add?chunker=size-262144&encoding=json&hash=sha2-256&pin=true&progress=true&recursive=true&stream-channels=true:"
test_add_named_pipe " Post http://$API_ADDR/api/v0/add?chunker=size-262144&encoding=json&hash=sha2-256&inline-limit=64&pin=true&progress=true&recursive=true&stream-channels=true:"

test_add_pwd_is_symlink

Expand Down
35 changes: 35 additions & 0 deletions test/sharness/t0046-id-hash.sh
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,41 @@ test_expect_success "can still fetch it" '
test_cmp junk.txt actual
'

test_expect_success "ipfs add --inline works as expected" '
echo $ID_HASH0_CONTENTS > afile &&
HASH=$(ipfs add -q --inline afile)
'

test_expect_success "ipfs add --inline uses id multihash" '
MHTYPE=`cid-fmt %h $HASH`
echo "mhtype is $MHTYPE"
test "$MHTYPE" = id
'

test_expect_success "ipfs add --inline --raw-leaves works as expected" '
echo $ID_HASH0_CONTENTS > afile &&
HASH=$(ipfs add -q --inline --raw-leaves afile)
'

test_expect_success "ipfs add --inline --raw-leaves outputs the correct hash" '
echo "$ID_HASH0" = "$HASH" &&
test "$ID_HASH0" = "$HASH"
'

test_expect_success "create 1000 bytes file and get its hash" '
random 1000 2 > 1000bytes &&
HASH0=$(ipfs add -q --raw-leaves --only-hash 1000bytes)
'

test_expect_success "ipfs add --inline --raw-leaves works as expected on large file" '
HASH=$(ipfs add -q --inline --raw-leaves 1000bytes)
'

test_expect_success "ipfs add --inline --raw-leaves outputs the correct hash on large file" '
echo "$HASH0" = "$HASH" &&
test "$HASH0" = "$HASH"
'

test_expect_success "enable filestore" '
ipfs config --json Experimental.FilestoreEnabled true
'
Expand Down