-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generating checksums with duplicates plugin fails #2873
Comments
Interesting! I don't have an obvious explanation for what could be going wrong—maybe someone (perhaps you) can do some digging to find out what ffmpeg's output is and why it's exiting with an error when we invoke it. |
As mentioned running the ffmpeg command directly produces a crc checksum without a non zero exit status, also the problem seems to be beyond ffmpeg as the same error occurs with the suggested md5sum command (which also functions fine when executed directly on the command line). |
Right—so the question is what it’s doing (and what its output is) when it’s invoked by beets instead of manually. |
So, maybe this is a Python thing, but I find this suspicious:
Same as in OP -- what's with the spurious "b" in front of the file name? No matter which command I pass in (sha512sum, md5sum, ffmpeg), they all exit status 1 -- and if duplicates really is putting a "b" in there, no wonder, because that's not the Edit: copy/paste included some newlines; removed those for accuracy. |
Indeed, that seems to be the problem. In Python 3, because the file path is a Changing the relevant line from: args = [p.format(file=item.path) for p in shlex.split(prog)] to args = [p.format(file=item.path.decode('utf-8')) for p in shlex.split(prog)] seems to fix things. A few problems remain, however. Both $ md5sum ~/my/file.mp3
ea187811890ede95aa618ecba4f27f57 ./my/file.mp3 Because beets uses this output to determine duplicates, it's never going to mark anything as a duplicate. Additionally, because beets caches the checksums (using the first argument of the command), if you somehow mistype your checksum command, once you've cached bad fingerprints from |
Thanks for investigating! It seems like you're on the right track. However, not all filenames are encoded with UTF-8, so just using a hard-coded That problem with including filenames in the output does seem bad! Maybe we should change the advice to instead recommend that people somehow pipe data into |
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This is still a problem for me. |
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I'm not sure what needs to be done here, just keeping it open. |
I think this issue is solved in the convert plugin: Lines 207 to 233 in b659ad6
So the solution is probably to move this to a separate function in beets.util and apply that in the duplicates plugin.
I think that's not easily doable without writing a separate script and passing it as the argument to beet dup -C 'sh -c "md5sum < \"$1\"" {file}' ...
beet dup -C 'sh -c "md5sum \"$1\" | awk '"'"'{print $1}'"'"'" {file}' ... Note that if the plugin would run the So maybe the advice should be to create a script #! /usr/bin/env sh
md5sum < "$1" or #! /usr/bin/env sh
md5sum "$1" | awk '{printf $1}' and use it with beet dup -C 'myscript {file}' ... I'll remove the |
Thanks, @wisp3rwind! I think you have the right fix there. |
This is an issue for me as well. |
I have replaced line 200 with the following block and it is now computing checksums, currently testing on Ubuntu 20.04, Python 3.8 if not six.PY2:
if platform.system() == 'Windows':
args = [p.format(file=item.path.decode(util._fsencoding()))
for p in shlex.split(prog)]
else:
args = [p.format(file=item.path.decode(util.arg_encoding(),
'surrogateescape')) for p in shlex.split(prog)] I tried to add a prog = prog.decode(util.arg_encoding(), 'surrogateescape')) but I got an error:
I am not sure if the prog needs the decoding? Thoughts? |
Problem
I can't get the duplicates plugin to generate checksums (neither CRC nor md5sum) when following the examples suggested in the documentation: https://beets.readthedocs.io/en/stable/plugins/duplicates.html
Running this command in verbose (
-vv
) mode:Led to this problem:
However running directly from ffmpeg produces a checksum and exits with 0:
$ ffmpeg -i '/media/mediacentre/Mediacentre/Music/Library/Richard Hell & The Voidoids/(1977) Blank Generation/07 Blank Generation.flac' -f crc -
Setup
My configuration (output of
beet config
) is:The text was updated successfully, but these errors were encountered: