Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nob_read_entire_file #33

Open
satchelfrost opened this issue Feb 15, 2025 · 3 comments
Open

nob_read_entire_file #33

satchelfrost opened this issue Feb 15, 2025 · 3 comments

Comments

@satchelfrost
Copy link
Contributor

The current implementation of nob_read_entire_file breaks for file sizes ~2 GB while using x86_64-w64-mingw32-gcc. Normally, I wouldn't report such a thing because this is kind of a highly specific use-case; however an older version of nob (which I presumably yoinked from musializer) actually worked in this obscure case.

old working nob_read_entire_file implementation:

bool nob_read_entire_file(const char *path, Nob_String_Builder *sb)
{
    bool result = true;
    size_t buf_size = 32*1024;
    char *buf = NOB_REALLOC(NULL, buf_size);
    NOB_ASSERT(buf != NULL && "Buy more RAM lool!!");
    FILE *f = fopen(path, "rb");
    if (f == NULL) {
        nob_log(NOB_ERROR, "Could not open %s for reading: %s", path, strerror(errno));
        nob_return_defer(false);
    }

    size_t n = fread(buf, 1, buf_size, f);
    while (n > 0) {
        nob_sb_append_buf(sb, buf, n);
        n = fread(buf, 1, buf_size, f);
    }
    if (ferror(f)) {
        nob_log(NOB_ERROR, "Could not read %s: %s\n", path, strerror(errno));
        nob_return_defer(false);
    }

defer:
    NOB_FREE(buf);
    if (f) fclose(f);
    return result;
}

new broken nob_read_entire_file implementation:

bool nob_read_entire_file(const char *path, Nob_String_Builder *sb)
{
    bool result = true;

    FILE *f = fopen(path, "rb");
    if (f == NULL)                 nob_return_defer(false);
    if (fseek(f, 0, SEEK_END) < 0) nob_return_defer(false);
    long m = ftell(f);
    if (m < 0)                     nob_return_defer(false);
    if (fseek(f, 0, SEEK_SET) < 0) nob_return_defer(false);

    size_t new_count = sb->count + m;
    if (new_count > sb->capacity) {
        sb->items = realloc(sb->items, new_count);
        NOB_ASSERT(sb->items != NULL && "Buy more RAM lool!!");
        sb->capacity = new_count;
    }

    fread(sb->items + sb->count, m, 1, f);
    if (ferror(f)) {
        // TODO: Afaik, ferror does not set errno. So the error reporting in defer is not correct in this case.
        nob_return_defer(false);
    }
    sb->count = new_count;

defer:
    if (!result) nob_log(NOB_ERROR, "Could not read file %s: %s", path, strerror(errno));
    if (f) fclose(f);
    return result;
}

To be clear, I'm not exactly sure why the old implementation seems to be more resilient. If you'd like me to further investigate, or have any questions for replicating the bug, let me know.

Otherwise, just ignore the issue if this is too specific of a use-case :)

@rexim
Copy link
Member

rexim commented Feb 15, 2025

What exactly do you mean by "breaks"?

@satchelfrost
Copy link
Contributor Author

When calling nob_read_entire_file for my 2GB file, it gives the following error:

"[ERROR] Could not read file my_custom_file: Success"

It's strange to me that the message says "Success", which seems like unexpected behavior. Out of paranoia, I also tested naming the file incorrectly, and that correctly gives the expected "No such file or directory" error message.

Again, this is only an issue for the executable built through x86_64-w64-mingw32-gcc. Running on wine, the message is "Success", whereas actual windows it says "No error message". I also don't think it's anything weird with my file because I load other similar (but smaller files) and they work just fine.

@satchelfrost
Copy link
Contributor Author

satchelfrost commented Feb 15, 2025

Upon closer inspection, the 2GB limit I'm running into seems to be the ftell returning a long.

    long m = ftell(f);
    if (m < 0)  nob_return_defer(false); // returns early for file size > ~2GB

The old implementation didn't have that issue (though I understand it was changed for performance reasons). There does exist seeko and tello for larger files, but it might not worth the trouble. Another option might be having both implementations (i.e. keeping / renaming the old one). Since, the older one is slower, but more reliable for larger files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants