Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency: BDB 6.2.23 hangs on aarch64-apple-darwin; an upgrade to BDB 6.2.32 is likely needed. #6977

Open
DeckerSU opened this issue Nov 9, 2024 · 2 comments

Comments

@DeckerSU
Copy link
Contributor

DeckerSU commented Nov 9, 2024

The issue described here is not directly related to zcashd but rather to one of its dependency packages—Berkeley DB 6.2.23—used in the project. I encountered a similar issue in another project that also uses this version of Berkeley DB. Therefore, ZCash developers should be aware of it. The problem occurs on the aarch64-apple-darwin triplet when compiled with the built-in clang. Upon attempting to open the database with dbenv->open, the daemon process hangs, resulting in an infinite Verifying wallet... wait. Below, I have isolated the issue to aid in reproducibility.

Environment:

sw_vers && pkgutil --pkg-info=com.apple.pkg.CLTools_Executables && clang --version
...
ProductName:	macOS
ProductVersion:	12.7.6
BuildVersion:	21H1320
package-id: com.apple.pkg.CLTools_Executables
version: 14.2.0.0.1.1668646533
volume: /
location: /
install-time: 1683848664
groups: com.apple.FindSystemFiles.pkg-group
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: arm64-apple-darwin21.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Steps to reproduce:

  1. Download and extract bdb 6.2.23:
curl https://download.oracle.com/berkeley-db/db-6.2.23.tar.gz -o db-6.2.23.tar.gz
tar xzvf db-6.2.23.tar.gz
  1. Patch atomic_init:
cd db-6.2.23
sed -i.old 's/atomic_init/atomic_init_db/' src/dbinc/atomic.h src/mp/mp_region.c src/mp/mp_mvcc.c src/mp/mp_fget.c src/mutex/mut_method.c src/mutex/mut_tas.c
  1. Build library (with debug info and unoptimized for future debugging):
cd build_unix
CFLAGS="-g -O0" CXXFLAGS="-g -O0 -std=c  11" ../dist/configure --disable-shared --enable-cxx --disable-replication --disable-atomicsupport
  1. Create a test program test-bdb.cpp:
#include "db_cxx.h"       // BerkeleyDB C   API
#include <iostream>
#include <string>
#include <cstdio>         // For fopen
#include <sys/stat.h>     // S_IRUSR and S_IWUSR

int main() {

    const std::string db_path = ".";
    const std::string log_path = "./database";
    const std::string err_path = "./db_errors.log";

    // Initialize the DB_ENV environment object
    DbEnv dbenv(DB_CXX_NO_EXCEPTIONS);

    try {
        // Set environment parameters
        dbenv.set_lg_dir(log_path.c_str());
        dbenv.set_cachesize(0, 0x100000, 1); // 1 MiB cache
        dbenv.set_lg_bsize(0x10000);         // Log buffer size: 64 KiB
        dbenv.set_lg_max(1048576);           // Maximum log file size: 1 MiB
        dbenv.set_lk_max_locks(40000);       // Maximum number of locks
        dbenv.set_lk_max_objects(40000);     // Maximum number of lock objects

        // Open the error file for logging
        FILE* err_file = fopen(err_path.c_str(), "a");
        if (err_file == nullptr) {
            std::cerr << "Failed to open error log file: " << err_path << std::endl;
            return 1;
        }
        dbenv.set_errfile(err_file); // Set the error file

        // Set environment flags
        dbenv.set_flags(DB_AUTO_COMMIT, 1);
        dbenv.set_flags(DB_TXN_WRITE_NOSYNC, 1);
        dbenv.log_set_config(DB_LOG_AUTO_REMOVE, 1);

        // Define environment flags for opening
        u_int32_t env_flags = 0; // Additional flags can be set here if needed

        std::cout << "Open ... (will we hang?)" << std::endl;

        // Open the environment
        dbenv.open(
            db_path.c_str(),
            DB_CREATE | DB_INIT_LOCK | DB_INIT_LOG |
            DB_INIT_MPOOL | DB_INIT_TXN | DB_THREAD |
            DB_RECOVER | env_flags,
            S_IRUSR | S_IWUSR
        );

        std::cout << "BerkeleyDB environment opened successfully." << std::endl;

        // Your database operations would go here

        // Close the environment
        dbenv.close(0);
        fclose(err_file); // Close the error file
    }
    catch (DbException &e) {
        std::cerr << "BerkeleyDB Exception: " << e.what() << std::endl;
        return 1;
    }
    catch (std::exception &e) {
        std::cerr << "Standard Exception: " << e.what() << std::endl;
        return 1;
    }

    return 0;
}
  1. Build test-bdb.cpp and link it with bdb:
clang   -v -g -O0 -std=c  11 test-bdb.cpp -o test-bdb -L. -ldb_cxx-6.2 -ldb
  1. Launch ./test-bdb ... and it will hang after call __db_tas_mutex_lock_int from src/mutex/mut_alloc.c.

This behavior is only related to arm64 Darwin, on x86_64 everything is fine. If we will repeat all the steps after arch -x86_64 zsh (switch to i386), i.e.:

make clean && CPPFLAGS="" CFLAGS="-g -O0" CXXFLAGS="-g -O0 -std=c  11" ../dist/configure --disable-shared --enable-cxx --disable-replication --disable-atomicsupport --enable-option-checking && make libdb_cxx-6.2.a libdb-6.2.a -j4
clang   -v -g  -std=c  11 test-bdb.cpp -o test-bdb -L. -ldb_cxx-6.2 -ldb-6.2
./test-bdb

The result will be:

Open ... (will we hang?)
BerkeleyDB environment opened successfully.

Solution:

In the next version of Berkeley DB, 6.2.32, the issue is fixed, and there are no hangs on Darwin arm64 and x86_64. So, perhaps we should update to 6.2.32? Some projects, such as ycash, have already made this update.

@daira
Copy link
Contributor

daira commented Nov 19, 2024

This is unlikely to be fixed within the remaining life of zcashd, especially as this only applies to a platform that is less commonly used to run servers and is not in any zcashd support tier.

@DeckerSU
Copy link
Contributor Author

This is unlikely to be fixed within the remaining life of zcashd, especially as this only applies to a platform that is less commonly used to run servers and is not in any zcashd support tier.

Thank you for your response. I just opened the issue to inform the developers and the community that this problem exists. Hopefully, it will be easy to find if someone encounters the same wallet initialization hang on Apple Silicon, whether in ZCash, its forks, or elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants