Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performances differences between QF_BV and QF_ABV for bitvectors #5655

Closed
SolalPirelli opened this issue Nov 9, 2021 · 9 comments
Closed

Comments

@SolalPirelli
Copy link

First off, thanks for Z3, it's awesome 😄

I'm using Z3 via a symbolic execution engine (angr), so I have a bunch of incremental queries. All my queries only use quantifier-free bitvectors. When creating a solver with QF_BV as the logic (using SolverFor in Python), perf is the same as without specifying a logic. But when using QF_ABV, Z3 is ~2x faster, even though the queries don't use arrays.

Is there a list somewhere of what Z3 does differently when arrays are in the logic, and can I replicate these differences with set_param calls to try and isolate the cause of the perf differences?

@NikolajBjorner
Copy link
Contributor

Could be a performance bug with some pre-processing.
You can run with verbose=10 mode to get a high level overview of what pre-processing does in the two settings.
The pre-processing directives for QF_BV and QF_ABV are somewhat set in stone. (Power) users can author their
own tactics to gain tight control of what happens.

The other situation could be that z3 chooses the "wrong" backend for QF_BV. The backend is the much more efficient SAT solver. The backend for QF_ABV is a general purpose SMT core that doesn't scale as well, generally, but could have an edge based on how it uses in-processing and branch heuristics.

@NikolajBjorner
Copy link
Contributor

was any of this information of use? Otherwise, if you can share a repro I might be able to figure out.

@SolalPirelli
Copy link
Author

Sorry for the delay. From some of the verbose output I tried forcing the tactic to SMT and it turns out that set_param('tactic.default_tactic', 'smt') makes both QF_BV and QF_ABV faster on my inputs (i.e., the fastest config I have is now QF_ABV + SMT as default tactic, and the slowest is QF_BV without changing the default tactic).

I tried generating a repro with solver.smtlib2_log but the resulting file is is broken as if it was written by multiple threads concurrently, I thought I wasn't using any parallelism in angr but I have to look into that first it seems...

Is there a way to have verbose=10 and also print the queries themselves? It's hard for me to see what info corresponds to what query since symbolic execution leads to lots of incremental queries.

@NikolajBjorner
Copy link
Contributor

If you dump the output of verbose=10 for the two versions here, it might give an idea. It reports number of ast nodes during pre-processing. It probably goes up quite a bit with QF_BV somewhere. It could be something basic, such as equality propagation.

@SolalPirelli
Copy link
Author

Here's the verbose=10 output for QF_BV, QF_ABV, and QF_ABV+tactic.default_tactic=smt: https://gist.github.com/SolalPirelli/056811e998813d90c78b2ecbe31a4ba0 (probably easier to view when downloaded as a zip, the QF_BV output is particularly large)

@NikolajBjorner
Copy link
Contributor

There is a difference in backends used. The sat backend gets used in QF_BV but not always for QF_ABV where the log indicates it is using "smt" (the general purpose solver). It is very possible the smt solver integrates better rewriting that ends up making a difference.

NikolajBjorner added a commit that referenced this issue Nov 19, 2021
@NikolajBjorner
Copy link
Contributor

I have now tried to ensure that smt2log works with multi-threaded solvers by appending thread ID to each log file in case it is multi-threaded. The precise root cause of performance difference could tell us something about how to make the SAT backened better (caveat is that it is often difficult to improve so having the stable SMT backend is a point solution even as it requires some tweaking from the user point of view).

@SolalPirelli
Copy link
Author

Sorry for the delay. The log still has the same issue with z3 4.8.14, I'll open a separate issue for that.

Anyway, even deleting some damaged parts of the log still reproduces the problem: https://gist.github.com/SolalPirelli/f795ca3b8112765e3eaf636dc45607bc

Not only is this 2-3x slower with QF_BV than QF_ABV (by prepending a set-logic instruction), but if you copy-paste the last check-sat a bunch of times (i.e., have the same assertion 10s of times in a row), the solving time goes up with QF_BV while it remains constant with QF_ABV, as if it is cached when using QF_ABV but not QF_BV.

I've tried with 4.8.12, 4.8.13, and 4.8.14, same issue.

@NikolajBjorner
Copy link
Contributor

It uses the sat core under QF_BV, it uses the SMT/CDCL(T) solver otherwise. CDCL(T) allows some equality propagation not available with the SAT core. The option sat.smt=true uses a new CDCL(T) core with the more efficient SAT core and is faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants