Automating Bitvector and Finite Field Equivalence Proofs in Lean

Source: arXiv:2605.15163 · Published 2026-05-14 · By Elizaveta Pertseva, Valentin Robert, Clark Barrett, James Parker

TL;DR

This paper addresses a critical verification challenge arising in Zero-Knowledge Proof (ZKP) systems: proving equivalences between quantifier-free statements mixing bitvector and finite field operations. Existing approaches are either manual, costly, and brittle, or rely on SMT solvers which scale poorly due to difficulties with theory conversions and reasoning about inequalities. To overcome these bottlenecks, the authors present BitModEq, a novel tactic implemented in the Lean interactive theorem prover. BitModEq leverages range lemmas and case analysis to soundly translate finite field expressions into bitvectors verified inside Lean. This tactic integrates a new Lean-based range analysis to efficiently rule out modular wrap-around and then completes proofs by bit-blasting. On a suite of ZKP arithmetizations, BitModEq outperforms state-of-the-art SMT solvers, solving 19% more benchmarks while providing the correctness guarantees of ITPs. The core technical novelty is a verified operator-distribution calculus that pushes conversion operators (toNat and toBV) inward in the formulas while relying on range analysis to simplify inequalities, enabling substantial performance gains.

Key findings

BitModEq solves 19% more zero-knowledge proof (ZKP) arithmetization benchmarks than leading SMT solvers such as cvc5.
On a motivating example encoding bitwise OR for 1-bit inputs, BitModEq verified inputs up to 32 bits within 5 minutes and 8GB RAM, whereas cvc5 verified only up to 9 bits using a weak encoding and 3 bits with a full encoding under the same limits.
The tactic leverages range lemmas and case reasoning to eliminate modular wraparound, allowing simplification of finite field subtractions to natural number subtractions without introducing mod operations.
Lean's built-in tactics like ring_nf and omega struggled with large coefficients and variable dependencies; the bespoke range analysis algorithm significantly improved inequality discharge.
Finite field to natural number conversion applies sound operator distribution rules such as distNatSub when inequalities prove no wraparound, reducing bit width needed in bitvectors.
In the conversion from natural numbers to bitvectors, BitModEq calculates maximal bit widths globally to maintain consistency between hypotheses and goals, avoiding unsound counterexamples.
The integration of translation passes with range analysis and bit-blasting in Lean yielded a sound and terminating decision procedure for a large class of ZKP verification goals.
Proof obligations involving bitvector OR encodings previously handled inefficiently by SMT due to encoding size and range checks become feasible inside Lean using BitModEq.

Threat model

The adversary is not explicitly modeled in a security sense; the work assumes a benign user or verifier seeking to prove equivalence between finite field and bitvector encodings of ZKP circuits with correctness guarantees. The approach does not protect against adversarial proofs but rather strengthens automation and trustworthiness of verification inside a small trusted kernel.

Methodology — deep read

The methodology centers around the design and implementation of BitModEq, a Lean tactic that automates equivalence proofs involving finite field and bitvector expressions common in zero-knowledge proof verification. The threat model assumes the need for verifying quantifier-free expressions mixing the two theories with no adversarial attempts to break soundness, focusing instead on improving automation and scalability of verification.

Data: The evaluation involves benchmarks from real-world ZKP arithmetizations including the Jolt zkVM encoding of bitwise OR, where bitvector operations are encoded as finite field polynomials. These formulae are quantifier-free first-order logic statements over combined arithmetic theories. Detailed task sizes include input bit widths ranging up to 32 bits.

Architecture / Algorithm: BitModEq proceeds through a four-phase workflow. First, all finite field terms (FP) are converted to natural number terms (N) using an operator distribution calculus applying rewrite rules (e.g., pushing toNat operators inward, replacing subtractions by natural number subtractions where range proofs exclude wraparound). Second, range analysis attempts to discharge all inequalities over natural numbers by introducing placeholder variables, decomposing inequalities, and enumerating cases with bit-specific polymorphisms. Third, any remaining natural number terms are converted to bitvectors (BV) with carefully selected bit widths. Fourth, final goals over bitvectors are discharged via the bv_decide tactic, a verified bit-blasting decision procedure within Lean.

The translation rules are expressed as guarded assignments on proof contexts (goals and hypotheses), enabling derivations that simplify terms systematically. Range lemmas allow dropping modular arithmetic where safe. For subtractions, the tactic checks if toNat(t1) ≥ toNat(t2) to drop mod; otherwise, it encodes wraparound correctly by adding the field order P.

Training Regime: Not applicable as this is a formal methods procedure rather than machine learning.

Evaluation Protocol: The authors benchmark BitModEq on ZKP arithmetic encodings for bitvector OR across variable bit widths, comparing to state-of-the-art SMT solvers such as cvc5 using both strong and weak encodings. Metrics include maximum verifiable input size under fixed memory/time constraints and solved benchmark counts. Ablations include testing performance with and without range analysis and comparisons to Lean's built-in tactics.

Reproducibility: BitModEq is implemented in Lean's tactic language, building on the existing bv_decide procedure. The paper states detailed formal rules and includes pseudocode for the strategies used. While the paper does not explicitly mention a public code release, the described methods operate fully inside Lean's ITP environment and hence are reproducible in principle given Lean scripts and definitions.

Technical innovations

BitModEq: a verified Lean tactic that translates finite field expressions into bitvectors via natural numbers using operator distribution and range lemmas.
A new range analysis tactic that introduces placeholder variables and performs case splitting to automatically discharge inequalities involving bitvector simulations over finite fields.
A two-stage translation calculus with sound, terminating rewrite rules pushing toNat and toBV operators inward while minimizing introduction of modular arithmetic.
Integration of range analysis and verified bit-blasting in Lean enables outperforming state-of-the-art SMT solvers in ZKP arithmetization verification.

Datasets

ZKP arithmetization benchmarks including Jolt zkVM bitwise OR polynomials — size/concrete numbers not fully specified — source: real-world ZKP systems

Baselines vs proposed

cvc5 SMT solver weak encoding: max input bit width verified = 9 vs BitModEq: 32
cvc5 SMT solver full encoding: max input bit width verified = 3 vs BitModEq: 32
BitModEq: 19% more ZKP arithmetization benchmarks solved than state-of-the-art SMT solvers

Limitations

Range analysis is incomplete and may return unknown if inequalities cannot be discharged automatically.
The approach currently targets quantifier-free fragments and does not handle arbitrary quantifiers or general finite field operations like inverses or divisions.
Performance is reliant on manually engineered rewrite rules and range lemmas that may need tuning for different ZKP encodings.
The evaluation benchmarks are limited to particular ZKP arithmetizations (e.g., bitwise OR) and do not cover all arithmetic or cryptographic circuits.
No adversarial robustness tests were performed; the tactic operates under correctness assumptions typical for theorem proving rather than security threat modeling.
Scalability to very large circuits beyond 32-bit widths remains to be explored.

Open questions / follow-ons

How to extend BitModEq to handle quantified formulas and support reasoning about finite field inverses or divisions?
Can the range analysis tactic be further automated or combined with SMT-solving techniques to improve completeness?
What are the limits of scalability for this approach on larger ZKP circuits beyond 32 bits or more complex arithmetic?
Could similar operator distribution and range analysis techniques benefit other SMT theories with mixed domains?

Why it matters for bot defense

For bot-defense and CAPTCHA practitioners working with formal verification of challenging arithmetic circuits, BitModEq’s approach offers a promising pathway to automate verification of arithmetic equivalences inside an interactive theorem prover with rigorous soundness guarantees. This is especially relevant where trusted proofs are preferred over black-box SMT queries prone to scalability and soundness issues.

Practitioners implementing or verifying cryptographic proof systems underlying CAPTCHA or bot-detection protocols could adopt strategies from BitModEq to soundly relate bit-level encodings to finite field arithmetic, potentially improving correctness assurance. The key takeaway is that utilizing range analysis to prune modular arithmetic complexities combined with operator pushing can break down difficult mixed-theory goals into simpler domains amenable to reliable decision procedures like bit-blasting.

Cite

bibtex

@article{arxiv2605_15163,
  title={ Automating Bitvector and Finite Field Equivalence Proofs in Lean },
  author={ Elizaveta Pertseva and Valentin Robert and Clark Barrett and James Parker },
  journal={arXiv preprint arXiv:2605.15163},
  year={ 2026 },
  url={https://arxiv.org/abs/2605.15163}
}

Automating Bitvector and Finite Field Equivalence Proofs in Lean ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Baselines vs proposed ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​