# ProofTrace: Modern Proof Steps Recording for HOL Light

COPYRIGHT: (c) Copyright Stanislas Polu 2019
LICENSE: The usual HOL Light license applies.

ProofTrace is a lightweight proof steps recording solution for HOL Light. It
modifies the HOL Light kernel by introducing a new `proof` type and attaching
proofs to each `thm` generated by the kernel.

It is also entirely written in OCaml which makes it easy to use once HOL Light
is properly installed on a system.

The `proof` of a `thm` represents the tree of basic rewrites that were applied
to axioms and definitions to demonstrate the theorem.

# Generated Dataset

ProofTrace can generate three files from a loaded HOL Light context:

## `prooftrace.theorems`

This file dumps the sequence of `thm` demonstrated in the current HOL Light
context using a compact explicit `term` tree syntax. Each line is a marshalled
JSON object that represents a demonstrated theorem with its integer index.

```
{"id": 0, "th": {"hy": [], "cc": "C(C(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]))(c(T)(c[bool][])))(C(C(c(=)(c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[bool][]]]]]))(A(v(p)(c[bool][]))(v(p)(c[bool][]))))(A(v(p)(c[bool][]))(v(p)(c[bool][]))))"}}
{"id": 1, "th": {"hy": [], "cc": "C(C(c(=)(c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[bool][]]]]]))(A(v(p)(c[bool][]))(v(p)(c[bool][]))))(A(v(p)(c[bool][]))(v(p)(c[bool][])))"}}
{"id": 2, "th": {"hy": [], "cc": "C(C(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]))(c(T)(c[bool][])))(c(T)(c[bool][]))"}}
{"id": 3, "th": {"hy": [], "cc": "C(C(c(=)(c[fun][[c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]][c[fun][[c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]][c[bool][]]]]]))(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]])))(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]))"}}
{"id": 4, "th": {"hy": [], "cc": "C(C(c(=)(c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[bool][]]]]]))(C(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]))(c(T)(c[bool][]))))(C(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]))(C(C(c(=)(c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[bool][]]]]]))(A(v(p)(c[bool][]))(v(p)(c[bool][]))))(A(v(p)(c[bool][]))(v(p)(c[bool][])))))"}}
{"id": 5, "th": {"hy": [], "cc": "C(C(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]))(C(C(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]))(c(T)(c[bool][])))(c(T)(c[bool][]))))(C(C(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]))(C(C(c(=)(c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[bool][]]]]]))(A(v(p)(c[bool][]))(v(p)(c[bool][]))))(A(v(p)(c[bool][]))(v(p)(c[bool][])))))(c(T)(c[bool][])))"}}
{"id": 6, "th": {"hy": [], "cc": "C(C(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]))(C(C(c(=)(c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[bool][]]]]]))(A(v(p)(c[bool][]))(v(p)(c[bool][]))))(A(v(p)(c[bool][]))(v(p)(c[bool][])))))(c(T)(c[bool][]))"}}
{"id": 7, "th": {"hy": [], "cc": "c(T)(c[bool][])"}}
```

## `prooftrace.proofs`

This file dumps the calculus rewrites that led to each `thm` demonstrated in
the current HOL Light context. Arguments of rewrites are either previously
demonstrated `thm` (represented by their integer index) or an exogenous `term`
represented by an explicit tree syntax.

```
{"id": 0, "pr": ["DEFINITION", "C(C(c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]]))(c(T)(c[bool][])))(C(C(c(=)(c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[fun][[c[fun][[c[bool][]][c[bool][]]]][c[bool][]]]]]))(A(v(p)(c[bool][]))(v(p)(c[bool][]))))(A(v(p)(c[bool][]))(v(p)(c[bool][]))))", "T"]}
{"id": 1, "pr": ["REFL", "A(v(p)(c[bool][]))(v(p)(c[bool][]))"]}
{"id": 2, "pr": ["REFL", "c(T)(c[bool][])"]}
{"id": 3, "pr": ["REFL", "c(=)(c[fun][[c[bool][]][c[fun][[c[bool][]][c[bool][]]]]])"]}
{"id": 4, "pr": ["MK_COMB", 3, 0]}
{"id": 5, "pr": ["MK_COMB", 4, 2]}
{"id": 6, "pr": ["EQ_MP", 5, 2]}
{"id": 7, "pr": ["EQ_MP", 6, 1]}
```

## `prooftrace.names`

This files associates named `thm` in the current context (this is based on
parsing the entire HOL Light code subtree) to their proof index.

```
{"id": 1837, "nm": "ABS_SIMP"}
{"id": 6188, "nm": "AND_CLAUSES"}
{"id": 18, "nm": "AND_DEF"}
{"id": 15449, "nm": "AND_FORALL_THM"}
{"id": 1752, "nm": "BETA_THM"}
{"id": 29585, "nm": "BOOL_CASES_AX"}
{"id": 115216, "nm": "CHOICE_PAIRED_THM"}
````

# Installation

After checking out the HOL Light repository using `git` you simply need to
patch the `fusion.ml` file with:

```
$ git apply ProofTrace/fusion.ml.diff
```

You can then restart HOL Light as normal (not using any checkpointing technique
as all the theorems need to be reconstructed after the patch to build their
associated proofs).

# Usage

```
# #use "hol.ml";;

(* Possibly load other files... *)

# #use "ProofTrace/proofs.ml";;

# dump_theorems "ProofTrace/prooftrace.theorems";;
# dump_proofs "ProofTrace/prooftrace.proofs";;
# dump_names "ProofTrace/prooftrace.names";;

(* Each of these steps may take a while. The generated set of files for a    *)
(* fresh hol.ml context is generally 8GB large.                              *)
```
