Thoughts on RefinedC project file management
RefinedC project file management
The verification of a C file dir1/dir2/name.c
leads to the generation of the following files (all paths relative to the root of the RefiendC project root):
-
dir1/dir2/proofs/name/generated_code.v
(contains all the function implementations), -
dir1/dir2/proofs/name/generated_spec.v
(contains all the function specifications), -
dir1/dir2/proofs/name/generated_proof_NAME.v
(forNAME
a function name, contains the function's typing proof), -
dir1/dir2/proofs/name/proof_files
(a list of proof files currently tracked, useful to detect function deletion), -
dir1/dir2/proofs/name/proof_dune
(build system configuration).
Additionally, file _CoqProject
is edited to let editors know where to find compiled (.vo
) files.
It basically maps directory dir1/dir2/proofs/name
(in fact, its counterpart under the dune _build
directory) to the Coq path ROOT_PATH.dir1.dir2.name
.
Problem 1: strict directory and file names
By default, the Coq module paths used for the generated files is based on the name of the file (and of the folders along a relative path from the project root).
This works well as long as the path to the file does not correspond to an invalid Coq module path (e.g., uses character -
or non-ascii characters).
The obvious solution is to let the user map source directory folders and files to a Coq module path of their choice (provided it is under the project's prefix).
One potential issue is: what happens if the user decides to change the module path for some file path? Obviously, the generated file corresponding to the C source files must be marked as stale, or generated again.
Problem 2: modular handling of header files
The generated_spec.v
file currently contains all type definitions and function specifications originating from the (pre-processed) input C file, including annotated function prototypes defined in transitively included header files (possibly even standard library files).
This is obviously not very modular, and leads to specification being repeated in different files accross the source tree (namely, all files that include the header containing the prototype annotated with the RefinedC specification).
A possible solution would be to rely on the source file and line number information inserted by the pre-processor to determine where a given function prototype or type declaration comes from. However, this approach is far from perfect since it does not play well with conditionals: depending on the inclusion context for a header file, different branches (and hence definitions) may be used. Hence, we cannot reliably rely on just the name of a symbol for desambiguation: we would need to also take into account the value of all pre-processor flags at the point of the inclusion (or similar). A possible work around for the last part is to only support a subset of the pre-processor directives, and to somehow detect when the restriction is violated (which shounds both hard and quite restrictive given the kind of hacks one can find in, e.g., the Linux kernel source tree).
Problem 3: broken Coq project file
What if the user messes up the Coq project file?
One way to make sure this does not happen, and to let RefinedC recover if it does, is to re-generate the file upon each invocation of RefinedC (if it is not already up to date).
To be able to to this, we need to know which C files are currently tracked by RefinedC, since processing a large source tree (e.g., that of Linux) may be very inneficient (even using grep
in the whole Linux source tree is super slow).
One possible solution is to rely on a .rc-ignore
file to exclude certain files and directories during "project discovery", and to consider by default that all C source files are part of the project.
Another solution is to rely on special commands like refinedc add dir/file.c
, refinedc mv dir/file.c new_dir/file.c
and refinedc delete dir/file.c
, as the git
command does.
This later solution seems cleaner, and we could have a refinedc status
command giving information on the current status of the project.
However, concern has been raised about having to keep the state of the project in sync with the files.