Browsing Program Structure with Worgle and SQLite
2019-09-02
Literate Programs as Structured Data
A compelling aspect of writing literate programs is that they can be represented as structured data. In Worgle, there are two main tree representations of the overall program structure. The woven tree represents the document structure as a collection of headers found in the Org markup language. The tangled tree represents the generated code structure as a series of named codeblocks created using the noweb syntax.
I always thought it would be a very powerful thing to be able to explore literate programs as trees. As of now, this is beginning to be possible in Worgle. Worgle has the ability to write data to a SQLite database, which is then queried using a program called worgmap.
Extracting Data from a Worgle Literate Program
Before a literate program made using Worgle can be queried, data must first be written to an intermediate format. I have chosen to use SQLite, as it is a robust and mature data format that is trivial for other programs to parse.
A database is generated using the "-d" flag in Worgle. The code below generates a database from the main Worgle org file.
worgle -d a.db worgle.org
The name "a.db" is the default name of the database that Worgmap opens to query information.
Database write times are reasonable. My largest program written in Worgle to-date, Monolith, is able to write a database in under half a second on my 2015 macbook pro. My GPD laptop running alpine Linux does seem to take a few seconds. This performance difference feels larger than I expected, even when considering the hardware difference. Even so, it still feels manageable.
Some Querying Via Worgmap
Once a database is generated, it can be queried using "get" utilities found in a program called Worgmap. The database is a pure SQLite database, so it is possible to just do raw SQL queries using the sqlite3 CLI. The worgmap get interface saves a few keystrokes.
When worgmap is run, it is assumed the database is in the current working directory, and that the name of the database is named "a.db". In the future, this will be more customizable.
To get a list of files from the database, run
worgmap get filelist
This will return the list of files tangled by Worgle:
worgle.c
worgle.h
worgle_private.h
The program ffile
can be used to get metadata on
the file "worgle.c":
worgmat get ffile worgle.c
This returns the following:
id = 2
filename = worgle.c
top = 1
next_file = 29
The id
is the UUID associated with this resource.
filename
is the stored filename (duh). top
refers
to the top-level code block represented. next_file
is the UUID of the next file in the list.
To get more information on the top level block:
worgmap get blk 1
Which returns:
1 3 worgle-top
This displays in order: the UUID (1), the UUID of the top level segment, and the name of the block (worgle-top). "worgle-top" is the block that contains the entire structure of the tangled C file "worgle.c". A tree view of this block can be printed using:
worgmap get tree worgle-top
The results:
global_variables
enums
parse_modes
static_function_declarations
functions
loadfile_localvars
loadfile
parser_local_variables
parser_initialization
getline
parse_mode_org
parse_mode_code
parse_mode_begincode
begin_the_code
worgle_block_set_id
worgle_file_set_id
worgle_segment_string_set_id
worgle_segment_reference_set_id
worgle_init
worgle_free
worgle_string_init
worgle_segment_init
worgle_block_init
hashmap_hasher
worgle_file_init_id
local_variables
initialization
parse_cli_args
append_filename
turn_on_debug_macros
turn_on_warnings
map_source_code
generate_database
check_filename
loading
parsing
generation
mapping
database
cleanup
Future Plans
Lots of things to be done here, really. Using something like SQLite allows me to dump way more metadata than I know what to do with right now.
For starters, I'd like to parse save org structure in addition to tangled code structure. I'm hoping to build more utilities that generate interesting representations of the document. Hoping to build a better static HTML generator than the simple one I have currently written. I also want to build a simple HTTP server that dynamic generates HTML content. Maybe throw in a few dot graph generators for good measure?
Being able to write multiple worgle programs into one database is important to me as well, as this would allow more incremental (hopefully faster) development to happen.