About this page

This page describes the design and development of BILE, a program that I wrote to give HTML pages a common look and feel. BILE was written in C and has been compiled for both Windows and Linux.

Contents

Background

I wrote BILE originally to help produce a satirical Web site (hence the name) that never got off the ground. This was some years ago before blogging software and Web-based Content Management Systems were commonplace and before Web hosting services routinely offered CGI or other server-side interactivity. For this reason, BILE was created as an “offline” tool; the intention is that you run BILE over your Web pages and upload the results to your Web site. Therefore, BILE is suitable in the following situations:

BILE can be used to style dynamic content such as PHP or ASP pages but, as with static content, BILE must be run on these files before they are uploaded to the server.

At the time BILE was conceived, the most commonly-available technology for giving Web pages on a site a consistent look and feel was Server Side Includes (SSI)[1]. However, SSI+ requires the Web page author to embed special comments into each and every page on which they want common page elements (such as sidebars and footers) to appear. Instead of doing this, BILE uses a template file into which the body of each page is “poured”; see the section on BILE’s application model for a more detailed description.

The first version of BILE was written in QBasic, the free version of QuickBasic that Microsoft distributed with DOS 6 and Windows 95. This version of BILE offered the user very little control over the output unless they were willing to change the BASIC code. It was also limited in the kinds of metadata it could extract from the input files because of limitations in QBasic itself.

Because of these limitations, I decided to rewrite BILE in a more powerful language. Specifically, I wanted to add the following improvements:

After some experiments, I decided on C as the implementation language.

Build environment

The BILE source currently consists of about 6,000 lines of C. The source is fairly portable with little conditional compilation. I build the code using the GNU C Compiler on Linux and the MinGW32 port of same on Windows. The build is controlled using GNU make but I don't use the GNU autoconf tools. In addition, I use the following software for version control and change tracking:

Application model

Basic concepts

BILE was originally intended to be used to produce a news-type Website and its terminology reflects this. A BILE “project” is referred to as a publication. A publication is broken into sections which contain stories. Each section can have one or more indexes which are used to generate lists of links, tables of contents, etc. Stories can also have tags like those found on a number of Web sites. In the present implementation of BILE, publications and sections map to directories and stories map to files. There only difference between the publication and its sections is that indexes defined in the publication configuration file (see below) apply to all files in all sections and subsections, whereas indexes defined in section configuration files only index those files in the section directory itself.

Each BILE entity — publication, section, index and story — has a number of variables associated with it. Some of these variables are created by BILE itself on startup, some are read from file metadata and configuration files and some are modified by BILE during processing.The entities form a set of nested scopes. For example, if BILE is processing a story file’s template and it encounters a reference to a variable $var, it will first check if there is a local variable called $var. If there is not, it will check the section’s variables, the parent section’s variables and so on up to the publication’s variables. Finally, it will check the computer’s environment variables. If no variable can be found, it will create a new local variable in the story called $var and assign it a blank value. If BILE code in an “inner” scope attempts to change a variable in an “outer” scope, BILE will create a local variable with the same name and containing the modified value. This is to prevent side effects as BILE doesn’t guarantee the order in which it processes files. However, this behaviour can be circumvented if necessary by using the SET command described below.

Command-line invocation

BILE is invoked as follows:

bile [-f] [-v] -i input directory -o output directory -t template directory
Command-line switchDescription
-f Optional. Force regeneration of all output pages even if the input hasn’t changed.
-v Optional. Verbose mode. Generate progress information while running.
-i Mandatory. Specifies the input directory.
-o Mandatory. Specifies the output directory.
-t Mandatory. Specifies the template directory.

Index pages

One of the benefits of using BILE is its ability to automatically generate index pages which act as tables of contents for a site. An index can be added to any template using the INDEX block command but a separate file or set of files can be generated for an index by setting the $index_file and $index_template variables for in the index’s definition in the publication or section configuration file.

Multi-page indexes

Normally, an index page will consist of only a single output file. This can be overriden by changing the value of the $index_file variable (using the SET command as you are changing the global state) and then using the BREAK command. BILE checks the value of the $index_file variable after it has output the index page and if it has changed, re-runs the template with the new file name.

Note: While inside an INDEX block, BILE changes the way it searches its scope for variables. When processing a file normally, the search looks like this:

story → section → … publication → environment

Inside an INDEX block, the search looks like this:

story → index → section → … publication → environment

Configuration files

BILE configuration files are used to store information about the publication and each section in the publication. They also store the index definitions for the publication and its sections. They have a .bile file extension. The publication configuration file is called publication.bile and is located in the publication’s top-level directory. Each subdirectory in this direction can have a section-specific configuration file called section.bile.

Format

The configuration files have the following format:

# A comment; ignored
$var_name1 = `A literal`
$var_name2 = `A valid ` . ucase(`bile`) . ` expression`

# Index definition
index index_name
$sort_by = `-file_date`
endindex

Note the following:

Mode of operation

For BILE to run, it needs three arguments: the input directory from which files are to be read, the output directory where processed files are to be created, and the template directory where template files are stored. When BILE is run, it performs two main phases of processing: reading the data it needs to generate the output, and the generation phase itself. The first phase proceeds as follows:

At the end of this operation, BILE has everything it needs to do its job. It then proceeds to the generation phase:

Controlling the output

BILE allows you to specify variables in the configuration file or metadata to give finer control over what gets output.

Variable nameDescription
$use_template This is the variable that tells BILE to use a template on the input file.
$template_file The location of the template file to use, if $use_template is true. The location is relative to the template directory specified on the command line.
$use_template_ext If use $use_template_ext is set to "true", the extension of the template is used rather than the extension of the input file. For example, if the input file is called input.html but the template file is called template.shtml, then the output file will be called input.shtml if $use_template_ext is set to "true".
$output_mode If $output_mode is set to "both", both the original input file and the file generated by passing the input file through the template are copied to the output directory. This is useful for non-HTML input. For example, this can be used with image files to create a gallery.

Template files

BILE template files are simply text files with commands enclosed in double square brackets, [[like this]]. Template commands can be simple commands or block commands that enclose other commands. Blocks are closed by preceding the command name with a slash, for example, [[if]] ... [[/if]]. Some commands are immediate; that is, they are evaluated when the template is loaded, not when it is executed. Immediate commands are prefixed with a “!” character.

The following commands are defined in BILE:

CommandDescription
# A comment. Everything after the “#” is ignored. No output is generated
= expression Writes the result of the expression to the output, escaping any HTML special characters.
> expression Writes the result of the expression to the output without escaping any characters.
BODY Writes the body of the input file to the output. What constitutes the body of a file varies depending on the input file type. For example, for an HTML file, only the parts of the file between the <BODY> tags will be output; for a text file, the entire file will be output.
LOCATION expression Prints a “breadcrumb trail” for the input file using the result of the expression as a separator. The “breadcrumb trail” will consist of a link to the index page of the first index defined in the input file’s section, parent section, etc., all the way to the publication level.
BREAK Leaves the current block.

Note: Unlike C and its descendents, a BILE IF command is a block like any other, so the following will not exit the block as intended:

[[block]]
   [[if $some_condition]]
      [[break]]
   [[/if]]
[[/block]]

In this case, the BREAKIF command should be used instead:

[[block]]
   [[# Do something... ]]
   [[breakif $some_condition]]
[[/block]]
BREAKIF expression Leaves the current block if the expression is true.
IF Conditional block command. Syntax:
[[block]]
   [[if expression]]
   [[/if]]
[[/block]]

Note that there is no ELSE clause in BILE.

!INCLUDE expression Includes the filename given as the the result of the expression in the template including any BILE commands. Immediate command, so executed once when the template is loaded.
INDEX [expression] Block command. Evaluates the expression and looks for an index of that name. If an index is found, the block is evaluated for each file in the index. Used to generate tables of contents. If included in an index template, the expression may be omitted.
LET $variable = expression Assigns the value of the expression to the local variable $variable.
PREAMBLE For HTML files, outputs any text that occurs before the opening <HTML> tag. Useful for PHP files which may have setup code before the <HTML> tag.
SECTIONS Generates a list of sections defined in the publication.
SET $global = expression Assigns the value of the expression to the global variable $global.
TAGS Block command. For each tag defined in the input file, evaluate the block.

Expressions

BILE has simple expression evaluator based on Jack Crenshaw’s series of articles entitled, “Let’s build a compiler”[6]. The syntax is similar to that of PHP’s with some peculiarities described below.

Variables

Like PHP, BILE variables are prefixed with a “$” character. Prefixing a variable name with two “$” characters works as it does in PHP, allowing a simple form of indirection, for example:

[[let $a = `Test`]]
[[let $b = `a`]]
[[= $$b]]

This will write “Test” to the output. This also works for functions, for example:

[[let $func = iif($do_uppercase, `ucase`, `lcase`)]]
[[= $func(`Test`)]]

will write “TEST” to the output if the variable $do_uppercase is true.

Literals

String literals may be delimited by single quote ('), double quote (") or backquote (`) characters. BILE does not interpolate variable names in double-quoted strings like PHP does.

BILE recognises the Boolean literals true and false. The following additional values are regarded as Boolean False:

All other values are regarded as True in a Boolean context.

Operators

Like PHP, “.” is the preferred string concatenation operator. Using “+” may not work.

BILE’s arithmetic operators are (in decreasing order of precedence):

The logical operators have higher precedence than the arithmetic operators and are (in decreasing order of precedence):

BILE’s comparison operators are somewhat unorthodox. This is because BILE is often embedded in HTML code in which the standard comparison symbols like “<” and “>” have special meanings and must be escaped. Although the BILE parser could be modified to recognise the escaped form of the operators, I felt this would reduce the readibility of BILE code. Therefore, I decided to use two-letter names for the comparison operators which FORTRAN and DCL programmers might recognise, but will probably be unfamiliar to everyone else!

Conventional operator BILE operator Description
=, ==eqequal to
<>, !=nenot equal to
<ltless than
<=leless than or equal to
>gtgreater than
>=gegreater than or equal to

Functions

BILE has a number of built-in functions. These functions have been added on a more-or-less ad-hoc basis as I needed them so they are something of a “mixed bag”. The functions in BILE are stored in a table of function pointers so it is fairly straighforward to add new ones.

FunctionDescription
basename(file_path) Removes the directory part of file_path and returns the filename.
decode(expression, [val1, ret1, ... valn, retn], default Compares expression to va11. If they are equal, ret1 is returned. If not, the next val is compared. If none of the supplied vals match, the default value is returned.

Note: This function is equivalent to the Oracle function of the same name.

defined(variable_name) Returns True if a variable called variable_name exists in the expression’s scope, False otherwise.
dirname(file_path) Returns the directory part of file_path.
ent(entity_name) Returns an SGML entity reference. For example, ent(`quot`) returns &quot;. This is a covenience function intended to reduce the amount of escaping necessary in BILE code.
exec(program_name) Runs the external program program_name and captures any output it generates. The program’s exit code is stored in the global variable $error.
file(file_name) Reads the file file_name and returns its contents.
file_exists(file_path) Returns True if the file file_path exists, False otherwise.
iif(expression, true_val, false_val) If expression is True, returns true_val, otherwise false_val is returned

Note: This function is equivalent to the VB/VBA function of the same name.

index_first(index_name[, variable_name]) Checks the publication for an index called index_name. For index_first(), the value of variable_name in the scope of the first file in the index is returned. For index_last(), the value of variable_name in the scope of the last file in the index is returned. For index_prev() and index_next(), the current file’s position in the index is determined and then the the value of variable_name in the scope of the preceding file or the following file in the index is returned. If variable_name is not specified, the value of the variable $file_name is returned.

These functions are used to generate Previous/Next links on pages.

index_last(index_name[, variable_name])
index_next(index_name[, variable_name])
index_prev(index_name[, variable_name])
lcase(string) Returns string in lower case.
length(string) Returns the length of string.
now() Returns the current time as the number of seconds since midnight of Jan 1 1970.
relative_path(path1, path2) Given two absolute paths path1 and path2, returns a path to path2 relative to path1.
strftime Formats a time value. Accepts the same format as the strftime function in the C library.
substr(string, start[, length]) Returns the substring of string of length characters, starting at offset start. If length is omitted, the substring to the end of the string is returned.

Note: BILE counts the characters in strings starting from zero.

tag(tag_name[, attr_name1, attr_val1, ...]) Returns an SGML element (tag) with the specified attributes. For example tag(`h1`, `class`, `title`) returns <h1 class="title">. This is a convenience function used to reduce the amount of escaping in BILE code.

Note: attr_vals are not escaped before being added to the tag.

ucase(string) Returns string in lower case.

Note: In order to simplify the parser, there can be no space between a function name and the opening bracket.

File metadata

BILE works by being able to extract metadata from its input files. There is a default file handler that works on all files and extracts information common to all files such as their name, size and modification date. There are two additional handlers for processing HTML and image files.

General metadata

The general file handler will create the following variables in the file’s scope:

HTML metadata

The HTML file handler parses the <HEAD> element of the HTML file and will create a variable called $title equal to the contents of the <TITLE> element. In addition, it will create variables for every <META> element it finds in the <HEAD>, replacing any characters that are illegal in BILE variable names with underscores. For example, if an HTML file contains the following <HEAD> element:

<HEAD>
  <TITLE>Home Page</TITLE>
  <META HTTP-EQUIV="Content-Type" CONTENT="text/html">
  <META NAME="Keywords" CONTENT="home page">
</HEAD>

BILE will create the following variables:

Image metadata

The image file handler can parse a GIF, JPEG or PNG image and extract the image’s type and dimensions. These will be stored in the variables $content_type (as a MIME type), $image_width and $image_height. For GIF and JPEG images, it will check for embedded comments in the image and store them in the variable $comments. For PNG images, the handler will check for tEXt chunks. These chunks contain image metadata in key/value pairs and the handler will create a BILE variable for each key/value pair it finds.

Known issues

There are a number of bugs and other problems with BILE:

Further directions

As it stands, BILE serves my needs pretty well. I use it to maintain this Web site. However, there are a number of features that could be added to make it more useful.

References

[1] Server Side Includes, Apache implementation, http://httpd.apache.org/docs/2.2/mod/mod_include.html
[2] Git http://git-scm.com/
[3] CVSTrac, integrated issue-tracker/ Wiki http://www.cvstrac.org/
[4] Doxygen documentation generation tool http://www.stack.nl/~dimitri/doxygen/
[5] Git Extensions https://github.com/gitextensions/gitextensions
[6] Jack Crenshaw, Let’s Build a Compiler, http://compilers.iecc.com/crenshaw/

Document history

VersionAuthorDateComment
1.1 Ken Keenan 06 January 2016 Updated details
1.0 Ken Keenan 13 August 2007 Initial version