1. HOWTO (Partially) Automate a Web Site

1.1. Update: April 2011

The description below of how I generate pages for this site has changed some. As I noted in my survey of lightweight markup languages, I actually like Markdown because it's markup seems more intuitive to me. But txt2tags has more features than Markdown, which is focused on producing snippets of HTML rather than complete web pages. Those features of txt2tags allowed me to include HTML code for headers, footers, divs, etc., without cluttering up the source text file with a lot of HTML code. Then I found WriteMonkey, a minimalist text editor that accepts Markdown markup and exports a complete web page. While that does not allow the easy inclusion of headers, footers, and divs (for formatting) that txt2tags does, it allows the production of pages to be more streamlined. So I now make my web pages that present a complete article with WriteMonkey and those that require the additional formatting with txt2tags. For the first of several pages discussing WriteMonkey and its use (and to see the difference in formatting compared to this page), go here.

1.2. Goals

As I noted in the article About This Page, I've been doing a web site for a long time. I've used most of the tools to "automate" the site generation: content management systems, blogs, wikis. I've had a few spam attacks on my site over the years. These attacks are more successful when I've used a system that was based on server-side executable code, such as the blogging and wiki software. So, I've come full circle on how I want my site generated. I'm going back to static, hard-coded HTML pages. Static pages are, I believe, less prone to attacks by spammers since they have no comment feature, no editable option, as wikis and blogs do.

Even with hand-coded pages, I've learned enough about how I like to work and how to build sites that I have specific goals in developing this new system. New, that is, to me as of December 2010. These are the goals:

  1. Write page content in plain text as much as possible.
  2. Use a converter to turn the plain text into HTML.
  3. Use scripts to organize and upload the pages to my ISP's server.

To meet goal 1, I decided I wanted to use a lightweight markup language (LML) that allows content to be written in plain text, with a minimum of markup to structure the page and format the text. One benefit of using an LML is that the text is readable as a text file. This makes editing, revising, and maintenance of the content source much easier. So the first task was to select the LML to use.

1.3. txt2tags

After doing a little research on different LMLs, the one I decided to start with is txt2tags. Txt2tags actually meets goals 1 and 2. It is a specification for the markup as well as a Python script that produces output in a numbrer of formats, among them HTML, PmWiki, and Docbook. Since I use all of those formats in some fashion, it seems like txt2tags will fill the bill. Since it is a Python script, Python must be installed on your computer.

The markup in txt2tags is similar to the markup in other LMLs such has Markdown, AFT, and reStructuredText. For example, in txt2tags, to insert a main heading in a page, you type =Main Heading Text= into the source text. In this case, txt2tags would convert =Main Heading Text= to <h1>Main Heading Text</h1> when you select HTML as the output format from txt2tags. The txt2tags script has provisions to include files such as standard headers and footers, so producing coded pages ready to upload to your web server is easy. You can review the list of features of txt2tags here.

1.4. List of Tools

The tools I use to write and manage the web site:

And that's it. Most of the work is done in the editor. I run txt2tags from the command line of a Windows CMD window (although txt2tags does have a GUI if you want one). And please note that all this software is free.

1.5. The Process

1.5.1. Page Template

I use a template as the starting point for a new page. The template, like the source document for all pages, is a plain text file marked up using txt2tags syntax. The template includes the file headers required by txt2tags, configuration directives to tell txt2tags how to convert the source text file to HTML, and directives to add the footer to the HTML output. Below is a screenshot of current template file. You can click here to see a more readable version of the file named template.txt.

You can see the HTML that txt2tags produces in this file. (Note: I have modified my copy of the txt2tags script to specify HTML5 and to insert the <div> tags for my CSS. The default version of txt2tags will produce different HTML code.) The screenshot below shows what the page template looks like in a browser after it is run through txt2tags and is rendered with the CSS stylesheet.

1.5.2. Writing the Content

As noted above, the content is written into a plain text file with txt2tags markup. As an example, you can view the source text of this page.

1.5.3. Converting the Text to HTML

I'm running on a Windows machine so I use a CMD window to run txt2tags. When I've got the draft source text the way I want it, I open a CMD window and type:

  txt2tags page_filename.txt

Assuming you have Python installed and all the file associations and paths correctly specified, this command launches txt2tags and it converts the file page_filename.txt to page_filename.html.

1.5.4. Updating the Web Site

Once the HTML is generated, I upload the file to my server. I add the new content page to the Table of Contents on my home page and I'm finished.

BTW, I hand-code the TOC on the home-page. I created that page before this effort to convert to using plain text and "automated" tools to add to the site. A future TODO is to convert the home page (and some of my other, older pages) to the txt2tags format. Update 2011-01-07: most of the pages are now maintained using txt2tags. I still do the TOC by hand since it has a lot of spans and divs for the CSS formatting. Spans and divs are not a strength of txt2tags.

2. Other Reading

If you want to read some background on why lightweight markup languages and plain text are a good way to capture your ideas for publishing on the web, browse some of the links below: