createPhotoWebPages.pl - User Guide

Contents

What is createPhotoWebPages.pl?

createPhotoWebPages.pl is a perl script which assists in creating HTML thumbnail and image preview pages from existing image files (the preferred format is JPEG, however, also PNG and GIF are supported).

An HTML Thumbnail-view page with small, clickable, "thumbnail"-size images is created together with associated image HTML pages which allow to view the images in a bigger size that is more suitable for a computer monitor. All HTML pages are linked to each other using Web links which greatly simplifies navigation and the search for a specific image or group of images. Large image collections may be grouped into separate thumbnail pages which themselves are then summarized in a top thumbnail page.

The perl script execution is based on a simple text configuration file which contains all the relevant information and which allows to re-generate the HTML pages and associated images solely from the original image files. A wide range of configuration options allows to annotate and complete the HTML pages individually with descriptive text, additional infos and formatting options.

back to top

What are the strengths of createPhotoWebPages.pl?

To be honest, you won't find much that other programs can also do for you ;-)

Anyway, the script puts together a number of steps and features that are probably quite unique in that combination. The perl script supports the user during all steps that are required to get from the plain images to nice HTML pages. The main intention has been to try to avoid any manual modifications to the HTML code in a later stage. The script tries to support all steps, starting with the collecting of images from various file locations, the renaming of images, probably the rotation of images (without loss of quality!), up to the annotation of text and other information. All important parameters are controllable via a single configuration file. Changes to that file may at any time be converted into new or updated HTML files and images.

back to top

What is required to be able to use createPhotoWebPages.pl?

Of course, not everything has been created from scratch. The script, which is currently only available for Linux based computers, requires a number of graphics utilities as a prerequisite. However, the most important ones are normally part of the (newer) Linux distributions (e.g SuSE 7.3 and above, or Debian). The missing ones may be easily downloaded from the Internet. Installation is usually no problem as binary versions are available.

Here is a list of the utilities that are required:

The JPEG Utilities, as well as the NetPBM package are probably available in the most Linux distributions (although it seems that some older distributions use older versions of the NetPBM package that are missing some of the required options).
In some cases, it may be interesting to use the improved version of jpegtran, but this is not a requirement of createPhotoWebPages.

The other utilities and fonts can be downloaded from the Web links mentioned above. In addition some of them are also included in the package.

All executeable programs and utilities must be available in the search path. The font is expected in the directory "~/.createPhotoWebPages/" by default (this can be changed, of course). (see section installation for more details)

back to top

What are the detailed features of createPhotoWebPages.pl?

This section simply lists the most important features of the perl script. A detailed description is contained later on and in the reference pages.

back to top

Quick Start

This is a brief introduction into the main features of the script. Unfortunately, the number of options is so large that we can only touch the most important ones. Anyway, the quick start guide should at least enable you to use the script in the correct manner. Explore the remaining features as you need them.

The Basics

The script supports two basic modes. Web-ready mode and file system mode.

The first mode, the web-ready mode, is used whenever there is a chance that you need to move the generated files. For example, if you intend to move your HTML pages to a web server or if you want to store your HTML files on CD-ROM (without copying the original image files!) then this mode should be used. This mode solely relies on relative links within the HTML code and creates all files within a configurable target directory. If you move that directory, then the HTML pages stay operational wherever you put them.

The latter mode, the file system mode, is used to support the archiving of the original image files. The created HTML pages are always associated to the original image files. The search for a specific image is also simplified in your image archive as you may now search for comments and text inside your HTML pages. In this mode all newly created images are placed next to the original image files. This allows to create several image selection series which all access the same images, but show the images in a different context or theme. This mode also allows the usage of absolute filenames - which is, however, not really recommended. The recommended usage is to call the script in a place where all target images can be easily grabbed with a relative path name (preferrably not even using the "../" construct).

The mode is determined by a switch in the configuration file. The file system mode is chosen by default.

Step 1 : search, collect and sort images

In step one we search for image files, order them if desired, and on request add a prefix for simplified processing in all following steps. All the collected information is automatically stored in a configuration file (if it does not already exist).

The script is executed with the "-scan" option:

      createPhotoWebPages.pl -scan [-order name|date]    \
[-rename] [-ext] \
[-prefix] \
Configurationfile \
Searchpathes ...

Note: square brackets denote optional parts. You may remove them or use them if required (without square brackets, of course ;-).

It is not required that the configuration file exists, although it may exist. However, the configuration file is re-written in any case. The file extension ".txt" is recommended as the file contains readable text. Note that any file extensions that might be confused with image files are forbidden and rejected - to avoid confusion especially when using the scan option.

The search pathes may be anything that can be interpreted as a filename or directory name in Linux. For example:

      images/*.jpg
*.JPG
./Meine-Bilder/Wilhelma-am-12.April.2001/
Mein-Bild.gif

The given directory names are recursively searched for image files. All files that could be found are checked for a correct image filename extension. If a filename passes the ".jpg", ".jpeg", ".gif" or ".png" test the file is added to the list of images (the file name extensions are case insensitive).

The additional sorting and rename options are recommended if you process images that have been freshly loaded from your digital camera.

The sort option "-order" determines in which order the image files appear within the configuration file. "name" leads to an alphabetic sort according to file name. "date" leads to a date/time sort according to recording date (if available in the EXIF info, otherwise the file date is used). The "date" option normally only makes sense in connection with JPEG images for which the EXIF info entries are available (actually the "Date/Time" entry).
Note that when downloading files from a digital camera the sort according to name is normally already identical to a sort according to date.
If the "-order" option is omitted then the files are listed in the order in which the files have been collected in the file system (which may be quite arbitrary ;-).

It is recommended to start a first rename operation right after sorting the images. This is achieved by using the "-rename" option. This option first disassembles the file name of the image and the re-assembles it following a few simple rules. In this step the filenames won't see any dramatic change. We just use two additional options that may be of help in later file processing.

Two of the re-naming rules can be activated as script options.
The option "-ext" converts all upper case letters in the file extension to a more commonly used lower case format.
The option "-prefix" adds a two-letter prefix to each file name (separated by an "_" character). These two-letter prefixes are sorted in ascending order (i.e. "aa_", "ab_", "ac_", ..., "az_", "ba_", ...).
The advantage of this method is that all files within a series of images are sorted by name in their original alphabetic order - even if you change the filenames later on (of course, only if you keep the prefix in all later steps - which is the default behavior of the script).

Newer version of createPhotoWebPages (starting with v2.3.09) are also capable of handling images which have the same filename (but are placed in different directories). Whenever appropriate, the script is going to add a numeric suffix to the filenames of newly generated files. This makes those files distinguishable even if they are created in the same target directory.

Done. Step one is now completed. We have created an example configuration file in which all of our images are listed. Besides, the configuration file also contains default entries for nearly every possible entry. Step two is now going to adapt and change a few of those entries in order to meet our expectations.

Note that, up to now, no image files or HTML files have been created. This is done by intention - and is left for step three.

TIP: as an experienced user you may start with a default configuration file that already contains your preferred configuration entries (and normally no image entries, of course). These configuration entries are read in first. They will then already be correct in the configuration file that is written after the scan operation. If the configuration file name does not exist, the script uses default entries.

Step 2 : first settings

First thing to do before starting image and HTML page generation is to take a look into the configuration file. Bevor nun zum ersten Mal HTML-Seiten und Bilder erzeugt werden sollte man erst einmal einen Blick in die Konfigurationsdatei werfen. At least the basic mode (i.e. web-ready or fle system mode) should be set to the correct value (otherwise files might be created in a place where you don't want to haev them).

The general composition of the configuration file is as follows. The file comprises of comments and single line configuration entries (of course, empty lines are also allowed ;-).
All comments start with a "#" character (leading blanks and tabulators are allowed).
All valid entries follow the pattern:

KEYWORD = Value

The "KEYWORD", composed of upper case letters, digits and underscores, describes what the entry is good for, and the "Value" describes what to do. The end of the line always terminates at value. Blank and tabulator characters may be placed at the start or end of line, and also to the left and right of the "=" sign. They are ignored in any case.

Here are a few examples:

AUTHOR=Stefan Spaeth
COPYRIGHT = Copyright © 2002 Stefan Spaeth
WEB_READY= 1
TARGET_DIR=./test/
HTML_BG_COLOR =#ffffdd
HTML_TEXT_ALIGN=center
THUMBNAIL_SIZE=128
THUMBNAIL_DIR=thumbnails

The length of the keywords varies and the value may take very different formats. Simple switches rely solely on the values 0 and 1, some values are numbers or text words, and others may be composed of full sentences.

Watch out for these parameters:

The configuration file entries can be divided into two groups. The configuration parameters and the image parameters. The latter are normally placed at the end of the file and are related to individual images. Some of theconfiguration parameters shoudl be adapted in any case.

The entries for "TITLE", "EMAIL", "AUTHOR", "COPYRIGHT" should be modified - otherwise the script-Author gets the copyright on your images ;-).
"WEB_READY" and "TARGET_DIR" are as well important. The switch value WEB_READY=1 means "Web-Ready"-Mode, and WEB_READY=0 means "Dateisystem"-Mode. "TARGET_DIR" is only relevant in "Web-Ready"-Mode. It denotes the target directory for images and HTML files in this mode.

The image parameters are grouped in several entries per image. Each group is associated to a single image. An "IMAGE_PATH" entry serves as separator for the different groups. All "IMAGE_..." entries in front of this entry are belonging to the image described by the "IMAGE_PATH" entry.

At this point, if desired, we could already start to group or reorder the images. If several images should be composed to a separate thumbnail page then all images are grouped together and the whole group is prepended by a subtitle ("SUBTITLE" entry). All these images are then put together in separate thumbnail HTML pages.

In addition, all the different groups may be combined to form a top thumbnail page. This is done by setting the HTML_CREATE_TOP switch to the value "1".

This is all you need to know in the beginning. You may want to limit yourself to adapting the "WEB_READY" entry for now. All entries are described in detal in the reference section.

Step 3 : create a draft version of your HTML pages

Now, for the first time, we are going to create images and HTML files. Many comments and text entries in the configuration file may still not be final, but it is much easier to adapt these after looking at the first draft of the HTML pages.

If the only thing you want to do is to create the HTML pages and images then the script can be executed without any further options. However, note that the initial creation of the thumbnails and other images may take a while (an old Pentium I is likely overloaded, a Pentium II at 350MHz and above is more likely to meet your expectations). If you think the calculation times are still way too long, don't worry. The script tries to avoid regeneration of images and builds on existing images whenever possible.

Enter the following command to start the script:

      createPhotoWebPages.pl Configurationfile

All that needs to be done is to enter the name of the configuration file. Informational messages and error messages are written to the screen and most of them are also written into a log file (by default: file name = "Configurationfile.log")

If you feel like you want to get the full set of messages onto your screen then use the "-verbose" option. In this case all messages and information about each step of the script execution are written to the terminal window.

      createPhotoWebPages.pl -verbose Configurationfile

Note that, by default, the script does never overwrite any file that is associated to a specific image. In the contrary, the main goal is to avoid at all cost to do the same step twice.
The file generation or file rewrite process may be controlled in more detail with the help of the options "-html" and "-images". They allow to limit the file generation process to either HTML page files or image files (or both, if both options are applied). The "-force" option, which may be applied alone or together with "-html" and/or "-images", allows to enforce the re-generation of files even if the script comes to the conclusion that a file is still up to date.
The thumbnail pages are always regenerated (unless the generation of HTML pages is globally disabled using the "-images" option).

The HTML pages are ready to be viewed more or less immediately (even long before the image generation is finished) as they are created first.
The starting HTML page file (or homepage file) is by default named either index.html or index_0.html, depending on whether HTML_CREATE_TOP=1 is set or not. The homepage file is either located in the directory where the script is executed (WEB_READY=0), or it can be found in the directory specified by the "TARGET_DIR" entry (WEB_READY=1).

The filename as well as the filename extension of the HTML files are configurable via entries in the configuration file.

Step 4 : rename, comment and rotate your images

Once the HTML files aand images are generated the result can be viewed in any Web browser. At the same time you can already start to make the first modifications to the configuration file.

This step is used take a look at the HTML pages and then to make the following modifications (if needed).

All that needs to be done is to modify the configuration file and to rerun the script usign the "-rename" option.

For example, if you decide that a certain image should be rotated by 90 degrees (counter-clockwise) then simply set the associated "IMAGE_ROT" entry to a value of 90.
IMAGE_ROT = 90
After a rename run the value will be automatically reset to 0 as the original image is now correctly aligned. Note that images in the JPEG format are rotated without loss of quality. Note also that the complete EXIF information is left untouched (i.e. any hints on exposure time, date of recording, etc. stays untouched, if it exists at all).

The renaming of image files (and all other associated image and HTML files) is achieved by modifying the "IMAGE_NAME" entry. The default value for that entry is by default extracted from the existing file name. Whenever this entry is changed and a rename run is executed (i.e. the "-rename" option is used) then a new filename is generated from this entry according to a fixed set of rules.

Due to the fact that the "IMAGE_NAME" entry may later on be used as a basis for a new image filename it should be kept short and precise. Each rename run will look for modifications in the "IMAGE_NAME" entries and will perform a file renaming if needed.

It is, of course, advisable to also modify the caption of the image and to add descriptive text to each image. The "IMAGE_CAPT" and "IMAGE_TEXT" entries need to be modified for that purpose.
Note that in all of these cases it is allowed to make use of all HTML formatting elements (e.g. &uml;, <i>, ..). The "IMAGE_NAME" entry, however, should be limited to simple HTML-entities.

Finally, to put all of these changes into effect, we simply have to rerun the script (using the "-rename" option).

      createPhotoWebPages.pl -verbose -rename Konfigurationsdatei

The script will take care that all already created and existing images and HTML files are renamed in case that the original image file is renamed. Note that existing files are not re-created unless you specify the "-forcee" option.

This procedure may be repeated as often as necessary. It is therefore possible to work in an incremental way.

The "IMAGE_DEL" entry allows to automatically remove the image entry and all associated files and links. The original image file is, of course, not deleted (event though after rename it may now have a different name).
In order to delete an image (for which HTML or image files have already been created) we simply add the entry "IMAGE_DEL = 1". The perl script will take care that all associated files, configuration file entries and links will be safely removed. The image entries may als be directly deleted in the configuration file. In that case any already existing associated HTML are image files might continue to exist.

Step 5 : applying changes

Further changes can be easily applied at any later time. Simply rerun the script and check if your changes have been applied as desired. In most cases, the "-scan" and "-rename" won't be required again (unless you want to add new files or rotate images).
It is important to remember that basically only the thumbnail HTML pages are re-generated. Use the "-html" and /or "-images" options in connection with the "-force" option to re-create existing files. As an alternative it is often easier to manually delete all or a few of the already existing (automatically created) files. When re-running the script, a file is automatically re-created if it cannot be found or does not or no longer exist.

back to top

Installation

The installation is quickly done. The script itself comprises of only a single file. On top you only need the already mentioned utility programs. Both must, of course, be found in the (executable) search path and must have execution rights.

Inside the script a global variable
$installDir,
exists. The value defaults to
"~/.createPhotoWebPages".
This directory needs to be created and the example font
gamow-10-m-r-sc.bdf
must be copied into that directory.
(note: this font is only required if you intend to use the "IMG_IMPRINT_DEFAULT" or "IMAGE_IMPR" entries).

You may, of course, change this perl variable if you want to use a different directory.

Future versions are going to place initial or global configuration parameters inside this directory. Also localization information is going to be added in there (currently, the script uses german texts and the user has to set the global perl variable $UseEnglishText to 1 to get english texts).

An example installation may look like this:

The directory "~/perl" contains executable perl scripts. Furthermore the directory "~/bin" contains all other executable programs and utilities.
"createPhotoWebPages.pl" is copied to the "~/perl" directory and "jhead", as well as "ppmcaption" are copied to "~/bin". The JPEG and PBM utilities are already preinstalled in the Linux "SuSE 7.3" distribution and are already available.

On top the file "~/.bashrc" contains the path variable entry
export PATH=/home/myuser/bin:/home/myuser/perl:$PATH
This is all that needs to be done to be able to use the script.

Hints for Linux-Newbies:

back to top


Last change: ,
Stefan Spaeth, S.Spaeth@z.zgs.de
Copyright © 2002 Stefan Spaeth