The publishing engine is an important component of content management systems (CMS). It generates all the content seen by site visitors, and the capabilities of the engine strongly influences the design and features of the published site.
There are two main publishing models used by content management systems: dynamic and batch publishing, and each has its strengths and weaknesses.
It is important to understand the issues around these publishing models when evaluating a CMS, to ensure that a product is obtained which meets the requirements and technical environment of your organisation.
This article explores each publishing model in depth, and outlines a set of business requirements which can be used to assess the publishing capabilities of content management systems.
In this model, pages are published directly out of the CMS repository ‘on demand’. When the page is accessed via the user, the content is merged with the page layout templates, and any dynamic additions are made.
This publishing model is most notable for the tight integration of the content management system and the published website. It is this integration, and the dynamic creation of pages it supports, which provides many powerful features and advantages. It is, however, not without its problems and issues.
(Note: this publishing model is also referred to as ‘real-time’ publishing.)
Both dynamic and batch publishing have strengths and weaknesses
Strengths of dynamic publishing
It is the well-recognised strengths of dynamic publishing which makes it an attractive option:
This is typically what gets people excited when they see dynamic publishing: you hit ‘save’ on a topic, refresh the site, and there it is!
This publishing model ensures that what is seen is always the latest version, and that changes to the content are instantly reflected in the published site.
This benefit should not be under-estimated, particularly when implementing more complex sites with workflow.
The tight integration between the published site and the content management system allows for a range of powerful features.
Content on the site can be dynamically filtered based on the user, section of the site, or other rules.
Navigation reflects the structure of the site from moment to moment, and ‘related topics’ are always up to date.
Dynamic publishing is also very well suited to personalisation and role-based customisation.
Dynamic publishing makes it easy for a CMS to provide ‘in-context’ editing, where a user simply browses to the page on the published website, and presses the ‘edit’ button at the bottom of the page.
While this is not the only option possible, in-context authoring is increasingly being seen as a good way of supporting the needs of infrequent and non-technical authors.
Versioning and ‘back in time’
Extensive use of database capabilities is required to support dynamic publishing approaches, and this provides a number of other benefits.
By tracking everything in a database, it becomes straightforward to provide powerful versioning and archiving.
Dynamically-published content management systems also commonly provide the ability to view the site ‘back in time’, on a specified date in time. This is very useful in addressing legal and record keeping compliance issues.
One of the most interesting uses of dynamic publishing is to blur the lines between intranets, websites and extranets.
Some vendors are now promoting a model of having a single ‘blended’ site, whether for internal or external users. The CMS simply keeps track of who is logged in, and filters the published pages accordingly.
In this way, internal users might see additional menu items, while external visitors are only shown a cut-down home-page.
Dynamic publishing is very resource-hungry
Weaknesses of dynamic publishing
There are a number of weaknesses inherent in dynamic publishing, including:
Performance and resource requirements
The resource-intensive nature of this publishing model is by far the greatest issue that needs to be recognised. Each time a visitor accesses a page, dozens (or even hundreds) of database calls are made to assemble the page.
This generally leads to the requirement for fast webservers, with large amounts of memory and expensive hard drives.
Moreover, as the usage of the site grows, so does the resource demands of the CMS. Fairly rapidly, dynamically-published sites start to require multiple servers in load-balanced configuration, with the databases themselves stored on still more servers.
This should not be underestimated. For some systems, even an intranet needs to be served by a pair of dual-processor machines configured to their maximum capacity.
In practice, once the systems start to scale to this level, the CMS provides resource-saving mechanisms such as caching, which reduce the load of the servers. While these help, they do not entirely eliminate the resource demands, and introduce their own complexities.
For all its strengths, dynamic publishing is certainly the most technically complex approach.
This may impact upon the stability of the software, or the organisation’s ability to maintain and customise the CMS without having to rely on the vendor.
Dynamic publishing can be difficult on hosted servers
Hosted environments and security
Dynamic publishing requires some (or all) of the content management system to be installed on the webserver. If the webserver is being run within the firewall (such as for an intranet), this may not be an issue.
Where it does become more problematic is where the corporate website is hosted externally, using a third-party provider.
In this situation, the external provider have difficulty installing the CMS, or may be reluctant to do so. It is also more complex in terms of communicating between systems, and generally managing the IT infrastructure.
There may also be security issues in having the content management system sitting outside the firewall, particularly in larger corporate sites.
In general, having the CMS tightly integrated with the published website increases the potential for security vulnerabilities and issues.
Dynamic publishing may impact upon Google and other search engines
Cost and licensing
The resource-hungry nature of dynamic publishing increases costs in terms of both hardware and software.
Beyond this, there may be implications due to the licensing model itself. If multiple copies of the CMS are required (either for load balancing, or for intranet and website), the license may require multiple purchases of the core CMS software.
This can quickly multiply the costs, and convert a relatively inexpensive CMS into a much more costly option.
Search engine and usage statistics
When pages are dynamically-created ‘on the fly’, issues arise with software originally designed to handle websites consisting of static pages.
Most importantly, public search engines such as Google and Yahoo have well-documented issues with searching fully-dynamic sites, and at worst, it may mean that published sites don’t appear in these search engines at all.
Within an organisation, there may be a requirement to obtain a search engine that has specific support for the content management system, or obtain a CMS that has a search engine built in.
Off-the-shelf packages for tracking usage statistics may also have problems with reporting on dynamic sites, and it may be necessary to develop additional integration code, or use the statistics features provided within the CMS itself.
The dynamic publishing model often leads to sites having URLs like the following:
There are a number of usability issues with such URLs, and they also cause problems with the major search engines such as Google and Yahoo (as indicated earlier).
When evaluating dynamically-published content management systems, explore whether they provide an option for ‘human-friendly’ URLs in addition to the solely-numbered pages.
Batch (static) publishing
In this publishing model, pages are published as part of a ‘batch’ process that scans through the content
repository and generates updated pages. These are delivered to the webserver in the form of ‘static’ or ‘flat’ HTML pages.
While this is often viewed as a the ‘low-tech’ approach, the comparative simplicity of this model makes it suitable in many situations.
Strengths of batch publishing
Batch (static) publishing provides a number of important benefits:
Modern webservers are able to deliver static web pages to hundreds of simultaneous visitors on relatively modest hardware.
By publishing static pages, performance is thereby greatly increased, making this publishing model the most suitable for high-load sites.
It is also easier to estimate required hardware and software, and to manage spikes in usage.
Static content is also easy to cache, using existing (and well-tested) software solutions.
Batch publishing systems produce standard web pages. This has the big advantage that all tools designed to operate on web content are equally valid for pages published from the CMS.
This means that standard search engines can ‘spider’ the content, and off-the-shelf web statistics packages operate as normal. This makes it easier to integrate the CMS with existing processes and applications.
Batch published pages work like any other web page
Hosted or production environments
Batch publishing is well suited to situations where the website is hosted externally with a third-party provider. Since the published files are static HTML, they can be uploaded onto the remote server via standard methods such as FTP or SSH.
Using a batch publishing model can therefore minimise, or eliminate, any impact upon the production server environment.
Security can be greatly improved by keeping the content management system entirely behind the firewall, and simply publishing static pages through to the production site.
In this way, all the code sits internally (along with all draft content), with the production website containing nothing but the approved content.
Content management systems are complex packages, with many possible sources of bugs or crashes.
In the dynamic model, failure of the CMS leads to failure of the website. In contrast, batch publishing ensures that the failure of the CMS doesn’t affect the site (only the ability to manage it internally).
Batch publishing improves site reliability and up-time
Supports remote users
In larger organisations, there are often users who do not have easy or fast connections to the corporate network. These may be staff working in the field, or even located on sites such as ocean-going ships.
In these situations, it is useful to be able to provide a ‘cut’ of the site, that can be distributed on CD. The batch publishing model is well-suited to creating such outputs.
May still provide ‘dynamic’ features
A batch publishing model isn’t restricted to producing just ‘plain’ HTML pages. Instead, pages can be published that incorporate scripting languages such as PHP, ASP or JSP.
In this way, while the content itself is static, this can be enhanced through additional server-side scripting to add required dynamic features, such as personalisation, online polls, fine-grained security, etc.
This is a potentially very powerful approach that can combine the best of both worlds: the performance and simplicity of batch publishing, with elements of dynamic presentation.
Simplifies management of publishing
In practice, there may be good reasons to restrict publishing to a specified set of times and days. This simplifies the management of releases, and makes it easier to conduct quality assurance on the released material.
By providing a more strict publishing model, internal business processes may actually be enhanced.
Weaknesses of batch publishing
Batch publishing suffers from a number of weaknesses, including:
Fundamentally, batch publishing of content is a simpler mechanism that often leads to less features being offered by the CMS.
Batch publishing may limit the amount of content reuse, personalisation, ‘back in time’ features, content assembly, and the like.
This makes batch publishing more suitable for situations where ‘plain’ web content is being managed and published in a relatively straightforward way.
Batch publishing offers less features
Content management systems using a batch publishing model must allocate time to create the published versions. This may be done:
- each night
- at scheduled times during the day
- on demand
Depending on the publishing needs within the organisation, a scheduled publishing timetable may limit flexibility.
Some content management systems may take up to several hours to re-publish an entire site. This may cause problems in recovering from webserver crashes, or when content needs to be rapidly updated.
In practice, content management systems may offer a mix of the previously-outlined approaches.
Some systems use dynamic publishing internally to supporting authoring and workflow processes, and then publish a static ‘snapshot’ to the production webservers.
Others install a small component of the CMS on the webserver (often in the form of PHP or Java code) that accepts updates from the repository in the form of XML. A modest amount of ‘dynamic’ publishing is then used to apply stylesheets and page layouts in real-time.
Like all other aspects of content management systems, this is an area still being innovated by CMS vendors, and features are constantly changing.
Many CMS products offer a mix of publishing approaches
While this article has looked extensively at the two publishing models, these should not a key selection criteria when evaluating systems.
Instead of focusing on the technical aspects of how the systems work, it is more useful to look at how the content management system will meet your business needs.
The following section may provide a useful starting point for the business requirements surrounding publishing models.
Details for tender
The vendor should outline the publishing model used by the CMS, whether ‘batch’ or ‘dynamic’ publishing, or a hybrid of the two.
Vendors should also specify:
- how the publishing process operates
- the specific strengths and features offered by the publishing system
- what user actions are required to initiate publishing of updated content
- what software must be installed onto the production webserver
- how the publishing system would operate in the IT environment as it currently exists
- how rapidly changes can be published to the site
- whether individual pages can be published to the site without requiring a full site rebuild
- whether content can be published ‘on demand’, or whether it must be scheduled
- the resources (hardware and software) required to support the publishing system
The vendor should also provide sufficient technical details for an evaluation of suitability to be made by group responsible for maintaining the IT infrastructure.
There are two main publishing models: dynamic (real-time) and batch (static).
Dynamic publishing offers the greatest features, and is the more powerful of the two options. The down-side is that it is resource-hungry, potentially complex, and may be difficult to integrate into existing environments.
Batch publishing is the simpler option that produces ‘plain’ HTML that is hosted on a webserver in the traditional way. Although this is often seen as the poorer cousin of dynamic publishing, it does scale much better as usage increases, and may be the best fit for straightforward website needs.
Regardless of the publishing model being considered, organisations should focus on their business requirements, and assess how each CMS’s specific publishing mechanisms best meets these needs.