WARP: Web Accessibility Reporting Project Ireland 2002 Baseline Study

Dr. Barry McMullin

Dublin City University

4 November 2002 [Document also available in PDF Format.]


Contents


Executive Summary

If anybody asks me what the Internet means to me, I will tell him without hesitation: To me (a quadriplegic) the Internet occupies the most important part in my life. It is my feet that can take me to any part of the world; it is my hands which help me to accomplish my work; it is my best friend--it gives my life meaning.

- Dr. ZhangXu (Zhangxu & Aldis, 2001)

The technology of the Internet holds tremendous promise to significantly improve access to information, goods, and services for many people with disabilities. Properly engineering web sites can interoperate with dedicated assistive technologies to flexibly address a wide range of disabilities. Almost overnight, it has become possible for a blind person to read papers, magazines and books, without assistance, and on the same day they are published; for a person with impaired mobility to shop for groceries, and pay her bills without even leaving home; for a deaf student to attend a "virtual" lecture, with sub-titling and text transcripts.

The key is in the design of web sites so that they facilitate--rather than obstruct--access by users with disabilities. This is not rocket science: the basic requirements have been codified since 1999 in the Web Content Accessibility Guidelines (WCAG) 1.0 published by the World Wide Web Consortium (W3C, 1999), and more recently endorsed by the European Commission (2001) and the Irish National Disability Authority (2002). These define a series of checkpoints which, if satisfied by a web site, will ensure that it has a high likelyhood of being accessible to the widest possible variety of users. This is good for the disability community; but it is also good for the general community of web users: it is well established that universal design frequently results in products and services that are more usable for all. And in the world of the web, where another site is only ever a key-click or a button-press away, improved usability must be a key priority for all web site operators. It is the proverbial win-win situation.

Nonetheless, though this is the promise: it is already under threat.

WCAG defines three conformance level: WCAG-A is a minimum standard which a site must meet to be considered accessible for any significant disability groups; WCAG-AA is a "professional practice" standard, which all sites should meet to be accessible to a broad range of disability groups; finally WCAG-AAA is a "gold standard" of maximum accessibility which some sites may choose to aim for--for example, sites with a particular remit to serve disability communities.

This report documents a study of over 159 separate web sites, operated by Irish organisations, and spanning a wide range of activities, information, and services. These were assessed for a set of characteristics correlated with the WCAG guidelines. This set is not exhaustive: it cannot determine that any site positively meets the guidelines; but failure on any of these tests definitively demonstrates failure against the guidelines.

The key results are that, of this sample, at least 94% failed to meet even the minimum WCAG-A standard; and 100% failed to meet the professional practice WCAG-AA standard. Furthermore, at least 90% of sites failed to meet minimal conformance with generic technical standards for web interoperability.

This should be a "wake-up call"--for government, for public agencies, for private companies, organisations and individuals. It signals that, despite Ireland's justifiable pride in its economic and technological development, despite very laudable goals in documents such as the E-Europe Action Plan (European Commission, 2000,2001), the current commitment to accessibility of the Irish web for users with disabilities is, at best, aspirational--and, at worst, cynically inadequate.

On the other hand, this study also indicates that significant progress could be made quickly, and with comparatively little effort--if decision makers will recognise the significance of the issue and take action accordingly. Detailed analysis of the survey data identifies a number of specific, pervasive, web design flaws which can significantly obstruct accessibility by users with disabilities; but which could be drastically reduced or eliminated if web site designers and content authors were simply provided with appropriate tools and training.

The report discusses both the strengths and limitations of its particular methodology; and lays out a programme of further work to extend, clarify, and refine our ability to monitor the evolving state of Irish web accessibility--as an essential tool for informed policy formation. Now, of course, there is a need for leadership; for public policy initiatives, particularly in education and training; and for concrete incentives and supports to organisations wishing to improve services for users with disabilities.

Finally, there must surely also be a role for legislation and regulation--to ultimately guarantee and vindicate the rights of all citizens to equal treatment in a digital democracy.

1 What is Web Accessibility?1

The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect...

- Tim Berners-Lee, W3C Director and inventor of the World Wide Web. (W3C, 1997)

The explosive arrival of the Internet/World Wide Web2 as a mass communication medium makes it a little difficult to see it yet in full perspective, or to properly gauge its impact. For most of us it is at least a somewhat useful tool, giving more immediate access to many information sources and services. It is also often awkward and frustrating-with slow responses, complex and confusing interactions, incompatible software requirements, and all the other characteristics of any immature technology.

But for many people with disabilities the web is more than just another technological toy-it can offer the potential for significant improvement in their access to products and services that the more abled community take for granted. Here are a few brief examples:

Other more elaborated and detailed scenarios are available in an illuminating guide to How People with Disabilities Use the Web (W3C, 2001). However, as that document also makes clear, while the web opens up very promising new possibilities for supporting and including users with disabilities in the benefits of the emerging information society, these outcomes are by no means automatic: they rely on the services and facilities being designed in an inclusive way that supports and facilitates all categories of users.

To give a more familiar example: wheelchairs are an assistive technology with the potential to dramatically improve the mobility of people with specific physical disabilities. However, if buildings are designed without ramps or accessible lifts and doors, then the effectiveness of a wheelchair is, of course, dramatically reduced.

In the same way, assistive technologies such as speech synthesizers, alternative keyboard and pointing devices, voice recognisers etc., can all help people with various disabilities to access computer devices in general, and to access the Internet in particular. But if web services are not designed in such a way as to facilitate and inter-operate with these assistive devices, then their benefit may be sharply reduced.

Here is a simple--and pervasive--example. Many web sites are enriched with small graphical images or "icons" for various functions, such as searching and navigation. This immediately presents a potentially serious obstacle for a blind user, who, of course, cannot see these icons. While a speech synthesizer can automatically read out any text on the page, it cannot in general recognise or describe arbitrary images in any meaningful way. Fortunately there is a relatively trivial way of resolving this. The underlying HTML format of web pages makes provision for attaching "alternative" or ALT text to images. This is generally hidden if the image is displayed or visible to the user; but it can be picked up and spoken by the speech synthesizer for a blind user. The extra design effort involved is minimal--but dramatically improves the usability of the site for the affected users.3

There are many other ways in which web sites and services can be designed in an inclusive way that facilitates access by all users regardless of disability or the need for particular assistive technologies. These have been codified in various guidelines formulated by the World Wide Web Consortium (W3C)--in particular, the Web Content Accessibility Guidelines version 1.0, WCAG 1.0 (W3C, 1999).

While compliance to such guidelines is essentially voluntary at the moment, this is likely to evolve on an ongoing basis. For example, the Irish Information Society Commission has stated that "... [w]here organisations are designing web-sites care must be taken to ensure that they are accessible to as broad a section of the population as possible" (Irish Information Society Commission, 2000). More recently, the e-Europe Action Plan (European Commission, 2000,2001) makes a commitment to the adoption and (ultimately) application of the WCAG guidelines to all public sector web sites in Europe.

To inform policy and promote action on this agenda it is important to have information about the state of accessibility of web sites, and to actively monitor trends in accessibility over time. This is a large scale task which should ultimately involve a range of different instruments, methods and approaches, which will themselves evolve over time.

This report documents an initial, baseline survey of the overall accessibility of Irish web sites. The particular approach taken has both strengths and weaknesses: however it is hoped that it can initiate a process of engagement, debate, and action by all Irish providers of web services, not merely to "accommodate" users with disabilities, but to reach out and positively include them--as of right--in the emerging Information Society.

2 Methodology

2.1 The W3C WCAG Accessibility Guidelines

The starting point for this study is the W3C set of Web Content Accessibility Guidelines version 1.0, WCAG 1.0 (W3C, 1999). The question being researched is then the extent to which Irish based web sites conform to these guidelines.

It is important to note that this question is already at one remove from actual user experience. That is, instead of direct testing of web sites by users with various disabilities, we are using the WCAG guidelines as a proxy indicator of accessibility. In particular, we are here relying on the prior work in devising these guidelines to assure their quality and relevance to real accessibility: but we note that the guidelines themselves are under ongoing review, and that research which actively investigates the correlation between WCAG conformance and accessibility is an essential complement to the work presented here.

WCAG consists of 14 separate guidelines, each of which has an associated set of one or more individual checkpoints. There are a total of 65 checkpoints which are classified into three priority levels (W3C, 1999, Section 4):

Priority 1:
A Web content developer must satisfy this checkpoint. Otherwise, one or more groups will find it impossible to access information in the document. Satisfying this checkpoint is a basic requirement for some groups to be able to use Web documents.
Priority 2:
A Web content developer should satisfy this checkpoint. Otherwise, one or more groups will find it difficult to access information in the document. Satisfying this checkpoint will remove significant barriers to accessing Web documents.
Priority 3:
A Web content developer may address this checkpoint. Otherwise, one or more groups will find it somewhat difficult to access information in the document. Satisfying this checkpoint will improve access to Web documents.

WCAG conformance levels are then defined on the basis of these priorities (W3C, 1999, Section 5):

Conformance Level A:
all Priority 1 checkpoints are satisfied.
Conformance Level AA:
all Priority 1 and 2 checkpoints are satisfied.
Conformance Level AAA:
all Priority 1, 2, and 3 checkpoints are satisfied.

Thus, to fully assess conformance with WCAG potentially requires an assessment of conformance with each and every checkpoint by each and every web "page" of the site(s) under consideration. In general, this requires detailed testing and evaluation, against each checkpoint, by an expert, human, tester. This clearly makes large scale, exhaustive, conformance testing a slow and potentially costly activity. Whereas, for the purposes of the current study, it is essential to be able to scale up to examinations of large collections of web sites (each of which can individually consist of many separate pages).

However: there are a number of checkpoints (or aspects of checkpoints) where it is possible to test conformance "mechanically"--in effect, by submitting the page(s) to an automated, software, testing agent. By using such automated assessment, it is possible to devise a survey methodology that can be effectively and efficiently scaled up to large collections of web sites.

Of course, the price of limiting consideration to a subset of the checkpoints (or ever, certain aspects of certain checkpoints) is that the results are intrinsically incomplete. In particular, it is impossible, using such a methodology, to declare that any site positively conforms to WCAG at any particular level; rather, the strongest potential results will be of the negative form, namely that specific sites definitely do not conform.

Nonetheless, notwithstanding this limitation, for the purposes of policy formulation and implementation, it is much preferable to have available some concrete, comprehensive, data relating to web accessibility on a large scale, even if this is incomplete, and requires careful interpretation. Moreover, the use of such automated assessment instruments carries with it the positive merit of being absolutely objective. Thus, it provides a firm foundation for valid comparisons--between sites, between regions, and between different points in time.

2.2 Delimiting the Irish Web Space

The objective of this study is primarily to inform and promote web accessibility policy in Ireland. However: by its nature, the World Wide Web is a globally distributed network. Thus, a web "site" (or "server") can be physically situated in a location which is geographical arbitrary, relative to the locations both of the operating organisations and of the users of those sites. Thus, it is not immediately obvious what criteria to apply in classifying sites as "Irish", nor how to employ such criteria in locating or identifying such sites.

The general criterion that has been applied here is to classify sites as Irish if (and only if) the organisation responsible for the site is incorporated in this jurisdiction (i.e., would be legally bound by the laws and courts of Ireland). This is applied regardless of the physical locations of the servers or the users.

This criterion is based on the view that only such sites are likely to be directly affected by Irish policy making and initiatives: however, it must be recognised that this does give a perspective which is then somewhat skewed relative to that of a "typical" Irish user. Such users will very often access sites which are not Irish in this sense, but which do offer information, services, or tangible goods to Irish users, and whose accessibility thus does directly impact on their web experiences.

In any case, this is only a criterion: we must now attempt (somehow) to generate listings of candidate sites matching this criterion.

One natural approach to this problem is to start with information from the Internet's own Domain Name System (DNS). This is the distributed database which contains administrative and management information about every named host computer which is connected to the Internet.

The DNS is organised as a hierarchy; the highest level contains the so-called Top Level Domains, or TLDs. These are subdivided into "geographical" and "non-geographical". There are defined geographical codes for each internationally recognised, independent legal state; that for Ireland is the familiar ".ie". The non-geographical TLDs include ".com", ".org", ".net" etc. While these are most appropriate for use by organisations operating across national boundaries, there is no technical constraint enforcing this. In practice, individual organisations may opt to use a non-geographical TLD for a variety of more or less pragmatic reasons (including simple cost--registration of ".ie" names continues to be comparatively more expensive than names with non-geographical TLDs).

Thus, while it is possible to automatically extract, from the DNS, a listing of all names within the ".ie" TLD, and it is reasonable to assume that more or less all of these are "Irish" in our sense, this will certainly not represent a complete or exhaustive catalogue of such names. However, since the DNS also contains additional administrative information, including, in particular, the address of the registered owner for each name, it is possible to attempt to probe the non-geographical TLDs for names associated with Irish organisations.

A further significant difficulty in attempting to generate a listing of web sites simply through querying of the DNS, is that the although the DNS contains host names for all publicly accessible web sites, it also contains many other host names which do not correspond to web sites at all. This would include, for example, names which have been registered simply to protect trademarks, or in anticipation of future use. At any given time, the DNS also contains many names which are, in effect, defunct--i.e., which are no longer in use (if they ever were in use) but whose registrations have not yet expired. However, it is possible to automatically address some of these difficulties by the simple expedient of tentatively assuming that a given host name does correspond to a public web site, and then attempting to connect to it on that basis. If this succeeds, it is generally possible to also automatically extract some further technical and administrative information directly from it, which can help to decide whether it represents a functioning web site; conversely, if this connection fails, one can (still tentatively) conclude that the host name does not correspond to a web site.

These general techniques have been used by whoisireland to develop at least rough estimates of the extent of the Irish web "space". The following summary data was reported as of 20 March 2002:

In total then, based purely on DNS data, the Irish web space, across the main relevant TLDs, would appear to consist of between 15,000 and 23,000 individual web sites. However, allowing for the exhaustive nature of the DNS, this is likely to be, if anything, an over-estimate.

A quite different approach to assessing the scope of the Irish web space is through analysis of manually constructed directories of web sites. In contrast to the automated, mechanical character of the DNS approach, entries in such directories must generally be originated, and perhaps explicitly reviewed or evaluated, by human "editors". Their coverage is therefore likely to be significantly less exhaustive (i.e., providing, if anything, an under-estimate of overall scale); but entries are likely to much more consistently correspond to genuine web sites providing some substantive information or service.

One of the largest such directories is that maintained by the Open Directory Project (ODP). Indeed, the ODP provides the core data which is used as the basis for a number of other well known, commercially oriented, Internet directories, such as Google. The ODP is an open, collaboratively maintained, global directory of web sites, organised by a variety of categories, including geographical location.

In total, the ODP geographical category for "Ireland" contains 7,1476 entries (as of 21st April 2002). Of course, not all of these would necessarily be "Irish" in our sense, but the great majority can reasonably be assumed to be. While this number is rather smaller than the 15,000-23,000 estimate based on raw DNS data, it is of a similar order of magnitude. Thus, finally combining the two approaches, suggests an overall estimate of perhaps 10,000 distinct sites as the current size of the Irish web.

2.3 Sampling the Irish Web Space

This data of the previous section provides a useful gross indication of the overall scale of the challenge of surveying the Irish web space (for accessibility, or any other purpose).

In the context of the current study, it was clear that attempting an exhaustive study would be technically very demanding; and that even direct (unbiased) sampling based on this data would probably not be appropriate. Web sites differ from each other in many ways: size, function, utility, popularity etc. It is inevitable that a relatively much smaller subset of the total sites identifiable via the methods of the previous section are actually significant to the needs of typical users.

Ideally, one might therefore attempt to identify the most relevant Irish sites by consideration of relative "traffic" or "activity". However, such an analysis is very difficult in the general case. There is considerable debate as to what the most appropriate measures of activity levels should be. Objectively certified usage data is not widely available. And even where usage data is available it is typically intended to support commercial objectives (specifically, Internet based marketing), whereas, many important web services are provided by public sector or otherwise non-commercial organisations (which may be therefore be seriously under-represented in such data).

Accordingly, the pragmatic approach taken here was to create a sample of the Irish Web based on largely subjective judgments of experienced web users (drawn from within the project team). The ODP category for Ireland was used as a starting point, but sites were also identified from a variety of other sources, including other directories, public advertising etc.

A target sample size of approximately 200 distinct sites was set for this current (initial) study. This was judged to be large enough to give a reasonably wide distribution across sectors and service types and serve as a proof of principle of the methodology and technology being used. Scaling up to significantly larger surveys can be considered subsequently, in the light of the experience with this initial baseline.

The following informal categories were taken into account in selecting specific sites:

The final list of target sites contains 214 entries; it is presented in full in Appendix A. It should be emphasised that this listing is not presented as a "statistically representative" sample; extrapolations to the overall Irish web space must therefore be considered very carefully. Nonetheless, and subject to further critical testing in future studies, it is conjectured that overall patterns from this sample can usefully guide national policy development and implementation.

2.4 Delimiting Individual Web Sites

Just as the web itself is globally distributed and hyperlinked, and lacking clear boundaries, the scope of a "single" web site is typically hard to define--both in principle and in practice.

It is generally reasonable to suppose that all components of a single web site should be managed by a single organisation, so that hypertext links pointing outside of that organisational responsibility do not belong to the same site--although even this principle can be violated where an organisation outsources certain aspects of its services. A more serious problem is that the delimiting of a "single" organisation can itself be problematic. Thus a large organisation will often have subdivisions, subsidiaries, partnerships, projects etc., each of which may have their own associated web information or services. In principle, for our purposes, these should probably be considered as separate web sites if they are subject to separate technical design and maintenance (and thus may be significantly different in terms of accessibility). However, in practice, there is a virtual continuum of degrees of independence; and no reliable technical means to measure this variation externally.

As with the delimiting of the overall Irish web space, concrete, pragmatic, choices had to be made here.

Accordingly, the scope of a single web "site" has been arbitrarily defined relative to its identified top-level host name (this normally corresponds to what is called the site "home page"). Thus starting with a host name of the form, say:

www.someorg.ie

an initial, or "root", Uniform Resource Locater (URL) is generated as:

http://www.someorg.ie/

The page at this URL will, in general, contain hypertext links to other pages. These hypertext links are classified as falling within the "same" web site if, and only if, their URLs contain exactly the same host name. Those secondary pages may, in turn, contain further links; and the same criterion is then applied to classify these as falling inside or outside the site. In principle, by applying this procedure recursively, the complete scope of the web site (at least, as defined in this particular way) can be mechanically identified.

While this is presented as a reasonable, pragmatic, approach to delimiting single web sites, it does, of course, have some significant limitations. In particular, URLs with a slightly different host name may well properly refer to the same web site. Continuing the example above, one might encounter URLs of the form, say:

http://search.someorg.ie/...

which might provide a search facility for the site, and should surely be classified as belonging to the site. However, they would be excluded by our criterion. On the other hand, of course, if one instead required only the "organisation specific" fragment of the host name (the "domain name") to be preserved, one might well include URLs which should really be excluded; for example:

http://www.department.someorg.ie/...

If this corresponded to information or services under completely separate management, it should be classified as outside the original site.

There is no generally reliable method to automatically distinguish these cases; but the choice we have made, being the more restrictive, is likely to ensure that all URLs classified as belonging to a site, do indeed do so--even if it is not exhaustive in probing the limits of the site.

2.5 Sampling Individual Web Sites

While the method described above provides a criterion for delimiting individual web sites, it does not necessarily follow that one should include the complete scope of every site, identified in this way, in the current study. In particular:

With these points in mind, the particular approach taken here was as follows:

As with the overall sampling of the Irish web "space", the sampling procedures described here are pragmatic, and are not presented as "statistically representative". Thus, results from each such sample cannot be reliably extrapolated to complete sites. However, there is no basis for supposing that accessibility characteristics would be randomly and independently distributed across a site anyway--on the contrary, if anything, there are likely to be very strong correlations. Thus, "statistically representative" sampling is probably not even meaningful in principle.

2.6 System Architecture

The overall system for conducting the accessibility survey consists of the following major components:

The bobby and pavuk components are described in more detail in the following sections.

2.6.1 bobby: An Automated WCAG Assessment Tool

There are a number of software products now available to carry out automated assessments against (subsets of) the WCAG guidelines. These have a variety of strengths and weaknesses, but are functionally very similar (by definition, as they are largely driven by the WCAG guidelines themselves). For the purposes of this study we have chosen to adopt one of the most widely deployed of these products, bobby, originally developed by the Center for Applied Special Technology (CAST), and now distributed and maintained by Watchfire Corporation.

However, it should be noted that bobby, in common with most products in this area, is primarily focussed on assessment as a prelude to "repairing" of correcting accessibility problems. Thus it has not been designed with the special purposes of large scale surveying (without a direct linkage to repair) in mind, and is not necessarily particularly well suited to this purpose. This issue will be returned to in the conclusion of the report.

The particular version of bobby used here is Bobby Worldwide (Core v4.0). This is available both in an ASP form5(hosted by Watchfire and controlled via a web interface), and in a standalone form (hosted on a local server). The latter provides greater functionality (specifically, the ability to analyse batches of locally stored pages, and to generate machine readable, XML reports) and is therefore preferred for the current study. All further detailed discussion refers to this version.

bobby implements 91 distinct tests or diagnostics, each of which maps onto a specific WCAG checkpoint.6 A number of bobby diagnostics map onto (different aspects of) the same checkpoint. The bobby diagnostics are classified into a number of different "support" categories, as follows:

Full:
bobby automatically checks this guideline.
Partial:
bobby performs some checking and presents line numbers of the HTML code to verify for errors.
Partial Once:
Similar to partial but bobby does not present line numbers.
Ask Once:
bobby does not do any checking, the guideline is presented as a reminder to you.
Summary Ask Once:
An Ask Once only presented in the Summary report since it only applies to a group of pages (such as Use Consistent Navigation).

For all categories other than Full, further evaluation would be required by a human assessor to determine WCAG conformance. Accordingly, in the work presented here, bobby is restricted to implementing just those diagnostics with Full support. There are 25 such diagnostics, which map onto (aspects of) 20 distinct WCAG checkpoints, including some at all three priority levels. The complete table of these bobby diagnostics, with WCAG mappings and priorities, is presented in appendix B.

The command line form of bobby, called bobbycl, was used. Custom perl scripts were used to invoked bobbycl once for each web site in the survey. In each case, bobby was provided with a list of of local files to be analysed, corresponding to a local (sampled) image of the target site (the capture of this local image is discussed in section 2.6.2 below). The results were emitted by bobby in the form of an XML formatted file. This is designed for ease of subsequent machine processing. In our case, the controlling perl script then parsed this output file (using the XML::DOM perl module) and recorded the required data in the database.

2.6.2 pavuk: Web Capture Robot

As already discussed, it is necessary in this study to delimit individual web sites, according to some more or less pragmatic technical criteria. The core element of this is a procedure whereby links are extracted from a given page, and each of these is classified for membership of the intended site; the process is then recursively applied to those pages etc.

The bobby package has integrated support for this form of recursive site mapping; however, it provides only limited control and customisation. In particular, it does not provide for site sampling limited by data quantity; nor for configuration of sampling by media type. Furthermore, when using this integrated mechanism, bobby downloads web pages for immediate analysis, directly over the Internet, after which the downloaded pages are simply discarded. This is unsatisfactory for our purposes. In general, for test and debug reasons alone, it is necessary to be able to review specific pages against the corresponding bobby generated reports; furthermore, in the general case (not addressed in the current study) there may be other additional analyses which it would be desirable to carry out on the pages. Therefore, it is essential to be able to retain (at least for some time) local copies of the downloaded pages.

Thus, the preferred approach is to have two separate phases: capture, when copies of the target web pages/sites are stored locally, followed by analysis when bobby (and/or other tools) are used to process the pages.

The tool selected for the capture phase is called pavuk; it is an open source package, released at no charge under the GNU Public License. It provides for extensive control and customisation of the capture process.

The capture phase for the complete survey is implemented by a custom perl script. This invokes pavuk once for each site. pavuk is controlled by a customisation file. This is dynamically tailored, as necessary for each site, but most options are copied from a single template. This template is presented for reference in Appendix C. For our purposes, the key configuration settings in this template are:

The configuration settings tailored for each site are:

2.7 Accessibility Measures and Benchmarks

For the purposes of summarising the accessibility diagnostics generated by bobby, and especially for comparing these (between individual pages, sites, whole surveys etc.), it is desirable to formulate specific accessibility measures.

At the most summary level, the approach adopted here is based on the notion of WCAG conformance levels. A given site is classified as non-conformant, at a given level, if it fails one or more checkpoints at that level. The benchmark measures for the complete survey are then defined as the proportions of sites which can be definitively classed as failing to achieve WCAG conformance levels A, AA and AAA.

In considering the significance or severity of individual bobby diagnostics, clearly all Priority 1 defects take precedence over all Priority 2 defects; and Priority 2 defects in turn take precedence over all Priority 3 defects. However, within a priority level, one can reasonably ask whether it is possible to rank the significance of the different defect types. We elect to do this by again measuring the proportions of sites classed as failing (at least once) the individual bobby tests. These provide at least a rough ranking--in the sense that comprehensive repair of the highest ranking defects would affect the greatest proportion of sites.

However: it must be borne in mind that such a ranking is naturally limited to the specific defect types included in the study--i.e., those which can be fully and automatically tested by bobby. Thus, there may be other defect types, not considered here, which would be of more significance to the practical accessibility of specific sites.

Note also that these measures do not attempt to probe the relative significance of different defect types within individual sites. The latter sort of analysis will normally need to consider the relative densities of the different defect types in the site--as, presumably, repair effort may depend on how pervasive the defect is. However, for the purposes of the current study, which is focussed on policy, particularly in regulation, education and training, such more detailed profiles of individual sites are not at issue.

2.8 Experimental Runs

The project naturally involves a large number of prototype or developmental runs to test, refine, and debug the overall system functionality. However, for consistency, all detailed data reported here is drawn from just one "definitive" run, and all specific discussion is based on this. The site capture phase took place on 9th April 2002, starting at 16:11:49 IST (UT+01) and ending at 17:59:19 IST (UT+01). Thus, all results relate only to the state of the target sites at that point in time.

3 Processing Platform Requirements

This section reviews the computing and communications resources required for a single survey run. This is primarily to inform subsequent discussion of the prospects and issues in extending the work described here: particularly in terms of repeating the survey (to implement comparative longitudinal studies) and increasing the scale (in terms of number of sites, sampling coverage per site etc.).

3.1 Capture Phase

During the capture phase, pavuk records various information regarding each URL encountered, and each attempted page capture, in a number of plain text log files. While these files are all preserved for reference, selected information from them is also inserted into the database.

The data capture phase had a total duration of 01:47:26, i.e., rather less than two hours. The total data captured was 32,402,205 bytes (31 MBytes). The total number of individual pages captured was 3270; thus the average page size was 9.7 KBytes.

The average data capture rate was 4.9 KByte/s.7 At first sight this may appear low, given that the work was done using a broadband Internet connection via the Irish National Education and Research Network HEAnet; however the effective bandwidth utilisation was deliberately throttled down by configuring pavuk, as already mentioned, with a 1s delay between successive page downloads. This was done to limit impact both on the host sites and on other users of shared bandwidth. Clearly, if it was decided to increase the scope of future studies, this delay could be reduced or eliminated. Thus, the total capture time need not increase in direct proportion to either the number of sites or the data transfer quota per site.

There were 10 sites for which no data was captured; these are listed in Appendix D. For a further 18 sites, only the initial, "home" page was captured, indicating that pavuk failed to successfully follow any links from that page; these are listed in Appendix E. The remaining 186 sites, for which pavuk achieved some more or less significant site traversal, are listed in Appendix F

Some (limited) analysis has been done of the reasons for these problems with site capture; factors which have been identified include:

In any case, these capture difficulties are only incidentally related to the direct objective of this study, namely assessing accessibility. Accordingly, the 28 sites, for which capture failed either completely or substantially, are simply eliminated from further consideration: the following analysis is then restricted to the remaining 186 sites, for which at least some significant sample of the site was successfully captured.

On this basis, the average number of pages captured per site was 17, and the average data captured per site was 169 KBytes. However, note that these are now primarily determined by the site sampling strategy: specifically the maximum hyperlink depth (3) and the maximum transfer quota (200 KBytes per site); see section 2.5. They should not be interpreted as indications of average site sizes in any absolute or objective sense.

3.2 Analysis Phase

The analysis phase involves executing bobby over the set of data files captured for each site. For each site, bobby generates an XML formatted report file which identifies for each file which (if any) of the bobby Full diagnostics are triggered on that page, and (if appropriate) at exactly what point(s) in the page. This XML file is automatically parsed, and the core data is inserted into the database. This is expressed in terms of bobby detectable accessibility defects. In particular, for each page processed, the database will contain a count of how many instances of each such defect were detected for that page. Subsequent database processing then allows summarising per defect type, per site and/or across the whole set of sites.

bobby did not function properly for a number of sites. Discounting the 28 sites for which the capture phase had failed, an independent bobby failure was encountered on a further 27 sites. A complete listing of these is given in Appendix G. These failures were manifested in two different ways:

To pragmatically deal with the cases where bobbycl appears to lock up, the controlling script is programmed to forcibly abort the bobbycl process if it does not complete within a specified time. The appropriate timeout value generally depends on the typical capture size of each site, the processing speed of the platform, and the background platform load. Thus it can only be estimated pragmatically and experimentally. For the particular conditions of the current run, in those cases where bobbycl terminated normally, the maximum processing time per site was 37s (the average was just 7s); a reasonable value for the timeout was therefore judged to be 120s (2 minutes).

In any case, for the purposes of further analysis here, it is necessary to eliminate all sites where bobby processing failed for whatever reason; that now leaves a total of 159 admissible sites. This is still 74% of the original target set, which is judged to be sufficient to yield informative data on accessibility measures. These are listed in Appendix H.

Across these sites, the total data processed by bobbycl was 27,892,509 bytes (27 MByte), comprising 2620 individual pages, in a time of 18:59 (1137s); thus the average processing rate was approximately 24 KBytes/s. This might be used as a very rough basis for estimating bobby processing times on scaled up studies (on a similar processing platform, or adjusted to reflect relative platform processing benchmarks). However, there was very significant variability in processing rate across different sites; and, of course, such estimates will be significantly confounded anyway by the unpredictable incidence of cases where bobbycl locks up. Thus, it is likely that processing demands for larger scale studies can only be effectively determined through empirical testing.

3.3 Total Storage Requirements

Storage (disk space) requirements are largely determined by the decisions on the number of target sites and the data capture quota per site. These yielded the basic storage usage of just over 30MBytes. However, this has to be augmented by space for:

In our case, the primary data is archived (uncompressed) on CD-ROM; in that form it requires c. 60MBytes. On the live disk (with a larger allocation block size), during capture and processing, the requirement was for c. 90MBytes. The live database storage requirement is c. 5 MBytes; a compressed database dump requires less than 200 KBytes.

This suggests that the overall scale of the survey could quite easily be increased by a factor of, say, 10, without posing very significant raw storage demands (e.g., the data could still be easily archived within the capacity of a single CD-ROM). Of course, scaling also depends on processing and communications requirements which are more difficult to estimate, as already discussed.

4 Irish Web Accessibility Benchmarks

This section presents the core benchmark results from the study: a set of automated indicators of accessibility across the Irish web space.

The raw bobby accessibility defect data, generated as described in section 3.2, was analysed as follows:

The key benchmark conformance measures are:

WCAG-A Conformance Failure Rate: 149/159 93.7%
WCAG-AA Conformance Failure Rate: 159/159 100.0%
WCAG-AAA Conformance Failure Rate: 159/159 100.0%

A number of observations may be made on these results:

5 Defect Type Analysis

In this section we review and discuss the specific accessibility defects which the study has identified. We focus on those with the highest WCAG priority and affecting the highest proportions of sites across the survey. The objective is to try to identify repair or remediation activities which, if undertaken by site operators, would effectively reduce accessibility defect levels on the Irish web.

However, again, the methodological limitations of purely automated accessibility testing, especially when aggregated across a wide variety of sites, must be remembered here. While the results discussed in this section should provide helpful generic context for web operators, effective remediation would require much more careful and detailed evaluation on an individual site basis. In particular, for any given site, there may well be significant accessibility problems which are not detectable at all with the techniques employed here, and therefore not mentioned (at any ranking) in this section; but which have a much more immediate and practical effect for many users with disabilities, and should therefore be a high priority for repair.

5.1 WCAG Priority 1 Defects

The following table summarises all WCAG Priority 1 defects, across all sites, identified in this survey, ranked by the proportion of sites affected (i.e., the proportion of sites for which the defect was detected at least once):

Diagnostic ID WCAG Checkpoint Incidence (sites %)
g9 1.1 90.56
g39 12.1 33.96
g38 6.2 33.33
g240 1.1 26.41
g10 1.1 18.23
g21 1.1 10.69
g20 1.1 0.62

It is notable that 5 out of the 7 defect types here all refer to a single WCAG Checkpoint, 1.1: "Provide a text equivalent for every non-text element" (W3C, 1999). Given the dominant impact of this category of defect on the overall results of the survey, it is worth discussing this is a little more depth. The motivation for its use is elaborated as follows (ibid.):

The power of text equivalents lies in their capacity to be rendered in ways that are accessible to people from various disability groups using a variety of technologies. Text can be readily output to speech synthesizers and braille displays, and can be presented visually (in a variety of sizes) on computer displays and paper. Synthesized speech is critical for individuals who are blind and for many people with the reading difficulties that often accompany cognitive disabilities, learning disabilities, and deafness. Braille is essential for individuals who are both deaf and blind, as well as many individuals whose only sensory disability is blindness. Text displayed visually benefits users who are deaf as well as the majority of Web users.

Text equivalents thus provide a generic mechanism for addressing a wide variety of both web functions and user disabilities. The general notion of a text equivalent was already mentioned in section 1, in the context of providing alternatives to embedded images; but the scope of this checkpoint is significantly broader than that. Text equivalents can be an effective accessibility technique for all of the following web design situations:

Each of the bobby priority 1 defect types identified in table 5.1 are considered in more detail in following sections.

5.1.1 bobby Defect Type g9 (90.6% of sites)

g9: Provide alternative text for all images.

Given that this one defect is encountered one or more times on over 90% of sites, an improvement in this could have a very direct effect on the key WCAG-A Conformance Failure benchmark.

A basic requirement for addressing this issue is, of course, the use of web publishing tools that technically support the insertion of text alternatives. Thus, an obvious recommendation is that site operators should ensure that all their approved tools have this capability; and--equally, or even more importantly--that all relevant developers or content authors have the technical knowledge to access it.

However, it is essential that this be coupled with training in the appropriate formulation of text alternatives. While this is not generally complex, neither is it trivial or intuitive: it requires essential knowledge and understanding of the diverse purposes of equivalent texts, and the diverse needs and capabilities of users with disabilities.

Having said that, in some cases, the formulation of alternative text for images is straightforward. In particular, many images embedded in web pages are purely to provide visual decoration. In such cases the appropriate text alternative is an (explicitly) empty string, coded as:

<img alt="" src=... >

Although it may seem counter-intuitive, this is quite distinct from simply omitting alternative text completely. It makes explicit to the browser that there is no significance to the image; therefore a user who can't (or chooses not to) perceive the image need not even be alerted to its presence. By contrast, if the alternative text is omitted, as opposed to being blank, a browser must generally try to signal to the user that some sort of image is present--just in case it may be significant. In the case of a blind user using an auditory browser for example, this would mean that the browser would insert some sort of auditory indication of the presence of each such image, which may interfere quite significantly, and completely unnecessarily, with the user's effectiveness (not to say enjoyment!).

Of course, apart from this special case, text alternatives for images will usually have to be generated on a case by case basis9; but provided that appropriate tools are in place, and that developers and authors are properly qualified, the universal insertion of effective alternative text imposes essentially negligible overhead or additional effort on site operators. Given the very significant accessibility benefits which it can offer, operators should be strongly encouraged to take these steps.

5.1.2 bobby Defect Type g39 (34.0% of sites)

g39: Give each frame a title.

Frames are a legacy HTML technology allowing for more or less arbitrary sets of distinct web resources to be dynamically combined for display purposes at the client side--i.e., by the user's browser. Historically, the frame concept was an ad hoc invention which was very poorly matched to the underlying technical principles of the web (Engelfriet, 1997b). Frames continue to give rise to a wide variety of problems both for general usability and for accessibility by users with disabilities in particular. The functionality offered by frames can now be achieved by alternative technologies that have been properly engineered into HTML, and are well supported in browsers. Accordingly, frames should be deprecated in the design of new sites; and should preferably be phased out of existing designs. Similarly, web publishing tools should be configured to avoid the use of frames; or if this is not supported, alternative tools should be identified. These recommendations can be made on general engineering and usability grounds, quite aside from accessibility considerations.

However, as long as frames are still in use, then it is essential that the special accessibility issues which they raise are adequately addressed (Engelfriet, 1997a). In this regard, it is disappointing not so much that at least 34% of the surveyed sites are (still) using frames, but that they are doing so in such a way as to erect additional barriers for certain disability groups.

HTML frames technology was conceived as a primarily visual mechanism for merging distinct web resources in a single browser window. As such, the use of frames can cause particular problems for users with a variety of visual disabilities. In general, such users may require explicit help with orientation to the frame-based organisation: i.e., they need to be able to access descriptive information about what frames are present, what their intended purposes are, and to access their individual content.

To facilitate such users, each HTML frame element should have an associated, textual, title attribute; this serves to provide suitable orientating information about the frame content. As will be discussed further in section 5.1.3, care is required in formulating this text to ensure that it is not appropriate only to the initial content of a frame but rather describes the general role or function of the frame. In this way it should remain appropriate even as the specific frame content is dynamically altered.

In any case, if a frame element has no title attribute at all--which is what is flagged by bobby diagnostic g39--then a user is necessarily denied what may be critical orientation information. As a result, this represents a potentially major and pervasive obstacle to access. However: the titling of frames is generally controlled in a small number of locations on the server (perhaps just a single page), and thus the repair effort required to correct it may be very small.

Of course, the phasing out of the use of frames technology will also ultimately be an effective remedy.

5.1.3 bobby Defect Type g38 (33% of sites)

g38: Each FRAME must reference an HTML file.

bobby defect type g38 again refers to accessibility of HTML frames technology, but to a very precise technical aspect of frame usage.

Each HTML frame element specifies, inter alia, a corresponding web resource to be initially rendered by the browser. However: frames are a technology to support so called "dynamic content" in the browser. Specifically, it is possible for user interaction via one frame to dynamically alter the content of another frame. This sort of dynamic content poses special accessibility issues, codified in WCAG Checkpoint 6.2 (W3C, 1999):

Ensure that equivalents for dynamic content are updated when the dynamic content changes.

In the current case this means that if the content of a frame requires accessibility support--such as a text equivalent--then that equivalent must, of course, change appropriately (and automatically) whenever the frame content changes. Otherwise a user relying on such an equivalent is likely to become hopelessly disoriented whenever frame content changes.

The problem is that insofar as the frame element itself (which is effectively just a screen oriented "place-holder") specifies an equivalent, this can only directly relate to the initial content. Thus, in general, frame content must (dynamically) define its own equivalents, rather than rely on the frame element to do so. But this is only possible if the content is of a media type that can contain embedded or linked accessibility equivalents. Images formats, for example, cannot generally do this. So if an image resource is directly specified as frame content, then it will generally be impossible to provide a correct equivalent if the contained content is dynamically altered.

A general (though not comprehensive) way to address this is to require that the content specified in a frame element must have the HTML media type. Where bobby reports a g38 defect this is therefore supposed to indicate a frame element which is specifying a non-HTML resource as (initial) content.

However: in the detailed analysis during the current study, some concerns have been identified about the reliability of this particular bobby test. Firstly, it is not clear in principle how bobby might reliably implement it. The problem is that, in general, media type cannot be established simply by examining a URL; rather it requires a http protocol transaction with a web server.10 But when bobby is only examining locally stored files (as in the current study) there are no http protocol transactions to exploit for this purpose. bobby might attempt to examine the content of the target file to determine whether it is HTML format; but in the general case, the target file may not even have been captured locally, and so would not be available for checking. Secondly, manual spot checking of a number of cases where bobby had flagged g38 defects could not verify them: i.e., inspection of the frame elements at the code locations identified by bobby found only HTML format content specifications.

Given this uncertainly, it might legitimately be argued that this defect type should be removed from the data reported here completely--particularly in the calculation of the key accessibility benchmarks in section 4. Against that, of course, the status of this diagnostic is currently only uncertain: while some apparent "false positives" have been identified, this does not mean that all reported instances of this defect are mistaken.

Nonetheless, for completeness, the key benchmarks have been recalculated on the basis of excluding defect type g38; the only resultant change is that the number of sites classified as definitely failing WCAG-A conformance would fall from 149 to 147; i.e., the WCAG-A minimum failure rate would change from 93.7% to 92.5%. The WCAG-AA and WCAG-AAA failure rates remain at 100%. Thus, such a change would not significantly alter any of the overall conclusions of the study.

Further detailed investigation will be necessary to definitively test the reliability of bobby diagnostic g38. The fact that bobby is not open source software (in contrast to the other packages used here--pavuk and postgresql) complicates this, as the implementation cannot be examined directly in the bobby source code. In any case, there is little purpose in reviewing the (potential) impact of this defect type further at this point.

5.1.4 bobby Defect Type g240 (26.4% of sites)

g240: Provide alternative text for all image map hot-spots (AREAs).

From the visual web user's point of view, image maps are images in which different regions are, in effect, hyperlinked. These image regions can then be "clicked on" in the normal way to follow the relevant link. When well designed, such image maps can often provide an intuitive, user-friendly, and visually appealing way of presenting certain collections of links. A very typical use is for graphical site navigation bars and menus.

However, because of the nature of the user interactions involved with image maps they can cause substantial barriers for users with a wide variety of disabilities. Obviously, a person who is blind cannot see the image at all; but even a person with moderate visual handicap may be unable to adequately distinguish the distinct regions of the image. A person with a motor disability may be unable to manipulate a mouse at all; or someone with poor sensory-motor co-ordination may find it difficult to position a visual pointer with adequate accuracy.

Because image maps may play an essential role in site navigation, obstacles to their use may drastically limit accessibility for users with disabilities to a significantly greater extent than "bare" images. Thus, although the raw site frequency of this defect is substantially smaller than g9 this may not reflect their relative significance. As usual, only detailed evaluation (and, preferably, user testing) could clarify this for individual sites.

The techniques for rendering image maps accessible are again not particularly complex. In essence they require that the links associated with an image map also have individual text equivalents associated with them: then the user's browser can be configured to convey these, and allow selection between them, in whatever way is appropriate to that individual's capabilities and preferred modes of interaction. And again, provided that suitable web development tools are used, and developers are appropriately qualified, the effort or overhead involved in doing this will be negligible.

Indeed, in the specific case of site navigation bars, these are typically only coded in a small number of page templates on the server side and then automatically incorporated into all relevant pages. In such cases, a single repair, taking perhaps an hour or less for a qualified developer, could bring about a dramatic and pervasive improvement in accessibility across an entire site.

5.1.5 bobby Defect Type g10 (18.2% of sites)

g10: Provide alternative text for all image-type buttons in forms.

HTML forms provide for web users to enter information into a web page which can then be transmitted to the web server. Such forms are widely used in web sites, for functions such as searching the site, providing feedback, setting preferences, voting in "polls" etc. For most e-commerce sites, the use of forms is essential--to support user registration, product ordering, entering payment instructions etc.

HTML forms can contain a variety of components, such as menus, checkboxes, text entry fields etc. Virtually all forms will have one or more "buttons" which can be activated by the user to carry out some action on the form (most commonly, to "submit" the form--i.e., transmit the form data to the server for processing). A web site designer may design the HTML code for the form so as to specify that such buttons be visually displayed as images. This allows a wide variety of graphical effects, which in turn can significantly enhance the usability of the form--for those users who can perceive such effects. However, for users who cannot, or choose not to, perceive images, then these effects will be lost. In such cases, in the absence of text equivalents for the button images, the users may be left with no indication whatever as to the intended function of the buttons. This will typically render the form difficult, if not impossible, to use.

As with image maps, a lack of equivalent text for image-type buttons in forms may thus erect severe accessibility barriers. Again, the remediation requirement is typically modest. In e-commerce sites in particular, forms are generally dynamically generated by the web application software. Relatively simple and localised modifications to this software may thus provide textual equivalents for image-type buttons across the entire site.

5.1.6 bobby Defect Type g21 (10.7% of sites)

g21: Provide alternative text for each APPLET.

"Applets" are programmes to be run in the context of a web browser. That is, a web page can specify, as part of its content, a programme that should executed on the client or user's Internet access device (i.e., their computer, or TV Internet box etc.). Applets thus allow a wide variety of effects to be achieved, that are not possible just by using "plain" HTML. These might include, for example, displaying movies or animations, or providing user controlled simulations (video games, educational tools etc.). The applet mechanism is very flexible, open-ended, and powerful.

However, precisely because the mechanism is so powerful, it can also raise a wide variety of accessibility issues. It is difficult to be precise here: applets would typically need to be evaluated individually for their accessibility implications. However, the WCAG guidelines do stipulate a generic requirement that whatever functionality or information is made available via applets should also be available without using applets--which is to say, via "conventional" (non-programmable) facilities of a web browser. There are a variety of techniques to achieve this, which may typically be used in combination; however, a minimal requirement is that every applet should have a textual equivalent. It is the absence of such equivalents that is being flagged by defect type g21.

While the notion of text equivalents for applets is conceptual similar to that for images, it is also a good deal more complex--simply because applets are, in principle, so general in their potential design and function. Thus, effective remediation of this defect will generally require careful analysis of the function of each applet.

5.1.7 bobby Defect Type g20 (0.6% of sites)

g20: Provide alternative content for each OBJECT.

Conceptually, this defect type is very similar to g21, already discussed. The object element is essentially a more recent addition to the specification of HTML, which is even more general purpose than applet--and, indeed, is now recommended by W3C as a replacement for applet.

The low reported incidence of g20 (in fact, just a single site in the current survey) is probably thus primarily a reflection of limited use of the object element in general, rather than an indication that object is being used more accessibly than is applet.

In any case, where the object element is used, there is, again, a minimum requirement to provide an equivalent which will be accessible to users who, for whatever reason, cannot access the specified object type. As with applet, remediation (i.e., design of appropriate and effective equivalents) will normally require case-by-case analysis.

5.2 WCAG Priority 2 Defects

The following table summarises the frequencies of all WCAG Priority 2 defects, across all sites, identified in this survey, ranked by the proportion of sites affected (i.e., the proportion of sites for which the defect was detected at least once)::

Diagnostic ID WCAG Checkpoint Incidence (sites %)
g104 3.4 98.74
g271 3.2 89.93
g265 13.1 76.72
g41 12.4 69.81
g269 9.3 69.18
g34 13.1 37.73
g273 13.2 12.57
g2 3.5 6.28
g33 7.4 3.77
g37 6.5 3.77
g254 7.5 2.51
g5 7.3 1.88
g4 7.2 1.25

Note again that, in the WCAG scheme, while priority 2 defects are not as significant as priority 1, nonetheless they still potentially represent significant barriers to web access for a variety of users with disabilities. Site operators engaged in remediation should certainly concentrate on priority 1 defects in the first instance; but once these are corrected (i.e., WCAG-A conformance is achieved) they should then go on to seriously address all priority 2 defects. Design of new sites, and maintenance of existing sites, should aim at WCAG-AA conformance (at least)--i.e., zero priority 1 or 2 defects.

Thus, all the defect types in the table above should be considered as significant. Nonetheless, it is clear that these currently occur at significantly different levels. Accordingly, the more detailed discussion below will focus on just the top five in this ranking--even the lowest of which was still detected on almost 70% of sites surveyed.

5.2.1 bobby Defect Type g104 (98.7% of sites)

g104: Use relative sizing and positioning (% values) rather than absolute (pixels).

The visual position and size of various elements can be specified in HTML pages--for example, font size for text, widths of tables, or individual table cells, etc. In general, HTML allows such positions and sizes to be specified in either "relative" or "absolute" units. Relative units mean that the numbers specified in the HTML should be scaled according to some norm which the user's browser already has; absolute units mean that the numbers are not scalable. The effect of using relative units is that the browser can very flexibly adjust the visual presentation according to the available visual space on the user's device, and the user's preferences and capabilities. For example, it can be ensured that even if the device, or viewing window, is relatively small, or if the user needs to use comparatively large text (for example, due to visual disability), then the text can still "flow" (so that horizontal scrolling is not required).

This defect type is significant for a number of reasons:

Because of the potentially pervasive effect of this defect type, remediation may appear difficult. However, in many cases, the problem arises simply because the particular web authoring tool in use is either not configured to use relative sizes and units, or (worse) is not capable of doing so. In the former case, reconfiguration and republishing of content can quickly be accomplished; in the latter case, of course, an alternative publishing tool should be sourced, but subsequently the repair effort can be small. Once appropriate tools, properly configured, are standardised upon, then this issue should not recur.

The more difficult case is where absolute positions and sizes have been "hand-coded" widely across a site to achieve particular visual effects. Typically this arises where web design has been contracted out. Remediation may then require some significant redesign effort; but, in such cases, organisations may well also want to reconsider their choice of contractor.

5.2.2 bobby Defect Type g271 (89.9% of sites)

g271: Use a public text identifier in a DOCTYPE statement.

Properly formatted HTML pages must conform to a set of strict technical specifications. Conformance to such specifications is quite generally important to ensuring compatibility between web sites and web browsers (Zeldman, 2001); but it is absolutely crucial to ensuring compatibility with the wide variety of special purpose web browsers and assistive technologies that are necessary to address the diverse needs of users with disabilities.

Unfortunately, many web sites continue to deliver HTML pages which do not conform to these technical specifications. This situation has been able to develop historically because a very small number of browsers have accounted for the vast majority of users. These browsers have been able to process HTML which is "broken" in a variety of ways (relative to technical standards) and still render such pages in some more or less usable way. For many site operators, as long as "most" users appeared to be able to access the site, they did not concern themselves with whether it was operating in conformance with technical standards (or, in many cases, may not even have been properly aware of the technical issues).

But this is a very unsatisfactory, and indeed, unsustainable, situation. The diversity of browser technologies in active use is now growing significantly. Firstly, there is the continuing evolution of browser software: successive versions of even the "same" browser can differ significantly in their handling of various kinds of HTML "bugs"; but users are becoming progressively reluctant to continuously upgrade their browser software. So among the user population, there is a growing diversity of legacy browser versions in active use. Secondly, there is the diversification of access devices--Internet TVs, games consoles, PDAs etc. This is an unstable situation because this growing diversity makes it progressively more difficult for faulty HTML to function consistently across user populations; the only effective long term solution is to enforce strict technical standards on the server side.

Furthermore, the need to try to deal with non-conforming HTML has resulted in very significant software overhead in recent browser designs: on the one hand this has meant that "mainstream" browser software has tended to become progressively more bloated, and thus require significantly more powerful computers to run on; on the other hand, it has meant that browsers targeted at more lightweight devices (PDAs, etc.) simply cannot be designed to accommodate such technical deficiencies on web servers.

So this is another case where universal design will benefit all users, regardless of ability: but the benefit to users with disabilities will be disproportionate because of their very reliance on diverse specialist technologies.

bobby is not technically an HTML validator--that is, a general purpose tool to validate HTML code conformance to technical standards. Future studies may complement bobby testing with the use of such a validator. However, for the purposes of the current study, bobby does give some basic indications of HTML standards conformance. In particular, a properly conforming HTML page minimally contains a so-called DOCTYPE statement which identifies which particular HTML standard the page satisfies. If this is missing, then it is not even possible in principle to validate the page for standards conformance; and bobby raises defect code g271.

On this basis, the current study shows that 89.9% of sites examined have at least some non-conformant HTML. Furthermore, given that this bobby diagnostic is only a preliminary test of HTML standards conformance, the statistic above must be interpreted as a minimum level: actual levels would require more research to establish, but may well be significantly higher.

5.2.3 bobby Defect Type g265 (76.7% of sites)

g265: Do not use the same link phrase more than once when the links point to different URLs.

A "link phrase" is the (usually short) fragment of text in a web page that is hypertext linked to another web resource. For users of visual browsers link phrases are normally visually highlighted in some way (perhaps by colour, underlining etc.); and "clicking" within the link phrase causes the browser to load the linked resource. Such users can generally scan web pages visually very quickly to pick out such link phrases; and can easily read the surrounding text if they need more context to understand a particular link phrase.

For users of non-visual browsers (say using computer synthesised speech, or a braille output device) "scanning" a web page is generally a slower and more cumbersome process. One common technique to aid scanning in such cases is to simply skip from link to link; in this circumstance, only the link phrases are directly rendered to the user, and access to surrounding text (for additional context) will be relatively slow (i.e., it will undermine the very utility of this form of scanning).

This being the case, access for such users can be significantly improved if a little care is taken in the selection of link phrases; and, conversely, poor selection of link phrases can create a significant, and generally quite unnecessary, obstacle to users. More specifically, if the same link phrase is used multiple times, in the same page, but linking to different resources, this will be completely hidden from a user who is scanning only such link phrases. The most hackneyed form of this is a repetition of stock phrases like "click here", or "more", which are meaningless in isolation.

In some cases, repetitive link phrases are generated by a particular authoring tool, or perhaps by customised, site specific, software. Then, the usual advice of reconfiguring or replacing the tool, or redesigning the relevant software would apply. However, for link texts which are manually generally by individual content authors, the most important step is education and training. In most situations, once the author understands the potential variety of usage situations for link phrases, more accessible authoring then comes quite naturally and does not require any significant extra effort or resource; although, of course, some dedicated effort may be required to repair legacy content.

5.2.4 bobby Defect Type g41 (69.8% of sites)

g41: Explicitly associate form controls and their labels with the LABEL element.

In the visual presentation of a web page there can often be important relationships between different components of the page which are expressed only implicitly by their juxtaposition in the display. A common example arises in the case of HTML based forms. As already mentioned in section 5.1.5, HTML forms are web pages where the user can fill in or select responses, and then select a "submit button" to send this information to the web server. A form thus generally consists of information explaining to the user what has to be filled in, interspersed with "form controls"--text entry boxes, radio buttons, drop down lists etc.--which the user can interact with. Typically, the relative positions in the visual display make it reasonably easy for a visual user to identify which text is associated with which control.

However: for users who are unable to use a visual display effectively (blind, partially sighted etc.) it is not possible to directly perceive these implicit, but critical, relationships. To address this, HTML provides facilities whereby a particular form control can be explicitly marked as associated with a particular text (the corresponding "label"). This coding can then be used by a suitably configured browser to help a user with a disability to recognise the correct relationships. Furthermore, coding these explicit relationships can improve general form usability; for example, the browser can associate clicks on a label as intending to activate a form control, thus providing a larger target for selection with a pointer. This may be particularly helpful to users with motor impairment which limits fine pointer manipulation; but will generally be of benefit to all users. Thus, this again illustrates the applicability of universal design.

As with a number of the defects already discussed, remediation will depend on the authoring or development tools in use; but if a given tool does not support making forms accessible in this way, then an alternative tool should be sought. And again, effective accessible design will generally require that developers and authors have received appropriate training.

5.2.5 bobby Defect Type g269 (69.2% of sites)

g269: Make sure event handlers do not require use of a mouse.

Various HTML coding techniques (normally involving the use of client side scripting) rely on certain kinds of interaction with the user. However, depending on their individual capabilities and preferences, users may adopt a wide variety of interaction devices. In particular, the use of a conventional mouse, or even of some adapted form of screen pointing device, may be difficult or impossible for some users. Thus, if a page is coded in such a way that certain functionality or features can be accessed only by using a particular form of interface device--such as a mouse in the current case--then that functionality will be unavailable to many users with disabilities; worse still, such users may not even be aware that such functionalities exist.

There are a variety of well established techniques for "universal design" of client side interactions (including accessible client side scripting techniques, complemented with appropriate alternatives for users unable or unwilling to process client side scripting). The scale of the remediation effort will depend on the particular site, and the particular authoring and development tools in use. If the problematic interactions are being automatically embedded by certain tools then, as usual, the tool should be configured to avoid this; or if that is not possible, an alternative tool should be selected. Where the problem is a result of "hand-coding" then remediation may require some significant redesign effort; but the starting point then must be with ensuring that development staff (or external contractors, as appropriate) are properly qualified: in the general case, web site development is an engineering activity that must be subject to normal standards of professional practice.

5.3 WCAG Priority 3 Defects

The following table summarises the frequencies of all WCAG Priority 3 defects, across all sites, identified in this survey:

Diagnostic ID WCAG Checkpoint Incidence (sites %)
g31 5.5 97.48
g125 4.3 96.22
g35 10.5 89.93
g109 10.4 61.63
g14 1.5 1.88

Priority 3 checkpoints are described as issues which web site developers "may" wish to address, to ensure maximum accessibility for all users (W3C, 1999). Thus, one would expect that particular organisations--depending on their particular objectives and remit--may or may not choose to conform with these checkpoints.

In the context of the results presented here, it is clear that there are very significant issues to be dealt with at priority levels 1 and 2 in the first instance. Accordingly, the specific priority 3 checkpoints are not discussed further at this point.

6 Future Work

This section provides a critical review of the results of the study, and of the tools and methodology adopted. It also considers the implications for future work.

First and foremost, the project has demonstrated that this sort of largely automated survey of selected accessibility indicators is technically feasible; and that once the appropriate tools have been developed and integrated, the technical resources to carry out such a survey are comparatively modest.

However, the project has also identified a number of limitations and/or open issues, many of which will require further research.

7 Conclusion

The core objectives of this study were:

  1. To gather objective data, across a broad variety of Irish web sites, on the state of web accessibility to users with disabilities.
  2. To formulate summary benchmark indicators, based on this data.
  3. To develop tools which would support the effective ongoing--preferably automatic--monitoring of these indicators.
  4. To identify specific remediation activities that could significantly improve web accessibility in Ireland.

These objectives have been achieved, as documented in the body of the report. But of course, these objectives arise in a particular context, and with particular purposes in mind. The results presented here are intended specifically to inform policy and initiatives in the development of the Irish web.

In this sense, the primary outcome is, in effect, a "wake-up call"--for government, for public agencies, for private companies, organisations and individuals. It signals that, despite Ireland's justifiable pride in its economic and technological development, despite very laudable goals in documents such as the E-Europe Action Plan (European Commission, 2000,2001), the current commitment to accessibility of the Irish web for users with disabilities is, at best, aspirational--and, at worst, cynically inadequate.

This is doubly unfortunate. It is not just that web technology is not being applied--as it could be--to positively improved opportunities and capabilities for users with disabilities; but on the contrary, as web services become more pervasive and essential, to the extent that they remain inaccessible this will actually impose progressively more disadvantage and exclusion on groups with disabilities, in our society.

On the other hand, this study also indicates that significant progress could be made quickly, and with comparatively little effort--if decision makers recognise the significance of the issue and take action accordingly. It is certainly not an all or nothing situation: if web site developers and operators focussed on just a few of the most pressing accessibility deficiencies in each particular site, significant improvement could be readily achieved.

There is of course a need for leadership; for public policy initiatives, particularly in education and training; for concrete incentives and supports to organisations wishing to improve services for users with disabilities. Finally, there must surely also be a role for legislation and regulation to guarantee and vindicate the rights of all citizens to equal treatment in a digital democracy.

Bibliography

Engelfriet, A. (1997a),
`Using Frames and Accessible Web Sites', Web Design Group. (Accessed: 2 August 2002)
Engelfriet, A. (1997b),
`What's Wrong With Frames?', Web Design Group. (Accessed: 2 August 2002)
European Commission (2000),
`eEurope 2002: Action Plan'. (Accessed: 17 October 2002)
European Commission (2001),
`eEurope 2002: Accessibility of Public Web Sites and their Content: Communication from the Commission to the Council, the European Parliament, the Economic and Social Committee, and the Committee of Regions', Commission of the European Communities. Microsoft Word Format. (Accessed: 17 October 2002)
Flavell, A. J. (2002),
`Use of ALT Texts in IMGs'. (Accessed: 20 April 2002)
Irish Information Society Commission (2000)
, `IT Access for All'. (Accessed: 23 October 2002)
Irish National Disability Authority (2002),
`IT Accessibility Guidelines'. (Accessed: 17 October 2002)
McMullin, B. (2002),
`Stretching the Web', DCU University View . DCU Alumni Magazine. Also printed in AIB Futures, Vol 12, December 2001 (AIB Internal Magazine). (Accessed: 17 October 2002)
W3C (1997),
`World Wide Web Consortium Launches International Program Office for Web Accessibility Initiative', World Wide Web Consortium (W3C). Press Release, October 22, 1997. (Accessed: 22 June 2002)
W3C (1999),
`Web Content Accessibility Guidelines (WCAG)', World Wide Web Consortium (W3C). (Accessed: 20 April 2002)
W3C (2001),
`How People with Disabilities Use the Web', World Wide Web Consortium (W3C). (Working Draft). (Accessed: 20 April 2002)
Zeldman, J. (2001),
`To Hell With Bad Browsers!', A List Apart (99). (Accessed: 17 October 2002)
Zhangxu & Aldis, J. (2001),
`No Disability in Digitalized Community', International Center for Disability Resources on the Internet (ICDRI). (Accessed: 26 September 2002)

Acknowledgments

The work described here could not have come about without generous financial support provided by AIB PLC. I am especially indebted to John Kelly, Head of Business Banking at AIB. He demonstrated an enduring faith and commitment to the the project, and the ultimate value that it could have in setting the agenda for Ireland's emerging information society.

Detailed research and development for the project was carried out by my two research students, Esmond Walshe and Carmen Marincu.

The work was carried out in the Research Institute for Networks and Communications Engineering (RINCE), established at DCU under the HEA Programme for Research in Third Level Institutions (PRTLI).

Appendices

A. Complete listing of initial target sites

Site ID Title Root URL
1 Irish Farmers' Association http://www.farm.ie/
2 Society for the Prevention of Cruelty to Animals http://www.ispca.ie/
3 Zoological Society of Ireland http://www.dublinzoo.ie/
4 Arts Council of Ireland http://www.artscouncil.ie/
5 Irish Museum of Modern Art http://www.modernart.ie/
6 National Concert Hall http://www.nch.ie/
7 Royal Irish Academy of Music http://www.riam.ie/
8 Astronomy Ireland http://www.astronomy.ie/
9 Dunsink Observatory http://www.dunsink.dias.ie/
10 Advertising Standards Authority for Ireland http://www.asai.ie/
11 Chambers of Commerce of Ireland http://www.chambersireland.ie/
12 Consumer's Association of Ireland http://www.consumerassociation.ie/
13 Director of Consumer Affairs http://www.odca.ie/
14 Enterprise Ireland http://www.enterprise-ireland.com/
15 IDA Ireland http://www.idaireland.com/
16 Irish Business and Employers Confederation http://www.ibec.ie/
17 Irish Small and Medium Enterprises http://www.isme.ie/
18 Small Firms Association http://www.sfa.ie/
19 Barnardos http://www.barnardos.ie/
20 Comhairle http://www.cidb.ie/
21 Fighting Blindness http://www.fightingblindness.ie/
22 National Council for the Blind http://www.ncbi.ie/
23 Irish Guide Dogs for the Blind http://www.guidedogs.ie/
24 Association for Higher Education Access and Disability http://www.ahead.ie/
25 Irish Defence Forces http://www.military.ie/
26 Central Applications Office http://www.cao.ie/
27 Dublin City University http://www.dcu.ie/
28 Dun Laoghaire Institute of Art Design and Technology http://www.iadt-dl.ie/
29 Dublin Institute of Technology http://www.dit.ie/
30 Cork Institute of Technology http://www.cit.ie/
31 Higher Education Authority http://www.hea.ie/
32 Institute of Public Administration http://www.ipa.ie/
33 National Adult Literacy Agency http://www.nala.ie/
34 University College Dublin http://www.ucd.ie/
35 University of Dublin Trinity College http://www.tcd.ie/
36 National University of Ireland Galway http://www.ucg.ie/
37 National University of Ireland Cork http://www.ucc.ie/
38 University of Limerick http://www.ul.ie/
39 Letterkenny Institute of Technology http://www.lyit.ie/
40 Union of Students in Ireland http://www.usi.ie/
41 Association of Secondary Teachers Ireland http://www.asti.ie/
42 Royal Irish Academy http://www.ria.ie/
43 Equality Authority http://www.equality.ie/
44 FAS http://www.fas.ie/
45 Irish Jobs http://www.irishjobs.ie/
46 Irish National Teachers Organisation http://www.into.ie/
47 Irish Times (portal) http://www.ireland.com/
48 SIPTU http://www.siptu.ie/
49 Electricity Supply Board http://www.esb.ie/
50 Bord Pleanala http://www.pleanala.ie/
51 ENFO http://www.enfo.ie/
52 Met Eireann http://www.met.ie/
53 Allied Irish Banks PLC http://www.aib.ie/
54 Bank of Ireland http://www.bankofireland.ie/
55 Credit Unions http://www.creditunion.ie/
56 EBS Building Society http://www.ebs.ie/
57 National Irish Bank http://www.nib.ie/
58 Bord Iascaigh Mhara http://www.bim.ie/
59 National Library of Ireland http://www.nli.ie/
60 Irish State http://www.gov.ie/
61 Department of Agriculture Food and Rural Development http://www.gov.ie/daff/
62 Department of Arts Heritage Gaeltacht and The Islands http://www.ealga.ie/
63 Office of the Attorney General http://www.gov.ie/ag/
64 Department of Defence http://www.gov.ie/defence/
65 Department of Education and Science http://www.gov.ie/educ/
66 Department of Enterprise Trade and Employment http://www.entemp.ie/
67 Department of Environment and Local Government http://www.environ.ie/
68 Department of Finance http://www.gov.ie/finance/
69 Department of Foreign Affairs http://www.gov.ie/iveagh/
70 Department of Health and Children http://www.doh.ie/
71 Department of Justice Equality and Law Reform http://www.justice.ie/
72 Department of Marine and Natural Resources http://www.marine.gov.ie/
73 Department of Public Enterprise http://www.gov.ie/tec/
74 Office of the Revenue Commissioners http://www.revenue.ie/
75 Department of Social Community and Family Affairs http://www.dscfa.ie/
76 Department of the Taoiseach http://www.gov.ie/taoiseach/
77 Department of Tourism Sport and Recreation http://www.gov.ie/tourism-sport/
78 Irish Statute Book http://193.120.124.98/
79 Automobile Association (AA) http://www.aaireland.ie/
80 Alzheimer Society of Ireland http://www.alzheimer.ie/
81 Arthritis Foundation of Ireland http://www.arthritis-foundation.com/
82 Beaumont Hospital http://www.beaumont.ie/
83 Central Remedial Clinic http://www.crc.ie/
84 Eastern Regional Health Authority http://www.erha.ie/
85 Health Research Board http://www.hrb.ie/
86 North Western Health Board http://www.nwhb.ie/
87 Rehab Group http://www.rehab.ie/
88 Voluntary Health Insurance (VHI) http://www.vhi.ie/
89 Bupa Ireland http://www.bupaireland.ie/
90 Simon Community of Ireland http://www.simoncommunity.com/
91 Amnesty International Irish Section http://www.amnesty.ie/
92 Irish Council for Civil Liberties http://www.iccl.ie/
93 Central Statistics Office http://www.cso.ie/
94 NTL http://www.ntl.ie/
95 Garda Siochana http://www.garda.ie/
96 Irish Courts Service http://www.courts.ie/
97 Law Society of Ireland http://www.lawsociety.ie/
98 The Institute of Chartered Accountants in Ireland http://www.icai.ie/
99 Cork County Council http://www.corkcoco.com/
100 Dublin City Council http://www.dublincity.ie/
101 Dun Laoghaire/Rathdown County Council http://www.dlrcoco.ie/
102 Fingal Country Council http://www.fingalcoco.ie/
103 Kerry County Council http://www.kerrycoco.ie/
104 National Lottery http://www.lotto.ie/
105 Entertainment Ireland http://www.entertainment.ie/
106 Hot Press http://www.hotpress.com/
107 Independent News and Media PLC http://www.independent.ie/
108 Irish Examiner http://www.examiner.ie/
109 Ticketmaster Ireland http://www.ticketmaster.ie/
110 RTE http://www.rte.ie/
111 Today FM http://www.todayfm.com/
112 TV3 http://www.tv3.ie/
113 UTV http://www.utvlive.com/
114 Concern Worldwide http://www.concern.ie/
115 Trocaire http://www.trocaire.ie/
116 Fianna Fail http://www.fiannafail.ie/
117 Fine Gael http://www.finegael.ie/
118 The Labour Party http://www.labour.ie/
119 Progressive Democrats http://www.progressivedemocrats.ie/
120 Green Party of Ireland http://www.greenparty.ie/
121 Sinn Fein http://www.sinnfein.ie/
122 An Post http://www.postoffice.ie/
123 Football Association of Ireland http://www.fai.ie/
124 GAA http://www.gaa.ie/
125 Golfing Union of Ireland http://www.gui.ie/
126 Irish Rugby http://www.irishrugby.ie/
127 Irish Sports Council http://www.irishsportscouncil.ie/
128 Paddy Power Bookmakers http://www.paddypower.com/
129 Eircom http://www.eircom.ie/
130 Vodafone http://www.vodafone.ie/
131 ESAT Business http://www.esat.ie/
132 Digifone http://www.digifone.ie/
133 Aer Lingus http://www.aerlingus.com/
134 Aer Rianta http://www.aer-rianta.ie/
135 Ryanair http://www.ryanair.com/
136 Bus Eireann http://www.buseireann.ie/
137 Dublin Bus http://www.dublinbus.ie/
138 Irish Rail http://www.irishrail.ie/
139 Women's Aid http://www.womensaid.ie/
140 Federation of Irish Scout Associations http://www.scout.ie/
141 Irish Youth Hostel Association http://www.irelandyha.org/
142 National Youth Council http://www.youth.ie/
143 National Youth Orchestra http://www.nyoi.ie/
144 National Centre for Technology in Education http://www.ncte.ie/
145 2003 Special Olympics - World Summer Games http://www.2003specialolympics.com/
146 Research Institute in Networks and Communications Engineering (RINCE) http://www.rince.ie/
147 Science Foundation Ireland http://www.sfi.ie/
148 Tesco Ireland http://www.tesco.ie/
149 Superquinn http://www.superquinn.ie/
150 AXA Insurance http://www.axa.ie/
151 Cara Group http://www.cara.ie/
152 Buy4Now http://www.buy4now.ie/
153 Elan http://www.elan.com/
154 Mentec International http://www.mentec.ie/
155 CRH PLC http://www.crh.ie/
156 Fyffes http://www.fyffes.com/
157 Abbey PLC http://www.abbeyplc.ie/
158 First Active http://www.firstactive.ie/
159 Greencore Group http://www.greencore.ie/
160 Glanbia PLC http://www.glanbia.ie/
161 Horizon Technology Group PLC http://www.horizon.ie/
162 IAWS Group PLC http://www.iaws.ie/
163 Riverdeep http://www.riverdeep.com/
164 Jurys Doyle Hotels http://www.jurysdoyle.com/
165 Irish Life and Permanent PLC http://www.irishlifepermanent.ie/
166 Sherry Fitzgerald Group http://www.sherryfitz.ie/
167 Hamilton Osborne King http://www.cbhok.com/
168 Alphyra Group http://www.alphyra.ie/
169 IONA http://www.iona.com/
170 DELL Ireland http://www.euro.dell.com/countries/ ie/enu/gen/default.htm
171 Microsoft Ireland http://www.microsoft.com/ireland/
172 Sun Microsystems Ireland http://www.sun.ie/
173 Compaq Ireland http://www.compaq.ie/
174 I.T. Upgrade http://www.itupgrade.ie/
175 Electronic Business Solutions Ltd. http://www.e-bss.com/
176 Prestige Systems Ltd. http://www.prestige.ie/
177 Webrooms Ltd. http://www.webrooms.ie/
178 Parallel Inc. http://www.parallelit.com/
179 Software Ireland http://www.softwareireland.com/
180 ContentPAL/ERS Solutions Ltd. http://www.contentpal.com/
181 Aro http://www.aro.ie/
182 Golden Pages http://www.goldenpages.ie/
183 RealTaxis.com http://www.realtaxis.com/
184 IrelandTaxi.com http://www.irelandtaxi.com/
185 TaxiCabIreland.com http://www.taxicabireland.com/
186 Club Travel Ltd. http://www.clubtravel.ie/
187 Business Software Alliance Ireland http://www.bsa.org/ireland/
188 Buy and Sell http://www.buyandsell.net/
189 Ireland Offline http://www.irelandoffline.ie/
190 Frontend http://www.frontend.ie/
191 Media One http://www.mediaone.ie/
192 Irish Internet Association http://www.iia.ie/
193 Register.ie http://www.register.ie/
194 IE Domain Registry Limited http://www.domainregistry.ie/
195 Amarach Consulting http://www.amarach.com/
196 Labyrinth http://www.labyrinth.ie/
197 webBusters http://www.webbusters.com/
198 WebFactory http://www.webfactory.ie/
199 AdNet Ltd http://business.adnet.ie/
200 Aran Consulting http://www.aranconsulting.com/
201 Cedar Tree Ireland http://www.cedartreeireland.net/
202 Calyco http://www.calyco.net/
203 Advent Web-Design http://web-design.alturl.com/
204 CodeIsland http://www.codeisland.com/
205 Cobweb http://www.cobweb.ie/
206 The Communications Interactive Agency Ltd. http://www.thecia.ie/
207 DesktopIreland http://www.desktopireland.com/
208 TDH Interactive http://www.tdhinteractive.ie/
209 Futura Software http://www.futurasoft.com/
210 NewMedia http://www.newmedia.ie/
211 Simplyit http://www.simplyit.ie/
212 Blacknight Solutions http://www.blacknightsolutions.com/
213 Mediahouse Internet Services http://www.mediahouse.ie/
214 Cork Web Studio http://www.corkwebstudio.com/

B. bobby Full Support Diagnostics vs. WCAG Checkpoints

The table below shows the complete list of 25 bobby diagnostics, with WCAG mappings and priorities, included in the current survey.

ID Description WCAG Priority WCAG Checkpoint
g9 Provide alternative text for all images. 1 1.1
g21 Provide alternative text for each APPLET. 1 1.1
g20 Provide alternative content for each OBJECT. 1 1.1
g10 Provide alternative text for all image-type buttons in forms. 1 1.1
g240 Provide alternative text for all image map hot-spots (AREAs). 1 1.1
g14 Client-side image map contains a link not presented elsewhere on the page. 3 1.5
g271 Use a public text identifier in a DOCTYPE statement. 2 3.2
g104 Use relative sizing and positioning (% values) rather than absolute (pixels). 2 3.4
g2 Nest headings properly. 2 3.5
g125 Identify the language of the text. 3 4.3
g31 Provide a summary for tables. 3 5.5
g38 Each FRAME must reference an HTML file. 1 6.2
g37 Provide a NOFRAMES section when using FRAMEs. 2 6.5
g4 Avoid blinking text created with the BLINK element. 2 7.2
g5 Avoid scrolling text created with the MARQUEE element. 2 7.3
g33 Do not cause a page to refresh automatically. 2 7.4
g254 Do not cause a page to redirect to a new URL. 2 7.5
g269 Make sure event handlers do not require use of a mouse. 2 9.3
g109 Include default, place-holding characters in edit boxes and text areas. 3 10.4
g35 Separate adjacent links with more than whitespace. 3 10.5
g39 Give each frame a title. 1 12.1
g41 Explicitly associate form controls and their labels with the LABEL element. 2 12.4
g34 Create link phrases that make sense when read out of context. 2 13.1
g265 Do not use the same link phrase more than once when the links point to different URLs. 2 13.1
g273 Include a document TITLE. 2 13.2

C. Template pavuk Configuration File

pavuk-template-scn.txt

D. Failed pavuk Jobs

No data was successfully captured for the sites identified below (sample date: 9th April 2002). Thus, these sites were discarded from further consideration.

Site ID Root URL
8 http://www.astronomy.ie/
49 http://www.esb.ie/
88 http://www.vhi.ie/
95 http://www.garda.ie/
110 http://www.rte.ie/
113 http://www.utvlive.com/
186 http://www.clubtravel.ie/
189 http://www.irelandoffline.ie/
204 http://www.codeisland.com/
212 http://www.blacknightsolutions.com/

E. Unsatisfactory pavuk Jobs

Only one page (corresponding to the "root" or "home" URL) was successfully captured for the sites identified below (sample date: 9th April 2002). Thus, these sites were discarded from further consideration.

Site ID Root URL
5 http://www.modernart.ie/
7 http://www.riam.ie/
20 http://www.cidb.ie/
56 http://www.ebs.ie/
75 http://www.dscfa.ie/
105 http://www.entertainment.ie/
107 http://www.independent.ie/
108 http://www.examiner.ie/
115 http://www.trocaire.ie/
116 http://www.fiannafail.ie/
119 http://www.progressivedemocrats.ie/
149 http://www.superquinn.ie/
156 http://www.fyffes.com/
157 http://www.abbeyplc.ie/
170 http://www.euro.dell.com/countries/ie/enu/gen/default.htm
180 http://www.contentpal.com/
198 http://www.webfactory.ie/
203 http://web-design.alturl.com/

F. Successful pavuk Jobs

This shows a summary of the capture data for each site. The site capture phase took place on 9th April 2002, starting at 16:11:49 IST (UT+01) and ending at 17:59:19 IST (UT+01). In general, the content of web sites is dynamic. Thus, the data captured and analysed in this study represents only a "snapshot" of each site taken at that particular point in time.

Site ID Root URL Page Count Size (bytes)
1 http://www.farm.ie/ 10 75831
2 http://www.ispca.ie/ 9 203708
3 http://www.dublinzoo.ie/ 10 199440
4 http://www.artscouncil.ie/ 19 157988
6 http://www.nch.ie/ 10 158025
9 http://www.dunsink.dias.ie/ 14 29458
10 http://www.asai.ie/ 39 203025
11 http://www.chambersireland.ie/ 16 197458
12 http://www.consumerassociation.ie/ 30 200108
13 http://www.odca.ie/ 9 127960
14 http://www.enterprise-ireland.com/ 11 169898
15 http://www.idaireland.com/ 12 195319
16 http://www.ibec.ie/ 28 119981
17 http://www.isme.ie/ 17 165919
18 http://www.sfa.ie/ 8 189953
19 http://www.barnardos.ie/ 18 196225
21 http://www.fightingblindness.ie/ 21 202161
22 http://www.ncbi.ie/ 25 198272
23 http://www.guidedogs.ie/ 16 200252
24 http://www.ahead.ie/ 38 203079
25 http://www.military.ie/ 15 189490
26 http://www.cao.ie/ 10 31510
27 http://www.dcu.ie/ 12 167144
28 http://www.iadt-dl.ie/ 18 29658
29 http://www.dit.ie/ 29 198739
30 http://www.cit.ie/ 29 171011
31 http://www.hea.ie/ 31 202751
32 http://www.ipa.ie/ 14 200658
33 http://www.nala.ie/ 11 195712
34 http://www.ucd.ie/ 15 191952
35 http://www.tcd.ie/ 16 194867
36 http://www.ucg.ie/ 18 202257
37 http://www.ucc.ie/ 41 193219
38 http://www.ul.ie/ 27 169460
39 http://www.lyit.ie/ 44 185649
40 http://www.usi.ie/ 15 198679
41 http://www.asti.ie/ 17 171468
42 http://www.ria.ie/ 3 56112
43 http://www.equality.ie/ 19 203958
44 http://www.fas.ie/ 15 197418
45 http://www.irishjobs.ie/ 15 196197
46 http://www.into.ie/ 13 193871
47 http://www.ireland.com/ 5 163142
48 http://www.siptu.ie/ 15 197846
50 http://www.pleanala.ie/ 34 200293
51 http://www.enfo.ie/ 53 181751
52 http://www.met.ie/ 8 181667
53 http://www.aib.ie/ 6 200450
54 http://www.bankofireland.ie/ 11 202953
55 http://www.creditunion.ie/ 18 175399
57 http://www.nib.ie/ 14 197459
58 http://www.bim.ie/ 11 193989
59 http://www.nli.ie/ 58 203410
60 http://www.gov.ie/ 14 195231
61 http://www.gov.ie/daff/ 21 190129
62 http://www.ealga.ie/ 7 187257
63 http://www.gov.ie/ag/ 14 198928
64 http://www.gov.ie/defence/ 6 168580
65 http://www.gov.ie/educ/ 16 202089
66 http://www.entemp.ie/ 14 197220
67 http://www.environ.ie/ 60 168519
68 http://www.gov.ie/finance/ 11 174612
69 http://www.gov.ie/iveagh/ 34 199084
70 http://www.doh.ie/ 19 186957
71 http://www.justice.ie/ 14 200916
72 http://www.marine.gov.ie/ 5 56746
73 http://www.gov.ie/tec/ 20 193725
74 http://www.revenue.ie/ 34 197121
76 http://www.gov.ie/taoiseach/ 16 163152
77 http://www.gov.ie/tourism-sport/ 11 198469
78 http://193.120.124.98/ 8 15208
79 http://www.aaireland.ie/ 6 176295
80 http://www.alzheimer.ie/ 4 181144
81 http://www.arthritis-foundation.com/ 32 196241
82 http://www.beaumont.ie/ 61 190799
83 http://www.crc.ie/ 24 187199
84 http://www.erha.ie/ 25 201865
85 http://www.hrb.ie/ 23 195638
86 http://www.nwhb.ie/ 14 202604
87 http://www.rehab.ie/ 18 200858
89 http://www.bupaireland.ie/ 9 191989
90 http://www.simoncommunity.com/ 15 198846
91 http://www.amnesty.ie/ 10 116469
92 http://www.iccl.ie/ 18 183689
93 http://www.cso.ie/ 7 201546
94 http://www.ntl.ie/ 23 199781
96 http://www.courts.ie/ 31 187650
97 http://www.lawsociety.ie/ 15 182679
98 http://www.icai.ie/ 27 198312
99 http://www.corkcoco.com/ 15 202565
100 http://www.dublincity.ie/ 79 198461
101 http://www.dlrcoco.ie/ 30 191605
102 http://www.fingalcoco.ie/ 31 202271
103 http://www.kerrycoco.ie/ 5 191993
104 http://www.lotto.ie/ 8 38877
106 http://www.hotpress.com/ 10 189112
109 http://www.ticketmaster.ie/ 6 169254
111 http://www.todayfm.com/ 25 191965
112 http://www.tv3.ie/ 27 200538
114 http://www.concern.ie/ 18 202167
117 http://www.finegael.ie/ 18 190633
118 http://www.labour.ie/ 9 164199
120 http://www.greenparty.ie/ 52 203866
121 http://www.sinnfein.ie/ 12 175791
122 http://www.postoffice.ie/ 16 195519
123 http://www.fai.ie/ 4 201247
124 http://www.gaa.ie/ 39 202806
125 http://www.gui.ie/ 13 186265
126 http://www.irishrugby.ie/ 10 183894
127 http://www.irishsportscouncil.ie/ 16 134632
128 http://www.paddypower.com/ 5 192440
129 http://www.eircom.ie/ 8 194282
130 http://www.vodafone.ie/ 7 182810
131 http://www.esat.ie/ 14 204191
132 http://www.digifone.ie/ 10 203797
133 http://www.aerlingus.com/ 9 201094
134 http://www.aer-rianta.ie/ 32 187908
135 http://www.ryanair.com/ 21 198108
136 http://www.buseireann.ie/ 12 122312
137 http://www.dublinbus.ie/ 14 144877
138 http://www.irishrail.ie/ 12 194265
139 http://www.womensaid.ie/ 56 201457
140 http://www.scout.ie/ 6 26948
141 http://www.irelandyha.org/ 20 197984
142 http://www.youth.ie/ 17 202744
143 http://www.nyoi.ie/ 18 187232
144 http://www.ncte.ie/ 40 185650
145 http://www.2003specialolympics.com/ 4 179703
146 http://www.rince.ie/ 15 86433
147 http://www.sfi.ie/ 43 95288
148 http://www.tesco.ie/ 31 198030
150 http://www.axa.ie/ 6 194207
151 http://www.cara.ie/ 8 182620
152 http://www.buy4now.ie/ 5 174183
153 http://www.elan.com/ 11 186695
154 http://www.mentec.ie/ 7 202059
155 http://www.crh.ie/ 8 178590
158 http://www.firstactive.ie/ 7 190702
159 http://www.greencore.ie/ 21 51128
160 http://www.glanbia.ie/ 9 186409
161 http://www.horizon.ie/ 9 203489
162 http://www.iaws.ie/ 7 196746
163 http://www.riverdeep.com/ 7 184619
164 http://www.jurysdoyle.com/ 27 144563
165 http://www.irishlifepermanent.ie/ 6 190597
166 http://www.sherryfitz.ie/ 11 43378
167 http://www.cbhok.com/ 12 184295
168 http://www.alphyra.ie/ 11 186809
169 http://www.iona.com/ 13 184605
171 http://www.microsoft.com/ireland/ 7 161446
172 http://www.sun.ie/ 13 200867
173 http://www.compaq.ie/ 16 197323
174 http://www.itupgrade.ie/ 9 198521
175 http://www.e-bss.com/ 11 191081
176 http://www.prestige.ie/ 23 202253
177 http://www.webrooms.ie/ 9 106188
178 http://www.parallelit.com/ 9 179096
179 http://www.softwareireland.com/ 10 185383
181 http://www.aro.ie/ 21 196555
182 http://www.goldenpages.ie/ 6 179873
183 http://www.realtaxis.com/ 13 113200
184 http://www.irelandtaxi.com/ 40 198050
185 http://www.taxicabireland.com/ 8 134484
187 http://www.bsa.org/ireland/ 16 199092
188 http://www.buyandsell.net/ 36 194429
190 http://www.frontend.ie/ 11 200780
191 http://www.mediaone.ie/ 11 191977
192 http://www.iia.ie/ 12 194220
193 http://www.register.ie/ 14 203885
194 http://www.domainregistry.ie/ 18 194116
195 http://www.amarach.com/ 19 203880
196 http://www.labyrinth.ie/ 15 193685
197 http://www.webbusters.com/ 26 202207
199 http://business.adnet.ie/ 10 198434
200 http://www.aranconsulting.com/ 30 153500
201 http://www.cedartreeireland.net/ 11 58273
202 http://www.calyco.net/ 7 41875
205 http://www.cobweb.ie/ 10 189102
206 http://www.thecia.ie/ 15 176255
207 http://www.desktopireland.com/ 13 156647
208 http://www.tdhinteractive.ie/ 6 186183
209 http://www.futurasoft.com/ 14 204348
210 http://www.newmedia.ie/ 7 29909
211 http://www.simplyit.ie/ 6 28585
213 http://www.mediahouse.ie/ 11 203548
214 http://www.corkwebstudio.com/ 11 31287

G. Failed bobby Jobs

bobby failed to analyse the following sites. A status value of 1 indicates that bobby terminated abnormally (a java exception was reported); a status value of 3 indicates that bobby was forcibly terminated because the allowed time was exceeded.

Site ID Root URL Status
13 http://www.odca.ie/ 3
16 http://www.ibec.ie/ 3
26 http://www.cao.ie/ 3
28 http://www.iadt-dl.ie/ 3
38 http://www.ul.ie/ 3
39 http://www.lyit.ie/ 3
46 http://www.into.ie/ 3
54 http://www.bankofireland.ie/ 1
66 http://www.entemp.ie/ 1
67 http://www.environ.ie/ 3
72 http://www.marine.gov.ie/ 3
81 http://www.arthritis-foundation.com/ 3
82 http://www.beaumont.ie/ 1
83 http://www.crc.ie/ 1
84 http://www.erha.ie/ 3
90 http://www.simoncommunity.com/ 1
96 http://www.courts.ie/ 3
98 http://www.icai.ie/ 1
120 http://www.greenparty.ie/ 1
127 http://www.irishsportscouncil.ie/ 1
131 http://www.esat.ie/ 1
145 http://www.2003specialolympics.com/ 1
148 http://www.tesco.ie/ 1
158 http://www.firstactive.ie/ 1
188 http://www.buyandsell.net/ 3
190 http://www.frontend.ie/ 3
210 http://www.newmedia.ie/ 3

H. Successful bobby Jobs

The table below lists the 159 sites which were successfully captured by pavuk, and for which bobby successfully generated accessibility defect reports. All the results reported are based on this final sample.

Site ID Root URL
1 http://www.farm.ie/
2 http://www.ispca.ie/
3 http://www.dublinzoo.ie/
4 http://www.artscouncil.ie/
6 http://www.nch.ie/
9 http://www.dunsink.dias.ie/
10 http://www.asai.ie/
11 http://www.chambersireland.ie/
12 http://www.consumerassociation.ie/
14 http://www.enterprise-ireland.com/
15 http://www.idaireland.com/
17 http://www.isme.ie/
18 http://www.sfa.ie/
19 http://www.barnardos.ie/
21 http://www.fightingblindness.ie/
22 http://www.ncbi.ie/
23 http://www.guidedogs.ie/
24 http://www.ahead.ie/
25 http://www.military.ie/
27 http://www.dcu.ie/
29 http://www.dit.ie/
30 http://www.cit.ie/
31 http://www.hea.ie/
32 http://www.ipa.ie/
33 http://www.nala.ie/
34 http://www.ucd.ie/
35 http://www.tcd.ie/
36 http://www.ucg.ie/
37 http://www.ucc.ie/
40 http://www.usi.ie/
41 http://www.asti.ie/
42 http://www.ria.ie/
43 http://www.equality.ie/
44 http://www.fas.ie/
45 http://www.irishjobs.ie/
47 http://www.ireland.com/
48 http://www.siptu.ie/
50 http://www.pleanala.ie/
51 http://www.enfo.ie/
52 http://www.met.ie/
53 http://www.aib.ie/
55 http://www.creditunion.ie/
57 http://www.nib.ie/
58 http://www.bim.ie/
59 http://www.nli.ie/
60 http://www.gov.ie/
61 http://www.gov.ie/daff/
62 http://www.ealga.ie/
63 http://www.gov.ie/ag/
64 http://www.gov.ie/defence/
65 http://www.gov.ie/educ/
68 http://www.gov.ie/finance/
69 http://www.gov.ie/iveagh/
70 http://www.doh.ie/
71 http://www.justice.ie/
73 http://www.gov.ie/tec/
74 http://www.revenue.ie/
76 http://www.gov.ie/taoiseach/
77 http://www.gov.ie/tourism-sport/
78 http://193.120.124.98/
79 http://www.aaireland.ie/
80 http://www.alzheimer.ie/
85 http://www.hrb.ie/
86 http://www.nwhb.ie/
87 http://www.rehab.ie/
89 http://www.bupaireland.ie/
91 http://www.amnesty.ie/
92 http://www.iccl.ie/
93 http://www.cso.ie/
94 http://www.ntl.ie/
97 http://www.lawsociety.ie/
99 http://www.corkcoco.com/
100 http://www.dublincity.ie/
101 http://www.dlrcoco.ie/
102 http://www.fingalcoco.ie/
103 http://www.kerrycoco.ie/
104 http://www.lotto.ie/
106 http://www.hotpress.com/
109 http://www.ticketmaster.ie/
111 http://www.todayfm.com/
112 http://www.tv3.ie/
114 http://www.concern.ie/
117 http://www.finegael.ie/
118 http://www.labour.ie/
121 http://www.sinnfein.ie/
122 http://www.postoffice.ie/
123 http://www.fai.ie/
124 http://www.gaa.ie/
125 http://www.gui.ie/
126 http://www.irishrugby.ie/
128 http://www.paddypower.com/
129 http://www.eircom.ie/
130 http://www.vodafone.ie/
132 http://www.digifone.ie/
133 http://www.aerlingus.com/
134 http://www.aer-rianta.ie/
135 http://www.ryanair.com/
136 http://www.buseireann.ie/
137 http://www.dublinbus.ie/
138 http://www.irishrail.ie/
139 http://www.womensaid.ie/
140 http://www.scout.ie/
141 http://www.irelandyha.org/
142 http://www.youth.ie/
143 http://www.nyoi.ie/
144 http://www.ncte.ie/
146 http://www.rince.ie/
147 http://www.sfi.ie/
150 http://www.axa.ie/
151 http://www.cara.ie/
152 http://www.buy4now.ie/
153 http://www.elan.com/
154 http://www.mentec.ie/
155 http://www.crh.ie/
159 http://www.greencore.ie/
160 http://www.glanbia.ie/
161 http://www.horizon.ie/
162 http://www.iaws.ie/
163 http://www.riverdeep.com/
164 http://www.jurysdoyle.com/
165 http://www.irishlifepermanent.ie/
166 http://www.sherryfitz.ie/
167 http://www.cbhok.com/
168 http://www.alphyra.ie/
169 http://www.iona.com/
171 http://www.microsoft.com/ireland/
172 http://www.sun.ie/
173 http://www.compaq.ie/
174 http://www.itupgrade.ie/
175 http://www.e-bss.com/
176 http://www.prestige.ie/
177 http://www.webrooms.ie/
178 http://www.parallelit.com/
179 http://www.softwareireland.com/
181 http://www.aro.ie/
182 http://www.goldenpages.ie/
183 http://www.realtaxis.com/
184 http://www.irelandtaxi.com/
185 http://www.taxicabireland.com/
187 http://www.bsa.org/ireland/
191 http://www.mediaone.ie/
192 http://www.iia.ie/
193 http://www.register.ie/
194 http://www.domainregistry.ie/
195 http://www.amarach.com/
196 http://www.labyrinth.ie/
197 http://www.webbusters.com/
199 http://business.adnet.ie/
200 http://www.aranconsulting.com/
201 http://www.cedartreeireland.net/
202 http://www.calyco.net/
205 http://www.cobweb.ie/
206 http://www.thecia.ie/
207 http://www.desktopireland.com/
208 http://www.tdhinteractive.ie/
209 http://www.futurasoft.com/
211 http://www.simplyit.ie/
213 http://www.mediahouse.ie/
214 http://www.corkwebstudio.com/

Footnotes

... Accessibility?1
Much of the text in this section is derived from a previous short article (McMullin, 2002).
... Web2
Technically, the web is just one specific service hosted on an underlying communications network, which is the Internet. However, given that the web is by far the most familiar Internet service, and often now provides the primary user interface to other services, I will generally not distinguish between them here.
... users.3
However, to be properly effective, this technique does rely on the author/designer of the web site having a clear understanding of the need for, and the appropriate use of, such alternative text (Flavell, 2002).
... site.4
Note that this is, in effect, 200 KBytes of text--as opposed to graphics or other content. It corresponds to roughly 70 pages of print, per site, and is therefore quite substantial.
... form5
ASP: Application Service Provider. See also: http://whatis.techtarget.com/definition/0,289893,sid9_gci213801,00.html
... checkpoint.6
There are an additional 3 bobby diagnostics which do not relate to the WCAG guidelines, but only to the requirements of Section 508 of the US Rehabilitation Act; since this act does not apply in the Irish jurisdiction, these diagnostics were excluded from the current study, and will not be discussed further.
... 4.9 KByte/s.7
This is slightly different from the average data transfer rate because some additional data was "transferred" but not "captured"; this can arise where a transfer is initiated, but subsequently aborted for some reason--for example, if the site transfer quota is exceeded.
... itself.8
It should be noted that bobby is not distributed in source form; this makes further investigation in cases such as this problematic.
... basis9
Flavell (2002) provides a more extended discussion and tutorial.
... server.10
This issue is also discussed, in a somewhat different context, in section 2.6.2.

Page Administrative Information

Maintainer: eaccess@rince.ie
  • Level Double-A conformance icon, W3C-WAI Web Content Accessibility Guidelines 1.0
  • Valid HTML 4.01!
  • Valid CSS!