Ensuring Right Information on the Right Person(s)
Legal Controls of the Quality of Personal Information
Lee A Bygrave(1)
Section 1: Introductory remarks
This paper is in two parts. Part I attempts to establish a framework for analysing the notion of quality in relation to personal information. Part II evaluates the functionality of certain legal controls of the quality of personal information, particularly those controls found in legislation on privacy and data protection.
In this paper, the notion of quality refers, in short, to various characteristics or attributes of personal information which bear on the worth of the latter for given purposes and given persons. These characteristics are broken up into several categories. First, there is a group of characteristics that relate primarily to the data upon which information is based. Secondly, there is a set of characteristics concerned essentially with the uses to which data are put as information. A third group of characteristics relate to the systems that are constructed and used for information processing. Thus, in this paper, the notion of information quality is viewed as embracing the quality of both data and information systems.
The concepts of data, information and information system are employed here in basically the same way as they are defined by the Organisation for Economic Co-operation and Development (OECD) in Part III of its Guidelines for the Security of Information Systems (Paris: OECD, 1992). Thus, the term "data" denotes signs or characters which represent facts, concepts, processes or instructions in a formalised manner and which are suitable for communication, interpretation or processing by human beings or by automatic means. "Information" refers to the meaning assigned to data by way of conventions applied to the data. "Information system" means computer and communication facilities and networks, and data and information that may be stored, processed, retrieved or transmitted by them, including programs, specifications and procedures for their operation, use and maintenance.
Each of the concepts of data, information and information system is both complex and diffuse. In practice, the distinction between data and information is often difficult to draw, as are the borderlines between one information system and another such system. Moreover, the above definitions of these concepts are not accepted by all scholars and practitioners.(2) Nevertheless, the definitions given above would appear to have a broad, cross-national following, and they serve as useful points of departure for the purposes of this paper.
In relation to any information system, there are at least four kinds of actors. For the purposes of this paper, these are termed "data controllers", "data processors", "data users" and "data subjects". A data controller is the person or organisation which is in charge of an information system and hence determines the purposes and means of the processing of data in that system. A data processor is the person or organisation which actually carries out the processing (including collection, registration and storage) of data. A data user is a person or organisation which takes data in an information system and applies these for various purposes. A data subject is the person, organisation or object to which the data and information in an information system are supposed to relate. Again, in practice, the distinction between these categories of actors is not hard and fast. In relation to many information systems, the data controller functions also as the data processor and chief data user. Further, data controllers, data processors and data users are by and large data subjects in relation to other information systems.
As suggested by its title, this paper is not concerned with all kinds of information, only personal information; ie information which may be linked to identifiable persons (either physical/natural or legal/juristic). Nevertheless, much of the discussion in the paper is of relevance for all kinds of information.
The paper is divided into six sections. Section 1 outlines the paper's points of departure. In section 2, the notion of information quality is described and defined. Section 3 provides a brief overview of empirical studies of the quality of information, while section 4 lists factors that affect such quality. An outline of various legal rules pertaining to the quality of information is given in section 5. Finally, in section 6 some suggestions for legal reform are considered which may bring about improvements to information quality.
It should be noted that references to Norwegian practices, policy and law predominate in the paper. This is due largely to the fact that the paper is based on research carried out at the Norwegian Research Centre for Computers and Law, Oslo, where I have had ready access to Norwegian materials. Nevertheless, similar sorts of practices, policies and laws are also found in other countries, and the discussion in the paper should have relevance outside Norway.
Many areas of law (eg on negligence, defamation and/or judicial review of administrative decision making) have long been concerned with aspects of the quality of information. Traditionally, however, the issues with which they deal have not been explicitly framed in terms of the concept of quality. Indeed, there appears to be little literature in the field of law dealing specifically and expressly with the quality of information. Surprisingly, there also seems to be little such literature in the fields of informatics and computer science. In his doctoral thesis completed in 1972, Kristo Ivanov noted that literature on computer and information science showed little concern for the subject of data/information quality. The literature also revealed little agreement on how to define data/information quality as a concept and on how to measure it.(3) Some would argue that this situation has not changed significantly in the meantime. At an international conference on the topic of information quality organised in 1989 at the Royal School of Librarianship in Copenhagen, one of the invited guest speakers observed:
In the age of information it is a profound irony that there is a lack of solid body of theoretic work on the quality and value of information. This area of knowledge lacks synthesis or even a compendium to bring together the theoretical studies.(4)
Similarly, Christopher Fox et al note that
[d]espite the importance of data quality, it has received little attention. [...] No common framework for studying data quality problems, nor an agreed-on terminology for discussing data quality, has emerged from the modest efforts to date.(5)
Claims have also been made that organisations have given little attention to the quality of the information that they collect, process and store. For example, Donald Marchand has written in relation to the private sector:
For better or worse, managers exhibit a tendency to take information quality for granted as they navigate through the many sources of information available to them. Information overload seems to be a more pressing concern than the relative degree of information quality which their information sources represent. When executives do, on occasion, perceive that a failure in decision-making is due to the poor quality of the information they used, they tend to treat the occurence as exceptional or anecdotal, rather than as a deficiency in either the information services, sources or systems which are available to them in the company. That is, many executives would not consider a deficiency in information quality as a problem worthy of more systematic management attention!(6)
Although Marchand is commenting on managerial attitudes in the private sector, the findings (outlined in section 3 of this paper) of various studies of the quality of information stored by governmental agencies would seem to suggest that Marchand's comments probably are applicable also to many bureaucrats working in the public sector.
In many countries, efforts aimed at improving computer and communications security seem to have focused traditionally on safeguarding the confidentiality of information; ie they have concentrated on ensuring that information is not disclosed to unauthorised persons. Relatively little attention appears to have been given to safeguarding other aspects of security, such as ensuring that information is correct, complete and relevant in relation to the purposes for which it is used, and/or that data are protected from being altered, damaged or destroyed in an unauthorised manner. For example, a report issued in 1991 by the US National Research Council noted that initiatives by the US government to enhance computer and communications security "have related largely to preserving national security and, in particular, to meeting one major security requirement, confidentiality (preserving data secrecy)".(7) The report went on to state that these initiatives "have paid little attention to the other...major computer security requirements, integrity (guarding against improper data modification and/or destruction) and availability (enabling timely use of systems and the data they hold)".(8)
Similar findings have been made in relation to the Nordic countries. In 1993, the Nordic Council of Ministers instituted a study of the information security practices and needs of selected Nordic governmental institutions in the civil sector. The study report concluded, amongst other things, that measures taken by these institutions to safeguard the confidentiality of information were generally quite strong, whereas measures taken to safeguard the quality of information were relatively weak.(9)
Nevertheless, in recent years, steps have begun to be taken to pay greater attention to information quality. The two reports referred to immediately above are evidence of this new concern. In particular, the study initiated by the Nordic Council of Ministers sought specifically to highlight the importance of securing appropriate quality of information. In addition, systematic empirical studies of the quality of information held by selected organisations have been instituted in recent years,(10) some governmental information policies have started explicitly to address the need for better quality assurance of information,(11) and new legal rules and regulations on the topic have been passed or proposed.(12)
The recent trend to pay more attention to information quality would seem to be fueled by several factors. One major factor is that there is a growing tendency on the part of organisations to look upon information as constituting a valuable resource. The concept of "information resource(s) management" (IRM), for example, has become one of the more popular elements in current managerial policy making. IRM treats information as a resource of considerable administrative, political and economic value which should be managed and controlled like other important resources.(13) Closely linked to this perspective is the notion that information constitutes a resource that can and should be shared between various bodies.(14) Practical application of this notion manifests itself in policies and practices aimed at increasing the transfer, exchange, use and re-use of information across organisational boundaries.(15) Such policies and practices are motivated by several concerns, including a belief that they will lead to greater intra- and inter-organisational efficiency, better service for organisations' respective sets of clients, and/or larger financial profits for the organisations involved. The expectation of increased profitability reflects the fact that there is a rapidly growing market in information services, a market in which information can be bought and sold for significant financial sums. Given the greater importance being attributed to information for organisational efficiency, service and profitability, it is obvious that the quality of information also will assume greater importance.
A second major factor is that there is an accumulating body of empirical evidence to suggest that, in many cases, the increased transfer, exchange, use and re-use of information do not lead to gains in organisations' efficiency, service and/or profitability because the quality of the information concerned is inadequate.(16) This evidence serves to highlight the need for greater attention to be devoted to safeguarding and improving information quality.
A third factor is that there is a growing number of countries that have enacted privacy and data protection laws over the last two decades. One of the central principles set out in these laws is that organisations which collect, register and process personal information (and/or data) should take steps to ensure that such information is correct, complete and relevant in relation to the purposes for it is collected, registered and processed.(17) Of course, it is difficult to know with certainty what the practical effect of these laws actually is; it may be that many organisations that are supposed to comply with the laws do not do so, for reasons of ignorance, apathy, indifference and/or an attitude that compliance is too burdensome. Nevertheless, it is fairly clear that such laws reflect a realisation on the part of legislators and members of the public that the quality of personal information is of great importance for the well-being, integrity and privacy of the data subject(s).
Section 2: Defining the notion of information quality
The notion of quality has various connotations. It can function as an outright value judgement, characterising something as having attained a high degree of accomplishment, worth or excellence (eg one can say: "This chair is a quality product"). It also can function more loosely to denote any measure of accomplishment, worth or excellence (eg "This chair is of good quality; that chair is of bad quality"). Alternatively, the notion can be used to denote simply a particular attribute of something (eg "One quality of this chair is that it is made of mahogany").(18)
It is apparent, then, that the notion has both objective and subjective dimensions. The objective dimension is connected to the properties, content or construction of an object, including the relation of the latter to other objects; the subjective dimension to persons' opinion(s) of the value of an object, particularly in light of their own expectations, desires and needs. Of course, the distinction between the two dimensions is not sharp, and the properties, content or construction of an object will determine to a large extent how the object is valued in terms of its value, worth and utility.
In the burgeoning literature on so-called "quality assurance" and "quality management", the notion of quality is usually defined in a way that combines both the subjective and objective dimensions of the term. Thus, Erik Jersin, for example, in his book on quality assurance, defines the quality of a product in terms of the product's ability to satisfy the product user's needs, desires, requirements and expectations, and he notes that this definition is tied to the user's own assessment of the finished product.(19) The definition of quality given by the International Organization for Standardization (ISO) is along similar lines. For the ISO, quality refers to the "totality of features and characteristics of a product or service that bear on its ability to satisfy stated or implied needs".(20)
Although the ISO definition of the notion of quality has been criticised for being somewhat narrow,(21) it seems, nevertheless, to sum up in a concise and analytically serviceable way a notion that is otherwise vague and difficult to grasp. Moreover, it combines both the objective and subjective dimensions of the notion which are described above. In my view, it also serves as a suitable point of departure for defining the notion of quality in relation to information.
Accordingly, for the purposes of the following discussion in this article, I employ the notion of quality as referring to various characteristics or attributes of information which bear on the worth of the latter for given purposes and given persons. This notion of quality encompasses also attributes or characteristics of the relationship between various sets of data, information and information systems. At the same time, though, I do not employ the notion of quality per se to express a judgement about the actual value, worth or utility of information relative to other information.
The definition of quality advanced in this article is hardly radical or pioneering. Most of the studies I have come across which broach the topic of data/information quality have advanced a similar definition. In Sweden, for instance, a range of official reports define information quality in terms of the usefulness of information for a given problem and a given user.(22) In Norway, similar definitions of data/information quality have been embraced, either explicitly or implicitly.(23)
The next issue that needs to be tackled concerns identifying and defining more closely the attributes or characteristics of information which are encompassed by the term "quality". As mentioned above, the attributes or characteristics which are of interest here are those that bear on the worth and utility of information for given purposes and given persons. Obviously, there is a large number of such attributes or characteristics, and it is difficult (if not impossible) to specify them all. In the following, I set out and describe the most central of these attributes or characteristics. My approach builds and expands upon a range of previous analyses of the notion of data/information quality, particularly the classificatory scheme adopted by the Swedish Committee for Informational-Technical Standardisation in its report, Terminologi för Informationssäkerhet.(24)
To begin with, there is a class of attributes or characteristics which concern the extent to which data correspond with the persons, facts, concepts, instructions or processes which they (the data) are supposed to represent. For the sake of brevity and convenience, these persons, facts, concepts, instructions and processes may be described as Real World Objects (RWO). The correspondence between a set of data and an RWO may be summed up in terms of the validity of the data. Three chief dimensions of such validity are:
(i) the precision of the data (ie the level of detail at which the data describe or define the RWO);
(ii) the completeness of the data (ie the extent to which all data that are necessary to represent an RWO are present in a given information system); and
(iii) the correctness of the data (ie the degree to which the data accurately correspond to the RWO).
An important aspect of the second dimension (completeness) is the identifiability of the data (ie the extent to which the data are able to be connected to the RWO they are supposed to represent). An important aspect of the third dimension (correctness) is the currency, actuality or up-to-dateness of the data (ie the "age" of the data measured in terms of the time difference between when the data are used for a given purpose and when the data first were collected and stored).
All of the above attributes or characteristics relate directly to the content of data. As such, they may be described as aspects of "data quality" (even though I place them under the broader notion of information quality). Other important attributes or characteristics which relate to data quality may be summed up in terms of the integrity, consistency and interpretability of data. By "integrity" is meant the extent to which data remain free from being modified or destroyed whilst being collected, stored and/or disseminated. By "consistency" is meant the degree to which data that are supposed to represent the same RWO do so in a non-contradictory way. The "interpretability" of data refers quite simply to the extent to which data may be understood (ie become information). An essential component of interpretability is, of course, the presentation and form of the data (ie the way in which data appear).
Also related to data quality are a set of attributes or characteristics of the way in which data are registered in a given information system. These attributes or characteristics could be classified under the rubric of "registration quality". They include the following: (i) the extent to which each RWO that is supposed to be registered in a given information system, actually is registered in that system; (ii) conversely, the degree to which entities that are not supposed to be registered in the system are not in fact registered; and (iii) the degree of mistaken double or multiple registration of an RWO in a given information system.(25)
Another set of attributes or characteristics relate primarily to the uses to which data are put as information. They concern essentially the correspondence between information (ie data that are interpreted and understood) and the purpose(s) for which the information is collected, stored and/or used. This correspondence may be summed up in terms of the utility of the information. The two central facets of such utility are the relevance and completeness of the information.
The notion of completeness is easy to define; it simply refers to the extent to which all relevant information is present in relation to a particular application. The notion of relevance, however, is difficult to describe in the abstract and without resorting to circular definitions that refer to concepts (such as pertinence, suitability or conformity) that are equally hard to define. Nevertheless, one can measure the hypothetical degree to which a given set of information is relevant to a given application, in terms of the extent to which the outcome of the application would differ according to whether or not the information is taken into account.
There are several factors that assist in determining the relevance of information. One such factor may be summed up in terms of the cognitive authority of the information (ie the "weight" put on the information because of its perceived credibility and reliability).(26) Another factor may be summed up in terms of the information's legality (ie the extent to which the use of the information for a particular purpose is legally permitted). Both factors, however, are neither necessary nor sufficient conditions for relevance. Information may still be deemed relevant to a given purpose even if it has little cognitive authority or its application is illegal.
Next, there is a set of attributes or characteristics which relate first and foremost to the quality of the systems that are constructed and used to collect, store, process and disseminate data (and thereby information). Such attributes or characteristics include the manageability, robustness, accessability, reliability and comprehensibility of information systems.
The "manageability" of an information system (IS) refers to the degree to which the IS - and interactions between the IS and other systems - can be steered, administered and maintained in a desired manner. It also refers to the extent to which the IS operates on the basis of a clear allocation of responsibilities for defining, registering, storing, rectifying and disseminating the data handled by it.
The "robustness" of an IS refers to the degree to which the system is (in)vulnerable to extraneous interference, while "accessability" relates to the extent to which an IS allows data to be located and retrieved. The latter attribute or characteristic covers both the practical/physical ease with which data can be located and retrieved, and the time it takes to locate and retrieve the data.
The "reliability" of an IS relates to the extent to which the system functions in accordance with the expectations of those who use it and those of who are affected by it. This attribute or characteristic includes the capacity of the system to protect the integrity and retrievability of the data it handles. It also embraces the degree to which the system takes accounts of the levels of random error and bias (ie "systematic" error) with which it operates.
Finally, the "comprehensibility" of an IS relates to the degree to which the system hinders or promotes understanding of the way in which it functions. By "understanding" is meant not just the understanding of the persons or organisations which are responsible for operating the system, but also the understanding of persons or organisations which are affected by the system (eg as data subjects). Furthermore, the "comprehensibility" attribute embraces the capacity of the IS to promote or hinder understanding of the data it handles.
The very last point builds upon the work of Kristo Ivanov, who usefully argues that the quality of an IS involves the capacity of the system for "taking account of alternative contradictory assessments [of data] or building in possibilities of indicating margins of uncertainty or indicating when classifications and definitions are inapplicable or...indicating when a whole database has to be closed down".(27) In short, Ivanov is pointing to the ability of an IS to test aspects of the quality of its data itself, and communicate the results of such tests to those controlling and using the system. To take this point a step further, and using the notion of quality to express a degree of worth/accomplishment, it may be claimed, for example, that an IS which reveals, and/or allows account to be taken of, genuinely ambiguous data, has a higher level of quality than a system which presents the same data as being apparently unambiguous. This measure of quality, though, must be related also to the purposes for which the data in question and the information system are used. It could be, for instance, that the system which presents ambiguous data as being unambiguous is used only for purposes that do not require account to be taken of the ambiguity.
It should be emphasised that the five attributes or characteristics of an IS set out above should not be understood as hard and fast categories, nor as being entirely separate of each other. There is considerable overlap between, say, the robustness reliability of an IS, and between its manageability and comprehensibility.
The terminology employed in the above presentation is not used uniformly by persons or organisations working in the fields of computer and information science or in the fields of data protection and information security. For example, Clark, Holloway and List use the term "integrity", rather than "quality", to denote the "completeness, correctness, accuracy and timeliness of data and information derived from the data".(28) Similarly, the OECD's Guidelines for the Security of Information Systems (Part III) also employ the notion of integrity to refer to "the characteristic of data and information being accurate and complete and the preservation of accuracy and completeness". This paper, however, uses the notion of integrity as a sub-element of the notion of quality, to refer simply to what the OECD's Guidelines describe as the "preservation of accuracy and completeness". Other persons and organisations working in the field of information security also use the term "integrity" in this more restricted sense.(29)
Many scholars and technologists do not distinguish between "data quality" and "information quality", but use the one term to refer to both concepts. I also use one term (information quality) to embrace the quality of both data and information systems; I do so for the sake of brevity and convenience. In the presentation above, however, I show that a distinction can be drawn between data quality, information quality and information systems quality. I observe this distinction largely in order to draw out the multi-faceted nature of the topic. In practice, however, each type of quality is related to, and affected by, the other two types of quality.(30) It is clear that the quality of data affects the quality of information, which affects, in turn, the quality of the information system concerned. Moreover, it is obvious that the quality of the IS also affects the quality of the data and information processed by the system.
Finally, it should be noted that all three types of quality are affected, in turn, by the understanding, motivations and worldview of the data controller/processor/user. All information is created and processed on the basis of certain models or views of the world. Such models help determine how a particular problem is understood, and, accordingly, which information is deemed necessary (relevant) for tackling it. In the words of Dionysios Tsichritzis and Frederick Lochovsky, a model "allows us to see the forest (information content of the data) as opposed to the trees (individual values of data)".(31) Often these models will be organised formally.(32) Sometimes, though, it is difficult to discern all of their contours in detail, as they remain latent rather than expressly formulated. Nevertheless, some of their contours will be found in the organisation of an IS, which tends both to embody and shape such models. The important point here, however, is that the quality of any information (and any data or IS) can only ever be fully assessed in the light of the models upon which the information is based. Moreover, poor conceptualisation of a problem (or what might be termed poor model quality) will tend to result in poor interpretation and application of the information which is processed to address the problem.(33)
Section 3: Empirical studies of information quality
In Norway, there is a paucity of comprehensive empirical studies of the quality of information held in major information systems. There is also a paucity of thorough surveys of the degree to which customers and users of these information systems are satisfied with the systems' performance. Concomitantly, it is not surprising to find that a comprehensive strategy for defining, measuring and securing adequate quality in relation to data, information and information systems has yet to be developed by Norwegian authorities.(34)
This is not to say that there is a complete absence in Norway of empirical studies of information quality, but the focus of these studies so far has been relatively narrow. In the following, I briefly describe the results of some these studies. My list is not exhaustive, and it includes only studies the results of which have been made generally available to the public.
In 1995, the Norwegian Directorate of Public Management (Statskonsult) initiated a study of data quality in three large data registers: the Central Population Register (Det sentrale folkeregisteret - hereinafter termed the DSF-register), the Employer/Employee Register (Arbeidsgiver-/Arbeidstakerregisteret - hereinafter termed the AA-register) and the Property/Address/Building Register (Grunn-/Adresse-/Bygningsregisteret - hereinafter termed the GAB-register). However, the empirical survey of data quality in these registers was far from comprehensive. Not all data in the registers were examined; neither was there a thorough study done of the accuracy of the data. It appears from the study report that the empirical survey of data quality had demanded more resources than the Directorate had anticipated.(35) If this is the case, one may query why the necessary resources for undertaking a more comprehensive survey were not made available, particularly given the paucity of such studies in Norway, along with the administrative and economic importance of the data in the three registers concerned.
Nevertheless, the study was able to identify several problems with each of the three registers. The basic problem identified with the GAB-register was delays (of up to several months) in registering data. Unsatisfactory data quality in the register was found to lead to loss of income for the State Mapping Authority (Statens Kartverk), which administers the register, due to lack of external interest in using the data.(36) Users of the DSF-register were found to complain that central concepts, such as "address" and "relocation", are defined in the register differently to the way they are defined in other registers.(37) The study report indicates also that approximately 5.5% of persons do not reside at the addresses they are registered as residing at in the DSF-register.(38) As for the AA-register, the study found that 70 percent of data registrations are delayed (on average by two months), causing around ten percent of data in the register to be out-of-date.(39) The register was also found to include data which are not supposed to be registered, while data which are supposed to be registered were either missing or given an incorrect identification code.(40) These and other discrepancies were found to reduce severely the utility of the register for producing reliable statistics and managing the work market (eg through matching of the data in the register with data in other registers).
The report concluded by recommending, inter alia, that all administrators of information systems develop routines for securing and assessing the quality of data and information in their respective systems. It was also recommended that assessments be made of the costs caused by incorrect data, both for administrators and users of information systems. The report suggested additionally that steps be taken to assess the adequacy of the legal requirements for data quality in relation to each information system, especially the DSF-register.
Another empirical study of information quality was undertaken by the Norwegian Institute of Technology's Foundation for Scientific and Industrial Research (Senter for Industriell og Teknisk Forskning (SINTEF)) and the Norwegian Institute for Hospital Research (Norsk Institutt for sykehusforskning (NIS)). This study examined the quality of anonymised data on patients at public hospitals in Norway during the period 1986-1992.(41) The study found that the quality of these data (measured in terms of completeness, accuracy, precision and currency, but not relevance) improved during this period, and that the extent of alarming discrepancies was minor. Nevertheless, Jørgensen reports that the study was not sufficiently comprehensive to provide a fully reliable picture of the quality of the data.(42) He reports also that there is still room for improvement of hospitals' quality control measures.(43)
Dag Wiese Schartum carried out a path-breaking study in the mid-1980s of three computerised information systems operated by public administrative agencies in Norway.(44) Schartum's study focused upon the legal correctness of computer program code developed for calculating unemployment benefits, student loans and sickness benefits, respectively. Schartum discovered a variety of instances of program code embodying fallacious or doubtful interpretations of the relevant law. Moreover, he found a lack of clear, detailed and non-misleading documentation explaining the development of the information systems concerned and the links between the relevant legal rules and the program code embodying these rules. Thus, in addition to identifying problems with the (legal) correctness of certain program code, Schartum's research highlighted serious inadequacies with respect to the comprehensibility, reliability and, ultimately, manageability of automated case-processing systems used by central Norwegian administrative agencies, both from the perspective of the affected data subjects and the data controllers. Regrettably, his research has not led as yet to similar, more wide-ranging studies being made of other major information systems in Norway.
The Norwegian Directorate for Taxation (Skattedirektoratet) undertook several examinations in 1987-88 of the quality of data on taxpayers supplied by employers, banks, insurance companies and other third parties for the purposes of income tax assessment. The examinations revealed considerable delays in the delivery of many of these data.(45) In addition, large proportions of the data - particularly those supplied by employers - were found to lack the necessary identification codes for linking the data correctly to the persons they concern.
While the taxation authorities have since made a concerted attempt to fix these weaknesses - and appear to have been largely successful in doing so(46) - some problems may remain. This is evidenced, for example, by a study undertaken by Even Harket in 1995-96 of the process by which funds management companies supply data on taxpayers' share holdings to the taxation authorities for the purposes of income tax assessment. The study found, inter alia, that some of these share data for the tax years 1994 and 1995 still lacked the necessary identification codes for linking the data correctly to the persons or organisations they concern.(47) Although the proportions of share data suffering from this problem were fairly low (2.1 percent for the 1994 tax year, and 1.5 percent for the 1995 tax year), they may nevertheless constitute a significant financial loss for the Norwegian state.
In addition to the studies listed above, sporadic evidence of unsatisfactory data quality in Norwegian administrative agencies' files has emerged in the wake of various attempts by these agencies to match and compare data in their respective registers. A noteworthy example of such evidence is the results of a matching program carried out by the social welfare offices of three neighbouring municipalities in Norway in 1993. The program was initiated in order to identify persons who were illegally claiming and receiving social security benefits from more than one of the offices at the same time. An initial match revealed a substantial number of social security clients apparently engaged in "double-dipping". Subsequent analysis of the matching results revealed, however, that there was no fraud. That persons were registered as clients of more than one municipal social welfare office was due to the failure of the offices to up-date their respective client registers when a client moved residence from one municipality to another.(48) Similarly, an on-going matching program carried out by Norway's Labour Directorate in order to identify persons in illegal receipt of unemployment benefits, has revealed that around 90 percent of "hits" (ie cases of apparent fraud) stem from poor data quality in one of the matched data registers, rather than from fraud.(49)
Urgent calls for the instigation of more empirical studies of the quality of data and information in major Norwegian information systems have come from various quarters, including the legal research community(50) and various politicians.(51) The Norwegian Data Inspectorate (Datatilsynet) has long wanted to carry out thorough quality control of a large national database but has lacked the resources to do so.(52)
In February 1991, the Nordic Council (Nordisk råd) came close to initiating a comprehensive study of the quality of data and information in selected data registers run by administrative agencies in the Nordic countries. The study proposal was unfortunately rejected, though only by a narrow margin.(53) The rejected proposal also embraced several other recommendations pertaining specifically to the quality of automated administrative case-processing systems. One recommendation was to examine the extent to which computer programs used in such systems accurately reflect and apply relevant law. The latter recommendation was motivated in part by the results of Schartum's study described above.
The paucity of comprehensive, systematic empirical studies of the quality of data, information and information systems in Norway is a cause for concern. This is particularly so given the disturbing results of some of the empirical research described above. Similar research carried out in other countries has also uncovered major problems related to information quality. In Sweden, for instance, the National Audit Office (Riksrevisionsverket (RRV)) undertook a series of studies from the mid- to late 1980s of information quality in ten computerised information systems run by state agencies. The Audit Office found that data controllers were often unaware of the information quality in their respective systems.(54) At the same time, it was found that data users seldom made precise demands about the quality of the information they were supplied. In addition, few analyses had been carried out by data controllers to establish both desired levels of information quality and the consequences of not reaching those levels. There was a concomitant failure to work out in detail the purpose(s) for which information was processed, and how it was to be categorised. Moreover, the uncertain parameters of many of the information systems, along with the fact that these parameters sometimes failed to follow traditional organisational boundaries, made it difficult for data controllers to establish the extent of their respective spheres of responsibility for the quality of information they processed.
In a subsequent study from 1991, the Audit Office attempted to estimate some of the financial costs of poor information quality in computerised address registers used by the Swedish postal service. The study found that errors in these registers cost the postal service SEK 100-200 million per annum.(55) However, it was noted that such errors also result in considerable costs for a large number of other organisations that apply or rely on the data in the address registers.
One of the most alarming and widely cited sets of results of a systematic empirical study of data and information quality comes from the USA. There a survey was carried out in the early 1980s of the completeness, accuracy and ambiguity of records on persons' criminal histories and arrest warrants kept in information systems operated by the Federal Bureau of Investigation (FBI). These information systems were the National Crime Information Center Computerized Criminal History (NCIC-CCH), the Identification Division (Ident) and the National Crime Information Center Wanted Person System (NCIC-WPS). Also examined were the criminal-history record systems of three states. The study was carried out by Kenneth Laudon with funding from the US Congress' Office of Technology Assessment (OTA).(56)
The study found that 54.1 percent of the records in the NCIC-CCH and 74.3 percent of the records in the Ident system, were incomplete, inaccurate or ambiguous. According to Laudon, the most common problem "was lack of court disposition information (where present at all), ambiguity of record, or some combination of the above".(57) As for the three state criminal-history record systems, just 12.2 percent of the data in one of these systems (run by a state in the south-east) were found to be complete, accurate and unambiguous. Corresponding figures for the other two states were 18.9 percent and 49.4 percent respectively.(58) With regard to the NCIC-WPS, 11.2 percent of recorded warrants were found to be no longer valid, 6.6 percent were found to be inaccurate in their classification of offence, and 15.1 percent were probably not capable of prosecution because they were more than 5 years old.(59) In summing up the study results, Laudon writes:
The most conservative interpretation of our research is that it is reasonable to believe that (1) constitutional rights of due process are not well protected in either manual or computerized criminal-history and federal wanted-person systems, and (2) the efficiency and effectiveness of any law enforcement or criminal-justice programs that use or rely on such records must be considerably impaired.(60)
The results of Laudon's study are even more disturbing given the fact that the US Department of Justice had issued regulations in 1975 for the use and maintenance of criminal history information systems at both the federal and state levels.(61) The regulations require, inter alia, state agencies to conduct annual audits of the quality of criminal history records. Already in 1980, however, a study instituted by the OTA found that few state agencies complied with this requirement.(62) In the words of Kenneth Laudon:
In 80 percent of the states no audit [of criminal history records] has ever been conducted, most states have no adequate procedure to monitor incomplete records, many states (33 percent) cannot trace the flow of information down to the individual level with transaction logs, and nearly 80 percent of the states rarely if ever review transaction logs....(63)
Similarly, results from an investigation of record-keeping practices of US Federal government agencies carried out in 1985 by the OTA indicated that few agencies conducted audits of the quality of the data in their record systems. A total of 142 agencies were asked by the OTA for the results of any data quality audits they conducted on their record systems falling within the ambit of the Privacy Act 1974 and computerised record systems maintained for law enforcement, investigative and/or intelligence purposes; of the 127 agencies that responded, only 13 percent stated that they conducted such audits, but only one agency provided any audit results.(64) In general, the OTA found that "agencies were not, on the whole, making use of...technology to ensure record quality, and were conducting few reviews of record quality".(65)
I do not know of the extent to which the problems highlighted by these studies have improved in the meantime. Neither do I know whether or not these sorts of problems are common to a large number of countries. It would be surprising, though, if such problems were not.
Section 4: What factors affect information quality?
A multitude of factors affect the quality of information. Some of these factors are basically technological in character while some are essentially organisational. Still others are primarily cognitive. In this section, I do not attempt to canvass all of these factors in detail but to provide a brief overview of the main factors concerned.(66) I consider first technological factors, then organisational factors, and finally factors pertaining directly to human cognition.
To begin with, the quality of information is dependent on the technological apparatus used to process it. Today, electronic computer systems constitute the basic technological apparatus for processing large amounts of information. Faults which detrimentally affect information quality can and do arise in various parts of computer systems: eg in the computer hardware and software.
Hardware faults are allegedly a rare occurrence and easy to detect.(67) Software errors, however, are claimed to be more frequent. Indeed, it has been said that Murphy's third law, applied to computer programs, reads: "There are no programs without bugs; only programs in which the next bug has not yet been found".(68) According to Edward Yourdon, "the typical software product produced by the average software organization in the United States has three to four defects per thousand lines of code".(69) In Yourdon's view, these organizations should offer a level of software quality at which the defect rate is "approximately three to four defects per million lines of code."(70)
Many software faults are clearly attributable to accidental human error. Such error can occur not just in the creation of software or hardware but in relation to all aspects of information processing, from data collection to registration, storage and transmission. For instance, stored data may be mistakenly deleted and/or amended by computer operators. Nevertheless, it is usually systematic rather than random errors which allegedly have the greatest significance for information quality in large databases.(71) It should also be noted that there is an expanding range of technologies for automatically correcting the results of many random human errors.(72)
The extent of random human error in information processing is partly dependent on the amount of care taken by the persons engaged in the processing, along with the level of these persons' processing expertise. Levels of both care and expertise are dependent, in turn, on a range of organisational factors that are largely outside the immediate control of the persons concerned. For instance, Yourdon attributes the high incidence of error in US computer program code to systemic deficiencies in software quality assurance (SQA). According to Yourdon, while approximately 75 percent of US software development organizations have established independent SQA groups, the actual practice of SQA is "abysmal". Yourdon cites the results of recent surveys which highlight apparently major weaknesses in the SQA strategies of most US software development organisations and which show that the majority of personnel charged with executing SQA have no formal professional training in such a task.(73) Similarly, Schartum contends that instances of legally incorrect program code may be due, in part, to an absence of lawyers in the systems development process.(74)
As for random errors made in the collection and registration of data, these errors may also be directly or indirectly an outcome of a variety of factors with organisational roots. One such factor may be the time pressure under which these persons work.(75) Other factors may be the extent to which the information is actually used by the persons or organisations engaged in its processing, and/or the degree to which these persons/organisations must bear the costs for poor information quality. It would be reasonable to expect that when deliverers or processors of data have little use themselves for these data, and/or when they do not bear the costs of errors in the data, there is little incentive for them to implement stringent quality control measures. For instance, Sweden's National Audit Office identified as one factor contributing to poor quality of data in Swedish address registers the fact that those organisations responsible for registering and distributing address data in the first place, do not have responsibility for posting documents etc with the help of the address data.(76)
A range of other organisational factors affecting quality of address data were also identified by the National Audit Office.(77) These factors included:
- uncertainty as to which organisations are directly responsible and liable for the quality of the data in the registers concerned, and uncertainty as to the purposes for which these registers are to be used;
- a lack of documented agreements regulating the flow of data between the involved organisations, and specifying how the data are to be defined and up-dated;
- an absence of systematic, regular controls of the quality of the data in the address registers;
- a multiplicity of organisations that are able to receive and register a new postal address, which is then spread around the various address registers (according to the Audit Office, there should be ideally only one such organisation);
- the fact that when a new postal address is first registered the data subject does not receive a copy of the registered data for verification.
All of these sorts of factors are undoubtedly important for the quality of other kinds of data and information as well.
Most of the factors listed above have little immediate relationship with the behaviour and actions of data subjects. It is clear, though, that data subjects also play a crucial part in determining the quality of information. They do so firstly as suppliers of data, and secondly as information quality verifiers. As data suppliers, they largely determine the quality of information before it enters an information system. Thus, invalid data they supply to a data controller will detrimentally affect the quality of information in the system concerned, unless, of course, the data controller subsequently undertakes quality checks of the data. There are many reasons why data subjects might supply invalid data. One reason that is often prominent in public debate is data subjects' desire to gain a benefit to which they are not entitled. Some research suggests, however, that intentional supply of invalid data is unlikely to occur unless there is an economic incentive to supply such data and a low probability of being discovered and punished for the action.(78) Quite often supply of invalid data is unintentional, and is due, say, simply to carelessness. Alternatively, it may be due to misunderstanding on the part of the data subject as to what the data controller or processor wants. Such misunderstanding may be attributable, in turn, to ambiguously formulated questions on the information collection forms.(79)
As for data subjects' role as information quality verifiers, the usefulness of this role is determined by the degree to which data subjects are able to have access to data on themselves held by others, and to understand how these data are processed. Without doubt, such access and understanding can constitute valuable means for controlling information quality. At the same time, with increasing speed, automatisation and complexity in the processing and flow of information, it is unrealistic to expect that information quality assurance can or should be undertaken primarily by data subjects alone. In many cases, individual persons have little real possibility to comprehend, let alone control, the processing of data on them.(80)
The overwhelming majority of the factors listed above have a legal dimension. More specifically, they are shaped to at least some extent by a variety of legal rules. For instance, the degree to which data subjects are able to have access to data on themselves is largely a function of the ambit of legal rules providing for, and regulating, a right to such access. And data subjects' ability to comprehend the processing of data on themselves is influenced by data controllers' legal duties to inform and guide data subjects about such processing. These and other legal rules are presented in greater detail in Part II of this paper.
A final set of factors affecting information quality relate to human cognition. According to Bailey, a large proportion of informational error is attributable to faulty calculations, judgements and classifications.(81) In other words, poor information quality may be often a reflection of poor thinking.
This is demonstrated by a comprehensive study of the quality of statistical data on production and consumption of energy in the USA. The study was carried out between 1978 and 1981 by the Oak Ridge National Laboratory, on the initiative of the US Energy Information Administration. It was found that the most significant problem with the quality of statistical data on production and consumption of energy in the USA has not been the data's lack of validity (accuracy, precision or completeness) but their misinterpretation and consequent misapplication. In the words of Andrew Loebl, one of the study researchers, "in virtually every case where an energy analysis was said to be vulnerable on the grounds of inaccurate data, it was subsequently found to have used reasonably accurate data in an inappropriate (nonrelevant) manner".(82) Data misapplication tended to occur because the models used for understanding the problems at hand, and hence for determining which data were relevant to these problems, were faulty. For example, data that were supposed to measure energy conservation actually measured energy consumption, while data meant to measure energy consumption in fact measured energy sales.(83)
Another useful illustration of poor "model quality" leading to misapplication of data is the outcome of a matching program initiated by a Swedish municipality in the early 1980s. The aim of the program was to identify persons in illegal receipt of housing aid, and involved the matching of income data held in various data registers. The matching resulted in a large number of spurious "hits", primarily because account was not taken of the fact that the matched data registers operated with different concepts of "income".(84) The results of this matching program illustrate the obvious but important point that many terms (such as "income"), which we use to categorise data, can have different underlying referents. This is a point that those responsible for the Swedish matching program failed to appreciate.
(1) BA(Hons); LLB(Hons); Barrister; Research Fellow, Norwegian Research Centre for Computers & Law, University of Oslo. This paper forms part of a doctoral research project on legal policy concerning privacy and data protection. The final results of this research will be published as a doctoral thesis for the degree of "dr juris" in mid-1997. The ideas presented in this paper may be revised in the meantime. Any comments on the paper will be gratefully received.
(2) For example, Christopher Fox et al are critical of defining "data" simply in terms of signs or symbols. In their opinion, "data" are better understood as sets of values making up the attributes (eg dates-of-birth) of given entities (eg "employees") that serve in turn as model representatives of so-called "real world objects" (physical or abstract). This conception separates data from their representation, such that data may be seen as being represented in different ways and even to exist without being represented or recorded at all. See Fox, C; Levitin, A & Redman, T: "The Notion of Data and Its Quality Dimensions" (1994) 30 Information Processing & Management, no 1, 9, 11-13. It is pertinent to note, however, that their definition of "data" results in essentially the same analysis of the "quality dimensions" of data as my analysis of such dimensions: cf. section 2, infra.
(3) See Ivanov, K: Quality-control of Information. On the Concept of Accuracy of Information in Data-Banks and in Management Information Systems (Stockholm: Royal Institute of Technology, 1972), especially chapter 1.
(4) Wagner, G: "The value and the quality of information: the need for a theoretical analysis", in Wormell, I (ed): Information Quality - Definitions and Dimensions (London: Taylor Graham, 1990), 69.
(5) Fox et al, supra n 2, 10.
(6) Marchand, D: "Managing Information Quality", in Wormell, I (ed): Information Quality - Definitions and Dimensions (London: Taylor Graham, 1990), 7, 8.
(7) National Research Council, Computer Science and Telecommunications Board, System Security Study Committee: Computers at Risk: Safe Computing in the Information Age (Washington, DC: National Academy Press, 1991), 3.
(9) Nordic Council of Ministers: Information Security in Nordic Countries, Nordiske Seminar- og Arbejdsrapporter 1993:613 (Copenhagen: Nordic Council of Ministers, 1993), 8, 18, 22-23, 40.
(10) For examples of such studies, see section 3.
(11) See eg Norway's state information policy document, Statlig informasjonspolitikk - Hovedprinsipper (Oslo: Ministry of Administration, 1994), 7 & 14 (declaring that realisation of the goals set out in the policy depends upon examining and improving the quality of all elements of informational activity within state organisations).
(12) For examples, see sections 5 and 6 in Part II of this paper.
(13) For an introductory overview of the philosophy and central elements of IRM, see eg Wormell, I: "Information Resources Management", in Wormell, I: Understanding Information (Copenhagen: Danmarks Biblioteksskole, 1992), 97-132.
(14) See eg Den IT-baserte informasjonsinfrastrukturen i Norge - Status og utfordringer, report issued on 7.9.1994 by an inter-departmental Working Group led by Vidar Steine on use and development of information technology in the Norwegian governmental sector (Oslo: Ministry of Administration, 1994), 73.
(15) See eg Commission of the European Communities, Directorate-General for Telecommunications, Information Industries and Innovation: Guidelines for improving the synergy between the public and private sectors in the information market (Luxembourg: Commission of the European Communities, 1989). Guideline 1 notes that the information materials of public organisations "have value beyond their use by governments, and their wider availability would be beneficial both to the public sector and to the private industry." Accordingly, Guideline 1 goes on to state that public organisations "should, as far as practicable and when access is not restricted for the protection of legitimate public or private interests, allow [their] basic information materials to be used by the private sector and exploited by the information industry through electronic information services."
(16) For further details, see section 3.
(17) See further section 5 in Part II.
(18) In terms of etymology, this last meaning of "quality" lies closest to the Latin origins of the term. "Quality" derives from the Latin notion of "qualitas", which roughly denotes the properties, content or construction of an object.
(19) Jensen, E: Kvalitetsstyring, kvalitetssikring, kvalitetskontroll (Trondheim: TAPIR, 1985, 2nd ed), 5 ("Med KVALITETEN av et produkt...menes produktets evne til å tilfredsstille brukerens behov, ønsker, krav og forventninger." ["By QUALITY of a product...is meant a product's ability to satisfy the user's needs, wishes, requirements and expectations"] Or, more concisely: "Med KVALITETEN av et produkt menes produktets egnethet for bruksformålet." ["By quality of a product is meant the product's suitability for its purpose of use"]).
(20) See ISO 8402:1986, Quality - Vocabulary. At the same time, though, the ISO definition notes that this notion of quality, on its own, "is not used to express a degree of excellence in a comparative sense nor is it used in a quantitative sense for technical evaluations." But the definition also notes that the latter two meanings can be expressed by using qualifying adjectives; eg the term, "relative quality", can be used to express a degree of excellence in a comparative sense, and the term, "quality level", can be used to express a technical evaluation in a quantitative sense.
(21) See eg Skard, T: "Sterke og svake sider ved ISO 9000", paper presented at a seminar, Software '93: Et kritisk søkelys på ISO og kvalitetssikring, arranged by The Norwegian Computer Society at Sandvika, Norway on 3.2.1993.
(22) See Committee for Information-Technical Standardisation (Informationstekniska standardiseringen (ITS)): Terminologi för Informationssäkerhet, Report ITS 6 (Stockholm: ITS, 1994), 106. See also Swedish Standardisation Committee (Standardiseringskommissionen i Sverige (SIS)): Informationssäkerhet och dataskydd - en begreppsapparat, Technical Report 322 (Stockholm: SIS, 1989). Sweden's National Audit Office (Riksrevisionsverket (RRV)) has applied the same definition in its studies of the quality of data and information held by organisations in the Swedish public sector: see eg Rätt data? Studie av informationskvalitet i statliga ADB-system, Dnr 1989:393 (Stockholm: RRV, 1990), 66.
(23) See eg Jansen, A: Dataflyt mellom lokal og sentral forvaltning (Oslo: Statskonsult, 1992), 35 (defining data quality as "dataenes godhet i forhold til brukerens behov" ["the goodness of data for the user's needs"]). See also Hansen, J: SAFE P: Sikring av foretak, edb-anlegg og personverninteresser etter personregisterloven, CompLex 12/88 (Oslo: TANO, 1988), 56ff.
(24) Supra n 22.
(25) Cf the terminology of the Norwegian Directorate of Public Management (Statskonsult), which describes the latter three attributes as aspects of "identification quality" (identifikasjonskvalitet). See Utvikling av metode for kartlegging av datakvalitet i grunndataregistre, Report 4204.20 (Oslo: Statskonsult, 19.3.1996), 18. Also included under the term "identification quality" is the extent to which identification codes for each registered RWO are correct. Under my classificatory scheme, the latter characteristic is treated as an aspect of the completeness of data.
(26) The term "cognitive authority" is borrowed from the work of Johan Olaisen, who uses the term to denote influence on one's thoughts which one would consciously recognise as proper. According to Olaisen, the notion of cognitive authority is related to that of credibility, which Olaisen describes as having two main elements: competence and trustworthiness. See Olaisen, J: "Information quality factors and the cognitive authority of electronic information", in Olaisen, J (ed): Information Management. A Scandanavian Approach (Oslo: Scandinavian University Press/Universitetsforlaget, 1993), 47-78, 51-52. The same article is also published in Wormell, I (ed): Information Quality - Definitions and Dimensions (London: Taylor Graham, 1990), 91-121.
(27) Ivanov, K: Systemutveckling och rättssäkerhet: Om statsförvaltningens datorisering och de långsiktiga konsekvenserna för enskilda och företag (Stockholm: Svenska Arbetsgivareföreningen, 1986), 50-51 ("Kvalitet innebär att ta hänsyn til alternativa motsägelsefulla bedömningar eller att bygga in möjligheter att ange osäkerhetsmarginaler eller att ange när klassificeringar och definitioner blir oanvändbara eller...att indikera att hela databasen måste läggas ned."). Similarly, Kenneth Laudon notes that information systems differ in terms of how easily they allow persons to discover erroneous data kept in the systems. "An airline-reservation system operates in such a way that erroneous information can be spotted easily, but criminal-history systems are not as equally visible to the individuals involved": see Laudon, K C: "Data Quality and Due Process in Large Interorganisational Record Systems" (1986) 29 Communications of the ACM, no 1, 4-11, at 4.
(28) Clark, R; Holloway, S & List, W (eds): The Security, Audit and Control of Databases (Aldershot: Avebury Technical, 1991), 2.
(29) See eg Denning, D E R: Cryptography and Data Security (Reading, Massachusetts: Addison-Wesley Publishing Company, 1982), 4 (defining the notion of "[data] integrity" (and "[data] authenticity") as referring to the prevention of unauthorised modification of data).
(30) See further section 4 of this paper.
(31) Tsichritzis, D C & Lochovsky, F H: Data Models (Englewood Cliffs: Prentice-Hill, 1982), 5.
(32) See Tsichritzis & Lochovsky, ibid, for examples of various formalised models for organising data in computerised information systems.
(33) For instructive examples of such cases, see section 4.
(34) To quote the Norwegian Directorate of Public Management (Statskonsult), "[o]ffentlige dataregistre har i dag ikke et helhetlig kvalitetssystem for registerinnhold med etablerte måltall. Kundene/brukerne gis heller ikke deklarasjoner eller garantier for datakvalitet, og systematiske målinger av kundenes tilfredshet mangler": Statskonsult,supra n 25, 3.
(35) Ibid, 5.
(36) Ibid, 13.
(37) Ibid, 14.
(39) Ibid, 16, 20.
(41) See Jørgensen, S: Pasientdata fra norske sykehus 1986-1992: innhold og datakvalitet (Trondheim: SINTEF/NIS, 1994).
(42) Ibid, 28.
(43) Ibid, 26ff.
(44) See Schartum, D W: En rettslig undersøkelse av tre edb-systemer i offentlig forvaltning, CompLex 1/89 (Oslo: TANO, 1989). Results of the study are summarised in Schartum, D W: Rettssikkerhet og systemutvikling i offentlig forvaltning (Oslo: Universitetsforlaget, 1993), chapt 10. See also Schartum, D W: "Dirt in the Machinery of Government? Legal Challenges Connected to Computerized Case Processing in Public Administration", in Bing, J & Torvund, O (eds): 25 Years Anniversary Anthology in Computers and Law (Oslo: TANO, 1995), 151, espec 170-172. The latter article is also published in (1995) 2 International Journal of Law and Information Technology, no 3, 327-354.
(45) Ot prp nr 14 (1989-90), Forenklet selvangivelse, 28ff. A study carried out a short period later by the National Audit Office (Riksrevisjonen) found similar problems with the quality of base data on taxpayer income for the year 1990: see Dokument nr 1 (1993-94), Ekstrakt av Norges statsregnskap og regnskap vedkommende administrasjonen av Svalbard for 1992. Saker for desisjon av Stortinget og andre regnskapssaker, 25-26. For other instances of problematic information quality reported by the Audit Office, see ibid, 69-72 (concerning errors found in the "BOST-Flerfogd" register system, now maintained by social welfare offices).
(46) This at least is the conclusion of the National Audit Office: see Dokument nr 3:1 (1995-96), En gjennomgåelse og vurdering av antegnelsene til statsregnskapene for 1988-1993 desidert «Til observasjon», 12-13.
(47) See Harket, E: Oppgaveplikt og ligning i omstillingens tegn - et forvaltningsinformatisk perspektiv på oppgaveplikt etter ligningsloven, unpublished Masters thesis, Section for eGovernment Studies, University of Oslo, 1996.
(48) See Computerworld Norge, 11.3.1994, 18.
(49) See Computerworld Norge, 15.4.1994, 4.
(50) See eg Bing, J & Fog, J: Fem essays om ny informasjonsteknologi, forbrukere og personvern [Five essays on new information technology, consumers and data protection], NEK-report 1989:3 (Copenhagen: Nordic Council of Ministers, 1989), 39.
(51) See eg Aftenposten, 20.7.1993, 17 (morning issue).
(52) See eg Aftenposten, 18.7.1994, 13 (morning issue); Aftenposten, 31.7.1996, 5 (morning issue). Nevertheless, the Inspectorate has undertaken several small quality controls of personal data registers. One such control, carried out in 1988, revealed deficiencies in the routines for registering data in the national register of mortgaged moveable property (Løsøreregisteret) and for informing persons of the data on them contained in that register. See St meld nr 33 (1988-1989), Datatilsynets årsmelding 1988, 18.
(53) See "Nordisk råd: Medlemsforslag om kvalitetssikring og legalitetskontroll" (1992) 29 Lov & Data, 7.
(54) Riksrevisionsverket (RRV): Rätt data? Studie av informationskvalitet i statliga ADB-system, Dnr 1989:393 (Stockholm: RRV, 1990). See also SOU 1993:10, En ny datalag, 331ff (listing miscellaneous instances in which data subjects have sought compensation for harm caused them by poor information quality in personal data registers kept by Swedish authorities).
(55) Riksrevisionsverket (RRV): Fel data kostar! Exemplet Postens kostnader för fel i adressregister, F 1992:2 (Stockholm: RRV, 1992), 37. For some other examples of the potentially high financial cost of relatively simple data errors, see O'Neill, E T & Vizine-Goetz, D: "Quality Control in Online Databases" (1988) 23 Annual Review of Information Science and Technology, 125, 128.
(56) The study results are set out in Laudon, supra n 27, 4-11. See also Laudon, K C: Dossier Society: Value Choices in the Design of National Information Systems (New York: Columbia University Press, 1986), 135-145.
(57) Laudon, supra n 27, 8.
(58) Ibid, 9. For the purposes of the study of criminal-history records, such a record was considered to be incomplete if it noted an arrest but no formal court disposition had been recorded within a year of the date of the arrest, or if it noted conviction of "attempt" without stating the specific crime, or if it set out sentencing information without noting conviction information, or if it failed to present correctional information at the same time as it presented other data. A criminal-history record was deemed to be inaccurate "when the arrest, court disposition, or sentencing information on [it]...does not correspond with the actual manual court records." A criminal-history record was deemed to be ambiguous when it "shows more charges than dispositions or more court dispositions than charges", or it contains dates that do not correspond with each other, or it sets out "a number of arrest charges followed by a single court disposition where it is not clear for which particular crime the individual was convicted." Ibid, 6.
(59) Ibid, 8-9.
(60) Ibid, 9.
(61) Title 28, Code of Federal Regulations, Part 20, Subparts B & C.
(62) A summary of the findings is set out in Laudon: Dossier Society: Value Choices in the Design of National Information Systems (New York: Columbia University Press, 1986), 181-185.
(63) Ibid, 183.
(64) See US Congress, Office of Technology Assessment: Federal Government Information Technology: Electronic Record Systems and Individual Privacy, OTA-CIT-296 (Washington, DC: US Government Printing Office, June 1986), 111.
(65) Ibid, 104.
(66) For more detailed analyses of these factors, see eg Bailey, R W: Human Error in Computer Systems (Englewood Cliffs, New Jersey: Prentice-Hall, 1983), & Cohen, F B: Protection and Security on the Information Superhighway (New York: John Wiley & Sons, 1995), 33-56.
(67) Bailey, supra n 66, 27.
(68) See Johnsen, K: "System Implications of Privacy Legislation", in Bing, J & Selmer, K S (eds): A Decade of Computers and Law (Oslo: Universitetsforlaget, 1980), 92, 99.
(69) Yourdon, E: Decline and Fall of the American Programmer (Englewood Cliffs, New Jersey: Prentice-Hall, 1992), 199.
(71) Loebl, A S: "Accuracy and Relevance and the Quality of Data", in Liepins, G E & Uppuluri, V R R (eds): Data Quality Control: Theory and Pragmatics (New York: Marcel Dekker, 1990), 105, 128.
(72) For an overview of various manual and automated techniques for discovering and correcting errors in databases, see O'Neill, E T & Vizine-Goetz, D: "Quality Control in Online Databases" (1988) 23 Annual Review of Information Science and Technology, 125, 130-146.
(73) Yourdon, supra n 69, 202ff.
(74) See eg Schartum, D W: "Dirt in the Machinery of Government? Legal Challenges Connected to Computerized Case Processing in Public Administration", in Bing, J & Torvund, O (eds): 25 Years Anniversary Anthology in Computers and Law (Oslo: TANO, 1995), 151, 177ff.
(75) See eg Harket, supra n 47, 84-86 (contending that poor quality in share data sent to the Norwegian taxation authorities may be due partly to a large amount of these data having to be registered within a relatively small period).
(76) Riksrevisionsverket, supra n 55, 43. See also Jansen, supra n 23, 3.
(77) Ibid, 39ff.
(78) Loebl, supra n 71, 129-130.
(79) See eg Birgitta Nyberg's reference to a study by the Pedagogical Institute at Linköping University which found that 83 percent of Swedes were unable to fill out correctly application forms for sickness benefits. See Nyberg, B: Samkörning av personregister, IRI-rapport 1984:2 (Stockholm: Institutet för Rättsinformatik, 1984), 20. Unfortunately, Nyberg does not provide a reference for the study report nor the year in which it was undertaken. Note also Harket's claim that the specification by Norwegian taxation authorities of which data are to be supplied by funds management companies for the purposes of assessing income tax could be more precise: Harket, supra n 47, 98-99.
(80) See eg claims by Schartum and Harket that increased automatisation of income tax assessment in Norway has weakened taxpayers' ability to understand this assessment and to check the validity of its results. See Schartum: "Forenklet selvangivelse - en for enkel selvangivelse?" (1995) 5 Utvalget (Dommer, uttalelser mv i skattesaker og skattespørsmål), 1089-1122, & Harket, supra n 47, 107-108.
(81) Bailey, supra n 66, 22.
(82) Loebl, supra n 71, 139.
(83) For further examples of faulty models, see ibid, 133-135.
(84) See eg Nyberg, supra n 79, 16-21; Bing, J: "Data Protection in a Time of Changes", in Altes, W F K; Dommering, E J; Hugenholtz, P B & Kabel, J J C (eds): Information Law Towards the 21st Century (Deventer & Boston: Kluwer Law and Taxation Publishers, 1992), 247, 251-252; & Freese, J: Den maktfullkomliga oförmågan (Stockholm: Wahlström & Widstrand, 1987), 94-96.