Putting documents into their work context in document analysis

来源:互联网 发布:世界程序员排名 编辑:程序博客网 时间:2024/06/07 06:02

Abstract

In trying to achieve document standardization the goal is to find more effective, consistent, and standardized ways to utilize information technology. The specification and implementation of document standards may take several years requiring a profound analysis and understanding of document management practices. Document standardization does not concern documents only: it concerns workers, their work, business partners, and future systems as well. In this paper we discuss two ways of describing the work context of documents: process modelling and life cycle modelling. In process modelling, documents are regarded as resources produced and used in inter- or intra-organizational business processes. Different types of documents are typically produced and used in a business process. In life cycle modelling work related to processing of a document of a specific type is described. The modelling methods have been tested in an SGML standardization project called RASKE during the analysis of four case domains: the enquiry process in the Finnish Parliament and Government, national Finnish legislative work, budgetary work, and the Finnish participation in EU legislative work. This paper discusses the modelling requirements in document analysis and describes the techniques used in the RASKE project.

Keywords: Document analysis; Document standardization; Process modelling; SGML; XML

1. Introduction

The data volume in the electronic document repositories of organizations is growing fast, but the diversity of the document formats and systems, as well as continuing changes in the information technology, cause problems in the access and use of the information needed in work tasks. The problems concern both companies and public sector organizations. These problems have prompted organizations to start major document standardization projects where the intention is to agree upon rules which define the way information is represented in documents. The rules are needed in order to achieve more effective, consistent, and stable ways to utilize information technology in business processes. Problems with technological changes, and in the maintenance of long-term access to digital documents have motivated the search for application independent formats for documents. SGML (Standard Generalized Markup Language) is an international standard for defining and representing documents in an application-independent form (Goldfarb, 1990). A subset of SGML called XML (Extensible Markup Language) has been developed especially for specifying document standards to be used in Web information systems ( Bray, Paoli & Sperberg-McQueen, 1998).

In SGML/XML standardization projects, a profound document analysis is needed. The analysis is usually seen as an analysis of document structures (Travis; Watson and Maler, Magnusson Sjöberg, 1997, Weitz, 1998). Successful implementation of document standards in enterprises however requires understanding of the role of documents in work processes. Especially in cases where the standardization concerns several document types and the document production is part of inter-organizational business processes, the analysts as well as the actors in processes should be able to see the process context of documents. In this paper we discuss the work process modelling as part of document analysis. We will introduce the modelling techniques used in a major standardization project called RASKE where the standardization has concerned the documents created in the Finnish Parliament and ministries ( Salminen; Salminen and Salminen).

The rest of the paper is organized as follows. Section 2 introduces a model for electronic document management environments and defines the notions related to the model. Document standardization of enterprises is discussed in Section 3. As an example of a standardization project the RASKE project is introduced. Work process modelling approaches in other application areas and needs in the document analysis of a document standardization project are discussed in Section 4. The techniques used in the RASKE project are described in Section 5. Experiences and implications from the RASKE project are discussed in Section 6.

2. Electronic document management environments

Organizations use documents as a means for information management: a means to cluster, organize, store, transfer, and use information to fulfill their organizational purposes. The term electronic document management (EDM) refers to the use of modern information technology for the purpose (Sprague, 1995). In document standardization it is important to identify, not only documents and their structures, but also other entities of the EDM environment where the documents are created, manipulated, and used.

Fig. 1 shows a model for an EDM environment using the central notions of information control nets (ICNs): activities and resources (Ellis, 1979). Information is produced and used in activities. The resources are information repositories where information produced can be stored, or from where information can be taken. The dashed lines in the figure denote the information flow from and to resources. The set of activities is denoted by a circle and the resources by rectangles. The resources are divided into three types: documents, systems, and actors. Documents consist of the recorded data intended for human perception. A document can be identified and handled as a unit in the activities, and it is intended to be understood as information pertaining to topic. Since the documents in an EDM environment are mostly digital, it means that information technology is needed and utilized to operate on documents. Hence systems, i.e. hardware, software, and applications, are essential resources in an EDM environment. On the other hand, since the information in documents should be available also after system changes, it is also important to separate the documents from systems as resources. Finally, the actors are people and organizations performing activities and using documents as well as systems in the activities. In some fully automated activities a software system may perform an activity (for example, create an email message and send it to a repository). In this paper we will however consider activities where the actors creating and using documents are people and organizations. In relationship to documents and systems, actors are called users. Actors are grouped by roles. A role specifies the tasks, responsibilities, and rights of an actor in an activity, as a user of a system, or as a user of a document repository.

Information pieces needed and produced during an activity are stored in many different ways: in the heads and experience of people, in the organizational culture, as hardware and software solutions, and as data in documents and applications. If the notion of information is understood according to the sense-making theory of Dervin (1992) as the sense created in a situation, at a specific moment in time and space by a reader (where Dervin means a human reader), then information is subjective and the information needed by a person in order to perform an activity may be a complicated combination of pieces coming from different sources.

An EDM environment may be in a single organization. In the current networked world however, business processes often concern several organizations and resources are shared more or less by those organizations. Thus the EDM environments in which a specific organization or person is involved may be quite complex.

3. Document standardization

One of the approaches for improving business processes is document standardization using application-independent standard formats. In the standardization the idea is to plan digital information structures and formats taking into account future changes in systems instead of planning them for a specific software system. The rules associated with a document, document authoring, and its storage format are intended to help consistent understanding of the content by the authors and different readers also in situations where the software and hardware changes. Sprague (1995) suggests the development of an electronic document management strategy in an organization. Standardization can be taken as such a strategy.

3.1. RASKE as a standardization project

One example of a standardization project is RASKE. The term RASKE comes from the Finnish words Rakenteisten AsiakirjaStandardien KEhittaminen meaning the development of standards for structured documents. The project was commenced in spring 1994 by the Finnish Parliament and a software company in cooperation with researchers at the University of Jyvaskyla. The Ministry of Foreign Affairs, Ministry of Finance, Prime Minister’s Office, and a publishing house also participated in the project.

Starting the RASKE project was motivated by document management problems in the Finnish Parliament and government. Teams studying the legislative work carried out in Parliament identified, for example, the following problems concerning document management (Salminen et al., 1997):

1. Incompatibilities of the systems used caused the need for repeated typing of the same piece of text, which in turn was a potential source of inconsistencies in documents.

2. Inconsistencies in document naming and document identifiers caused problems and extra work.

3. Lack of information management coordination between the ministries, and between the government and Parliament.

4. In spite of the fact that almost all of the documents were digital, documents were mostly distributed on paper.

5. The retrieval techniques of different systems were heterogeneous.

6. The retrieval techniques of the electronic archiving system and the tracking system of Parliament were not satisfactory.

7. Uncertainty concerning the future usability of the information in the archived digital documents.

The document analysis in the RASKE project concerned four domains: the enquiry process, national legislative work, Finnish participation in EU legislative work, and the creation of the state budget. During the case analyses, various methods of analysis were tested and developed. Preliminary DTDs were designed for 21 document types including, for example, Government Bill, Government Decision, Government Communication, Private Bill, Special Committee Report, Budget Proposal, and Communication of Parliament.

 

原创粉丝点击