Autonomy, Local Autonomy, and Coherence in Naming in Distributed Computer Systems

Z thewoodcraft.org
Tato stránka obsahuje změny, které nebyly označeny pro překlad.
Sanjay Radia
Software Portability Laboratory
Department of Computer Science, University of Waterloo, Canada.
August 19, 1988

Overview

In a distributed system, different subsystems (for example, a machine or a network) may have the need and capability for quite different degrees of autonomy; for example, a mobile machine with a large disk that occasionally connects to a distributed system would be configured to be more autonomous than a disk-less workstation that is permanently connected. We wish to understand the issues and tradeoffs in controlling the degree of autonomy.

An important factor limiting autonomy in a distributed system is dependence on remote objects. However, making these objects locally available may not be sufficient. The autonomy of a subsystem with respect to its local objects may be constrained by the need for coherence (or cooperation) in the system. Coherence can occur in different forms and at different levels[8]; for example, a common protocol may be used to allow subsystems to communicate, operations may have the same semantics for local and remote objects (this is often called network transparency), global names may be used to allow names to be freely exchanged and to make sharing easier, etc. The potential conflict between autonomy and coherence was observed early in the development of distributed systems [2,8].

The aim is to find mechanisms that support coherence within a system, but do not constrain the autonomous filnctioning of a subsystem with respect to its local objects. This would allow a choice between autonomy and dependence so that different subsystems can be configured with different degrees of dependence on remote objects and hence different degrees of autonomy.

We investigate two areas. First, we take a closer look at what it means for a subsystem to be autonomous, for this has been largely neglected in the literature. We aim for a broad definition which is consistent with the general meaning of the term autonomy. Second, we investigate the conflict between autonomy and coherence in naming, and outline a solution for naming communication end-points that supports coherence without limiting the autonomy of machines and networks.

Autonomy

A subsystem is autonomous if

  • it has local control, and
  • it functions independently of other subsystems.

A subsystem is said to have local control if the operations of the subsystem are locally initiated (this includes operations initiated by human users connected to the subsystem) and if the objects that are operated on are locally administered. A subsystem is said to function independently if it can perform operations on objects independently of other subsystems. Administrating an object means controlling the object for local use and for use by other subsystems. When considering the autonomy of a subsystem, we include both local and remote objects accessed by the subsystem.

The Webster[3] defines autonomy as "a) having self-government, b) functioning independently of other parts or forms", llaworth in his book on personal autonomy[4] also ascribes the attributes of self-control (or self-rule) and independence to autonomy, tie distinguishes self-control from independence in that self-control has more positive connotation and uses the following example to illustrate: Athens loses autonomy when it comes under the domination of Sparta, and Athens becomes independent after the Spartans leave, but becomes autonomous only when it instigates self-rule. Svobodova et al[8] mention administration and the ability to operate independently as two aspects of autonomy in distributed computer systems.

A subsystem may not be able to or may not wish to achieve complete autonomy. For example, a diskless workstation has potentially less autonomy than a workstation with a disk. A subsystem may share an object such as a database and as a consequence may not be able to administer or operate on the object independently. Autonomy, therefore, is a matter of degree. The degree of autonomy depends on the extent to which operations are locally initiated, on the operations that can be performed independently, and on the objects that can be operated on and administered independently. (There may be physical constrains, for example, due to limitation of local storage).

Local Autonomy

In a distributed system, autonomy is affected by dependence on objects in a remote subsystem – the remote subsystem or the communication path to it may fail, or a subsystem, such as a mobile machine, may occasionally disconnect during which time it cannot depend on remote objects. One way to increase the degree of autonomy is to reduce dependency on remote objects. However, making an object locally available may not be sufficient; autonomy may be constrained if, for example, operations on the local object involve a remote machine (for example, for permission checking or name resolution). Similarly, autonomy may be constrained if a subsystem does not have administrative control over local objects (for example local objects may have to be shared). The design and implementation of a system to achieve coherence amongst subsystems may affect a subsystem's functioning with respect to its local objects. The question we pose is: Do mechanisms that support coherence in the system constrain the autonomous functioning of a subsystem with respect to its local objects? To deal with this question we define the term local autonomy. A subsystem has local autonomy if

  • it can locally administer its local objects, and
  • it can perform operations on its local objects independently of the rest of the system.

Local autonomy considers only local objects; the actual autonomy of a subsystem may be limited because of dependence on remote objects. For a subsystem to function autonomously with respect to an object, it is necessary that the object be locally available and that the subsystem have local autonomy. A subsystem with local autonomy can increase its degree of autonomy by reducing dependence on remote objects. Local autonomy insulates a. subsystem's local functioning from changes in the environment (for example, during reconfiguration of the system), from failures of other subsystems, and from interference by and conflicts with other subsystems, and hence leads to robustness and modularity. Local autonomy is also often needed in a subsystem to reflect the administrative autonomy present in the organizational unit that uses and controls the subsystem[8]. Local autonomy is especially useful in a dynamic environment where a mobile subsystem reconnects to different parts of system or possibly to different systems while continuing to function with respect to local objects. (There are other issues in such a dynamic environment which are not addressed here).

Coherence and Local Autonomy in Naming

One area where a conflict occurs between coherence and autonomy is in naming. Global names support coherence because they can be used from any location and can be freely exchanged across subsystem boundaries. Global file names are used in Locus[6] and V-System[5], and global process identifiers are used in V-System to support coherence. Global names, however, limit autonomy.

A subsystem has local autonomy in naming if • it can independently create private names for local objects, and • it can bind and resolve these names locally.

Being forced to use global names to refer to local objects violates the first requirement. This is true even for hierarchical global names of the form (nameOISubsystem, localNameOfObject). This is because an object name depends on the name of the subsystem; the object name cannot be created until the subsystem has been named, and the object name is valid only in the domains where the subsystem has that name. Whenever the subsystem is renamed (for example, during reconfiguration), names of local objects have to be changed; it is also necessary to change references to local objects stored within the subsystem because they use the old subsystem name. In many cases it may require the subsystem to be shut down and restarted with its new name (and even this works only if the references are not in storage that persists beyond restarts).

It is desirable to be able to refer to local objects in a, local context without globally identifying the context. For example, if we are in a room in some building, we can refer to objects in the room without knowing the room number or the name of the building. Furthermore, the names of the room and the building can be changed without changing the names of local objects. The name of the room is not needed unless one refers to the objects from outside of the room; similarly for the name of the building. The same is true for a subsystem and its name. This suggests that names should not be fully qualified, but qualified only as far as necessary, that is partially qualified.

Now we narrow our attention to machines and networks and to the problem of naming communication end-points. We give a brief overview of the problem and a solution; for more details refer to [7]. For concreteness, we discuss distributed systems designed as systems of communicating processes where a communication end-point is a process identified by its process identifier (pid).

To have coherence in systems of communicating processes, it is desirable that the interprocess communication primitives provide the same semantics for both local and remote communication, and that a process be able to exchange a pid in a message across machine and network boundaries. A pid is often exchanged in a message, for example, in a client-server interaction (especially with a name server), in a multi-process application, etc [7].

Fully qualified, hierarchical pids consisting of a network, machine, and local identifier are typically used because they can be freely exchanged. However, as mentioned above, this limits autonomy. The limitation becomes apparent when we consider changing the identifier of a subsystem (a machine or a subnetwork) because of a reconfiguration or because the (mobile) subsystem reconnects to a different part of the system or to a different system. In such a situation, it is desirable for the subsystem to continue functioning with respect to local processes, however, references to local processes using the old identifier may exist in the subsystem and could require the subsystem to be shutdown and restarted (this could be difficult in the case of a subnetwork).

Even flat, universal machine identifiers such as the 48-bit Ethernet identifiers in [1] do not help because pids, which typically contain a network identifier to help in routing[1], would change when relocating to another network. Besides we wish to support the local autonomy of subnetworks and our argument applies to network identifiers.

We propose using partially qualified pids where a local, machine, and a network identifier is specified only if necessary, and mapping the pids received in messages. A process with local identifier l on machine m and network n has the following pids depending on the context of reference: (0, 0, 0), (0, 0, l), (0, m, l), and (n, m, l). The kernel assures that a pid returned by a primitive such as receive or createProcess is minimally qualified. However, a pid is resolved differently in different contexts and therefore a pid received in a message must be mapped before it is used. A pid is mapped using a simple rule by the receiving kernel (if messages are typed) or by the receiving process[7]. The solution can be extended to allow interactions between autonomous domains that do not share a common namespace.

Partially qualified names have been previously used in computer systems for the convenience of human users and to allow compatibility when a system is extended to access remote objects. We are suggesting that they be used even when an existing hierarchical structure allows filly qualified names and even when the names are not intended for human usage. They support local autonomy in naming which is especially useful for reconfigurations and for dynamic environments.

Concluding Remarks

It would be useful to investigate other forms of coherence besides the use of global names that constrain the local autonomy of a subsystem. We believe coherence in dealing with authorization is likely to constrain local autonomy in a similar way. It would also be useful to investigate other factors besides dependence on remote objects that affect autonomy (for example sharing).

References


[1] Y. K. Dalai. Use of Multiple Networks in Xerox' Network Systems. In COMPCON, 1982, pages 391-397.

[2] P. H. Enslow Jr. What is a "Distributed" Data Processing System? Computer, 13-21, Jan. 1978.

[3] D. B. Guralnik, ed]tor. Webster's New World Dictionary. Simon and Schuster, New York, 1982.

[4] L. Haworth. Autonomy: An Essay in Philosophical Psychology and Ethics. Yale University Press, 1986.

[5] T. P. Mann. Deeenlralized Naming in Distributed Computer Systems. Stanford University, CS-87-1179.

[6] G. Popek and B. J. Walker, editors. The LOCUS Distributed System Architecture, The MIT Press, 1985.

[7] S. Radia and J. Pachl. Autonomy and Transparency in Naming Communication End-Points in Distributed Systems. University of Waterloo, UW/ICR 88-04 (June 1988).

[8] L. Svobodova, B. Liskov, and D. D. Clark. Distributed Computer Systems: Structure and Semantics. MIT/LCS/TR-215, 1979.