Collaboratories are inherently real-time applications. The participants collaborate over data in real-time, and expect to see and propagate their ideas and actions as if they were working in the same physical space. In a wide-area Collaboratory, specifically an Internet collaboratory, hard real-time delivery guarantees cannot be met; however, soft guarantees can be provided for a collaboratory's data delivery that provide the illusion of a simultaneous meeting.
Participants in a wide-area collaboratory vary in their hardware resources, software support and quality of connectivity. In an environment such as the Internet, these participants are interconnected by network links with highly variable bandwidth, latency, and loss characteristics. In fact, the explosive growth of the Internet and the proliferation of intelligent devices is widening an already large gap between these clients. Collaboratory participants share many types of media: video, audio, text, pictures, data from real-time instruments, etc. The throughput required to simultaneously support these multimedia streams can very often surpass slower participants' available bandwidth. A significant problem, is bandwidth allocation for these poorly connected participants. A collaboratory's bandwidth allocation scheme must effectively support group collaboration. We argue that due to the high-level semantic constraints on this bandwidth, an application-level mechanism for bandwidth allocation should be used.
We have implemented a middleware architecture that supports the allocation of this scarce resource by providing application-level semantic-based Quality of Service policies. This work is motivated by our experiences with the UARC distributed system during the last four years. The Upper Atmospheric Research Collaboratory (UARC) is an experimental distributed testbed developed at the University of Michigan to examine issues in supporting collaborative scientific work over wide-area networks. The UARC testbed connects a geographically dispersed community of space scientists via the Internet. These scientists perform experiments on remote instruments, evaluate their work, and discuss experimental results in real-time over the Internet. The main media that the UARC scientists share are: real-time data sets from remote scientific instruments, periodic still photographs, and text-based chat messages. Notice that all of these media types cannot suffer arbitrary packet losses in their transmission. This illustrates a split between data that must be totally received versus data that can suffer loss. Specifically, we make the distinction between these two types of real-time data by placing them in one of the following categories: continuous or discrete data. Continuous data are those characterized by a stream of continuous packets. Examples of continuous data are video and audio streams. The semantic information in these streams is spread over many network packets, all of which don't need to be received to provide useful information. Discrete data are those that provide a separable and discrete semantic in a specific number of network packets. Examples of discrete data are: pictures, postscript files, instrument data sets, Java code, etc. In general, all of a discrete data's network packets must be received to provide useful information. Since the majority of the UARC media are discrete, we focused our collaboratory architecture on supporting bandwidth allocation for discrete real-time data. However, we made sure that our architecture could easily accommodate continuous real-time data.