CSE Technical Reports Sorted by Technical Report Number

TR Number

Title

Authors

Date

Pages

CSE-TR-485-04

Infeasible Group Inversion and Broadcast Encryption

Irrer Lokam Opyrchal and Prakash

Jan, 04

Broadcast encryption allows a sender to broadcast messages in such a way that only the sender-specified subset of users can decrypt them. Over the years a number of schemes have appeared that solve different variants of the problem. The number of applications that could take advantage from an efficient broadcast encryption scheme has also grown and includes such examples as content-based publish subscribe systems. Such systems are characterized by messages going to arbitrary and rapidly changing subsets of users. Each message can potentially go any of the $2^n-1$ possible subsets. Collusion resistance and small resource requirements are necessary. In this paper we present a simple and efficient broadcast encryption scheme based on one-way group operations. A group operation $gpop$ is one-way if $x gpop y$ is easy to perform for any two group elements $x$ and $y$, but computing the inverse $inv{x}$ (w.r.t. $gpop$) is infeasible. Groups with infeasible inversion have recently received attention as a cryptographic primitive and their existence is still uncertain. Under the assumption that such groups exist, we prove the security of our broadcast encryption scheme and explore a few areas where such constructions might come from. Finally, we discuss some functions used in existing cryptosystems and compare them with one-way group operations.

CSE-TR-486-04

Coordinated Navigation of Multiple Independent Disk-Shaped Robots

Karagoz Bozma and Koditschek

Feb, 04

This paper addresses the coordinated navigation of multiple independently actuated disk-shaped robots - all placed within the same disk-shaped workspace. We encode complete information about the goal, obstacles and workspace boundary using an artificial potential function over the cross product space of the robots’ simultaneous configurations. The closed-loop dynamics governing the motion of each robot take the form of the approriate projection of the gradient of this function. We show, with some reasonable restrictions on the allowable goal positions, that this function is an essential navigation function - a special type of artificial potential function that is ensured of connecting the kinematic planning with the dynamic execution in a correct manner. Hence, each robot is guaranteed of collision-free navigation to its destination from almost all initial free placements.

CSE-TR-487-04

Practical Slicing and Non-slicing Block-Packing without Simulated Annealing

Chan and Markov

Feb, 04

We propose a new block-packer BloBB based on multi-level branch-and-bound. It is competitive with annealers in terms of runtime and solution quality. We empirically quantify the gap between optimal slicing and non-slicing oorplans by comparing optimal packings and best seen results. Most ongoing work deals with non-slicing packings, and implicitly assumes that best slicing packings are highly sub-optimal. Contrary to common belief, we show that the gap in optimal slicing and non-slicing packings is very small. Optimal slicing and non-slicing packings for apte, xerox and hp are reported. We extend BloBB to the block-packer CompaSS, that handles soft blocks. Optimal slicing packings for soft versions of apte, xerox and hp are reported. We discover that the soft versions of all MCNC benchmarks, except for apte, and all GSRC benchmarks can be packed with zero dead-space. Moreover, the aspect ratio bound [0.5,2] turns out to be not very restrictive, when area is concerned. Our heuristic slicing block-packer is able to pack with zero dead-space in most cases when we restrict the aspect ratio bound to [0.57,1.70]. CompaSS improves the best published results for the ami49 X benchmarks suite, outperforming the leading multilevel annealer in runtime, solution quality and scalability. Additionally, realistic floorplans often have blocks with similar dimensions, if design blocks, such as memories, are reused. We show that this greatly reduces the complexity of black-packing.

CSE-TR-488-04

A Compressed Memory Hierarchy Using an Indirect Index Cache

Hallnor and Reinhardt

Mar, 04

The large and growing impact of memory hierarchies on overall system performance compels designers to investigate innovative techniques to improve memory-system efficiency. We propose and analyze a memory hierarchy that increases both the effective capacity of memory structures and the effective bandwidth of interconnects by storing and transmitting data in a compressed form. Caches play a key role in hiding memory latencies. However, cache sizes are constrained by die area and cost. A caches effective size can be increased by storing compressed data, if the storage unused by a compressed block can be allocated to other blocks. We use a modified Indirect Index Cache to allocate variable amounts of storage to different blocks, depending on their compressibility. By coupling our compressed cache design with a similarly compressed main memory, we can easily transfer data between these structures in a compressed state, increasing the effective memory bus bandwidth. This optimization further improves performance when bus bandwidth is critical. Our simulation results, using the SPEC CPU2000 benchmarks, show that our design increases performance by up to 107% on some benchmarks while degrading performance by no more thatn 2% on others. Compressed bus transfers alone account for up to 59% of this improvement, with the remainder coming from increased effective cache capacity. As memory latencies increase, our design becomes even more beneficial.

CSE-TR-489-04

Network Overlay Construction under Limited End-to-End Addressability

Wenjie Wang, Cheng Jim ad Sugih Jamin

May, 2004

Network overlay construction has found many applications in today's Internet, such as P2P networks [1][2], end-host multicast [3][4], and network security [5]. Many researchers have proposed solutions to building an overlay network among a given group of end-hosts. The usual overlay construction assumes two-way network communication— each host can initiate connections to and accept requests from other hosts. This is not always true on the Internet due to the use of Network Address Translation (NAT) and rewalls. Our experiments with one P2P le-sharing system revealed that over 34% of hosts were guarded hosts, i.e., hosts that cannot accept connections from other Internet hosts. The lack of two-way communication capability presents a challenge because only a subset of hosts can act as overlay routers. Using a cluster-base approach, we design a new overlay construction mechanism called e*, which organizes members based on reachability and intelligently select overlay routers to reduce end-to-end latencies on the overlay. Under realistic scenarios involving guarded hosts, e* can reduce average end-to-end latency on the overlay by 28-61% compared to existing protocols. Furthermore, e* signicantly improves the worst case end-to-end latency on the overlay.

CSE-TR-489-04

Network Overlay Construction under Limited End-to-End Addressability

Wenjie Wang, Cheng Jim ad Sugih Jamin

May, 2004

CSE-TR-490-04

Data Propagation in a Distributed File System

Kim, Noble, Chen, Tilbury

July, 04

Distributed file systems benefit greatly from optimistic replication that allows clients to update any replica at any site. However, such systems face a new dilemma: Once data is updated at a replica site, when should it be shipped to others? In conventional file system workloads, most data is read by its writer, so it needs to be shipped only for administrative reasons. Unfortunately, shipping on demand substantially penalizes those who do share, while shipping aggressively places undue load on the system, limiting scalability. This paper explores a mechanism to predict and support sharing in a wide area file system. This mechanism uses the observation that updates that invalidate cached objects elsewhere are likely to be shared, and they should be propagated. The mechanism is evaluated on its precision, recall, and F-measure over traces capturing live file system uses. It reduces data shipped by orders of magnitude compared to aggressive schemes, while reducing the performance penalty of on-demand shipment by nearly a factor of two.

CSE-TR-491-04

The Internet Motion Sensor: A distributed global scoped Internet threat monitoring system

Cooke, Bailey, Watson, Jahanian and Nazario

July, 2004

Networks are increasingly subjected to a broad spectrum of threats that impact the reliability and availability of criticalinfrastructure. In response, researchers and network operators have increasingly relied on monitoring to characterize and track these threats. This paper introduces the Internet Motion Sensor (IMS), a globally scoped Internet threat monitoring system whose goal is to measure, characterize, and track threats. The dark address sensors in the IMS extend simplepassive capture using a novel transport layer service emulation technique to elicit payloads across all services, thereby addressing the issue depth of service coverage. To achieve breadth of coverage, the IMS employs a distributed infrastructure and utilizes sensors that are aware of their address diversity and their position in the actively routedtopology. Finally, the IMS uses an innovative signature encoding and data warehousing system combined with a hierarchical architecture to realize a system that is not only time and space efficient, but is also scalable to a global deployment. We explore the various architectural tradeoffs in the context of a 3 year deployment across multiple darkaddress blocks ranging in size from /24s to a /8. We show how the current architecture emulates services across a diverse set of routed and address topologies in a scalable manner. Results from three recent events are presented to illustrate the utility of such a system: the SCO Denial of Service attacks (December, 2003), the Blaster worm (August,2003), and the Bagle backdoor scanning efforts (March, 2004).

CSE-TR-492-04

Pueblo: A Modern Pseudo-Boolean SAT Solver

Sheini and Sakallah

July, 2004

In this report we introduce a new SAT solver that integrates logic-based reasoning and integer programming methods to systems of CNF and PB constraints. Its novel features include an efficient PB literal watching strategy that takes advantage of the preponderance of unit-coefficient literals in most PB constraints. Additionally, the solver incorporates several PB learning methods that take advantage of the pruning power of PB constraints while minimizing their overhead. Empirical evidence suggests that such judicious injection of IP techniques can be quite effective in practice.

CSE-TR-493-04

CIDS: Causality Based Intrusion Detection System

King, Mao and Chen

August, 2004

This paper presents a new style of intrusion detection system, CIDS, that links network and host based intrusion detection systems together using causal dependencies. This combination provides a number of benefits. First, combining alerts reduces false positives for both network and host based approaches. Second, host based intrusion detection systems have significantly less data to process since they only act on processes and files that are causally linked to a suspicious network packet. Third, alarms are associated with a specific network service, allowing system administrators to act accordingly after detecting a compromise. To show the utility of this new system, we describe how several existing intrusion detection systems benefit from using our platform, and we introduce a new host based intrusion detection system that leverages the full power of causal dependency tracking.

CSE-TR-494-04

What and How Much to Gain by Spectral Agility?

Chou, Kim and Shin

August, 2004

Static spectrum allocation has resulted in low spectrum efficiency in licensed bands and poor performance of radio devices in crowded unlicensed bands. To remedy these problems, we exploit the concept of ``spectral agility" such that radio devices can dynamically utilize idle spectral bands. We establish a mathematical model for the performance gain made by spectral agility, and use the model to evaluate important performance metrics such as spectrum efficiency, throughput of a spectral-agile network, and packet blocking/waiting time of spectral-agile devices. We propose three basic mechanisms to realize spectral-agile networks: spectrum opportunity discovery, spectrum opportunity management, and spectrum use coordination. These mechanisms are implemented in the ns-2, and the control overhead incurred by using spectral agility is evaluated. Our simulation results have shown that the throughput of a spectral-agile network is improved by up to 90%, and the improvement is very close to the performance bound predicted by our analytical (mathematical) model. These results demonstrate and confirm the spectral agility's capability of improving spectral utilization in an efficient, distributed, and automatic manner.

CSE-TR-495-04

Debugging operating systems with time-traveling virtual machines

King, Dunlap, and Chen

August, 2004

Operating systems are among the most difficult of software systems to debug with traditional cyclic debugging. They are non-deterministic; they run for long periods of time; their state and code is large and complex; and their state is easily perturbed by the act of debugging. This paper describes a time-traveling virtual machine that overcomes many of the difficulties associated with debugging operating systems. By time travel, we mean the ability to navigate backward and forward arbitrarily through the execution history of a particular run and to replay arbitrary segments of the past execution. We integrate time travel into a general-purpose debugger to enable a programmer to debug an OS in reverse, implementing commands such as reverse breakpoint, reverse watchpoint, and reverse single step. The space and time overheads needed to support time travel are reasonable for debugging, and movements in time are fast enough to support interactive debugging. We demonstrate the value of our time-traveling virtual machine by using it to understand and fix several OS bugs that are difficult to find with standard debugging tools.

CSE-TR-496-04

Enriching intrusion alerts through multi-host causality

King, Mao, Lucchetti, and Chen

August, 2004

Current intrusion detection systems point out suspicious states or events but do not show how the suspicious state or events relate to other states or events in the system. We show how to enrich an IDS alert with information about how those alerts causally lead to or result from other events in the system. By enriching IDS alerts with this type of causal information, we can leverage existing IDS alerts to learn more about the suspected attack. Backward causal graphs can be used to find which host allowed a multi-hop attack (such as a worm) to enter a local network; forward causal graphs can be used to find the other hosts that were affected by the multi-hop attack. We demonstrate this use of causality on a local network by tracking the Slapper worm, a manual attack that spreads via several attack vectors, and an e-mail virus. Causality can also be used to correlate distinct network and host IDS alerts. We demonstrate this use of causality by correlating Snort and host IDS alerts to reduce false positives on a testbed system connected to the Internet.

CSE-TR-497-04

Design and Applications of a Virtual Context Architecture

Oehmke, Binkert, Reinhardt, Mudge

September, 2004

This paper proposes a new register-file architecture that virtualizes logical register contexts. This architecture makes the number of active register contexts—representing different threads or activation records—independent of the number of physical registers. The physical register file is treated as a cache of a potentially much larger memory-mapped logical register space. The implementation modifies the rename stage of the pipeline to trigger the movement of register values between the physical register file and the data cache. We exploit the fact that the logical register mapping can be easily updated—simply by changing the base memory pointer—to construct an efficient implementation of register windows. This reduces the execution time by 8% while generating 20% fewer data cache accesses. We also use the large logical register space to avoid the cost of a large physical register file normally required for a multithreaded processor, allowing us to create an SMT core with fewer physical registers than logical registers that still performs within 2% of the baseline. Finally, the two are combined to create a simultaneous multithreaded processor that supports register windows. This architecture achieves a 10% increase in performance over the baseline architecture even with fewer physical than logical registers while also reducing data cache bandwidth. Thus we are able to combine the advantages of register windows with multithreading.

CSE-TR-498-04

Plato: A Platform for Virtual Machine Services

King, Dunlap, and Chen

October, 2004

Virtual machines are being used to add new services to system level software. One challenge these virtual machine services face is the semantic gap between VM services and the machine-level interface exposed by the virtual machine monitor. Using the virtual machine monitor interface, VM services have access to hardware-level events like Ethernet packets or disk I/O. However, virtual machine services also benefit from guest software (software running inside the virtual machine) semantic information, like sockets and files. These abstractions are specific to the guest software context and are not exposed directly by the machine-level virtual machine monitor interface. Existing ways to bridge this semantic gap are either ad-hoc or use debuggers. Ad-hoc methods often lead to cutting-and-pasting large sections of the guest operating system to reconstruct its interpretation of the hardware level events. Debuggers add too much overhead for production environments. Both ad-hoc methods and debuggers could cause unwanted perturbations to the virtual system. To address these shortcomings, we developed a new platform for implementing virtual machine services: Plato. The goal of Plato is to make it easy and fast to develop and run new virtual machine service. Plato allows VM services to call guest kernel functions and access guest local variables, eliminating the need to cut-and-paste sections of the virtual machine source code. Plato provides a checkpoint/rollback facility that allows VM services to correct for undesired perturbations to the virtual state. Plato adds less than 5% overhead for a variety of macrobenchmarks.

CSE-TR-499-04

A Hybrid Honeypot Architecture for Scalable Network Monitoring

Bailey, Cooke, Watson, Jahanian and Provos

October, 2004

To provide scalable, early warning and analysis of new Internet threats like worms or automated attacks, we propose a globally distributed, hybrid monitoring architecture that can capture and analyze new vulnerabilities and exploits as they occur. To achieve this, our architectures increases the exposure of high-interaction honeypots to these threats by employing low-interaction honeypots as frontend content filters. Host-based techniques capture relevant details such as packet payload of attacks while network monitoring provides wide coverage for quick detection and assessment. To reduce the load of the backends, we filter prevalent content at the network frontends and use a novel handoff mechanism to enable interactions between network and host components. We use measurements from live networks over five months to demonstrate the effectiveness of content prevalence as a filtering mechanism. Combining these observations with laboratory measurements, we demonstrate that our hybrid architecture is effective in preserving the detail of a specific threat while still achieving performance and scalability. We illustrate the benefits of this framework by showing how it enables earlier, higher-confidence detection, more detailed forensics, and robust signatures for mitigation of threats.

CSE-TR-500-04

Weakly Supervised Graph-Based Methods for Classification

Radev

October, 2004

We compare two weakly supervised graph-based classification algorithms: spectral partitioning and tripartite updating. We provide results from empirical tests on the problem of number classification. Our results indicate (a) that both methods require minimal labeled data, (b) that both methods scale well with the number of unlabeled examples, and (c) that tripartite updating outperforms spectral partitioning.

CSE-TR-501-04

Media-Aware Rate Control

Wang, Banerjee and Jamin

November, 2004

Streaming media transfers over the Internet are expected to behave in a TCP friendly manner and react appropriately to congestion similar to the way TCP does. However, our work shows that existing "TCP-friendly" protocols are not necessarily "media friendly". In this paper, we present our findings on the impact of the TFRC protocol on streaming media quality. We propose the MARC (Media Aware Rate Control) protocol that, unlike TFRC, exhibits significant tolerance towards transient changes in background workload, while still maintaining TCP-friendliness.

CSE-TR-502-04

Network Maps Beyond Connectivity

Jin, Wang and Jamin

November, 2004

Knowing network topology is becoming increasingly important for a number of applications such as server placement and traceback of DDoS attacks. Recent works in modeling the Internet topology and constructing network maps have focused on the connectivity aspect. This paper describes the first study in incorporating connectivity, latency, and routing information all into a network map based on a large set of traceroute data. We highlight the common challenges in constructing such network maps and discuss our solutions. We evaluate our network map based on various Internet routing models proposed in the literature. The evaluation shows that, for those traceroute data that we are able to evaluate, at least 85% of computed hop-counts and latencies are within a factor of two of the actual values. Furthermore, we show that a flat routing model based on hop-count performs as well as more complicated policy routing models.

CSE-TR-503-04

Worm Hotspots: Explaining Non-Uniformity in Worm Targeting Behavior

Cooke, Mao, and Jahanian

November, 2004

Long after the Blaster, Slammer/Sapphire, and CodeRedII worms caused significant worldwide disruptions, a huge number of infected hosts from these worms continue to probe the Internet today. This paper investigates hotspots (non-uniformities) in the targeting behavior of these important Internet worms. Recent data collected over the period of a month and a half using a distributed blackhole data collection infrastructure covering 18 networks including ISPs, enterprises, and academic networks show 75K Blaster infected hosts, 180K slammer infected hosts, and 55K CodeRedII hosts. We discover through detailed analysis how critical flaws and side effects in the targeting behavior lead to a significant bias for certain destination address blocks. In particular, we demonstrate three previously unexplored biases: a severely restricted initial random seed forcing infection attempts to certain blocks; flaws in the parameters of a random number generator making certain hosts cycle through limited target addresses; and the widespread use of private address space dramatically changing the targeting distribution of certain worms. A direct consequence of these biases is that certain blocks are subjected to far more infection attempts than others. We discuss the implication of these hotspots on worm simulation and modeling, placement of blackhole sensors, worm detection and quarantine.

CSE-TR-504-04

Cooperative ReVirt: Adapting Message Logging for Intrusion Analysis

Basrai and Chen

Nov, 2004

Virtual-machine logging and replay enables system administrators to analyze intrusions more completely and with greater integrity than traditional system loggers. One challenge in these types of systems is the need to log a potentially large volume of network traffic. Cooperative ReVirt adds message-logging techniques to ReVirt to reduce the amount of network traffic that needs to be logged. Cooperative ReVirt adapts message-logging techniques to address the challenges of intrusion analysis, such as the need to work in the presence of network attacks and unreliable networks, the need to support asymmetric trust relationships among computers, and the need to support dynamic trust and traffic patterns. CooperativeReVirt is able to reduce the log volume needed to replay a computer by an average of 70% for a variety of distributed computing benchmarks, while adding less than 7% overhead. Measurements of a live network indicate that Cooperative ReVirt would be able to avoid logging 85% of the received network data.

CSE-TR-505-04

Analyzing NIC Overheads in Network-Intensive Workloads

Binkert, Hsu, Saidi, Dreslinski, Schultz and Reinhardt

December, 2004

Modern high-bandwidth networks place a significant strain on host I/O subsystems. However,despite the practical ubiquity of TCP/IP over Ethernet for high-speed networking, the vast majority of end host networking research continues in the current paradigm of the network interface as a generic peripheral device. As a result, proposed optimizations focus on purely software changes, or on moving some of the computation from the primary CPU to the off-chip network interface controller (NIC). We look at an alternative approach: leave the kernel TCP/IP stack unchanged, but eliminate bottlenecks by closer attachment of the NIC to the CPU and memory system. To evaluate this approach, we have developed a simulation environment specifically targeted for networked systems. It simulates server and client systems along with a network in a single process. Full-system simulation captures the execution of both application and OS code. Our model includes a detailed out-of-order CPU, event-driven memory hierarchy, and Ethernet interface device. Using this simulator, we find that tighter integration of the network interface can provide benefits in TCP/IP throughput and latency. We also see that the interaction of the NIC with the on-chip memory hierarchy has a greater impact on performance than the raw improvements in bandwidth and latency that come from integration.

Technical Reports Page