CSE Technical Reports Sorted by Technical Report Number

TR Number Title Authors Date Pages

CSE-TR-514-06 A Simple Integrated Network Interface for High-Bandwidth Servers Nathan L. Binkert, Ali G. Saidi, and Steven K. Reinhardt January, 2006 22
High-bandwidth TCP/IP networking places a significant burden on end hosts. We argue that this issue should be addressed by integrating simple network interface controllers (NICs) more closely with host CPUs, not by pushing additional computation out to the NICs. We present a simple integrated NIC design (SINIC) that is significantly less complex and more flexible than a conventional DMA-descriptor-based NIC but performs as well or better than the conventional NIC when both are integrated onto the processor die. V-SINIC, an extended version of SINIC, provides virtual per-packet registers, enabling packet-level parallel processing while maintaining a FIFO model. V-SINIC also enables deferring the copy of the packet payload on receive, which we exploit to implement a zero-copy receive optimization in the Linux 2.6 kernel. This optimization improves bandwidth by over 50% on a receive-oriented microbenchmark.

CSE-TR-515-06 Language and Analysis Techique for Efficient Software Model Checking of Data Structure Properties Paul Darga and Chandrasekhar Boyapati January, 2006 10
This paper presents novel language and analysis techniques that significantly speed up software model checking of data structure properties. Consider checking a red-black tree implementation. Traditional software model checkers systematically generate all red-black tree states (within some given bounds) and check every red-black tree operation (such as insert, delete, or lookup) on every red-black tree state. Our key idea is as follows. As our checker checks a red-black tree operation o on a red-black tree state s, it uses a combination of dynamic and static analyses to identify other red-black tree states s'_1, s'_2, ..., s'_k on which the operation o behaves similarly. Our analyses guarantee that if o executes correctly on s, then o will execute correctly on every s'_i. Our checker therefore does not need to check o on any s'_i once it checks o on s. It thus safely prunes those state transitions from its search space, while still achieving complete test coverage within the bounded domain. Our preliminary results show orders of magnitude improvement over previous approaches. We believe our techniques can make checking of data structure properties significantly faster, and thus enable checking of much larger programs than currently possible.

CSE-TR-516-06 Accurate Real-time Identification of IP Hijacking Xin Hu and Z. Morley Mao May, 2006 28
In this paper, we present novel and practical techniques to accurately detect IP prefix hijacking attacks in real time to facilitate timely mitigation responses. There are strong evidences that IP hijacking is common on today¡¯s Internet. Attackers may hijack victim¡¯s IP address space to perpetrate malicious activities such as spamming and launching DoS attacks without worrying about disclosing their identity through source IP addresses. More seriously, they can disrupt network services or regular communication by temporarily stealing actively used addresses. Unintentional network misconfigurations can also have similar effects, possibly leading to severe impact on reachability. We propose novel ways to much more accurately detect IP hijacking by combining analysis of passively collected BGP routing updates and data plane fingerprints of suspicious prefixes. The key insight is to use data plane information in the form of edge network fingerprinting to disambiguate potentially numerous suspect IP hijacking incidences based on routing anomaly detection. Previous work on identifying IP hijacking solely relies on control plane information in the form of anomalous routing updates or external data such as stale address registries. Such an approach is inaccurate, suffering from too many false positives to be useful in practice. In our proposed scheme, real-time fingerprinting provides confirming evidence for hijacking, while incurring little overhead. More importantly, we provide mechanisms to perform online mitigation rather than post-mortem analysis. Utilizing real-time BGP data from multiple feeds as well as RouteViews, we demonstrate the ability of our system to distinguish between legitimate routing changes and hijacking attacks.

CSE-TR-517-06 OpenFire: Opening Networks to Reduce Network Attacks on Legitimate Services Kevin Borders, Laura Falk, and Atul Prakash May, 2006 18
Remote network attacks are a serious problem facing network administrators in universities, corporations, and governments today. Hackers, worms, and spammers will exploit and control hosts inside of a network to send junk E-mail, create bot nets, launch DDoS attacks, and steal sensitive information. Current solutions exist that attempt to deal with these threats, but they are often unsuccessful. In addition, security researchers have deployed honeypots to lure attackers and learn more about their techniques, but honeypots have had only limited success at gaining information and protecting other machines. In response to need for better security against network attacks, we present OpenFire. OpenFire is protection system that takes the opposite approach of traditional firewalls: instead of blocking unwanted traffic, it instead accepts {it all} traffic, forwarding unwanted messages to a cluster of decoy machines. To the outside, all ports and all IP addresses appear open in an OpenFire network. To evaluate OpenFire, we deployed it in a live sub-network here at the University of Michigan. During a three-week evaluation period, we found that OpenFire reduced the number of attacks on legitimate computers by 57%, while honeypots only reduced attacks by 8%. OpenFire decoys also had considerably better exposure than honeypots, and were able to elicit over three times as many attacks during the evaluation period.

CSE-TR-518-06 Adaptive MAC-layer Sensing of Spectrum Availability in Cognitive Radio Networks Hyoil Kim and Kang G. Shin May, 2006 21
In the recently-suggested dynamic spectrum allocation policy of cognitive radio networks [1]-[3], sensing/monitoring of spectrum availability is identified as a key requirement. To meet this requirement we address an important MAC-layer sensing issue: which of proactive and reactive sensing is more energy-efficient? An algorithm is proposed to dynamically determine which sensing mode to use. For proactive sensing, sensing-period adaptation to maximize discovery of opportunities and optimal-ordering of channels to minimize the delay in finding an available channel are proposed. Channel-usage patterns are also estimated. Our simulation results demonstrate the efficacy of the proposed sensing scheme, as well as its performance improvements over the previously-proposed schemes. The sensing-period adaptation discovers up to 98% of the analytical maximum of spectral opportunities, regardless of choice of the initial sensing period. For the testing scenario and simulation parameters we used, the proposed scheme is shown to discover up to 20% more opportunities than the previous sensing schemes without sensing-period adaptation. This improvement may become greater as the initial sensing period grows. The delay in finding an idle channel with the proposed channel-ordering is around 0.02 second, which is a half of the delay without channel-ordering, and the proposed scheme yields steady performance over a wide range of the number of channels.

CSE-TR-519-06 ECO-system: Enbracing the Change in Placement Jarrod A. Roy and Igor L. Markov June, 2006 15
In a successful RTL-to-GDS flow, numerous circuit modifications and optimizations must interact with physical aspects of the design. For example, timing violations can be moderated by gate sizing and net buffering, while routing violations can often be resolved by local replacement. Such a community of tools requires a reliable system for performing Engineering Change Orders (ECOs), notably in placement. Existing work is often overly incremental and limited in the amount of design change it can support. Most of it describes stand-alone tools that offer poor interfaces to the design flow and cannot handle a full range of modern VLSI layouts. We develop a new, strong ECO-system that reliably handles fixed objects and movable macros in instances with widely varying amounts of whitespace. It detects layout regions and sections of the netlist that require modification and applies an adequate amount of change in each case: given a reasonable initial placement, it applies minimal changes, but is capable of re-placing large regions to handle pathological cases. ECO-system favorably compares to recent detail placers and fares well in high-level and physical synthesis.

CSE-TR-520-06 Malware Spreading in Mobile Enterprise Environments Abhijit Bose, Scott Boehmer and Kang G. Shin June, 2006 25
The proliferation of networked mobile devices, such as laptops, PDAs, and cellular phones, make today’s enterprises highly vulnerable to an entirely new generation of malicious bots, worms and viruses. These malicious “agents” can spread not only via traditional social engineering methods, such as email, peer-to-peer and instant messaging (IM), but also via new vectors, such as proximity scanning using Bluetooth and SMS/MMS messages. The standard epidemic models based on homogeneity, average-degree distributions, and perfect-mixing assumptions are inadequate to model malicious agents in enterprise networks with overlapping wired, wireless and cellular segments. In this paper, we study three parameters crucial to describing malware propagation in such environments: service interactions among hosts, local network structure, and user mobility. The majority of the parameters in our study are derived from real-life network traces collected from a large enterprise network, and therefore, represent realistic malware propagation and infection scenarios. We propose a general-purpose agent-based malware modeling framework targeted to enterprise environments. We examine two likely scenarios: (i) a malicious virus such as Cabir spreading among the subscribers of a cellular network using Bluetooth, and (ii) a hybrid worm that exploit email and P2P file-sharing to infect users of an enterprise network. In both cases, we identify the parameters crucial to the spread of the epidemic based upon our extensive simulation results.

CSE-TR-521-06 Node Mergers in the Presence of Don't Cares Stephen Plaza, Kai-hui Chang, Igor Markov, and Valeria Bertacco July, 2006 16
Merging equivalent nodes in a circuit is a synthesis technique useful for reducing area and improving equivalence checking. Recently proposed algorithms that determine node equivalence are unable to exploit observability don't cares (ODC) and therefore miss several merging opportunities. Current strategies for extracting don't care information for identifying additional node mergers restrict the types of don't cares computed because of computational expense. We develop an efcient framework that merges nodes by exploiting ODCs through simulation and SAT. Specically, the framework integrates (1) a novel strategy for representing ODC information during logic simulation, (2) a fast ODC-aware simulator, (3) an ODC-aware equivalence checker, and (4) progressive renement of simulation vectors. Our results indicate that ODCs enable many node mergers with up to 30% gate reduction on unoptimized circuits and 23% area reduction after synthesis.

CSE-TR-522-06 Keeping Physical Synthesis Safe and Sound Kai-hui Chang, Igor L. Markov and Valeria Bertacco July, 2006 18
In this work we define and explore the concepts of physical safeness and logical soundness, and empirically evaluate the effects of physical safeness on route length, via count and timing. In addition, we propose a new physically safe and logically sound optimization, SafeResynth, which provides immediately-measurable improvement without altering the functionality of a design. It can enhance circuit timing without detrimental effects on route length and congestion. We achieve these improvements by performing a series of netlist transformations and re-placements that are individually evaluated for logical soundness (using on-line verification) and physical safeness. When used alone, SafeResynth improves circuit delay of IWLS05 benchmarks by 11% on average after routing, while increasing route length by less than 0.2%. Our resynthesis can also be used in an unsafe mode, akin to more traditional physical synthesis algorithms popular in commercial tools. Applied together, our safe and unsafe transformations achieve 24% average delay improvement for seven large benchmarks from the OpenCores suite, which we show to be orthogonal to improvement from timing-driven placement.

CSE-TR-523-06 Improving Resiliency of Overlay Networks for Streaming Applications Wang, Du and Jamin July, 2006 18
Early deployment of peer-to-peer (P2P) streaming network demonstrates its potential to deliver quality media streams to a large audience. However, P2P streaming is vulnerable to node failures and malicious attacks because of its high bandwidth demand and stringent timing requirement. In this paper, we investigated several heuristics to improve an overlay's resilience to failures and attacks. We formalized the overlay connectivity resilience as a graph theoretic problem and proved that disjoint paths with variable lengths are effective in increasing overlays' connectivity resilience. We proposed a distributed heuristic called “MPath” that enables each peer to improve its connectivity without global knowledge or coordination. We designed the MPath heuristic to work in conjunction with existing overlay improvement mechanisms. Such integration not only speeds up overlay convergence, but also significantly cuts down MPath's maintenance overhead. Our experiments showed that MPath reduced the number of disconnected components in an overlay by an order of magnitude when 20% of overlay nodes failed.

CSE-TR-524-06 PAN-on-Demand: Building self-organizing WPANs for better power management Manish Anand and Jason Flinn August, 2006 21
In this paper, we present PAN-on-Demand, a self-organizing multi-radio system that balances performance and energy concerns by scaling the structure of the network to match the demands of applications. Since current mobile devices ship with multiple radios, PAN-on-Demand can improve performance and extend battery lifetime by switching between different wireless interfaces such as Bluetooth and WiFi and opportunistically exploiting the available power-saving strategies. When applications are actively using the network, PAN-on-Demand offers high-bandwidth, low-latency communication; when demand is light, PAN-on-Demand adapts the network structure to minimize energy usage. Our results show that PAN-on-Demand reduces the average response time of common PAN applications like MP3 playing, e-mail viewing and photo sharing by 92% and extends the battery lifetime of mobile devices by up to 47% compared to current PAN management strategies.

CSE-TR-525-06 A Type System for Preventing Data Races and Deadlocks in the Java Virtual Machine Language Pratibha Permandla and Chandrasekhar Boyapati September, 2006 10
In previous work on SafeJava we presented a type system extension to the Java source language that statically prevents data races and deadlocks in multithreaded programs. SafeJava is expressive enough to support common programming patterns, its type checking is fast and scalable, and it requires little programming overhead. SafeJava thus offers a promising approach for making multithreaded programs more reliable. This paper presents a corresponding type system extension for the Java virtual machine language (JVML). We call the resulting language SafeJVML. Well-typed SafeJVML programs are guaranteed to be free of data races and deadlocks. Designing a corresponding type system for JVML is important because most Java code is shipped in the JVML format. Designing a corresponding type system for JVML is nontrivial because of important differences between Java and JVML. In particular, the absence of block structure in JVML programs and the fact that they do not use named local variables the way Java programs do make the type systems for Java and JVML significantly different. For example, verifying absence of races and deadlocks in JVML programs requires performing an alias analysis, something that was not necessary for verifying absence of races and deadlocks in Java programs. This paper presents static and dynamic semantics for SafeJVML. It also includes a proof that the SafeJVML type system is sound and that it prevents data races and deadlocks. To the best of our knowledge, this is the first type system for JVML that statically ensures absence of synchronization errors.

CSE-TR-526-06 Improving Distributed File System Performance in Virtual Machine Environments Xin Zhao, Atul Prakash, Brian Noble, and Kevin Borders September, 2006 15
Virtual machine (VM) systems have traditionally used virtual disks for file storage. Recently, there has been interest in using distributed file systems as a way to provide data storage to guest virtual machines, with the file server running on the same physical machine. Potential advantages include fine-grained data sharing, data protection, versioning, and backup to multiple guests from one central place. Unfortunately, distributed file systems often perform substantially worse than local file systems because of high network overheads, even when the file server is on the same physical machine. This paper proposes two mechanisms to reduce communication overheads of inter-VM file system operations: Inter-VM metadata sharing and Virtual Remote Procedure Call (VRPC). Inter-VM metadata sharing enables clients to directly read file attributes from the server without an RPC exchange. VRPC follows standard RPC format but uses inter-VM shared memory to exchange data between a file server and its clients, which cuts out a lot of communication overhead. We implemented these two mechanisms on the Xen virtual machine platform and adapted NFS version 3 to take advantage of these two mechanisms. For an Apache build workload, NFS with Inter-VM metadata sharing and VPRCs was only 6.3% slower than the Ext3 file system, while standard NFS was 31.2% slower. These results suggest that Inter-VM metadata sharing and VRPCs significantly reduce communication overhead between virtual machines running on the same hardware.

CSE-TR-527-06 Design of Location Service for a Hybrid Network of Mobile Actors and Static Sensors Zhigang Chen, Min-gyu Cho and Kang G. Shin September, 2006 10
Location services are essential to many applications running on a hybrid of wirelessly-networked mobile actors and static sensors, such as surveillance systems and the Pursuer and Evader Game (PEG). To our best knowledge, there has been no previous location service protocol for wireless sensor networks. A number of location service protocols have been proposed for mobile ad hoc networks, but they are not applicable to sensor networks due to the usually large per-hop latency between sensors. In this report, we present a distributed location service protocol (DLSP) for wireless sensor networks. Using a rigorous analysis of DLSP, we derive the condition for achieving a high packet-delivery ratio, and show how to configure the protocol parameters to ensure the scalability of DLSP. We prove that DLSP is scalable if the mobile’s speed is below a certain fraction of the packet-transmission speed, which depends on a movement threshold. For example, if the movement threshold for the lowest-level location servers is the same as the radio range, the mobile’s speed limit is one-tenth of the packet-transmission speed. The mobile’s theoretical speed limit is one-fifth of the packet-transmission speed, beyond which DLSP cannot scale regardless of the movement threshold. Because DLSP suffers from a high location-update overhead, we propose an optimization, called DLSP with the Selected Neighbor (DLSP-SN), which can reduce the update overhead by more than 70%, while achieving a high packet-delivery ratio. Due to the griding effect, the average packet’s path length of DLSP-SN is longer than that of DLSP. This increases data-delivery cost for continuous data streams. In order to make a tradeoff between update and data-delivery costs, we present a greedy adaptation mechanism, called DLSP-ASN, which can make a significant improvement of overall energy-efficiency.

Technical Reports Page