CSE Technical Reports Sorted by Technical Report Number

TR Number Title Authors Date Pages

CSE-TR-578-13 Uncovering Cellular Network Characteristics: Performance, Infrastructure, and Policies Huang, Qian, Xu, Qian, Mao, and Rayes March, 2013 7
Mobile Smart Devices (Smartphones and tablets) have become increasingly popular especially in IT based service management companies. According to IDC, more than 70% of executives and sale managers are replacing their PCs with tablets. This is in part due to the agility flexibility and the availability of diverse network-based and support applications. Thus network characteristics directly affect user-perceived performance and a deep understanding of the properties of contemporary cellular networks for commonly used platforms is important for smartphone application and platform optimization. In this work, we carry out the largest study to date of cellular networks in terms of users, time duration, location, and networks to understand the performance, infrastructure, and policy characteristics. With the data set collected from around 100K users across the world over 18 months, MobiPerf, a smartphone network measurement tool we developed and publicly deployed, enables us to analyze network performance along several new dimensions, previously not examined. Our results indicate that with better infrastructure support, large cities appear to have better performance than rural areas. Our case study on packet size's effect on RTT uncovers a surprising pattern for AT&T's uplink RTT. We also show that Internet-based CDN service provides very limited latency improvement in today's cellular networks. We further examine how local DNS servers are assigned to mobile users. In addition, we scrutinize the carriers' policy towards different types of traffic and successfully identify some middlebox behavior of today's cellular carriers. which may negatively impact user experiences.

CSE-TR-579-13 Automatic Spreadsheet Data Extraction Shirley Zhe Chen and Michael J. Cafarella March, 2013 6
Spreadsheets contain a huge amount of useful data, but do not observe the relational data model, and thus cannot exploit relational integration tools. Existing systems for extracting relational data from spreadsheets are too labor-intensive to support ad-hoc integration tasks, in which the correct extraction target is only learned during the course of user interaction. This paper introduces Senbazuru, a system that automatically extracts relational data from spreadsheets, thereby enabling relational spreadsheet integration. When compared to standard techniques for spreadsheet data extraction on a set of 100 Web spreadsheets, Senbazuru reduces the amount of human labor by 72% to 92%. In addition to the Senbazuru design, we present the results of a general survey of more than 400,000 spreadsheets we downloaded from the Web, giving a novel view of how users organize their data in spreadsheets.

CSE-TR-581-13 Towards Scalable, Flexible, and Deployable Inter-Domain Routing via Plural Routing Qiang Xu, Feng Qian, Mehrdad Moradi, Darrell Bethea, Z. Morley Mao, and Michael Reiter June, 2013 14
BGP has critical deficiencies in scalability and flexibility. Scalability is increasingly critical as routing tables grow, limiting BGP to single-path-per-destination routing, which constrains routing flexibility due to a lack of alternate paths in forwarding tables. To address these limitations, previous solutions either require a clean-slate re-design of the Internet or incur nontrivial overhead in modifying BGP. In contrast, we propose plural routing, a novel inter-domain routing scheme based on routing table re-construction, i.e., using bloom filters to record multiple forwarding entries per destination while keeping routing tables scalable. Aimed to be a general solution supporting diverse inter-domain routing scenarios, plural routing’s design is limited to routing tables without requiring additional infrastructure support, allowing it to retain incremental deployability. To demonstrate these properties, we run simulations on both the topologies of the current Internet and those projected topologies based on an Internet growth model. In particular, plural routing achieves the routing table size only linearly proportional to the number of neighbor domains instead of destination domains, with 99% of routing tables of size _100KB, while 99% of routes have zero path inflation. In supporting dynamic routing, plural routing’s performance is almost identical to the optimal in policy enforcement and load balancing.

CSE-TR-582-13 FLOWR: A Self-Learning System for Classifying Mobile Application Traffic Xu, Andrews, Liao, Miskovic, Mao, Baldi, Nucci June, 2013 14
The numerous mobile apps available on app market pose a unique challenge to mobile network operators in network management tasks. Unlike apps used by Internet hosts, mobile apps communicate predominantly via HTTP and are thus indistinguishable without untangling the knot of generic HTTP traffic. Discerning app identities in real time using network traffic enables network operators to perform app profiling at flow level and traffic engineering at app level, which further benefit entities such as users, developers, advertisers, and enterprise managers. We propose FLOWR, a system that identifies flows belonging to individual apps probabilistically in real time. Benefiting from the rich metadata in HTTP queries, FLOWR tokenizes the key-value pairs from queries into signatures that can best identify apps, enabling flow identification using signature matching against a knowledge base of app signatures. To minimize the need for supervised learning to construct the knowledge base, FLOWR adapts to the large traffic volume in mobile networks. Using the property that flow signatures from the same app should co-occur repeatedly, FLOWR infers the app identities of flow signatures by capturing the co-occurrences between identified flow signatures and undetermined flow signatures, incrementally updating the knowledge base. Using the apps with Doubleclick service as ground truth to evaluate FLOWR, we observe that our system can identify 86–99% of flows, i.e., uniquely identifying 26–30% of flows and narrowing down another 60–65% of flows to _5 candidate apps with <1% false positive.

CSE-TR-584-13 Extending Channel Comparison Based Sybil Detection to MIMO Systems Yue Liu, David Bild, and Robert P. Dick November, 2013 3

Technical Reports Page