0%

Spreadskech

Abstract

Key words: invertible sketch data, superspreader, networkwide detection

Superspreaders (i.e., hosts with numerous distinct connections) remain severe threats to production networks. How to accurately detect superspreaders in real-time at scale remains a non-trivial yet challenging issue. We present SpreadSketch, an invertible sketch data structure for network-wide superspreader detection with the theoretical guarantees on memory space, performance, and accuracy. SpreadSketch tracks candidate superspreaders and embeds estimated fan-outs in binary hash strings inside small and static memory space, such that multiple SpreadSketch instances can be merged to provide a networkwide measurement view for recovering superspreaders and their estimated fan-outs. We present formal theoretical analysis on SpreadSketch in terms of space and time complexities as well as error bounds. Trace-driven evaluation shows that SpreadSketch achieves higher accuracy and performance over state-of-the-art sketches. Furthermore, we prototype SpreadSketch in P4 and show its feasible deployment in commodity hardware switches

Definition

We define the fan-out of a source as the number of distinct destinations to which the source connects over an epoch. We say that a source is a superspreader if its fan-out exceeds a pre-defined threshold

sampling:

  • [ ] [5] J. Cao, Y. Jin, A. Chen, T. Bu, and Z.-L. Zhang. Identifying High Cardinality Internet Hosts. In Proc. of IEEE INFOCOM, 2009.
  • [x] [19] N. Kamiyama, T. Mori, and R. Kawahara. Simple and Adaptive Identification of Superspreaders by Flow Sampling. In Proc. of IEEE INFOCOM, 2007.
    • sampling: 将key映射为binary string,大于某个阈值则被采样;bloom filter检查是不是新的流;host list中更新计数
    • 亮点在于调参和内存规划
  • [ ] [35] S. Venkataraman, D. Song, P. B. Gibbons, and A. Blum. New Streaming Algorithms for Fast Detection of Superspreaders.

  • [ ] [43] Q. Zhao, A. Kumar, and J. Xu. Joint Data Streaming and Sampling Techniques for Detection of Super Sources. In Proc. of ACM SIGCOMM, 2005.

  • [ ] [22] T. Li, S. Chen, W. Luo, M. Zhang, and Y. Qiao. Spreader Classification Based on Optimal Dynamic Bit Sharing. IEEE/ACM Trans. on Networking, 21(3):817–830, 2013
  • [ ] [41] M. Yoon, T. Li, S. Chen, and J.-K. Peir. Fit a Compact Spread Estimator in Small High-Speed Memory. IEEE/ACM Trans. on Networking, 19(5):1253–1264, 2011.

([22]和[41]是不可逆的)

  • [ ] [43] Q. Zhao, A. Kumar, and J. Xu. Joint Data Streaming and Sampling Techniques for Detection of Super Sources. In Proc. of ACM SIGCOMM, 2005.
  • [ ] [22] T. Li, S. Chen, W. Luo, M. Zhang, and Y. Qiao. Spreader Classification Based on Optimal Dynamic Bit Sharing. IEEE/ACM Trans. on Networking, 21(3):817–830, 2013
  • [ ] [41] M. Yoon, T. Li, S. Chen, and J.-K. Peir. Fit a Compact Spread Estimator in Small High-Speed Memory. IEEE/ACM Trans. on Networking, 19(5):1253–1264, 2011.

Method

  1. map each observed connection to a binary hash string, estimate fans-out.