10.1145/3437801.3441613acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections

Extracting clean performance models from tainted programs

Published:17 February 2021Publication History

ABSTRACT

Performance models are well-known instruments to understand the scaling behavior of parallel applications. They express how performance changes as key execution parameters, such as the number of processes or the size of the input problem, vary. Besides reasoning about program behavior, such models can also be automatically derived from performance data. This is called empirical performance modeling. While this sounds simple at the first glance, this approach faces several serious interrelated challenges, including expensive performance measurements, inaccuracies inflicted by noisy benchmark data, and overall complex experiment design, starting with the selection of the right parameters. The more parameters one considers, the more experiments are needed and the stronger the impact of noise. In this paper, we show how taint analysis, a technique borrowed from the domain of computer security, can substantially improve the modeling process, lowering its cost, improving model quality, and help validate performance models and experimental setups.

References

  1. 2018. Extra-P 3.0. http://www.scalasca.org/software/extra-p/download.html.Google ScholarGoogle Scholar
  2. Sriram Aananthakrishnan, Greg Bronevetsky, and Ganesh Gopalakrishnan. 2013. Hybrid Approach for Data-flow Analysis of MPI Programs. In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing (Eugene, Oregon, USA) (ICS '13). ACM, New York, NY, USA, 455--456.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. an Mey, S. Biersdorff, C. Bischof, K. Diethelm, D. Eschweiler, M. Gerndt, A. Knüpfer, D. Lorenz, A. D. Malony, W. E. Nagel, Y. Oleynik, C. Rössel, P. Saviankou, D. Schmidl, S. S. Shende, M. Wagner, B. Wesarg, and F. Wolf. 2012. Score-P: A Unified Performance Measurement System for Petascale Applications. In Proc. of the CiHPC: Competence in High Performance Computing, HPC Status Konferenz der Gauß-Allianz e.V., Schwetzingen, Germany, June 2010. Gauß-Allianz, Springer, 85--97. Google ScholarGoogle ScholarCross RefCross Ref
  5. G. Bauer, S. Gottlieb, and T. Hoefler. 2012. Performance Modeling and Comparative Analysis of the MILC Lattice QCD Application su3 rmd. In Proc. of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) (Ottawa, Canada). IEEE Computer Society, 652--659.Google ScholarGoogle Scholar
  6. Claude Bernard, Michael C. Ogilvie, Thomas A. Degrand, Carleton E. Detar, Steven A. Gottlieb, A. Krasnitz, R.L. Sugar, and D. Toussaint. 1991. Studying Quarks and Gluons On Mimd Parallel Computers. Int. J. High Perform. Comput. Appl. 5, 4 (Dec. 1991), 61--70.Google ScholarGoogle Scholar
  7. A. Bhattacharyya, G. Kwasniewski, and T. Hoefler. 2015. Using Compiler Techniques to Improve Automatic Performance Modeling. In 2015 International Conference on Parallel Architecture and Compilation (PACT). 468--479.Google ScholarGoogle Scholar
  8. Alexandru Calotoiu, David Beckingsale, Christopher W. Earl, Torsten Hoefler, Ian Karlin, Martin Schulz, and Felix Wolf. 2016. Fast Multi-Parameter Performance Modeling. In Proc. of the 2016 IEEE International Conference on Cluster Computing (CLUSTER), Taipei, Taiwan. IEEE Computer Society, 172--181.Google ScholarGoogle ScholarCross RefCross Ref
  9. Alexandru Calotoiu, Alexander Graf, Torsten Hoefler, Daniel Lorenz, Sebastian Rinke, and Felix Wolf. 2018. Lightweight Requirements Engineering for Exascale Co-design. IEEE. To appear in IEEE International Conference on Cluster Computing (Cluster'18).Google ScholarGoogle Scholar
  10. Alexandru Calotoiu, Alexander Graf, Torsten Hoefler, Daniel Lorenz, Sebastian Rinke, and Felix Wolf. 2018. Lightweight Requirements Engineering for Exascale Co-design. In Proc. of the 2018 IEEE International Conference on Cluster Computing (CLUSTER), Belfast, UK. IEEE, 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  11. A. Calotoiu, T. Hoefler, M. Poke, and F. Wolf. 2013. Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes. IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC13).Google ScholarGoogle Scholar
  12. Alexandru Calotoiu, Torsten Hoefler, Marius Poke, and Felix Wolf. 2013. Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes. In Proc. of the ACM/IEEE Conference on Supercomputing (SC13), Denver, CO, USA. 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Alonzo Church. 1936. An unsolvable problem of elementary number theory. American journal of mathematics 58, 2 (1936), 345--363.Google ScholarGoogle Scholar
  14. James Clause, Wanchun Li, and Alessandro Orso. 2007. Dytan: a generic dynamic taint analysis framework. In Proceedings of the 2007 international symposium on Software testing and analysis. ACM, 196--206.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. James Clause, Wanchun Li, and Alessandro Orso. 2007. Dytan: A Generic Dynamic Taint Analysis Framework. In Proceedings of the 2007 International Symposium on Software Testing and Analysis (London, United Kingdom) (ISSTA '07). Association for Computing Machinery, New York, NY, USA, 196--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. dfsan 2019. Clang 9 Documentation - DataFlowSanitizer. https://clang.llvm.org/docs/DataFlowSanitizer.html.Google ScholarGoogle Scholar
  17. X. Fu, Z. Chen, Y. Zhang, C. Huang, W. Dong, and J. Wang. 2015. MPISE: Symbolic Execution of MPI Programs. In 2015 IEEE 16th International Symposium on High Assurance Systems Engineering. 181--188.Google ScholarGoogle Scholar
  18. Dan Gohman. 2009. ScalarEvolution and Loop Optimization. Talk at LLVM Developer's Meeting.Google ScholarGoogle Scholar
  19. S. F. Goldsmith, A. S. Aiken, and D. S. Wilkerson. 2007. Measuring Empirical Computational Complexity. In Proc. of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (Dubrovnik, Croatia) (ESEC-FSE '07). ACM, New York, NY, USA, 395--404.Google ScholarGoogle Scholar
  20. P. Gschwandtner, A. Hirsch, S. Benedict, and T. Fahringer. 2018. Towards Automatic Compiler-assisted Performance and Energy Modeling for Message Passing Parallel Programs. In ARCS Workshop 2018; 31th International Conference on Architecture of Computing Systems. 1--8.Google ScholarGoogle Scholar
  21. J. Hammer, G. Hager, J. Eitzinger, and G. Wellein. 2015. Automatic Loop Kernel Analysis and Performance Modeling With Kerncraft. CoRR abs/1509.03778 (2015). http://arxiv.org/abs/1509.03778Google ScholarGoogle Scholar
  22. Torsten Hoefler, William Gropp, William Kramer, and Marc Snir. 2011. Performance Modeling for Systematic Performance Tuning. In State of the Practice Reports (Seattle, Washington) (SC '11). ACM, New York, NY, USA, Article 6, 12 pages.Google ScholarGoogle Scholar
  23. T. Hoefler and D. Moor. 2014. Energy, Memory, and Runtime Tradeoffs for Implementing Collective Communication Operations. Journal of Supercomputing Frontiers and Innovations 1, 2 (Oct. 2014), 58--75.Google ScholarGoogle Scholar
  24. T. Hoefler, T. Schneider, and A. Lumsdaine. 2010. Characterizing the Influence of System Noise on Large-Scale Applications by Simulation. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC'10).Google ScholarGoogle Scholar
  25. Kashif Ilyas, Alexandru Calotoiu, and Felix Wolf. 2017. Off-Road Performance Modeling - How to Deal with Segmented Data. In Proc. of the 23rd Euro-Par Conference, Santiago de Compostela, Spain (Lecture Notes in Computer Science, Vol. 10417). Springer, 36--48. Google ScholarGoogle ScholarCross RefCross Ref
  26. Engin Ipek, Bronis R. de Supinski, Martin Schulz, and Sally A. McKee. 2005. An Approach to Performance Prediction for Parallel Applications. In Proceedings of the 11th International Euro-Par Conference on Parallel Processing (Lisbon, Portugal) (Euro-Par'05). Springer-Verlag, Berlin, Heidelberg, 196--205.Google ScholarGoogle Scholar
  27. A. Jayakumar, P. Murali, and S. Vadhiyar. 2015. Matching Application Signatures for Performance Predictions Using a Single Execution. In Proc. of the 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2015). 1161--1170.Google ScholarGoogle Scholar
  28. Min Gyung Kang, Stephen McCamant, Pongsin Poosankam, and Dawn Song. 2011. DTA++: Dynamic Taint Analysis with Targeted Control-Flow Propagation. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2011, San Diego, California, USA, 6th February - 9th February 2011. The Internet Society.Google ScholarGoogle Scholar
  29. Ian Karlin, Jeff Keasler, and Rob Neely. 2013. LULESH 2.0 Updates and Changes. Technical Report LLNL-TR-641973. 1--9 pages.Google ScholarGoogle Scholar
  30. Vasileios P. Kemerlis, Georgios Portokalidis, Kangkook Jee, and Angelos D. Keromytis. 2012. Libdft: Practical Dynamic Data Flow Tracking for Commodity Systems. In Proceedings of the 8th ACM SIGPLAN/SIGOPS Conference on Virtual Execution Environments (London, England, UK) (VEE '12). Association for Computing Machinery, New York, NY, USA, 121--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. J. Kerbyson, H. J. Alme, A. Hoisie, F. Petrini, H. J. Wasserman, and M. Gittings. 2001. Predictive Performance and Scalability Modeling of a Large-Scale Application. In SC '01: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing. 39--39.Google ScholarGoogle Scholar
  32. M. Kuhnemann, T. Rauber, and G. Runger. 2004. A source code analyzer for performance prediction. In 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings. 262--.Google ScholarGoogle Scholar
  33. C. Lattner and V. Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proc. of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (Palo Alto, California) (CGO '04). IEEE Computer Society, Washington, DC, USA.Google ScholarGoogle Scholar
  34. Benjamin C. Lee, David M. Brooks, Bronis R. de Supinski, Martin Schulz, Karan Singh, and Sally A. McKee. 2007. Methods of inference and learning for performance modeling of parallel applications. In Proc. of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (San Jose, California, USA) ((PPoPP '07)). ACM, 249--258.Google ScholarGoogle Scholar
  35. Seyong Lee, Jeremy S. Meredith, and Jeffrey S. Vetter. 2015. COMPASS: A Framework for Automated Performance Modeling and Prediction. In Proceedings of the 29th ACM on International Conference on Supercomputing (Newport Beach, California, USA) (ICS '15). ACM, New York, NY, USA, 405--414.Google ScholarGoogle Scholar
  36. Y. J. Lo, S. Williams, B. Van Straalen, T. J. Ligocki, M. J. Cordery, N. J. Wright, M. W. Hall, and L. Oliker. 2014. Roofline Model Toolkit: A practical tool for architectural and program analysis. In High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation. Springer, 129--148.Google ScholarGoogle Scholar
  37. G. Marin and J. Mellor-Crummey. 2004. Cross-architecture performance predictions for scientific applications using parameterized models. SIGMETRICS Performance Eval. Review 32, 1 (June 2004), 2--13.Google ScholarGoogle Scholar
  38. K. Meng and B. Norris. 2017. Mira: A Framework for Static Performance Analysis. In 2017 IEEE International Conference on Cluster Computing (CLUSTER). 103--113.Google ScholarGoogle Scholar
  39. M. R. Meswani, L. Carrington, D. Unat, A. Snavely, S. Baden, and S. Poole. 2013. Modeling and Predicting Performance of High Performance Computing Applications on Hardware Accelerators. Int. J. High Perform. Comput. Appl. 27, 2 (May 2013), 89--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Henry Gordon Rice. 1953. Classes of recursively enumerable sets and their decision problems. Trans. Amer. Math. Soc. 74, 2 (1953), 358--366.Google ScholarGoogle ScholarCross RefCross Ref
  41. Marcus Ritter, Alexandru Calotoiu, Thorsten Reimann, Torsten Hoefler, and Felix Wolf. 2020. Performance Modeling at a Discount. IEEE. Accepted at the 34th IEEE International Parallel & Distributed Processing Symposium (IPDPS'20).Google ScholarGoogle Scholar
  42. Marcus Ritter, Alexandru Calotoiu, Sebastian Rinke, Thorsten Reimann, Torsten Hoefler, and Felix Wolf. 2020. Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling. In Proc. of the 34th IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, LA, USA. IEEE Computer Society. (to appear).Google ScholarGoogle ScholarCross RefCross Ref
  43. Philip C. Roth and Jeremy S. Meredith. 2014. Value Influence Analysis for Message Passing Applications. In Proceedings of the 28th ACM International Conference on Supercomputing (Munich, Germany) (ICS '14). ACM, New York, NY, USA, 145--154.Google ScholarGoogle Scholar
  44. Marc Shapiro and Susan Horwitz. 1997. The effects of the precision of pointer analysis. In International Static Analysis Symposium. Springer, 16--34.Google ScholarGoogle ScholarCross RefCross Ref
  45. Dongdong She, Yizheng Chen, Abhishek Shah, Baishakhi Ray, and Suman Jana. 2019. Neutaint: Efficient Dynamic Taint Analysis with Neural Networks. arXiv:1907.03756 [cs.CR]Google ScholarGoogle Scholar
  46. Sergei Shudler, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube, and Felix Wolf. 2015. Exascaling Your Library: Will Your Implementation Meet Your Expectations?. In Proc. of the International Conference on Supercomputing (ICS), Newport Beach, CA, USA. 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Norbert Siegmund, Alexander Grebhahn, Sven Apel, and Christian Kästner. 2015. Performance-influence Models for Highly Configurable Systems. In Proc. of the 2015 10th Joint Meeting on Foundations of Software Engineering (Bergamo, Italy) (ESEC/FSE 2015). ACM, New York, NY, USA, 284--294.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Vítor Sousa, Daniel de Oliveira, Patrick Valduriez, and Marta Mattoso. 2018. DfAnalyzer: Runtime Dataflow Analysis of Scientific Applications using Provenance. Proceedings of the VLDB Endowment 11 (08 2018).Google ScholarGoogle Scholar
  49. K. L. Spafford and J. S. Vetter. 2012. Aspen: A Domain Specific Language for Performance Modeling. In Proc. of the International Conference on High Performance Computing, Networking, Storage and Analysis (Salt Lake City, Utah) (SC '12). IEEE Computer Society Press, Los Alamitos, CA, USA, Article 84, 11 pages.Google ScholarGoogle Scholar
  50. Yulei Sui and Jingling Xue. 2016. SVF: interprocedural static value-flow analysis in LLVM. In Proceedings of the 25th international conference on compiler construction. ACM, 265--266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. N. R. Tallent and A. Hoisie. 2014. Palm: Easing the Burden of Analytical Performance Modeling. In Proc. of the 28th ACM International Conference on Supercomputing (Munich, Germany) (ICS '14). ACM, New York, NY, USA, 221--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Rajeev Thakur, Rolf Rabenseifner, and William Gropp. 2005. Optimization of Collective Communication Operations in MPICH. Int. J. High Perform. Comput. Appl. 19, 1 (Feb. 2005), 49--66.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Sebastian Unger and Frank Mueller. 2002. Handling irreducible loops: optimized node splitting versus DJ-graphs. ACM Transactions on Programming Languages and Systems (TOPLAS) 24, 4 (2002), 299--333.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. R. Vuduc, J. W. Demmel, and J. A. Bilmes. 2004. Statistical Models for Empirical Search-Based Performance Tuning. Int. J. High Perform. Comput. Appl. 18, 1 (Feb. 2004), 65--94.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Xing Wu and Frank Müller. 2012. ScalaExtrap: Trace-Based Communication Extrapolation for SPMD Programs. ACM Transactions on Programming Languages and Systems 34, 1 (April 2012).Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Hengbiao Yu. 2018. Combining Symbolic Execution and Model Checking to Verify MPI Programs. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (Gothenburg, Sweden) (ICSE '18). Association for Computing Machinery, New York, NY, USA, 527--530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. D. Zaparanuks and M. Hauswirth. 2012. Algorithmic Profiling. SIGPLAN Not. 47, 6 (June 2012), 67--76.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Angeliki Zavou, Georgios Portokalidis, and Angelos D. Keromytis. 2011. Taint-Exchange: A Generic System for Cross-Process and Cross-Host Taint Tracking. In Advances in Information and Computer Security, Tetsu Iwata and Masakatsu Nishigaki (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 113--128.Google ScholarGoogle Scholar
  59. Jidong Zhai, Wenguang Chen, and Weimin Zheng. 2010. PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node. SIGPLAN Notices 45, 5 (January 2010), 305--314.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Extracting clean performance models from tainted programs

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!