Cryptography’s Developer Dilemma: An Urgent Call for API Research

Table of Links
Abstract and I. Introduction
II. Related Work
III. Methodology
IV. Results and Discussion
V. Threats to Validity
VI. Conclusions, Acknowledgments, and References
Abstract—Prior research has shown that cryptography is hard to use for developers. We aim to understand what cryptography issues developers face in practice. We clustered 91 954 cryptography-related questions on the Stack Overflow website, and manually analyzed a significant sample (i.e., 383) of the questions to comprehend the crypto challenges developers commonly face in this domain. We found that either developers have a distinct lack of knowledge in understanding the fundamental concepts, e.g., OpenSSL, public-key cryptography or password hashing, or the usability of crypto libraries undermined developer performance to correctly realize a crypto scenario. This is alarming and indicates the need for dedicated research to improve the design of crypto APIs.
I. INTRODUCTION
Studies have shown that cryptography concepts are hard to understand for developers, and the complexity of crypto APIs has rendered their secure usage very difficult [1] [2]. There exist static analysis tools, but developers are reluctant to employ them due to a lack of familiarity, restrictions in organizational policies, or high rates of false positives [3], [4]. Researchers have recently developed new APIs to ease the adoption of cryptography [5], yet online Q&A forums are among the main information sources used to resolve developer issues.
Closer inspection of online forums such as Stack Overflow provides a shortcut to identifying the frequent challenges that developers face in this domain. Therefore, in this study, we address the following research question: What types of crypto challenges do developers face in cryptography? We extract the common problems that developers recently encounter when dealing with various areas of cryptography. The findings provide significant help for developers in general, and software team leaders, tutors and crypto library designers in particular, to raise their awareness of common misunderstandings, or to highlight areas with a steep learning curve.
Unlike other studies, we only focus on crypto-related challenges of developers. To cover various types of cryptochallenges, we need to identify different groups of questions that are similar in terms of context. Particularly, manual grouping of such a large number of questions (i.e., 91 954) is a demanding task. We therefore used the Latent Dirichlet Allocation (LDA) generative statistical model, and found three main topics in 91 954 crypto-related posts on Stack Overflow. We then used stratified sampling to study 383 posts randomly from the three topics to identify the most common problematic issues for developers. The results showed that developers commonly failed to implement a cryptographic scenario due to two reasons, namely the complexity of crypto APIs, and their lack of familiarity with fundamental concepts such as digital certificates, public-key cryptography, and hashing algorithms.
Our findings show that hurdles for developers in cryptography are not yet resolved, and due to its impact on security, this domain urgently needs dedicated research effort. We are conducting a survey with developers who actively helped the Stack Overflow community in this domain to understand potential remedies to this problem.
Authors:
(1) Mohammadreza Hazhirpasand, Oscar Nierstrasz, University of Bern, Bern, Switzerland;
(2) Mohammadhossein Shabani, Azad University, Rasht, Iran;
(3) Mohammad Ghafari, School of Computer Science, University of Auckland, Auckland, New Zealand.