PROMISE is an annual forum, sponsored by ACM SIGSOFT, for researchers and practitioners to present, discuss and exchange ideas, results, expertise and experiences in construction and/or application of predictive models and data analytics in software engineering. PROMISE encourages researchers to publicly share their data in order to provide interdisciplinary research between the software engineering and data mining communities, and seek for verifiable and repeatable experiments that are useful in practice.

Please see FSE 2021 website for venue, registration, and visa information

Keynote 1 by Dr. David Lo, Singapore Management University, Singapore

Infusing AI to Squash Bugs

Abstract: Bugs are prevalent in software systems, and they are typically recorded in a variety of issue tracking systems. Needless to say, these bugs need to be resolved to improve software quality and increase customer satisfaction. Unfortunately, resolving them is not a trivial matter -- many remain unresolved for weeks (or even years!) and involve a complex multi-step process. Can AI help? Of course! AI can be trained on rich historical data to allow it to mimic developers in squashing bugs (and more!). For AI to work well, it often needs to be trained on a sizable amount of data. Fortunately, many projects maintain large historical data in various repositories that are publicly available, collectively forming big data of historically squashed bugs (that I like to refer to as "Big Fix"). Although full automation is not feasible yet (at least in the general sense), AI-infused solutions can support developers in their quest to squash bugs. This talk will provide an overview and reflection of the large body of work that builds automated tools (including predictive models) that leverage the power of AI, trained on rich data in issue repositories (and more!), for various tasks in the bug resolution process. Some open challenges and preliminary solutions will also be presented, with the goal of encouraging more research in this exciting area in the intersection of AI and Software Engineering.

Biography: David Lo is an ACM Distinguished Member (Scientist) and Professor of Computer Science at Singapore Management University, leading the Software Analytics Research (SOAR) group. His research interest is in the intersection of software engineering, cybersecurity, and data science, encompassing socio-technical aspects and analysis of different kinds of software artefacts, with the goal of improving software quality and security and developer productivity. His work has been published in major and premier conferences and journals in the area of software engineering, AI, and cybersecurity. He has won more than 15 international research and service awards including 6 ACM SIGSOFT Distinguished Paper awards and the 2021 IEEE TCSE Distinguished Service award. He has served in more than 30 organizing committees and many program committees of research conferences, including serving as general or program co-chairs of MSR 2022, ASE 2020, SANER 2019, ICSME 2018, ICPC 2017, and ASE 2016. He is also serving on the Executive Committee of ACM SIGSOFT and the editorial board of a number of journals including IEEE Transactions on Software Engineering and Empirical Software Engineering. His former PhD students, trainees, and postdocs have secured faculty positions and employment at high-tech industries around the globe. More information about him and his research group are available at and

Keynote 2 by Dr. Audris Mockus, University of Tennessee, United States

Software Prediction at Scale: Models of Open Source Universe

Abstract: All software development relies on open-source software (OSS) for platforms, components, and other artifacts. These critical OSS projects are typically outside developers' control, with their own priorities and practices. OSS grows incredibly fast, making it impossible to determine all the benefits and risks for the projects that rely on or want to exploit it. While there has been a tremendous progress in software prediction models ranging from ways to optimize distributed development to prediction of risky changes, all of them, however, are focused on one or a few projects. The recent developments of software data collection such as World of Code (WoC), where the entirety of OSS data is collected and curated, provide opportunities to model the extensive movement of artifacts, people, and ideas among projects in this giant ecosystem. While some of the project-focused models could be extended to the entirety of OSS, we discuss the entirely new set of models predicting the formation of social and technical networks, spread of technologies, and transfer of expertise. These models provide the most compelling opportunities to address emerging ecosystem-scale software engineering problems. The talk will also briefly introduce practical ways to get started developing ecosystem-scale prediction models.

Biography: Audris Mockus received the BS and MS degrees in applied mathematics from the Moscow Institute of Physics and Technology in 1988, the MS degree in 1991 and the PhD degree in statistics from Carnegie Mellon University in 1994. He is interested in recovering information and creating models of reality from big operational data. His latest interest concern models of the entire open source software ecosystem based on version control data and anthropological phenomena hidden in large image collections. He is the Ericsson-Harlan D. Mills Chair Professor in the Department of Electrical Engineering and Computer Science of the University of Tennessee. Previously he worked at Avaya Research and AT&T and Lucent Bell Labs.

Program (in UTC time)

[2021-08-19 11:00-11:10] Opening
[2021-08-19 11:10-12:40] Session I: Effort Estimation
[2021-08-19 11:10-11:30] Ibtissam Abnane, Ali Idri, Mohamed Hosni and Alain Abran Heterogeneous Ensemble Imputation for Software Development Effort Estimation
[2021-08-19 11:30-11:50] Leandro L. Minku Multi-Stream Online Transfer Learning for Software Effort Estimation - Is It Necessary?
[2021-08-19 11:50-12:10] Leonardo Villalobos and Christian Quesada-López Comparative Study of Random Search Hyper-Parameter Tuning for Software Effort Estimation
[2021-08-19 12:10-12:40] Common Q&A
[2021-08-19 13:00-14:00] Keynote 1
Keynote: Prof. Audris Mockus , the University of Tennessee: Software Prediction at Scale: Models of Open Source Universe
[2021-08-20 11:00-12:00] Keynote 2
Keynote: Prof. David Lo, Singapore Management University: Infusing AI to Squash Bugs
[2021-08-20 12:00-13:00] Session II: Quality
[2021-08-20 12:00-12:20] Guru Bhandari, Amara Naseer and Leon Moonen CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software
[2021-08-20 12:20-12:40] Khaled Al-Sabbagh, Miroslaw Staron, Regina Hebig and Francisco Gome A Classification of Code Changes and Test Types Dependencies for Improving Machine Learning Based Test Selection
[2021-08-20 12:40-13:00] Common Q&A
[2021-08-20 13:00-13:10] Closing

Topics of Interest

PROMISE papers can explore any of the following topics (or more).

Application-oriented papers:

  • prediction of cost, effort, quality, defects, business value;
  • quantification and prediction of other intermediate or final properties of interest in software development regarding people, process or product aspects;
  • using predictive models and data analytics in different settings, e.g. lean/agile, waterfall, distributed, community-based software development;
  • dealing with changing environments in software engineering tasks;
  • dealing with multiple-objectives in software engineering tasks;
  • using predictive models and software data analytics in policy and decision-making.

Ethically-aligned papers:

  • Can we apply and adjust our AI-for-SE tools (including predictive models) to handle ethical non-functional requirements such as inclusiveness, transparency, oversight and accountability, privacy, security, reliability, safety, diversity and fairness?

Theory-oriented papers:

  • model construction, evaluation, sharing and reusability;
  • interdisciplinary and novel approaches to predictive modelling and data analytics that contribute to the theoretical body of knowledge in software engineering;
  • verifying/refuting/challenging previous theory and results;
  • combinations of predictive models and search-based software engineering;
  • the effectiveness of human experts vs. automated models in predictions.

Data-oriented papers:

  • data quality, sharing, and privacy;
  • curated data sets made available for the community to use;
  • ethical issues related to data collection and sharing;
  • metrics;
  • tools and frameworks to support researchers and practitioners to collect data and construct models to share/repeat experiments and results.

Validity-oriented papers:

  • replication and repeatability of previous work using predictive modelling and data analytics in software engineering;
  • assessment of measurement metrics for reporting the performance of predictive models;
  • evaluation of predictive models with industrial collaborators.


Important Dates

  • Abstracts due: May 28th June 7th, 2021
  • Submissions due: June 3rd June 10th, 2021
  • Author notification: June 28th July 4th, 2021
  • Camera ready: July 8th, 2021
  • Conference Date: August 19th-20th, 2021


Journal Special Section

    Following the conference, the authors of the best papers will be invited to submit extended versions of their papers for consideration in a special section of a journal. The details will be announced later.


Call for papers

Technical papers: (10 pages) PROMISE accepts a wide range of papers where AI tools have been applied to SE such as predictive modeling and other AI methods. Both positive and negative results are welcome, though negative results should still be based on rigorous research and provide details on lessons learned.

Industrial papers: (2-4 pages) Results, challenges, lessons learned from industrial applications of software analytics.

New idea papers: (2-4 pages) Novel insights or ideas that may yet to be fully tested.

Defect prediction challenge-track papers: (2 pages). For details on this challenge track, see the challenge CFP. In summary, we provide just-in-time defect prediction data, and you can submit any defect prediction approach of your choosing to participate in this challenge. The highest scoring approach wins. Note that such challenge track papers would be a suitable term project from university students.

Publication and Attendance

Accepted papers will be published in the main ACM publication program and will be available electronically via ACM Digital Library.

Each accepted paper needs to have one registration at the full conference rate and be presented in person at the conference.

Green Open Access

Similar to other leading SE conferences, PROMISE supports and encourages Green Open Access, i.e., self-archiving. Authors can archive their papers on their personal home page, an institutional repository of their employer, or at an e-print server such as arXiv (preferred). Also, given that PROMISE papers heavily rely on software data, we would like to draw authors that leverage data scraped from GitHub of GitHub's Terms of Service, which require that "publications resulting from that research are open access".

We also strongly encourage authors to submit their tools and data to Zenodo, which adheres to FAIR (findable, accessible, interoperable and re-usable) principles and provides DOI versioning.



PROMISE 2021 submissions must meet the following criteria:
  • be original work, not published or under review elsewhere while being considered;
  • conform to the ACM SIG proceedings template;
  • not exceed 10 (4) pages for full (short) papers including references;
  • be written in English;
  • be prepared for double blind review, except for data papers, where double blind is optional (see instructions below);
  • be submitted via HotCRP (please choose the paper category appropriately).
Submissions will be peer reviewed by at least three experts from the international program committee. Submissions will be evaluated on the basis of their originality, importance of contribution, soundness, evaluation, quality, and consistency of presentation, and appropriate comparison to related work. Accepted papers will be published in the main ACM publication program and will be available electronically via ACM Digital Library. Each accepted paper needs to have one registration at the full conference rate and be presented in person at the conference.

Double-Blind Review Process

PROMISE 2021 will employ a double-blind review process, except for data papers, where double blind is optional (see below). This means that the submissions should by no means disclose the identity of the authors. The authors must make every effort to honor the double-blind review process. In particular, the authors’ names must be omitted from the submission and references to their prior work should be in the third person.

If the paper is about a data set or data collection tool, double-blind review is not obligatory. However, authors may choose to opt in double blind reviews by making their data repository or data collection tool anonymous and omitting their paper authorship information if they wish to. If in doubt regarding the obligatoriness of double blind review for your specific case, please contact the PC chairs.

Why double blind?

Double blind has now taken off, mostly driven by the considerable number of requests from the software engineering community. We have also now decided to respond to this call. We are aware that there are certain challenges with regard to the double blind review process as detailed by Shepperd [1]. However, we hope that the benefits, some of which are discussed by Le Goues [2], will overcome those challenges.


Programme Committee

  • Ayse Tosun, Istanbul Technical University
  • Burak Turhan, University of Oulu
  • Carmine Gravino, University of Salerno
  • Chunyang Chen, Monash University
  • Eunjong Choi, Kyoto Institute of Technology
  • Fabio Palomba University of Salerno
  • Gema Rodriguez Perez, University of Waterloo
  • Hirohisa Aman, Ehime University
  • Hironori Washizaki, Waseda University
  • Hoa Khanh Dam, University of Wollongong
  • Hongyu Zhang, The University of Newcastle
  • Koji Toda, Fukuoka Institute of Technology
  • Lech Madeyski, Wroclaw University of Science and Technology
  • Martin Shepperd, Brunel University London
  • Neng Zhang, Sun Yat-sen University
  • Osamu Mizuno, Kyoto Institute of Technology
  • Tracy Hall, Lancaster University
  • Vu Nguyen, University of Science, VNU-HCM. KMS Technology, Inc.
  • Weiyi Shang, Concordia University
  • Xiaoyuan Xie, Wuhan University
  • Yasutaka Kamei, Kyushu University
  • Yiming Tang, Concordia University
  • Yuming Zhou, Nanjing University
  • Zhiyuan Wan, Zhejiang University

Steering Committee

General Chair

PC Co-Chairs

Challenge Track Co-chairs

Publicity Chair