The Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3)1 was held on 28–29 September 2015 in Boulder, Colorado, USA. Previous events in the WSSSPE series are WSSSPE12 [1, 2], held in conjunction with SC13; WSSSPE1.13, a focused workshop organized jointly with the SciPy conference4; WSSSPE25 [3, 4], held in conjunction with SC14; and WSSSPE2.16, a focused workshop organized again jointly with SciPy7.
Progress in scientific research is dependent on the quality and accessibility of software at all levels. Hence it is critical to address challenges related to development, deployment, maintenance, and overall sustainability of reusable software as well as education around software practices. These challenges can be technological, policy based, organizational, and educational; and are of interest to developers (the software community), users (science disciplines), software-engineering researchers, and researchers studying the conduct of science (science of team science, science of organizations, science of science and innovation policy, and social science communities). The WSSSPE1 workshop engaged a broad scientific community to identify challenges and best practices in areas of interest to creating sustainable scientific software. WSSSPE2 invited the community to propose and discuss specific mechanisms to move towards an imagined future for software development and usage in science and engineering. But WSSSPE2 did not have a good way to enact those mechanisms, or to encourage the attendees to follow through on their intentions.
The WSSSPE3 workshop included multiple mechanisms for participation and encouraged team building around solutions. WSSSPE3 strongly encouraged participation of early-career scientists, postdoctoral researchers, graduate students, early-stage researchers, and those from underrepresented groups, with funds provided to the conference organizers by the Moore Foundation, the National Science Foundation (NSF), and the Software Sustainability Institute (SSI) to support the travel of potential participants who would not otherwise be able to attend the workshop. These funds allowed 16 additional people to attend and participate.
WSSSPE3 also included two professional event organizers/facilitators from Knowinnovation who helped the organizing committee members plan the workshop agenda, and during the workshop, they actively engaged participants with various tools, activities, and reminders.
This report is based on collaborative notes taken during the workshop, which were linked from the GitHub issues that represented the potential and actual working groups8. Overall, the report discusses the organization work done before the workshop (§2), the keynote (§3), and the lightning talks presented at the meeting (§4). The report also gives summaries of action plans proposed by the working groups (§5), then gives longer descriptions of the activities that occurred in each of the working groups that made substantial progress (§6), and provides some conclusions (§7). The appendices contain lists of the organizing committee (Appendix A), the registered attendees (Appendix B), and the travel award recipients (Appendix C).
WSSSPE3 was based on the work done in WSSSPE1 and WSSSPE2, but aimed at starting a process to make progress in sustainable software, as the calls for participation said:
The WSSSPE1 workshop engaged the broad scientific community to identify challenges and best practices in areas relevant to sustainable scientific software. WSSSPE2 invited the community to propose and discuss specific mechanisms to move towards an imagined future practice of software development and usage in science and engineering. WSSSPE3 will organize self-directed teams that will collaborate prior to and during the workshop to create vision documents, proposals, papers, and action plans that will help the scientific software community produce software that is more sustainable, including developing sustainable career paths for community members. These teams are intended to lead into working groups that will be active after the workshop, if appropriate, working collaboratively to achieve their goals, and seeking funding to do so if needed.
The first call for participation requested lightning talks, where each author could make a brief statement about work that either had been done or was needed, with the goal of contributing to the discussion of one or more working groups. There were 24 lightning talks submitted; after a peer-review process, 16 were accepted, as discussed further in Section 4.
The first call also discussed the potential action topics that came out of WSSSPE2, and requested additional suggestions. The combination of existing and new topics led to the following 18 potential topics that were advertised in the subsequent calls for participation:
WSSSPE3 began with a keynote speech delivered by Professor Matthew Turk from the Department of Astronomy, University of Illinois, titled Why Sustain Scientific Software?. Turk is a prolific scientific software practitioner and has extensive experience working on large collaborative projects employing modern computing tools . He also co-organizes and champions WSSSPE events.
In his keynote address, Turk recapped the course of development of WSSSPE workshops over the past few years, alongside his career development from a postdoc to an academic. The first WSSSPE workshop was at the Supercomputing conference (SC13) in 2013, but he observed that the notion of sustainable scientific software drew in an audience beyond supercomputing. In the following year, WSSSPE1.1 at SciPy had speakers talking about how software has been sustained inside the scientific Python community. WSSSPE2 at SC14 had breakout group discussions coming up with actionable items, and WSSSPE2.1 at SciPy 2015 was similar. Turk noted the different atmosphere of the surrounding large conferences, despite similar WSSSPE participants.
WSSSPE3 left the traditional Supercomputing Conference environment this year, and in Turk’s words, this change spoke to the fact that scientific software comes from many different types of inquiries, deployment, strategies for maintenance, users, and ways of measuring the value of a piece of software. It appeared to Turk that the supercomputing community generally adopts some top-down approaches, whereas the SciPy community more often than not uses more bottom-up systems. According to Turk, there is a divergence in views about progress in software: the supercomputing community thinks that software is getting harder, with exascale computing and optimization issues in mind; but the SciPy community thinks that software is becoming better, with emerging tools such as Jupyter and productivity packages for research workflows. Admitting such comparisons are somewhat unfair generalizations, Turk reminded the audience that the different approaches bring different types of ideas to the table, and he welcomed WSSSPE3 being conducted outside existing preconceptions.
Returning to the topic of his talk, Turk invited the audience to picture scientific software as a flower on a landscape under the Sun, which may represent a number of measurable factors such as number of citations; growth of a community and number of contributors; amount of funding; prestigious prizes awarded; stability of the community in terms of leadership transitions, serving community needs, not breaking test suites, and performance on new architectures. But all these metrics are strictly speaking proxies for the values and the impact scientific software bears. What we can measure does not give us direct insight—it just gives us proxies of insight.
Turk then moved onto various different definitions of sustainability. His favorite one was “keeping up with bug reports,” where even if no new features were added, the software remains sustainable. Another definition of sustainability Turk mentioned was “adding of new features,” or “maintaining the software for a long period of time” such as the cases of TEX or LATEX with community help. A notion Turk heard often at supercomputing conferences was that sustainable software “continues to work on new architectures.” Yet another metric was “people continuing to be able to learn how to use and apply the software.” A funder Turk heard talked about sustainability as “continuing to get funded.” Turk also recalled that Greg Wilson, among others, said in WSSSPE1.1 that his view of sustainable software was software that “continued to give the same results over time.” A last measure of sustainability Turk presented was “the ability to transition between different people developing and using a piece of software.”
At WSSSPE1, several models were presented for ensuring sustainability. Turk considered that a familiar one was a funded piece of software where an external agency provided funds to a group who are not necessarily exclusively working on and developing the software, keeps it going, and provides it to the scientific community. The model of productized software, in which a piece of software has grown to the point that research groups or people are willing to support it with some amount of funding, for instance, a subscription to use cloud services that deploy a piece of software, or purchase of a piece of software. A final model Turk felt conflicted about is a volunteer model that is traditional old-school—not modern-day open source—development.
Turk discussed whether productizing scientific software was synonymous with being sustainable and self-sufficient. He thought it was not necessarily the case and furthermore, it could lead to a divergence of interests between users and developers.
Turk reminded the audience that the volunteer model means unpaid labor. On this note, he recommended Ashe Dryden’s blog post on the ethics of unpaid labor and the open source software community9. Oftentimes, a person funded to work full time on a scientific project can spend a small amount of time for working on a piece of software necessary for that project. However, researchers’ abilities to participate in that volunteer community are not always the same and may not always be aligned with their research projects. From Turk’s experience, we cannot always rely on unpaid labor and volunteer time to sustain a piece of software—this came down to the notions of the top-down and the bottom-up approaches, i.e., the funded versus the grassroots. However, Turk pointed out that bottom-up, volunteer-driven projects can be just as large-scale as a top-down software development project.
Turk said that sustaining scientific software really meant to him conducting scientific inquiries, often by some specific software, and sustaining the people we care about, our careers, and the future of our fields. According to Turk, we all have an invested stake in sustaining scientific software. Hence, having “sustained” projects can suffocate new projects, so we need to make sure we don’t cause novel ideas and packages to suffer at the hands of the status quo.
Turk talked about possible reasons why we want to sustain scientific software: devotion to science and interests in pursuing the next stage of research; fun and creative thrill in writing codes and papers; usefulness with measurable impacts, for example, LINPACK and HDF groups providing data storage to satellites, which goes beyond usefulness to necessity. Lastly, Turk presented his wishlist of questions to be answered in the future:
After the keynote, WSSSPE3 continued with lightning talks. These short talks were intended to give an opportunity for attendees to quickly highlight an important issue or a potential solution.
After the keynote and lightning talks, the workshop facilitators led an exercise to create working groups for the rest of the meeting. The attendees first suggested additional topics beyond those in the call for participation (as listed in §2). One topic that was introduced at this point was “Building Sustainable User Communities for Scientific Software.” The full set of topics was then placed on flipchart-sized pieces of paper around the walls, and the attendees voted on which topics they were strongly interested in working on at WSSSPE3, and which topics they were generally interested in contributing to, but not as strongly. This led to some topics being taken out of the mix for the rest of the workshop, since not enough people wanted to contribute to them to lead to a useful discussion. Additionally, some topics were combined by the participants who felt they were closely linked and that a group could address multiple of them in a single discussion.
After this, the attendees broke up into small working groups to discuss the remaining topics during most of the remaining 1 1/2 days. A high-level summary of each topic and group’s work can be found in the subsections of this section, and for all but one group, more detailed notes on each group’s discussions can be found in §6.
Midway through the first afternoon and between the two days, each group had a chance to talk about what progress they had made. As discussed below (in §5.4), the group that formed to discuss Legacy Software dissolved after the first session as group members left to join other groups. In addition, in the morning of the second day, a small set of reviewers/advisors (external to the groups but chosen by the group members) visited each group to listen to what they group was planning and to provide feedback.
Reviewing multiple past articles and talks at different meetings like WSSSPEx [2, 4, 23, 24, 25] and analyzing and promoting sustainable scientific software makes it clear that there are several common and recurring ideas that underpin success in developing sustainable software. However, outside of a small community, this knowledge is not widely shared. This is especially true for the large community of scientists who generate most of the software used by scientists but are not primarily software developers. In this scenario, a clear and precise exposition of these best practices collected from many sources and open collaboration among all in the community in a single source (e.g., journal paper, tutorial) that can be widely disseminated is necessary and likely to be very valuable.
The creation of such a “best practices” document will build upon the range of activities and topics discussed at WSSSPE3 and associated prior meetings. This working group will attempt to distill the emerging body of knowledge into this document. The large number of articles from the NSF-funded SI2 projects (SSE and SSI), “lightning talks,” “white papers,” and reports from different workshops have created a large if somewhat diffuse source for this report.
Core questions that will need to be explored are in reproducibility, reliability, usability, extensibility, knowledge management, as well as continuity (transitions between people). Answers to these questions will guide the group to learn how a software tool becomes part of the core workflow of well-identified users (stakeholders) relating to tool success and hence sustainability. Ideas that may need to be explored include:
Sustainability requires community participation in code development and/or a wide adoption of software. The larger the community base is using a piece of software, the better are the funding possibilities and thus also the sustainability options. Additionally developer commitment to an application is essential and experience shows that software packages with an evangelist imposing strong inspiration and discipline are more likely to achieve sustainability. While a single person can push sustainability to a certain level, open source software also needs sustained commitment from the developer community. Such sustained commitments include diverse tasks and roles, which can be fulfilled by diverse developers with different knowledge levels. Besides developing software and appropriate software management with measures for extensibility and scalability of the software, active (expertise) support for users via a user forum with a quick turnaround is crucial. The barrier to entry for the community as users as well as developers has to be as low as possible.
For additional information about the discussion, see Section 6.1.
The creation of a document on best practices needs a large and diverse community involved. The group has enlisted over ten contributors from the attendees at the WSSSPE3 and those on the mailing list. The primary mechanism for developing this document will be to examine and analyze the success of several well known community scientific software and organizations supporting scientific software. The group will attempt then to abstract general principles and best practices. Some of the tools identified for such analysis are the general purpose PETSc toolkit for linear system solution, NWChem for computational chemistry and the CIG (Computational Infrastructure for Geodynamics) organization dedicated to supporting an ensemble of related tools for the geodynamics community. The group also established a timeline and a rough outline (see Section 6.1) for the report.
The landing page with instructions, timeline and the white paper is here: https://drive.google.com/drive/folders/0B7KZv1TRi06fbnFkZjQ0ZEJKckk. Discussions can be also continued in https://github.com/WSSSPE/meetings/issues/42.
Research Software Engineers (RSEs)—those who contribute to science and scholarship through software development—are an important part of the team needed to deliver 21st century research. However, existing academic structures and systems of funding do not effectively fund and sustain these skills. The resulting high levels of turnover and inappropriate incentives are significant contributing factors to low levels of reliability and readability observed in scientific software. Moreover, the absence of skilled and experienced developers retards progress in key projects, and at times causes important projects to fail completely.
Effective development of software for advanced research requires that researchers work closely with scientific software developers who understand the research domain sufficiently to build meaningful software at a reasonable pace. This requires a collaborative approach—where developers who are fully engaged or invested in the research context are co-developing software with domain academics.
The solution this group envisions entails creating an environment where software developers are a stable part of a research team. Such an environment mitigates the risk of losing a key developer at a critical moment in a projects lifetime, and provides the benefits of building a store of institutional knowledge about specific projects as well as about software development for today’s research. The group’s vision is to find a way to promote a university/research institute environment where software developers are stable components of research project teams.
One strategy to promote stability is implementing a mechanism for developers to obtain academic credit for software development work (see §5.8.) With such a mechanism in place, traditional academic funding models and career tracks could properly sustain individuals for whom software development is their primary contribution to research. A contributing factor to the problem with the current academic reward system is the devastating effect on an academic publication record resulting from time in industry; such postings often develop exactly the skills that research software engineers need, yet returns to university positions following an industry role are penalized by the current structures. Retention of senior developers is hard, because these people are high in demand by the economy. However, people who have a PhD in science and enter industry, may desire to return for diverse reasons, and should be welcomed back.
While developing new mechanisms in the current academic reward system is a worthy aspirational goal, such a dramatic change in this structure does not seem likely in a time scale relevant to this working group. Accordingly, the working group sought alternative solutions that may be achievable within the context of existing academic structures. The group felt that developing dedicated research software engineering roles within the university and finding stable funding for those individuals is the most promising mechanism for creating a stable software development staff.
Measures of impact and success for research programming groups, as well as for individual research software engineers, will be required in order to make the case to the university for continued funding. Research software engineers will hopefully not be measured by publications, but by other metrics. Middle-author publications are common for RSEs. Most RSEs welcome co-authorship on papers when the PIs think that the contribution deserves it.
It is hard for an individual PI in a university or college to support dedicated research software engineering resources, as the need and funding for these activities are intermittent within a research cycle. To sustain this capacity, therefore, it is necessary to aggregate this work across multiple research groups.
One solution is to fund dedicated software engineering roles for major research software projects at national laboratories or other non-educational institutions. This solution is in place and working well for many well-used scientific codebases. However, this strategy has limited application, as much of the body of software is created and maintained in research universities. Therefore, the group argues that research institutions should develop hybrid academic-technical tracks for this capacity, where employees in this track work with more than one PI, rather than the traditional RA role within a single group. This could be coordinated centrally, as a core facility, perhaps within research computing organizations which have traditionally supported university cyberinfrastructure, library organizations, or research offices. Alternatively, these groups could be organizationally closer to research groups, sitting within academic departments. The most effective model will vary from institution to institution, but the mandate and ways of working should be similar.
Having convinced themselves that this would be a positive innovation, the group members were then faced with the specific question of how to fund the initiation of this activity. A self-sustaining research software group will support itself through collaborations with PIs in the normal grant process, with PIs choosing to fund some amount of research software engineering effort through grants in the usual way. However, to bootstrap such a function to a level where it has sufficient reputation and client base to be self-sustaining will generally require seed investment.
This might come from universities themselves (this was the model that led to the creation of the group in University College London), but more likely, seed funding needs to come from research councils or other funding bodies (as with the Research Software Engineering Fellowship provided by the UK Engineering and Physical Sciences Research Council). The group therefore recommends that funding organizations consider how they might provide such seed funding.
Success, appropriately measured, will help make the case to such funding bodies for further investment. One might expect that metrics such as improved productivity, software adoption rates, and grant success rates would be sufficient arguments in favor of such a model. However, useful measurement of code cleanliness, and the resulting productivity gains, is an unsolved problem in empirical software engineering. To measure “what did not go wrong” because of an intervention is particularly hard.
The working group finally noted that the institutional case for such groups is made easier by having successful examples to point to. In the UK, a collective effort to identify the research software engineering community, with individuals clearly stating “I am a research software engineer,” has been important to the campaign. It will be useful to the global effort to similarly identify emerging research software organizations, and also, importantly, to identify longer-running research software groups, which have in some cases had a long running sui-generis existence, but which now can be identified as part of a wider solution. There remains the problem of how to “sell” the value of this investment to investigators within a university. This is an issue best addressed by the individual organizations that embark on the plan.
For more details on the discussion, see Section 6.2.
The first step in moving this strategy forward is to gather a list of groups that selfidentify as research software engineering groups, and to reach out to other organizations to see if there may be a widespread community of RSEs who do not identify themselves as such at this time. This working group will collect information about the organizational models under which these groups function, and how they are funded. For example, how many research universities currently fund people in the RSE track, whether they bear the RSE moniker or not. Are these developers paid by the university or through a program supported by research grants/individual PIs? How did they bootstrap the developer track to get this started? How successful is the university in getting investigators to pay for fractional RSEs? The group will author a report describing their findings, should funding be available to conduct the investigation.
Most scientific software is produced as a part of grant-funded research projects typically sponsored by federal governments. If we are interested in the sustainability of scientific software, then we need to understand what exactly happens when that sponsorship ends. More than likely, the project and its resulting software will need to undergo some kind of transition in funding and consequently governance.
At WSSSPE3, this working group was interested in better understanding successful pathways for scientific software to “transition” from grant-funded research projects to industry sponsorship. (This may be an initially awkward phrase—some software projects will begin their life being sponsored by industry, or result in collaboration between industry and academia. In such cases, there is still a need to understand how IP and how maintenance of the software is sustained over time.)
Most previous research and discussion of industry and academic collaboration, sharing, and funding of research software has focused on the impact of such arrangements. Examples of these types of reports are:
Although sustainability transitions are often studied under the broad umbrella of “technology transfer,” the group believes there are likely to be a number of different ways in which a pathway from initial production to long-term maintenance and secure funding is achieved. In short, industry sponsorship and/or direct participation is an important aspect of sustaining scientific software, but our current understanding of these transitions focuses narrowly on commercial successes or failures of those collaborations.
In looking at existing literature that addresses industry transitions, many reports (such as those listed above) focus on benefits that accrue to the private sector, or to a government that originally sponsored the research project. This literature does not address the impact that these transitions have on the accessibility or usability of the software, or the impact that these transitions have on the career of the researchers involved.
For more detail on the group’s discussion, see Section 6.3.
Plans for carrying forward are currently unclear—this project would require sustained attention and effort from the group members, and at least some amount of funding in order for those members to be involved for extended periods of time.
The broad goals that the group would like to accomplish are:
The main plan for the group going forward is the creation of a white paper on the topic of sustainability transitions.
This group met only briefly, for one period on the first day. They discussed that it is difficult to define legacy code because there is so much stigma associated with the term. At some point there will be more difficulty and resources wasted trying to keep legacy software supported, but it will eventually be too expensive compared to how much it would be to just rebuild the software or kill it. Most of the group members were not able to attend on the second day, and those who were able to attend joined other groups.
Principles for software engineering form the basis of methods, techniques, methodologies and tools . However, there is often a mismatch between software engineering theory and practice particularly in the fields of computational science and engineering, which can lead to the development of unsustainable software [27, 28]. Understanding and applying software engineering principles is essential in order to create and maintain sustainable software .
This group’s discussion focused on identifying existing principles of software engineering design that could be adopted by the computational science and engineering communities.
Software engineering principles form the foundation of methods, techniques, methodologies, and tools. Consisting of members from different backgrounds, including quantum chemistry, epidemiology, computer science, software engineering, and microscopy, this group discussed the principles of software engineering design for sustainable software (starting with principles from the Karlskrona Manifesto on Sustainability Design , Tate , and the Software Engineering Body of Knowledge (SWEBOK) ) and their application in various domains including quantum chemistry and epidemiology. The group examined the principles and took a retrospective analysis of what the developers did in practice against how the principles could have made a difference, and asked, what do the principles mean for computational scientific and engineering software, and how do the principles relate to non-functional requirements? It appeared that the sustainable software engineering principles should be mapped to two core quality attributes that underpin technically sustainable software: extensibility, the software’s ability to be extended and the effort level required to implement the extension; and maintainability: the effort required to locate and fix an error in operational software.
For more information about the discussion, see Section 6.4.
The next steps in this endeavor are to (1) Systematically analyze a number of example systems from different scientific domains with regards to the identified principles, to (2) Identify the commonalities and gaps in applying those principles to different scientific systems, and to (3) Propose a set of guidelines on the principles and examine how they exemplarily apply to scientific software systems. Preliminary work will be carried out through undergraduate or post-graduate student projects.
Metrics for scientific software are important for many purposes, including tenure and promotion, scientific impact, discovery, reducing duplication, serving as a basis for potential industrial interest in adopting software, prioritizing development and support towards strategic objectives, and making a case for new or continued funding. However, there is no commonly-used standard for collecting or presenting metrics, nor is it known if there is a common set of metrics for scientific software. It is imperative that scientific software stakeholders understand that it is useful to collect metrics.
The group discussion focused on identifying existing frameworks and activities for scientific software metrics. The group identified the following related activities:
The group discussion began by agreeing on the common purpose of creating a set of guidance giving examples of specific metrics for the success of scientific software in use, why they were chosen, what they are useful to measure, and any challenges and pitfalls; then publish this as a white paper. The group discussed many questions related to useful metrics for scientific software including addressing if there is a common set of metrics that can be filtered in some way, can metrics be fit into a common template, which metrics would be the most useful for each stakeholder, which metrics are the most helpful and how would we assess this, how are metrics monitored, and many more. A more complete bulleted list of these questions can be found in Section 6.5. Next, a roadmap for how to proceed was discussed, including creating a set of milestones and tasks. The idea was put forth for the group to interact with the organizing committee of the 2016 NSF Software Infrastructure for Sustained Innovation (SI2) PI workshop in order to send a software metrics survey to all SI2 and related awardees as a targeted and relevant set of stakeholders. The five solicitations for software elements released under the NSF SI2 program all included metrics as a required component with submitters requested to include “a list of tangible metrics, with end user involvement, to be used to measure the success of the software element developed, … ”. These metrics are then reported as part of annual reports to NSF by the projects. Although neither the proposal text describing the metrics nor the reported metric results are publicly available, there is reason to believe that the community will be willing to provide this information through a survey mechanism. This survey would be created by one of the student group members. Similarly, it was suggested that a software metrics survey be sent to the UK SFTF (Software For The Future, led by the Engineering and Physical Sciences Research Council) and TRDF (Tools and Resources Development Fund, led by the Biotechnology and Biological Sciences Research Council) software projects to ask them what metrics would be useful to report. The remainder of the discussion focused mainly on the creation of a white paper on this topic. This resulted in a paper outline and writing assignments with the goal of publishing in venues including WSSSPE4, IEEE CiSE (Institute of Electrical and Electronic Engineers Computing in Science and Engineering magazine), or JORS (Journal of Open Research Software). More information about the group discussion is available in Section 6.5.
The main plan for the group going forward is the creation of a white paper on the topic of useful metrics for scientific software. The authoring of this white paper would happen in parallel with the creation of a survey by the group with the survey results to be incorporated in the white paper. The timeline for completion of the white paper is approximately one year targeting venues discussed in the previous section.
In lieu of a landing page, the Useful Metrics for Scientific Software working group requests an email be sent to Gabrielle Allen34 to find out more about the group’s efforts and how to participate.
This group explored a rapidly growing array of training that is seen to contribute to sustainable software. The offerings are diverse, providing training that is more or less directly relevant to sustainable software. While research institutions support professional development for research staff, the skills taught which might impact on sustainable software are limited at best, often lacking a clear and coherent development pathway. Bringing together those involved in leading relevant initiatives on a regular basis could helpfully coordinate this growing array of training opportunities.
Three existing venues for discussion of related events are identified:
Some next steps were identified to quickly test whether there is interest in establishing a community committed to increasing the degree of coordination across training projects. See Section 6.6 for more details about the discussion.
The main plan for the group is to convene a discussion to explore bringing together regular meetings of those involved in leading relevant training projects.
The Training working group requests an email be sent to Nick Jones35 to find out more about the group’s efforts and how to participate.
Modern scientific and engineering research often relies considerably on software, but currently no standard mechanism exists for citing software or receiving credit for developing software akin to receiving credit via citations for writing papers. Ensuring that developers of such scientific software receive credit for their efforts will encourage additional creation and maintenance. Standardizing software citations offers one route to establishing such a citation and credit mechanism. Software is currently eligible for DOI assignment, but DOI metadata fields are not well tuned for software compared to publications. Some software providers apply for DOIs but it is still not widely adopted. Also, there is no mechanism to cite software dependencies within software in the same way papers cite supporting prior work.
Publishing Software Working Group (§5.9): publishing a software paper offers one existing mechanism for receiving credit, and further developing new publishing concepts for software will strengthen our activities.
A number of groups external to WSSSPE (although with some overlapping members) are also focused on aspects of software credit, including the FORCE11 Software Citation Working Group (see plans for coordination below). In addition, a Software Credit workshop36 convened in London on October 19, following the conclusion of WSSSPE3. See Section 6.7 for more detailed discussion of related activities.
The group discussed a number of topics related to software credit, including a contributorship taxonomy, software citation metadata, standards for citing software in publications, and increasing the value of software in academic promotion and tenure reviews. Although initial discussions both prior to and during WSSSPE3 focused on contribution taxonomy and dividing credit, discussing as an example the Entertainment Identifier Registry  used in the entertainment industry, the group decided to prioritize software citation. This decision was motivated by the idea that standardizing citations for software would introduce some initial credit for developers, and later the quantification of credit could be refined based on concepts such as transitive credit [37, 38].
The majority of the remaining discussion focused on standardizing (1) the metadata necessary for software to be cited and (2) the mechanism for citing software in publications. Moreover, discussions also oriented around the indexing of software citations necessary for establishing a software citation network either integrated with the existing paper citation ecosystem or complementary to it. See Section 6.7 for a more detailed summary of the working group’s discussion on these topics.
The group already merged with the FORCE11 Software Citation Working Group (SCWG), and their efforts will focus (over the next six to nine months) on developing a document describing principles for software citation. Following the publication of that document, the group will focus on outreach to key groups (e.g., journals, publishers, indexers, professional societies). Longer-term plans include working with indexers to ensure that software citations are indexed and pursuing an open/community indexer; these activities may be organized by future FORCE11 working groups.
This working group explored the value of executable papers (papers whose content includes the code needed to produce their own results), and other forms of publishing which include dynamic electronic content. Transitioning to this type of publication offers possibilities of addressing, or partially addressing, sustainability concerns such as reproducibility, software credit, and best practices.
The group felt that the best way to encourage the use of these new publishing concepts would be to create and curate a list of publishing venues that support them. The Software Sustainability Institute agreed to host this list.
See Section 6.8 for more details about the discussion.
The plan is to create and curate a web page describing executable papers, their value, and a list of what publishers support them. The group expects the page to be available in early January of 2016 on the Software Sustainability Institute’s website.
The aforementioned page will be published on the Software Sustainability Institute website: http://www.software.ac.uk.
User communities are the lifeblood of sustainable scientific software. The user community includes the developers, both internal and external, of the software; direct users of the software; other software projects that depend on the software; and any other groups that create or consume data that is specific to the software. Together these groups provide both the reason for sustaining the software and, collectively, the requirements that drive its continued evolution and improvement.
There are a number of activities already in progress that are targeted at improving the user community for open-source software, including Mozilla Science’s “Working Open Project Guide”  and “UK Collaborative Computational Projects” (CCP)39, or books such as “Art of Community” by Jono Bacon .
Discussion revolved around a few questions: what are the benefits of having a “community” for software sustainability; what practices and circumstances may lead to having and maintaining a community; how can funding help or hinder this process; and perhaps most importantly, how can best practices be described and distilled into a document that can help new projects.
All the group members agreed on a few points: software must not only offer value, but there must be some support for users; and funding can help pay for that support, in addition to the usual funding for software development. Openness is generally a virtue. An evangelist, either in the form of a single person or some domain-specific group of users, is often the key factor.
Additional details on the group’s discussion can be found in Section 6.9.
The most important next steps is a “Best Practice” document, which would describe what successful projects with engaged communities look like, how to replicate this type of project, and look at the end of life of a community project. Another next step would be better training to increase recognition of need for science software projects to focus on building and supporting their user communities.
This section captures detailed reports from each working group that made significant progress. Each subsection records the discussion of a group, as written by that group at that time (in the first person and in the present/future tense.) Thus, the subsections are records of what the groups did and planned, as of the end of the WSSSPE3 workshop.
Sandra Gesing42 will serve as the point of contact for this working group, and be responsible for ensuring timely progress of the planned actions.
Core questions that will need to be explored are in reliability, reproducibility, usability, extensibility, knowledge management, and continuity (transitions between people). Answers to these will guide us on how a software tool becomes part of the core workflow of well identified users (stakeholders) relating to tool success and hence sustainability. Ideas that may need to be explored include:
Sustainability requires community participation in code development and/or a wide adoption of software. The larger the community base is using a piece of software, the better are the funding possibilities and thus also the sustainability options. Additionally, the developers’ commitment to an application is essential and experience shows that software packages with an evangelist imposing strong inspiration and discipline are more likely to achieve sustainability. While a single person can push sustainability to a certain level, open source software also needs sustained commitment from the developer community. Such sustained commitments include diverse tasks and roles, which can be fulfilled by diverse developers with different knowledge levels. Besides developing software and appropriate software management with measures for extensibility and scalability of the software, active (expertise) support for users via a user forum with a quick turnaround is crucial. The barrier to entry for the community as users as well as developers has to be as low as possible.
There is an opportunity to collaborate on a white paper, which will be revisited regularly for further improvements, to enhance knowledge of the state of best practices, resulting in a peer-reviewed paper. We would like to reach a wide community by doing this. But these are also the challenges and obstacles – to get everyone to contribute to the paper and to reach the community.
White Paper Outline
The key next steps are to write an introduction, reach out to the co-authors, and to agree on the scope of the white paper.
Sandra Gesing and Abani Patra are the main editors and will organize the overall communication and the paper. Sections will be assigned to diverse co-authors.
At the moment we do not see any further requirements.
We might need funding for a journal publication (open-access options).
James Hetherington43 will serve as the point of contact for this working group, and be responsible for ensuring timely progress of the planned actions.
The group at WSSSPE:
This was further enhanced by additional discussions at the following GCE15 conference:
In addition to the points noted in the main discussion (§5.2), we also discussed the following:
“Are you an RSE or a RA?” is not properly a binary question. Most of us sit at different points on that spectrum, and move along it during our careers (usually from RA to RSE—examples of movement in the other direction from readers would be welcomed). Either way, the label “Research Software Engineer” is now starting to have some power. Many scientists do not want to be writing code; some do, to varying degrees. These groups can usefully support each other.
What is the power of the label? How can we get the word out about RSE support using the label?
Will research science developers be required in the long run? One issue that came up was whether the need for developers was a time bounded one; is it the case that the new generation of computer and software savvy scientists will be so comfortable in developing their own code that the professional developers will not be needed? And this brings up the flip side question, “Do scientists really want to be writing code?”
We also had a little discussion about how to make a career path for research developers. It need not be solely an academic enterprise, but in the past tenure has often been problematic for people of this class.
Skills and resources may vary between teams. To help resolve this, maintaining high levels of communication between groups will be valuable. In the United Kingdom (UK), there are plans to permit resource sharing between institutional RSE groups. Perhaps there are circumstances under which an RSE skill exchange could be arranged, either formally or informally.
Collaborative funding can be crucial to RSE groups, to ensure that research leadership remains with the domain scientists. As an example, at NCAR, university partnerships are required for submission of proposals, so collaboration is an essential part of grant submission, and this will tend to bring developers and scientists together. The UCL group also follows this approach, with all bids requiring an academic collaborator.
Domain scientists and developers are funded together in a single proposal. Another example of a success is the development of semantics and linked data in support of ocean sciences. An EarthCubefunded project pairs domain scientists with RSEs and has been successful; the semantics attached have increased data use and discovery significantly.
An alternative approach has been the provision of programming expertise as part of national compute services. The US XSEDE project’s Extended Collaboration Support Services (ECSS) is a set of developers who are paid with XSEDE funding, and are on “permanent” staff. When PIs request allocations on XSEDE resources, there is a finite pool of developer time that can be awarded, typically for one year only, and at partial effort, typically 20 percent or so. The finite time allowed provides motivation for the scientist and the scientist’s group to work closely with the developer and to become educated in what the developer is doing, so they can sustain the effort once the ECSS period is over. This funding mechanism can be highly efficient for scientific problems, because the developer pool assembled by the research providers are, by definition, expert in the characteristics of their specific resource, and can very quickly assess the scientist’s needs, and what it will take to implement software that meets the user’s needs. However, it does not develop capacity within institutions, and since XSEDE is a time-bounded program, it should not be relied upon as a long-term solution to acquiring this type of capacity.
The UK allows this kind of collaboration to support the creation of scientific software for the large supercomputing resource (ARCHER). However, while the support can come from the staff of the Edinburgh Parallel Computing Centre, who hosts the computer, this “embedded CSE” resource also funds the programming coming from local groups. This has been very helpful in providing funding to establish local groups. These groups work best when they develop good collaborations with national cyberinfrastructure pools. When an organization assembles a developer pool, diversity is developed and skills can be transferred.
We would like to see these models applied outside high performance computing. Most scientific software is not destined to run on national cyberinfrastructure, but needs similar support. The argument regarding making better use of expensive hardware through software improvements has been useful politically, (and many RSE groups are cited in organizations which host clusters for this reason), but the time has come to make the case that software itself is a critical cyberinfrastructure, and, with a much longer shelf-life than hardware, is itself a capital investment.
The CANARIE group (Canada) accepts proposals for providing services to broad communities, integrating people who are doing things that are complementary. The goal is to make the available stack more robust and richer for everyone. They offer short cycles of funding for creating some useful functionality that shows a diversity of input and draws from across disciplines as a key metric, If this metric is met successfully, then more funding may follow. This can apply within or across institutions.
There can be problems communicating across cultural barriers, with domain scientists seeing developers as “other”. Both collaboration and tools to fund, encourage, or motivate collaboration are extremely important.
We think support from non-governmental organizations will be important if RSE groups will become established. The Sloan Foundation is currently funding data science engineers, who work in the context of other software developers at the University of Washington. These scientists work in the e-Science Studio/Data Science Studio, and they help a group of graduate students in solving their problems in data science and data management. During Fall and Spring, a 10-week incubator program allows students to work two days a week on a data-intensive science project. Some fraction of the developer time is dedicated to the developers’ personal interests as well as instruction.
The goal for Sloan is to obtain success stories and demonstrable value of the presence of data scientists on university staff. These stories are the basis for arguments to the host organization. This is an effort to create awareness of the value of research scientist developers. Embedding with scientists, and adding spare capacity is critical to making the innovation possible. This model is essentially to argue for permanent budget lines to support data scientists as part of university staff hires, just as with core facilities. This could become a fee-for-service model requested by grant funding, just as DNA sequencing is for core facilities, if it becomes apparent that this gives competitive advantage to a university’s research effort.
One model that has been helpful in finding funding for RSE groups is the use of funds left over on research grants when RAs have left prematurely – PIs like this arrangement as it is hard to find good staff for short-term positions, so having a pool of research programming staff on hand resolves this problem. We recommend that funders give explicit guidance to grant holders and institutions that such an arrangement is favorable. Framework agreements permitting this to go ahead without checking back every time with funders and/or grant panels would further smooth this. (This also provides more stable jobs for those who hold these skills, but arguments about making life nicer for postdocs will not help persuade funders or PIs!)
There is some question about the most effective duration and percentage of full time for a programmer’s work on a project. At least three months is necessary for the programmer to read into the science (RSEs must not become so disengaged from research that they do not have time to read a few papers – this will result in code which does not meet scientific needs), but too long could result in an RSE losing their flexibility, becoming so engaged in one project that when that project ends, they find it hard to transfer. For this reason, we recommend that 40% is ideal; two projects per developer, with some time for training and infrastructure work. Having two developers per project seems to be ideal, in the sense that software development is enhanced by two pairs of eyes.
There is, as yet, no clear answer as to the scale of aggregation needed to make such a program work. A university wide program allows enough scale to be robust to fluctuations of funding within one field. But a specialization focus on developers to support, for example, physical or biological sciences may be preferable, if the customer base is large enough. The desire to aggregate enough work to make it sustainable, and the need to have domain-relevant research programming skills, are in tension.
In the UK, another source of funding for research software is the Collaborative Computational Projects (CCPs): domain specific communities put forward proposals that are a priority of the community as a whole, for example, biosimulation or plasma physics. These bodies act as custodians of community codes, and a central team also provides software engineering support.
However this area develops, the need for funding for software as a cyberinfrastructure component is clear. Funding that permits code to be refactored, tidied, and optimized is rare; this is often done “on the sly” in a scientifically focused grant. The UK EPSRC’s “software for the future” call, which really permits explicit investment in software as an infrastructure, is so oversubscribed as to have a 4% success rate; the demand is clear!
One opportunity is the idea of co-design, where infrastructural libraries are developed alongside the scientific codes that will call them. However, collaboration is hard to foster here; as incentive structures are still focused on short-term papers. This can cause infrastructure developers to focus more on publications in their areas of mathematics and computer science, the domain developers on the shorter-term needs of their own fields. Genuine collaborative co-construction is harder to foster.
It can be more difficult to help leading domain scientists see the value of engineering effort than those in their teams who are forced to work with difficult-to-use or unreliable software tools, as they do not see the pain. Perhaps a version of “software carpentry” targeted at those PIs who are awarded or apply for software-intensive grants could be valuable here.
RSEs provide a useful contribution to their universities’ teaching missions, as well as research, as they are well placed to deliver the research programming training that many scientists now need. In the longer term, with programming skills taught to all through their careers, we hope specialist scientific developers will be less needed.
We will seek to identify and approach existing research programming organizations, to get their permission to list them on a list of research software groups. Casual conversation during the meeting made it clear that although the title is not widely used in the US, this position is not rare. We spoke with several individuals who, at distinct universities, had RSEs (in effect if not in name) who were funded under differing models.
We will also look for examples of groups which have successfully become self-sustaining following initial seed funding.
In this respect, information gathering via a survey and subsequent analysis could be very useful. We would need to assemble a list of targeted individuals. (What positions and ranks are likely to know and care enough to respond?) Perhaps the Science Gateway Institute has already acquired information that could be helpful to advance this issue, and/or craft a proper survey and suggest target individuals.
The UK RSE community will provide initial facilities to host this list, and continue to work to spread the initiative, but local leadership in the US is needed if this campaign is to succeed. This will require an initial gathering of identified research software organizations in the US to this end.
Financial support for an initial conference that brings together research software groups to form an organization and create a resource sharing structure would help to further this campaign. Funding to conduct and analyze a survey could also be quite useful as knowing where we stand today, and what models are in use could fuel the ideas for further development of developers in this category.
In the longer term, funding organizations, especially non-governmental organizations with the capability to effect innovation through seed funding, could provide support to nucleate the creation of research software groups. As noted above, Sloan has already initiated one such program, and collaboration with Sloan or at least study of their methods and success or failure could be extremely useful in approaching universities and other institutions in funding this development track. It seems clear that if the value proposition can be made to university administrators, this track could flourish with buy-in at the administrative level.
Nic Weber44 will serve as the point of contact for this working group.
The group’s initial broad question was, “What makes for successful transitions of scientific software from academia to industry?” There are a number of potential funding transitions that may occur:
We characterized each of the above potential changes in funding as “transition pathways” to sustainable software (see similar work by Geels and Schot ).
Our work at WSSSPE3 included the following three activities (described in more detail below): (1) brainstorming goals for this type of research, (2) imagining potential outcomes of completing a set of case studies on this topic, and (3) generating a set of working definitions for some of the broad concepts we are describing.
First, we discussed the goals of this research, attempting to answer the question What is the goal of doing research on transition pathways? A number of research questions arose: Can we identify collaborations that have occurred and try to understand which were successful, which were unsuccessful, and what factors contributed to these successes/failures? Can we determine what each partner wants to get out of such a collaboration? For example, why would industry be interested in collaborating with academia? Or why would academia be interested in collaborating with industry? How could we design a study that focused on the impact of the software in undergoing this type of transition?
Next, we imagined potential outcomes of research on this topic, involving a set of case studies that look at successful and unsuccessful transitions of researchers between academia and industry. This might address each of the transition types (as described below). Successful transitions are described as those that lead to either weak or strong sustainability (also defined below). In addition, the results from this research might help create a generalizable framework that might allow for the study of different transition pathways (other than academia to industry).
Finally, we created some general definitions for these concepts; we characterize transitions in the following ways:
We characterized sustainability in the following ways:
We refer readers to Becker et al.  for an extended discussion of weak versus strong sustainability.
The opportunity is to create a catalog of success/failure for current and future software projects to be prepared for transitions and achieve sustainability of the software.
The obstacle is more superficial, in finding a champion to gather such information. It will be a challenge to keep this information and surveys updated. With changing rapidly changing industry landscapes, an obsolete survey could be of less or no use.
Identify projects that are collaborative, perhaps by reviewing funded projects from programs specifically geared towards industry academic collaborations.
Develop a systematic process for conducting case studies (what kind of data are being gathered about each case).
No concrete plans have been made at this point. If the community can rally behind this topic, some momentum could be built. Those interested should post at https://github.com/WSSSPE/meetings/issues/46
Nothing at the moment.
A key portion of this effort will require focused surveys of projects which have succeeded and failed in transition. Both these categories will yield good learning on what works and what does not work. The group has identified what needs to be studied further, but has not identified responsible parties to conduct them.
Community members could help in gathering data by means of interviews, historical documents or documentation, and surveys.
An example of data collection is:
Concrete funding needs were not discussed in this working group but a general impression was that some seed funding would motivate members of this group or others in community to launch a survey effort.
This group was comprised of members from different backgrounds, including quantum chemistry, epidemiology, microscopy, computer science, and software engineering. Each participant was invited to give their perspective on the topic area and what they thought were the crucial points for discussion. There was a general consensus that there was a need for relating principles to practice for the computational science and engineering community. Furthermore, various members of the group expressed their interest in tools and best practices for facilitating the maintenance and evolution of scientific software systems. It was agreed to identify principles from software engineering and from sustainability design and, based on those lists, discuss what each of those would mean applied to specific example systems from the expert domains of some of the group members. The group identified a number of software engineering principles drawn from the Software Engineering Body of Knowledge (SWEBOK) .
Software design principles include abstraction, coupling and cohesion, decomposition and modularization, encapsulation and information hiding, separation of interface and implementation, sufficiency completeness and primitiveness, and separation of concerns. Similarly, user interface design principles include learnability, user familiarity, consistency, minimal surprise, recoverability, user guidance, and user diversity. The sustainability design principles were drawn from the Karlskrona Manifesto on Sustainability Design . The manifesto states that sustainability is systemic, multidimensional, and interdisciplinary; transcends the system’s purpose; applies to both a system and its wider contexts; requires action on multiple levels; requires multiple timescales; changing design to take into account long-term effects does not automatically imply sacrifices; system visibility is a precondition for and enabler of sustainability design. A number of sustainable software engineering principles proposed by Tate  were also considered including: continual refinement of product and project practices; a working product at all times; continual emphasis on design; and value defect prevention over defect detection.
This congregated list is an initial collection of principles that could be extended by adding from further related work form separate disciplines within the field of software engineering, including requirements engineering, software architecture, and testing. The group identified two example systems to discuss the application of the principles. The first one was a quantum chemistry system that allows the analysis of the characteristics and capabilities of molecules and solids. The second one was a modeling system for malaria that permitted biologists to analyze a range of datasets across geography, biology, and epidemiology, and add their own datasets. The group then examined the principles and took a retrospective analysis of what the developers did in practice against how the principles could have made a difference. This raised the question, what do the principles mean for computational scientific and engineering software? Similarly, how do the principles relate to non-functional requirements? It was suggested that at the very minimum, that sustainable software engineering principles should be mapped to two core quality attributes that underpin technically sustainable software:
These fundamental building blocks could then be extended to include other quality attributes such as portability, reusability, scalability, usability, and energy efficiency etc. Nevertheless, this raises the question of what metrics and measures are suitable to demonstrate the sustainability of the software. In addition, what do the five dimensions of sustainability mean for scientific software, i.e., environmental, economic, social, technical and individual?
The opportunity was identified to distill existing software engineering and sustainability design knowledge into “bite sized” chunks for the Computational Science and Engineering Community. In addition, two primary challenges were identified:
In order to achieve the following three goals: (1) a systematic analysis of a number of example systems from different scientific domains with regards to the identified principles, (2) the identification of the commonalities and gaps in applying the principles to different scientific systems, and (3) a proposal of a set of guidelines on the principles, the following next steps were discussed.
The following plan for future organization was discussed:
The following key milestones were discussed as a roadmap for the set of guidelines on software engineering principles:
Specific funding was not discussed in this working group. However, this is a open topic that can be discussed in relation to emerging funding calls from National agencies or grant proposal initiatives.
Gabrielle Allen47 will serve as the point of contact for this working group.
The group discussion began by agreeing on the common purpose of creating a set of guidance giving examples of specific metrics for the success of scientific software in use, why they were chosen, what they are useful to measure, and any challenges and pitfalls; then publish this as a white paper. The group discussed many questions related to useful metrics for scientific software as follows:
Next, a roadmap for how to proceed was discussed including creating a set of milestones and tasks as follows:
The idea was put forth for the group to interact with the organizing committee of the 2016 NSF Software Infrastructure for Sustained Innovation (SI2) PI workshop in order to email out a software metrics survey to all SI2 and related awardees as a targeted and relevant set of stakeholders. This survey would be created by one of the student group members. Similarly, it was suggested that a software metrics survey be sent to the UK SFTF and TRDF software projects to ask them what metrics would be useful to report. The remainder of the discussion focused mainly on the creation of a white paper on this topic. This resulted in a paper outline and writing assignments with the goal of publishing in venues including WSSSPE4, IEEE CISE, or JORS.
The following opportunities, challenges, and obstacles were discussed:
The following next steps were discussed:
The following plan for future organization was discussed:
The following list of what else is needed was discussed:
The following items were discussed as a roadmap for the production of a white paper:
Funding needs were not discussed in this working group and it was thought that this could potentially be revisited down the road.
Nick Jones48 will serve as the point of contact for this working group, and be responsible for ensuring timely progress of the planned actions.
While little training focuses specifically on sustainable software, a variety of training activities could increase researcher awareness of and engagement with software professionals and software engineering practices. Research Software Engineers are being recognized as critical contributors to high quality research; the pathway to acquire and master the relevant skills is not yet clear; equally those skills required by researchers in general are also not commonly understood nor routinely developed.
The group’s discussion explored a rapidly growing array of training that is seen to contribute to sustainable software. The offerings are diverse, including: self-paced online modules focused around specific tools; single and multiple day training workshops that raise awareness of a tool chain to support collaborative and shared software development within a research workflow; block courses specializing on particular methods, technologies, and applications; academic programs at undergraduate and masters levels; doctoral training programs that in part contain requisite skills training activities.
While some of this training focuses on applying software engineering practices within the context of research, meeting the values and goals of research are less often incorporated as explicit learning outcomes. With software (and similarly, data) often being the only tangible artifact of a research method or protocol, the dependency between software applications and the quality of research adds complexity to the learner’s journey. In recognition of the longer term investment required by researchers to integrate such skills into their research practices, many activities are focusing on emotionally engaging researchers and cohorts, to build a sense of shared purpose beyond the obvious goal of technical skill acquisition.
In reviewing current training activities, the group identified a variety of perspectives seen as useful in positioning activities in ways to better communicate why and when best to apply each activity. Training can be categorized on a variety of spectra, with content and delivery ranging across them, for example: programming to research; basic to advanced; technical to emotional; informal to formal; and self-paced to participative. A few attempts have been made to situate a cross section of training activities within such dimensions, creating easier means of communicating the value of any specific opportunity and the pathways across opportunities over time.
Evaluation of training delivery and outcomes is seen as a weakness common to most non-academic training activities. Opportunities for measuring success in delivering training start simply with collecting a Net Promoter Score, which lets those delivering training know whether attendees are likely to recommend the training to others. In looking at the longer term outcomes for the learner, frameworks such as Bloom’s taxonomy and Kirkpatrick’s evaluation model offer possible approaches.
In this latter case of formal evaluation, ownership of evaluation as a component of career development for any researcher appears mostly absent. While academic research institutions have professional development centers to support research staff, the skills taught which might impact on sustainable software are limited at best, and lack a clear and coherent development pathway.
Coordination of these training projects will depend on buy-in from a broad range of training program and activity leaders, suggesting a key opportunity lies in identifying and bringing together these people on a regular basis.
Software skills are needed by an increasing array of researchers and fields. The training arc is not well-defined, with a sometimes baffling array of training opportunities responding to various facets of skill deficit and need. Given this current complexity, coordination across training projects would create common frames of reference, communicating and integrating activities to better serve the needs of researchers.
Building this community could lift the maturity of training projects and capabilities, enabling more advanced approaches to address key gaps in evaluation, career development, and a lift in the standard of research practices.
In aiming at these opportunities, it will be necessary to find the means to support those involved in leading training activities to allocate time to coordination activities, which will often sit beyond their current scope of responsibility.
These activities are also distributed globally, with no single country or region offering a comprehensive set of capabilities and initiatives. Any coordination activity will therefore need to raise the profile of the opportunity gap with relevant research funders and policy makers.
The goal of the following next steps is to quickly test whether there is interest in establishing a community committed to increasing the degree of coordination across training projects.
Continue to track progress by posting comments to WSSSPE3 issue.
If the group moves from early-stage formation into working towards shared goals, expertise will likely be required in pedagogy and training evaluation.
Workshop/RCN travel funding to bring together key program, project, and funder representatives from across North America, EU, UK, Australasia. In addition, funding to support work on better defining the landscape of training activities, the useful perspectives in communicating the value of the varied training projects, and the possible pathways through training activities over time.
Kyle Niemeyer49 will serve as the point of contact for this working group, and be responsible for ensuring timely progress of the planned actions.
The following section summarizes the working group’s discussion based on contributions prior to the meeting  and the collaborative notes taken during the meeting . Please refer to the original sources for the unedited discussions if necessary.
Initial discussions focused on both various mechanisms for, and the philosophical approach behind, crediting software in scientific papers. These began with proposals for various ways to credit software (or other research products including data) that contributed more significantly than a generic citation, including:
However, as of this writing, only Project CRediT roles [48, 49] and Contributorship Badges  have been implemented for published papers, and both of these only provide a single “Software” or “Computation” category associated with software. In addition, neither of these options allows for the citation of software itself, but only provide an author contribution related to software. The discussion quickly focused on transitive credit as a more quantitative measure of allocating credit to both authors and software, although there were some concerns about authors overestimating their own contributions compared to prior work.
The discussion then evolved into philosophical questions about the importance or reliance of a particular work on prior science, materials, or software—in other words, whether there is a difference between depending on prior scientific advances and depending on certain software (or experimental equipment). Multiple contributors converged on the conclusion that unique capabilities require some additional credit. The—albeit limited—consensus was that if a particular study relied on the unique capabilities of software, data, or an experimental apparatus, then the authors or developers that created this capability should be credited somehow.
The group also agreed on the fact that additional data was required to support the assertion that software was not being sufficiently cited in the literature. In particular, this issue seemed to be field-dependent. For example, as shown by a study of Howison and Bullard , in the field of biology, the most-cited papers appear to be those describing scientific software. However, this may not—and likely is not—the case in other fields, nor is it clear whether developers of scientific software, even in the case of the biology field, are receiving sufficient credit for their efforts.
In the breakout sessions on the first day of WSSSPE3, the group discussed and deliberated over the Entertainment Identifier Registry (EIDR)  as a potential model for scientific software. That system assigns unique Digital Object Identifiers (DOIs)—the same system used for scientific publications—to all content (e.g., movies, television shows) and contributors, along with relevant metadata. One important use of the EIDR system is to track rights and credits for contributors to entertainment works in order to distribute revenues—similar to the proposed transitive credit concept.
The group also discussed separating quantitative measures (e.g., number of citations) from the value of a work in order to give credit, moving towards qualitative or anecdotal evidence of value. Other topics that were brought up included a form of PageRank  for citations, based on number of mentions, and using market penetration or adoption rate in a community as a metric, although it was not clear how this would be measured. Finally, the concept a software tool’s uniqueness or indispensability to a community was mentioned, with value being characterized by a particular piece of software either offering unique capabilities or doing something better, faster, or with less computational requirements than other offerings.
On the second day of WSSSPE3, the group decided to put aside the taxonomy of contributions and focus on software citations to ensure developers receive credit (regardless of contribution). Eventually, once software citations are standardized, the goal would be to return to establishing different roles/contributions for this credit. Following this decision, the group identified two necessary actions to move forward:
For both of these actions, a number of ongoing efforts were identified and discussed.
Software Citation Metadata:
At a minimum, the metadata required for software citation includes:
This information would then be contained in a citation file, e.g., as part of the GitHub repository. The group also discussed similar efforts such as CodeMeta50, an attempt to codify minimal metadata schemes in JSON and XML for scientific software and code, and implementing transitive credit via JSON-LD . Some questions arose about how this information would be stored for closed-source software.
As one mechanism for constructing accurate contributor lists from existing project contributors, the group discussed associating GitHub accounts—as well as accounts on Bitbucket, CodePlex, and other repositories for open-source scientific software—with ORCID accounts. However, a (quick) response from GitHub (via Arfon Smith) indicated that this might not be possible in the near future: “GitHub doesn’t have any plans to allow ORCID accounts to be associated with GitHub user accounts.”
Citing Software in Publications:
Although far from a standard practice, examples of citing software in publications can be found in various scientific communities—notably, representative samples can be found in astronomy  and biology . The group recommended collecting similar examples from other communities, and then developing a software citation principles document in concert with the FORCE11 Software Citation Working Group (see §6.7.5 for more details), following the model of the FORCE11 Data Citation Principles document .
The group further discussed briefly whether software used directly in a publication—whether to perform simulation or analysis, or as a dependency for newly developed software—should be distinguished from other references due to the dependence of the study on these research artifacts. Suggestions included a separate list of citations (with DOIs) for software and other research objects that serve this sort of “vital” role. Similar recommendations were made by the credit breakout group at WSSSPE2 .
Finally, although a discrete task from software citations, significant discussion focused on ensuring software citations are indexed in the same manner as publications, allowing the construction of a corresponding software citation network. Currently, software releases can receive citable DOIs via Zenodo  and figshare ; however, these citations are not processed by indexers such as Web of Science, Scopus, or Google Scholar. Thus, either in parallel or following the primary task, the group will need to reach out to these organizations. Initial conversations with Elsevier/Scopus via Michael Taylor during WSSSPE3 clarified that Scopus is not yet DataCite DOI aware, and also does not yet have an internal identifier for software or data (but needs/plans to add this support). Taylor said they prefer a “software article” with the usual article metadata (e.g., authors, citations), and mentioned Zenodo as an example – this proposal seemed to align with our group’s discussions. Taylor also mentioned another benefit of the software and associated DOI on GitHub: in addition to a citation, one could obtain statistics on usage/downloads/forks, which happens to be what Depsy51 is beginning to try to do.
There currently is no standard mechanism for citing software or receiving credit for software (akin to citations for publications). Software is eligible for DOI assignment, but DOI metadata fields are not well tuned or standardized for software (vs. publications). Some software providers apply for DOIs, but this is not widely adopted. Also, there is no mechanism to cite software dependencies within software.
Major obstacles include the fact that indexers (e.g., Scopus, Web of Science, Google Scholar) do not currently support software/data document types or DataCite DOIs. Therefore, even with universal association of scientific software with DOIs and standardized practices for citing software in publications, software citations will not be indexed in the same manner as traditional publications.
Although this working group’s discussions at WSSSPE3 did not focus much on the topic of tenure and professional advancement, the group recognized that there is no standard policy—generally even within a single university—for software products to be included in promotion and tenure dossiers. Thus, it may be difficult to encourage valuing software contributions across the United States or United Kingdom and globally; furthermore, stakeholders are typically not tenured and thus may not be influential enough to change the status quo. However, as discussed in Section 5.2, this is changing for Research Software Engineers, at least in the UK.
The WSSSPE breakout group plans to join efforts related to citing software with the FORCE11 Software Citation Working Group (FORCE11-SCWG)52; Kyle Niemeyer formally requested the merging of these groups following the meeting. However, some future plans of the WSSSPE group fall outside the scope of FORCE11-SCWG, which covers software citation practices. These activities include working with indexers such as Web of Science and Scopus to index software citations archived on, e.g., Zenodo or figshare, and pursuing the development of an open indexing service; such plans will be pursued either separately or through the formation of follow-on FORCE11 working groups.
The group will primarily communicate electronically, with Kyle Niemeyer responsible for ensuring regular progress.
The near-term actions of the group, focused mainly on software citation, do not require any additional resources. However, connections with publishers and indexers will be needed to pursue related activities, although the FORCE11-SCWG may satisfy this need; in addition, some members of the group already reached out to relevant contacts. Funding may be needed to organize meetings or for group members to attend relevant meetings, as discussed further below.
Following the meeting, Kyle Niemeyer formally requested the merging of software citation activities with FORCE11-SCWG. Within a month of the meeting, Niemeyer will organize a virtual meeting of the group and manage the division of responsibilities for compiling existing practices of software citation and including software/products in promotion and tenure dossiers. Building off of these efforts, the next major milestone is drafting the Software Citation Principles document in collaboration with the SCWG, targeted for April 2016. While the existing directors of the SCWG, Arfon Smith and Dan Katz, lead the efforts of that group towards the Software Citation Principles document, Kyle will help coordinate contributions from the WSSSPE group members.
Some funding would be useful to support primarily travel to conferences for group meetings (e.g., FORCE2016)53, and to hold meetings to bring together both group members and key stakeholders (e.g., journals, publishers, professional societies, indexers). In addition, funding would be desired to support group members’ time to perform work towards the key steps described previously.
Steven R. Brandt54 will serve as the point of contact for this working group.
A tentative first cut at the list containing executable papers identified the following:
The group also discussed future possibilities for a new publication format that might provide advantages:
The opportunity is to collect a list of executable papers and shine a light on the experiments and development efforts currently underway.
The only obstacle to this is the difficulty in finding and identifying such publications. The Software Sustainability Institute was able to do something similar for publications about software by making a public page on the Software Sustainability Institute’s website (http://www.software.ac.uk) containing a catalog of these publications and enlisting the help of the community to grow the list.
Create the first version of the web page to be displayed on the Software Sustainability Institute’s website: http://www.software.ac.uk. We expect the page to be live in early January of 2016.
An ongoing effort to update the page should follow.
None at this time.
Nothing else at this time.
Steven R. Brandt has created a first version of the page, and it is in the process of being posted on the Software Sustainability Institute’s website: http://www.software.ac.uk. Neil Chue Hong will take responsibility for the page once it is up.
Discussion revolved around a few questions: what is the benefit of having a “community” for software sustainability, what practices and circumstances lead to having and maintaining a community, how can funding help or hinder this process, and perhaps most importantly, how can best practices be described and distilled into a document that can help new projects.
The benefits of having a community that were brought up were considered largely obvious. In addition to having advocates for the software, and a possible source of “free” contributions to the codebase, the community becomes a good source for requirements, feedback, and metrics. The software community can also act as “cheerleaders” who convince funders or other potential users to fund/use the software, and thus help sustain the software.
Practices and circumstances that lead to a community are first, that the software offers value. But in addition to this, a community will be much more likely to form if they receive (expert) support when they have questions. Additional contributing factors are good usability (not always needed), and an open development process such as IPython developer meetings on YouTube. It was also pointed out that an evangelist for the project, not necessarily but often one of the developers, can often make a big difference.
Funding can help the process by encouraging both value to the community and high-quality user support. Only providing funding for the software development may create good software, but with less likelihood to have a real community. It was discussed that federal laboratories are a good incubator for software communities, and that a general facility like EarthCube is too dispersed to really make a community. Also, domain-specific groups within laboratories or universities might provide as an incubator for software communities.
In describing best practices, the group discussed the different modes for starting a scientific software project: building on an existing product that needs improving, recognizing an unsatisfied need of an existing community, or creating a new solution to a need not yet recognized by the community. The group also thought that the existing books on software communities would need to be evaluated in light of differences between science software projects and general open-source software (OSS) projects in terms of scale, science, acknowledgement and credit, and funding models.
The main opportunity is to increase awareness among scientific software developers and project managers of the importance of developing a community around their project. While this message is fairly well understood in the open source community, the scientific community can be more focused on the science a software project is supporting rather than the software project itself.
As with many of the issues relevant to the sustainability of science software, the main challenge here will be changing the culture and expectations around scientific software.
The most important next steps is a “Best Practice” document, which would describe what successful projects with engaged communities look like, how to replicate this type of project, and look at end-of-life on a community project. Inputs to this document would include a software community survey of highly functioning communities such as R Open Science, Python SciPy, OPeNDAP, and Unidata, with analysis of factors that feed into their success. Also references like the “Art of Community” could be adapted and summarized for the science software community.
More specifically, the group would like to take the following steps:
Another next step would be increasing recognition of need for science software projects to focus on building and supporting their user communities. Good software engineering practices are not enough, and popular training like Software Carpentry does not currently address this issue head-on.
No definite plans were agreed upon for future organization. The major ideas discussed were coordinating with another group or adapting some existing text.
Collaboration within the framework of an existing organization seems a good initial path. Mozilla Science maintains a “Working Open Project Guide” , the introduction of which states:
Working openly with contributors enables your community to learn how to build and collaborate together. This document is a guideline on how to work openly and involve others in your projects with Mozilla. We want to help you engage your community in a way that encourages contributors and builds other leaders.
Another idea is to form a group that could adapt existing commercial-oriented guidelines for the world of scientific software and top-down funding structures. For example, to distill the “Art of Community” by Jono Bacon  for scientific software.
The group had many points of agreement, but there is not currently a dedicated core group of people who have committed to producing the key milestones. Coordination via phone or online would be necessary to build this “community” of contributors.
The key milestones for the group’s activities align closely with the Key Next Steps above:
With a small amount of seed funding, it is possible that members of this group or other parties could spend the time necessary to devise a survey of existing projects and deploy this, probably by traveling to meetings and workshops for the various software communities.
In WSSSPE3, we attempted to take what we learned from WSSSPE1 and WSSSPE2 in how we can collaboratively build a workshop agenda and turn that into an ongoing community activity. The success or failure of these efforts will only become apparent over time.
The workshop had two components, presentations and working groups. The presentations, in the first half day of the workshop, included an inspirational keynote and a set of lightning talks. We used lightning talks for two reasons: first, the need of some participants to have a slot on the agenda to justify their attendance; and second, as a way to get new ideas across to all the attendees. We broke with the tradition of requiring the lightning talk submitters to self-publish their papers, and instead used a common peer-review platform57, choosing to publish their slides on the workshop website instead.
The working groups met for a small part of the first half day and all of the second day, with the exception of some short periods for the groups to report back to the collected workshop attendees. Each group determined a set of activities that the members could do to advance sustainable software in a particular area.
The results of these group sessions made it clear that there are many interlinked challenges in sustainable software, and that while these challenges can be addressed, doing so is difficult because they generally are not the full-time job of any of the attendees. As was the case in WSSSPE2 as well, the participants were willing to dedicate their time to the groups while they were at the meeting, but afterwards, they went back to their (paid) jobs.
We need to determine how to tie the WSSSPE breakout activities to people’s jobs, so that they feel that continuing them is a higher priority than it is now, perhaps through funding the participants, or through funding coordinators for each activity, or perhaps by getting the workshop participants to agree to a specific schedule of activities during the workshop as we have tried to do in WSSSPE3. It remains to be seen, however, if the participants will meet the schedules they set.
The overall challenge left to the sustainable software community is perhaps one of organization: how to combine the small partial efforts of a large number of people to impact a much larger number of people: those who develop and use scientific software. While WSSSPE might help focus the actions of the groups, something more is needed to incentivize the wider community, which is a generalization of the sustainable software problem itself.
36London Software Credit workshop: http://www.software.ac.uk/software-credit
37FORCE11-SCWG landing page, https://www.force11.org/group/software-citation-working-group
38FORCE11-SCWG GitHub page, https://github.com/force11/force11-scwg
52FORCE11 Software Citation Working Group, https://www.force11.org/group/software-citation-working-group
The supplementary files for this article can be found as follows:
Work by Katz was supported by the National Science Foundation while working at the Foundation. Any opinion, finding, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Choi’s work was supported in part by the National Science Foundation research grant DMS-1522687 and a WSSSPE3 travel award. She thanks the encouragement and discussion with Fred Hickernell. Hetherington was funded by the Software Sustainability Institute, RCUK grants EP/H043160/1 and EP/N006410/1. Work by Gunter was supported by the Office of Science, Office of Biological and Environmental Research, of the U.S. Department of Energy (DOE) under Award Numbers DE-AC02-05CH11231, DE-AC02-06CH11357, DE-AC05-00OR22725, and DE-AC02-98CH10886, as part of the DOE Systems Biology Knowledgebase and by the Office of Science, Office of Advanced Scientific Computing Research (ASCR) of the U.S. Department of Energy under Contract Number DE-AC02-05CH11231 as part of the Template Interfaces for Agile Parallel Data-Intensive Science (TIGRES) project. WSSSPE3 was supported by NSF award 1434218 and funding from the Gordon and Betty Moore Foundation.
Kyle Niemeyer is an Associate Editor of JORS, although uninvolved in the processing/review of this article.
Katz, DS, Allen, G, Chue Hong, N, Parashar, M and Proctor, D (2013). First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE): Submission and Peer-Review Process, and Results. arXiv; 1311.3523. http://arxiv.org/abs/1311.3523.
Katz, DS Choi, SCT Lapp, H Maheshwari, K Löffler, F Turk, M et al. (2014). Summary of the First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE1). Journal of Open Research Software 2(1)DOI: https://doi.org/10.5334/jors.an
Katz, DS Allen, G Chue Hong, N Cranston, K Parashar, M Proctor, D et al. (2014). Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2): Submission, Peer-Review and Sorting Process, and Results. arXiv; 1411.3464. http://arxiv.org/abs/1411.3464.
Katz, DS Choi, SCT Wilkins-Diehr, N Chue Hong, N Venters, CC Howison, J et al. (2016). Report on the Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2). Journal of Open Research Software, Accepted. Available at http://arxiv.org/abs/1507.01715.
Turk, MJ Smith, BD Oishi, JS Skory, S Skillman, SW Abel, T et al. (2011). yt: A Multi-code Analysis Toolkit for Astrophysical Simulation Data. ApJS Jan 2011192DOI: https://doi.org/10.1088/0067-0049/192/1/9 9.
Becker, C Chitchyan, R Duboc, L Easterbrook, S Mahaux, M Penzenstadler, B et al. (2014). The Karlskrona manifesto for sustainability design. arXiv; 1410.6968. http://arxiv.org/abs/1410.6968.
Patra, AK Bauer, AC Nichita, CC Pitman, EB Sheridan, MF Bursik, M et al. (2005). Parallel adaptive numerical simulation of dry avalanches over natural terrain. Journal of Volcanology and Geothermal Research. Modeling and Simulation of Geophysical Mass Flows 139(1): 1–21, DOI: https://doi.org/10.1016/j.jvolgeores.2004.06.014
Meng, H, Kommineni, R, Pham, Q, Gardner, R, Malik, T and Thain, D (2015). An invariant framework for conducting reproducible computational science. Journal of Computational Science. Computational Science at the Gates of Nature 9: 137–142, DOI: https://doi.org/10.1016/j.jocs.2015.04.012
Meng, H and Thain, D (2015). Umbrella: A Portable Environment Creator for Reproducible Computing on Clusters, Clouds, and Grids. Proceedings of the 8th International Workshop on Virtualization Technologies in Distributed Computing. VTDC ’15. New York, NY, USAACM: 23–30, DOI: https://doi.org/10.1145/2755979.2755982
Meng, H, Wolf, M, Ivie, P, Woodard, A, Hildreth, M and Thain, D (2015). A case study in preserving a high energy physics application with Parrot. Journal of Physics: Conference Series 664(3): 032022.DOI: https://doi.org/10.1088/1742-6596/664/3/032022
Huo, D, Nabrzyski, J and Vardeman, C (2015). An Ontology Design Pattern towards Preservation of Computational Experiments. Proceedings of the 5th Workshop on Linked Science 2015 – Best Practices and the Road Ahead (LISC 2015) co-located with 14th International Semantic Web Conference. 2015, ISWC
Baxter, R, Chue Hong, N, Gorissen, D, Hetherington, J and Todorov, I (2012). The Research Software Engineer. Digital Research 2012, http://digital-research-2012.oerc.ox.ac.uk/papers/the-research-software-engineer
Parr, C (2013). Save your work – give software engineers a career track. Times Higher Education, August 15 2013 https://www.timeshighereducation.com/news/save-your-work-give-software-engineers-a-career-track/2006431.article
Gil, Y David, CH Demir, I Essawy, BT Fulweiler, RW Goodall, JL et al. (2016). Towards the Geoscience Paper of the Future: Best Practices for Documenting and Sharing Research from Data to Software to Provenance. Earth and Space Science,
Duffy, CJ, David, C, Peckham, S, Venayagamoorthy, K and Gil, Y (). Geoscience Papers of the Future: An Introduction to the Special Issue. Earth and Space Science, [In press], Accessible from: http://agupubs.onlinelibrary.wiley.com/agu/issue/10.1002/(ISSN)2333-5084(CAT)SpecialIssues(VI)GPF1/.
Nanthaamornphong, A and Carver, JC (2015). Test-Driven Development in scientific software: a survey. Software Quality Journal, : 1–30, DOI: https://doi.org/10.1007/s11219-015-9292-4
Heaton, D and Carver, JC (2015). Claims about the use of software engineering practices in science: A systematic literature review. Information and Software Technology 67: 207–219, DOI: https://doi.org/10.1016/j.infsof.2015.07.011 Available from: http://www.sciencedirect.com/science/article/pii/S0950584915001342.
Basili, VR Carver, JC Cruzes, D Hochstein, LM Hollingsworth, JK Shull, F et al. (2008). Understanding the High-Performance-Computing Community: A Software Engineer’s Perspective. IEEE Software July 200825(4): 29–36, DOI: https://doi.org/10.1109/MS.2008.103
Carver, JC, Kendall, RP, Squires, SE and Post, DE (2007). Software Development Environments for Scientific and Engineering Software: A Series of Case Studies. 29th International Conference on Software Engineering (ICSE’07). : 550–559, DOI: https://doi.org/10.1109/ICSE.2007.77
Sempervirens (). Accessed: 2015-11-07. https://github.com/njsmith/sempervirens.
Ahalt, S Berriman, B Brown, M Carver, J Chue Hong, N Fish, A et al. (2015). Toward a Framework for Evaluating Software Success: A Proposed First Step. Computational Science and Engineering Software Sustainability and Productivity Challenges (CSESSP) Workshop, Available from: https://www.orau.gov/csessp2015/whitepapers/ahalt_stan.pdf.
Heroux, MA and Willenbring, JM (2009). Barely sufficient software engineering: 10 practices to improve your CSE software. Software Engineering for Computational Science and Engineering, 2009. SECSE ’09. ICSE Workshop: 15–21.
Blatt, M (2013). DUNE as an Example of Sustainable Open Source Scientific Software Development. arXiv; 1309.1783. http://arxiv.org/abs/1309.1783.
Ahern, S Brugger, E Whitlock, B Meredith, JS Biagas, K Miller, MC et al. (2013). VisIt: Experiences with Sustainable Software. arXiv; 1309.1796. http://arxiv.org/abs/1309.1796.
Merali, Z (2010). Computational science: … Error … why scientific programming does not compute. Nature 467: 775–777, DOI: https://doi.org/10.1038/467775a
Hettrick, S et al. (2014). UK Research Software Survey 2014. DOI: https://doi.org/10.5281/zenodo.14809
Becker, C Betz, S Chitchyan, R Duboc, L Easterbrook, SM Penzenstadler, B et al. (2016). Requirements: The Key to Sustainability In: Software. IEEE, Jan 201633(1)pp. 56–65, DOI: https://doi.org/10.1109/MS.2015.158
Becker, C Chitchyan, R Duboc, L Easterbrook, S Penzenstadler, B Seyff, N et al. (2015). Sustainability Design and Software: The Karlskrona Manifesto. Proc. 2015 Int’l Conf. Software Eng. (ICSE’15). DOI: https://doi.org/10.1109/icse.2015.179
Working towards Sustainable Software for Science: Practice Experiences (). Accessed: 2015-12-03. http://wssspe.researchcomputing.org.uk/.
International Workshop on Software Engineering for High Performance Computing in Computational Science Engineering (). Accessed: 2015-12-03. http://se4science.org/workshops/.
Workshop on Software Engineering for Sustainable Systems (). Accessed: 2015-12-03. http://sustainabilitydesign.org/initiatives/se4susy/.
Entertainment Identifier Registry (). Accessed: 2015-10-28. http://eidr.org.
Katz, DS (2014). Transitive Credit as a Means to Address Social and Technological Concerns Stemming from Citation and Attribution of Digital Products. Journal of Open Research Software Sep 20142(1)DOI: https://doi.org/10.5334/jors.be e20.
Mayes, AC, zee-moz, Collins, A, Niemeyer, K and Jabbari, A (2015). Leadership-Training: “Working Open” Guide – WSSSPE3 version. DOI: https://doi.org/10.5281/zenodo.33748
Howison, J (2015). Sustaining scientific infrastructures: transitioning from grants to peer production (work-in-progress). iConference 2015 Proceedings. http://hdl.handle.net/2142/73439
Geels, FW and Schot, J (2007). Typology of sociotechnical transition pathways. Research Policy 36(3): 399–417, DOI: https://doi.org/10.1016/j.respol.2007.01.003
WSSSPE3 Software Credit Working Group (2015). WSSSPE3 Software Credit Working Group GitHub Issues. Accessed: 2015-10-1. https://github.com/WSSSPE/meetings/issues/51.
WSSSPE3 Software Credit Working Group (2015). WSSSPE3 Software Credit Working Group Collaborative Notes. Accessed: 2015-10-1. https://docs.google.com/document/d/1oN0ZYqIoWtOE1LBMIlWY9N8nn5LHTncj8GjUKPh62pA.
CASRAI (). Project Credit. Accessed: 2015-03-31. http://credit.casrai.org.
Mayes, AC (2015). Contributorship Badges. Accessed: 2015-10-26. https://www.mozillascience.org/projects/contributorship-badges.
Open Researcher Contributor ID (ORCID) (). Accessed: 2015-03-31. http://orcid.org/.
McCall, JG Al-Hasani, R Siuda, ER Hong, DY Norris, AJ Ford, CP et al. (2015). CRH Engagement of the Locus Coeruleus Noradrenergic System Mediates Stress-Induced Anxiety. Neuron Aug 201587(3): 605–620, DOI: https://doi.org/10.1016/j.neuron.2015.07.002
Lin, IC, Okun, M, Carandini, M and Harris, KD (2015). The Nature of Shared Cortical Variability. Neuron Aug 201587(3): 644–656, DOI: https://doi.org/10.1016/j.neuron.2015.06.035
Howison, J and Bullard, J (2015). Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature. Journal of the Association for Information Science and Technology, DOI: https://doi.org/10.1002/asi.23538 In press.
SIG SP (2015). Astronomy software citation examples and ideas. Accessed: 2015-10-29. https://docs.google.com/document/d/1q9ULl7alA3veL7Qwg7jGteRWeJwlrkvRHSXjvt-rTs0.
Data Citation Synthesis Group (2014). Joint Declaration of Data Citation Principles In: Martone, M ed. San Diego, CA: FORCE11. Accessed: 2015-10-28. http://www.force11.org/group/joint-declaration-data-citation-principles-final.
Zenodo (). Accessed: 2015-10-28. https://zenodo.org.
Figshare (). Accessed: 2014-02-03. https://figshare.com.
IPython (2015). A gallery of interesting IPython Notebooks. Accessed: 2015-10-28. https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks#reproducible-academic-publications.