How Research Projects Can Survive and Thrive in Open Source
EU-funded research projects strongly emphasize creating software that will outlive the project itself. The goal is for them to be built-upon, re-used, and extended, lead to new business growth, and have a long-term impact on research and innovation. Consequently, research projects always try to convincingly describe what will happen with the project results after the project ends.
Open Source Can Provide a Sustainable Ecosystem
Bringing research projects into open source gives their outputs the best chance of surviving and thriving once their funding runs out. Open source has become an integral part of the business strategies and products of leading companies, which makes for a highly viable environment where research project-generated code can be picked up and expanded upon. Research projects — especially larger ones — are complex systems with many players who have their own agendas, their own development process, their own governance, and open source, as a catalyst, helps bring them together and makes it easier for them to collaborate.
Pushing Code to GitHub Is Not the (Only) Solution
Studies on GitHub project survival over time are not encouraging: one study found that the probability of surviving longer than five years is less than 50%. Odds are that the survival rate of research project-generated code on GitHub is similar.
So there are more considerations for bringing results into the open source world than just pushing code to GitHub at the end of the 3-4 years of research. Here are a few to consider:
Identify What Partners Bring Into the Project
As an initial step, identify all the assets your partners are bringing into the project. This is a normal step for research projects, so it takes very little extra time and effort to look at which projects, communities, and licenses the project will be dealing with. It may sound trivial, but it is important to identify early, and is useful input for the next step.
Identify Which Outputs Will Be Open Source
Identifying and clearly stating which parts of the project results will be open source is crucial in defining a coherent open source landscape for the whole project.
A few questions to guide this process include:
- Will the project develop new tools or products?
- Does it aim to provide a full-fledged integrated platform?
- Does it aim to extend or significantly contribute to existing tools that partners are bringing into the project?
- Does the project plan to open source the entire code or only some parts of it?
- Is the project a combination of some or all the above?
Target Relevant Open Source Communities
The advantage of having a clear open source landscape for the project is that you can identify relevant communities early on. Naturally, these communities are to some degree defined by open source projects that partners bring into the project. But it’s always worth thinking about other communities beyond the obvious ones.
The old principle “don’t reinvent the wheel” also applies for community building. Building a new community from scratch is a long-term, time- and cost-consuming process. It’s better to tap into existing communities if you can.
Open source community-building activities should be integrated in the dissemination and communication plan and so should the concrete means and related KPIs.
Figure Out IP and Licensing Arrangements
Discussing open source licenses early on is important. If it is not done at the proposal phase, future conflicts are almost guaranteed. The consortium should agree on the types of licenses for the project results. Licenses to be excluded should ideally be put in the Consortium Agreement. For example, strong copyleft licenses should not be brought into the project without prior agreement to allow for commercial exploitation of the open source results.
Keeping track of all the inbound IP and licenses once the project is running and software is being developed will enable early identification of issues, which makes them easier to resolve. Exchanging a GPL’ed library, for example, will be much easier before a lot of code is dependent on it. IP and licensing are challenging topics that require training, coordination, and probably tool support, which should be reflected in project tasks and efforts.
Start Off in Open Source
To take advantage of the strengths of open source, infrastructure (i.e., a public repo) should be set up at the beginning of the project, and open source best practices implemented from day one.
While there’s sometimes an instinct to keep the code behind closed doors until it seems ready or polished, it will ultimately slow down the project, and the task of moving it to open source will only get more difficult over time. Starting in open source is a prerequisite to unleashing all the benefits of open collaboration — something that should be a pillar of every research project.
And in the highly competitive ecosystem of funded research, a credible open source strategy that demonstrates benefits to public, academic, and economic targets can make for a winning proposal.
Our 10 years of experience in research projects have taught us that the intention to do open source is not enough. Indeed, this desire can be confronted with a lack of motivation from certain partners, with a lack of knowledge of open source licenses and their compatibility, with an underestimation of the efforts that it implies to produce really business friendly open source software.
It is in the face of these challenges that a foundation such as ours can truly make a difference. Throughout the 3-4 years of the project, we take the time to explain, to accompany, to help the consortium to pass from the intention to the implication while passing by the motivation. We explain that a project that wants to be open source does not stop when the 3-4 years of the research project are over. That it is a commitment that the consortium takes for the future years towards its early adopters.
We usually start and end our projects by saying: "open source is a journey, not a destination!”
To learn more about how publicly-funded research projects partner with the Eclipse Foundation, visit eclipse.org/research.
About the Author
Marco Jahn is research projects manager at the Eclipse Foundation, where he helps turn innovations into successful open source projects.
More from this Edition
Discover the four components of Velocitas' programming model, and how these components simplify the development of vehicle applications.
Security has been a concern in the software industry for a long time. But the landscape and interplay between those trying to secure software and those trying to steal or compromise it has changed.