Getting parallel bilby production-ready for LIGO’s fourth observing run
The Third Observing Run of the LIGO-VIRGO-KAGRA Collaboration (LVK), O3, ended on March 31, 2020, bringing the total number of gravitational-wave (GW) events to more than 90. LVK’s Fourth Observing Run will start towards the end of 2022 and is estimated to provide over 150 GW events. Inferring the astrophysics of these candidates is a cornerstone of GW astronomy. In inference, observed data is modelled with parameterized GW signals to understand the objects that generated the GW. However, some GW models are very complex and take a long time to compute. To make it possible to use these complex models, Parallel Bilby makes it possible to use multiple computers to help study the observed data. Using multiple computers, Parallel Bilby can reduce the time for computation from 15 years to a few days!
Due to Parallel Bilby being the only software tool that allows the LVK to use the very complex GW models, the LVK required the immediate release of Parallel Bilby for analysis in O3 (a few days after its inception)! Hence, various development practices to ensure code quality and readability were sacrificed to achieve the functional requirements of Parallel Bilby. These sacrifices allowed the small team of Parallel Bilby developers to build and release the first version of the software quickly. However, due to the poor code quality, lack of unit tests, and end-to-end tests, maintenance and further development of Parallel Bilby have become challenging.
In 2021, ADACS developers Conrad Chan, David Liptai, and Tiger Hu stepped in to restructure Parallel Bilby. Through several hands-on development days, ADACS and Parallel Bilby developers collaboratively planned and implemented the new code structure. During these hands-on sessions, the ADACS team could listen to the needs and concerns of the Parallel Bilby developers and design a solution using their knowledge of software engineering design principles. The Parallel Bilby developers report that working alongside the ADACS developers was very helpful:
“I felt more keyed in the work that was done, as this was an entire overhaul of the codebase” – A. Vajpeyi
Before refactoring the code, the developers established a set of end-to-end (E2E) tests to validate the consistency of results. Deterministic code execution is a prerequisite of scientific reproducibility, which Parallel Bilby was previously unable to achieve. The ADACS team related the importance of this to the science team, and together they identified the issues preventing determinism and resolved them. Once a method for rapidly assessing the consistency of results was established, the ADACS developers refactored the code into modules using modern software development practices. These changes improved code readability and make it easier for future developers to build upon the codebase. The newly implemented test suite allows the code to be more thoroughly tested, minimising the risk of regression arising from colliding collaborative effort. Furthermore, the Parallel Bilby developers have been shown these best practices, setting them up to independently continue writing high-quality software for this and any other future projects.
Their work has instilled good development practices within the (currently small) development team and made the software more accessible to new developers. Their refactoring and the addition of tests have provided a firmer foundation for future development, which may enable us to make further optimizations to Parallel Bilby. – A. Vajpeyi