4 Modern SCEs

Chairs: James Black and Satish Murphy

5 Question

What are our goals for a modern clinical reporting workflow, on a modern SCE? What are our learnings today achieving that goal, and how can we better prepare ourselves to balance the drive to innovate while having to evolve people and processes?

6 Topics discussed

6.1 Is the next step Homegrown or vendor

Split in vendor approaches vs homegrown for the new generation. We don’t know what the ideal is today, but we know we need to be able to evolve and adapt to new technologies and approaches much more than we did in the past.
Homegrown usually means modular, as still relient on different open soruce and vendor solutions (e.g. AWS, Hashicorp, etc). Is a more accurate description turnkey vs modular?
How to ensure our modular platform scales is an important new aspect, especially using new open source tools.
In a turnkey, the vendor will have baked in provenance. In a modular we must focus on making sure metadata/provenance runs as a background across the system to ensure we can trace back from an insights and transformations to the source data.
A major pain point is how things are funded - we are not used to funding a platform for sustained evolution/innovation, but in the data science space things are constantly evolving - so moving to operations/maintenance is equilivant to decay.

6.1.1 Relationship with informatics partners

There is often tension between informatics and the business, with the growth of business written code that looks more like software (e.g. R packages) vs the the scripts/macros we use to make. We shared experiences finding a balance as we entered this new phase.
It was refreshing to see we have cases where informatics and the business are aligned on one goal, and we need to be more proactive trying to create these relationships (including inviting informatics to our pan-industry discussions. J&J + Roche had informatics representation at this meeting, but otherwise the dialogue was from the business perspective).

6.2 What is an SCE?

Should GxP and exploratory remain seperate platforms?
- Split across companies in the group, with some companies having a single platform for both, and others having seperate platforms.
- With initiatives like the digital protocol coming, we don’t know what the impact will be on routine clinical reporting, and what impacts this will have on the types of people and tasks needed to execute a CSR.
- Pain points merging:
  - Validation (CSV) is a long and high cost process in most companies, which can impact ability to support exploratory work.
  - Needs are different. E.g. clinical reporting is low compute, while design and biomarker work is often heavy in memory and data.
Is data part of the SCE? Traditionally yes, but some but not all companies are de-coupling data from compute.
Whether it’s in the SCE or not, tracebility is extra important in our domain of regulatory reporting.
It appeared across all companies access to an SCE is through a web-browser (not a local application)

6.3 Building trust in business/open source code

The Cathedral and the Bazaar by Eric Rayman was a recomended essay to read, that talks about ‘Cathedral’ products where the code is developed in a closed environment then released, vs ‘Bazaar’ products where the code is developed in the open. An arguement is the Bazaar model, as long as it is a project with enough eyeballs, will lead to shallow bugs; this is also known as Linus’ Law.

6.4 Change-management into a modern SCE

What are we actually building? A general data science platform? A platform optimised for clinical reporting?
- These are not the same platform, and which you pick has an impact. e.g. should statistical programmers learn git, or should we give a simple GUI for pushing code through QC and to Prod?
- There is not a consensus about this for next-gen, with only a handful of companies expecting statistical programmers to work in the same way as general data scientists.
Historically we depended on SAS, it’s data formats, and filesystems. How to build a modern SCE that doesn’t?
- Do we enable legacy workflows to work in the new SCE?; Only new ways; or how do we find a balance to ensure business continuity while enabling innovation?
- The human and process change management piece is massive, and SCE POs must work in tandem with statistical programming leadership.
- Agreement the biggest pain point is the dependency on file-based network share drives for data and insight outputs. One company mentioned they have millions of directories in their legacy SCE.
Most companies have carried over having the outputs server be a network share drive, but would a more ‘publishing’ type model be more robust?

6.5 General notes

We manage on user access. A question is whether we should control access based on user access and the intended use. In terms of both where they are working and what the context of the work is.
We need to rightsize our ambitions, as going to broad will slow us down.
How will moving to this latest generation be a positive impact on our financials? Interesting point made about putting ourselves in the shoes of someone like a CMO - if you don’t care about how the CSR is generated, how is the new SCE making the company money and when will we get a fiscal ROI?
Interactive analysis is growing - need to prepare for when people want to use something like shiny for GxP
The ideal people to work on the SCEs are unicorns - they need to be able to work with the business, understand the trial processes, and be able to work with the technology. We need to be able to train people to be unicorns, and we need to be able to retain them.

7 Actions

Can we have a learning series / discussion / panel on experience using a designing git flows
What are companies experiences across the diverse scopes we’ve placed onto SCEs - need to share more here.
How do we want to store data post shared drive, how do we want to track?
We need to share learnings on the value add from these new SCEs!

8 Resources

White paper on late-stage SCEs

Keynote incoporating the discussion at the round tables: