Skip to content

adds eu interchange in MD#82

Open
swhume wants to merge 2 commits intomainfrom
sam-todo
Open

adds eu interchange in MD#82
swhume wants to merge 2 commits intomainfrom
sam-todo

Conversation

@swhume
Copy link
Copy Markdown
Collaborator

@swhume swhume commented Apr 18, 2026

Let's use Markdown to get the slides and main messages down, then translate them into nice slides. We can decide which slides should be graphics and where we can use text, but find more visually interesting ways to present them.

I created an updated DDE architecture diagram that should work in the slide deck and added it. Once we agree on the content, I can create some additional graphics.

We can also allow other team members to contribute this way if they are interested. That said, @dostiep, since you're presenting, you have the final say on the slides.

@swhume swhume requested a review from dostiep April 18, 2026 13:30

## What Are We Trying to Accomplish
1. Create a solution that maximizes automation and minimizes manually created metadata to generate Define-XML
2. Support generating Define-XML from the study design; this uses USDM -> Biomedical Concepts -> SDTM Dataset Specializations
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add CRF Specializations. We already shared some details about it in the BC webinar in March.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could mention that when we cover the Define-XML generation using USDM + BCs + DSSs. We can say we plan to follow a similar approach to generating the CRFs using CRF Specializations.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the focus of the presentation is on SDTM Define-XML generation, I added the CRF Specializations to the speaker's notes as something to mention.

2. Add support for ADaM Define-XML
3. DTA to SDTM transformations
4. Plug In Architecture
5. Initial release targeted by EOY
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what form will a release look like?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe re-use the graphical image and highlight the boxes that represent what we want to deliver by EOY?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope we have an initial release candidate ready that covers, at the very least, the Define-XML generation. I believe we will also have the CRF generation ready as well, but less visibility on that one at the moment. We can generate an image that conveys what we expect to be ready for EOY.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a brief description in the speaker notes.

1. Numerous gaps in the metadata needed to automate Define-XML generation were identified
2. Examples include keySequence, Length, ...
3. To address the missing metadata, we used placeholders in the first Define-XML generated
4. Gaps may drive updates to standards
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, in this case, necessitates a new standard.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chowsanthony Do you have something in mind in terms of a new standard?

## Inefficiencies in Today's Process
1. Manual and Inefficient Workflow
- Current Define-XML workflows rely heavily on spreadsheets, local conventions, and manual editing, causing inefficiencies.
2. Error-Prone Processes
Copy link
Copy Markdown
Collaborator

@chowsanthony chowsanthony Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "Every 1.32 studies go through protocol amendments. Manual process is not only error-prone, but unsustainable and expensive."

Ref: Getz K, Smith Z, Botto E, Murphy E, Dauchy A. New Benchmarks on Protocol Amendment Practices, Trends and their Impact on Clinical Trial Performance. Ther Innov Regul Sci. 2024 May;58(3):539-548. doi: 10.1007/s43441-024-00622-9. Epub 2024 Mar 4. PMID: 38438658.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automation driven by the Study Design metadata available in USDM is a key point here.

@@ -0,0 +1,124 @@
# Automating Define-XML Generation in the CDISC 360i Program
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use this opportunity to strike a more strategic tone.

1. Frame the Shift as a Strategic Transformation, Not Just a Process Fix
Instead of solely focusing on "we are replacing spreadsheets to reduce manual copy-paste errors," frame it as moving from a static, document-based past to a dynamic, machine-consumable future. This helps conference attendees understand that we are not just building a faster spreadsheet. We are creating a fundamentally new metadata backbone that unlocks the true value of their standards investments.

2. Create a Clear "Current vs. Future" Contrast
We have inefficiencies and future states in separate sections. Bringing them together into a direct comparison makes the future direction immediately obvious and easy to digest for non-technical attendees.

The Current Standard (Spreadsheets) The Future Direction (Metadata-Driven)
Static & Study-Specific: Templates must be manually adapted for every new trial. Dynamic & Reusable: A structured model that scales systematically across projects.
Laborious: Requires manual interpretation, copy-pasting, and extensive QC. Machine-Consumable: Enables automated validation and generation directly from the source.
Siloed Intermediary: Breaks the chain between upstream concepts and downstream artifacts. Connected Backbone: Directly links the study design and biomedical concepts to the final Define-XML.

3. Strengthen the "Why This Matters"
The point "Realizing the benefits of your investment in standards via metadata-driven automation" is currently buried as the fifth bullet point in its section. This should be the headline. The core message should be: Spreadsheets trap your standards in unreadable formats; the Data Definition Engine activates them.

Copy link
Copy Markdown
Collaborator Author

@swhume swhume Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we want to highlight a new way of working driven by study design. Using existing metadata sources, like spreadsheets, just creates a better bridge to that future until USDM is business as usual.

## What Are We Trying to Accomplish
1. Create a solution that maximizes automation and minimizes manually created metadata to generate Define-XML
2. Support generating Define-XML from the study design; this uses USDM -> Biomedical Concepts -> SDTM Dataset Specializations
3. Support using multiple sources of metadata to generate the Define-XML, such as existing metadata spreadsheets
Copy link
Copy Markdown
Collaborator

@dostiep dostiep Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a conversation we need to have but instead of "metadata spreadsheets", why not mention something more 2026 such as "sponsor's MDR"? We could update the data_architecture image accordingly.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Generally speaking, we want to enable users to pull metadata from where it exists today, in addition to supporting a future where that metadata is sourced from USDM. That's one way the plugin architecture works. You could add MDR as a metadata source. Maybe we can get the MDR vendors to help with this at some point. The spreadsheet/MDR metadata sources also help test the DDS model and Define-XML generation code.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added MDRs to the slide and speaker notes.


## Inefficiencies in Today's Process
1. Manual and Inefficient Workflow
- Current Define-XML workflows rely heavily on spreadsheets, local conventions, and manual editing, causing inefficiencies.
Copy link
Copy Markdown
Collaborator

@dostiep dostiep Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my point about spreadsheets, we even mention here that using those is somehow a liability.

I think we also need to mention that spreadsheets are not "single source of truth". Or at least we need somewhere in this presentation to make it clear the intention is also to avoid duplicate information in different sources which could lead to inconsistencies.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I view spreadsheets mostly as a way to help the many folks who use them today get started with the DDE application. As they begin implementing USDM, they can support those new studies, with the goal of moving the industry off spreadsheets in the long run. I know many sponsors who use metadata spreadsheets (or a mix of an MDR and spreadsheets), and some have expressed interest in open-source tools for generating Define-XML.

---

## The Project: The Data Definition Engine (DDE)
![dde_architecture_slide.png](dde_architecture_slide.png)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the image, I think the DC/ADaM Loader box and the Additional box are mixed up.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll generate an updated version.


## The Project: The Data Definition Engine (DDE)
![dde_architecture_slide.png](dde_architecture_slide.png)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This slide is probably the most important one for the audience to grasp what this project is all about. I will probably need to stay on this slide for a couple of minutes. I will need "presenter's notes" to be added so I talk while the crowd is looking at the slide. I'll think of smoething but feel free to add contents.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Solution Overview document provides a description. I haven't pushed the latest as I need to review it. I will get that pushed soon. Then we can pull some speaker's notes from there.

## The Project: The Data Definition Engine (DDE)
![dde_architecture_slide.png](dde_architecture_slide.png)

1. Metadata Sources
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentionned in a previous comment, shouldn't we replace the spreadsheet with sponsor's MDR?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about I update the slide to have MDR/Metadata Spreadsheet? I like the MDR interface, but I'm not sure how feasible it is for us to implement. If you know of an MDR API or something similar we can test, please add it to the backlog. We may need MDR vendor support to do that. Spreadsheets are widely used and something we can develop. I showed this diagram to a sponsor, and they were pretty happy with what we are doing. They use spreadsheets and are also implementing an MDR (maybe their 2nd or 3rd try with different MDR vendors).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the sponsor's pain, our first attempt to a MDR failed miserably! I now have my standards in YAML and I'm using a python GUI for extraction. At least the YAML format is vendor neutral and can later be converted to be integrated in any MDR system.

## Plug In Architecture
1. Add new loaders to address different sources of metadata, such as MDRs or other propreitary sources
2. Add new generators to create new study artifacts or variations of the supported artifacts
3. New approaches to the refinement pipeline
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need more details here as I don't get the idea.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without getting into the technical details (I will write something up on this topic), the idea is that we provide a way for others to write their own loader code that can be used by this application. MDRs would be a great example. We, the project, may not be able to build an MDR loader (without vendor help), but we could set this up so someone could write one that could be added. Basically, the architecture should allow others to extend it with their own loaders that pull metadata from whatever sources they have to populate the DDS, and then can take advantage of the generators to create the outputs.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I can even give it a shot with my YAML standard files but they are for ADaM standards, not SDTM.


## Metadata Gaps Identified
1. Numerous gaps in the metadata needed to automate Define-XML generation were identified
2. Examples include keySequence, Length, ...
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I be more specific during the presentation, for example mentionning that those are not available in the CDISC Library and thus need user's input? Or should I just mention the few examples and move on?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move our list of gaps identified in Phase 1 into this repo and let the audience know where to find more details. Then, during the presentation, you can highlight some examples that get the point across. You can state that not all the metadata needed to generate a Define-XML is available via the USDM + BC + DSS content.

- Targets automation in a way that Define-XML was not designed to support.
3. Alignment and Automation Benefits
- Structured metadata enables automated validation, controlled terminology checks, and reliable value-level metadata building.
- Provides the metadata to support many different automation tasks, beyond generating define.xml and ODM-based CRFs.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such as? An example or two would help I think, this could be added to the presenter's notes, not the slides.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a note or two on this.


---

## Data Definition Specification (DDS)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Globally, this is the slide I'm going to struggle with as I'm not th author of the model. I know the benifits it adds for define.xml creation but I'm having a hard time figuring out everything else it might support. A few notes would be appreciated, mostly for me to fully understand the message we're trying to pass.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add some notes on this. I don't think we need to get into everything it might do as I think that's not 100% determined yet. That said, we can highlight how it will help with CRF generation, LabV2 to SDTM transformations, etc.

---

## Plug In Architecture
1. Add new loaders to address different sources of metadata, such as MDRs or other propreitary sources
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"propreitary" should be "proprietary"

2. Phased Progress Achieved
- Phase 1 delivered automated SDTM Define-XML generation using new metadata specification models and biomedical concepts.
3. Forward Extension Plans
- Phase 2 will automate ADaM Define-XML incorporating analysis concepts to represent analytical intent as structured metadata.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just cross-checking: is "DTA to SDTM transformations" not to be mentioned here? Maybe I'm just confused with the 360i Phase 2 goals :)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DTA-to-SDTM Transformations are listed as a sub-project at the beginning of the presentation, alongside the other feature team deliverables. We then say the presentation will largely focus on Define-XML generation, since that's primarily what we've worked on so far, and we want to provide specifics on our work in this presentation. Since we don't go into any details on it, it's not a major takeaway, and this holds for most of the other feature teams as well.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just not to forget: suggest removing "JSON" from "JSON" Model i the larger orange box in both diagrams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants