docs: revise best practices and contributing guide by a-frantz · Pull Request #293 · stjudecloud/workflows

a-frantz · 2026-02-03T20:59:47Z

Describe the problem or feature in addition to a link to the issues.

closes #277

Before submitting this PR, please make sure:

You have added a few sentences describing the PR here.
The code passes all CI tests without any errors or warnings.
You have added tests (when appropriate).
You have added an entry in any relevant CHANGELOGs (when appropriate).
If you have made any changes to the scripts/ or docker/ directories, please ensure any image versions have been incremented accordingly!
You have updated the README or other documentation to account for these changes (when appropriate).

maybe something else should go there?

adthrasher · 2026-02-04T16:54:26Z

CONTRIBUTING.md

+- See `template/common-parameter-meta.txt` for common description strings.
+  - If applicable, use the same parameter name and help text as the underlying tool called by the task.


Not at all related to this PR, but I wonder if there is a way to tell Copilot about this file. It'd be great if its autocomplete suggestions would consider that file as a set of options when writing parameter_meta.

I'm skimming this now, but there are instructions for doing something that sounds like this feature: https://docs.github.com/en/copilot/how-tos/configure-custom-instructions/add-repository-instructions?tool=vscode

CONTRIBUTING.md

adthrasher · 2026-02-04T17:14:45Z

CONTRIBUTING.md

+    - Contributors can mix and match the available templates, copy and pasting subsections as appropriate.
+    - It is allowed to have one resource allocated dynamically, and another allocated statically in the same task.
+- Multi-core tasks should *always* follow the conventions laid out in the `use_all_cores_task` example (see `template/task-examples.wdl`).
+  - This is catering to cloud users, who may be allocated a machine with more cores than are specified by the `ncpu` parameter.


We may want to revisit this paradigm in the future. With Planetary + AKS, the container only gets allocated the requested number of cores within the VM.

Q you may not have the answer to: how does nproc work in those environments? I think that nproc on the cluster is "smart" and only returns what was allocated.

It does. With Planetary + AKS, the container is allocated n CPUs, and it can only see that many. I'm not actually confident that the idea behind nproc is sound. As long as the task is running in a container, the Docker engine will have provided some number of cores to the container. Presumably, whatever engine is queueing the Docker container will provide the cpu argument to Docker as the number of cores. The use_all_cores idea is probably better implemented at the engine-level. If the engine allocates 1 VM per task, then it should use_all_cores. If it is capable of sharing VMs between tasks, then it shouldn't use_all_cores.

the point of use_all_cores is so that you can tell called tools what level of parallelism should be used. Most bioinformatics tools will only spin up as many threads as you ask for (i.e. without specifying some kind of --thread [>1 value] everything will run in one thread regardless of how many cores the task/engine/backend/VM allocates.

Yeah, I meant more that the engine, itself, should set use_all_cores if it isn't capable of scheduling multiple tasks to a node.

With our experience with LSF though, I'm curious how other HPC system function (e.g. Slurm). LSF does something so that the underlying task only sees the allocated cores. If the scheduling systems reliably did that, then we could just always use_all_cores.

Yeah, I meant more that the engine, itself, should set use_all_cores if it isn't capable of scheduling multiple tasks to a node.

this stumped me for a minute. I don't like the idea of the engine messing around in the inputs section, but that wouldn't be needed. 1.3 exposes both hints and requirements to command blocks, which is a paradigm shift I haven't adjusted to yet. I can see something analagous to use_all_cores being a hint that can be accessed in the command in order to determine how nproc should (or should not be) called

CONTRIBUTING.md

best-practices.md

adthrasher · 2026-02-04T17:35:37Z

best-practices.md

+- The non-empty qualifier (`+`) of arrays and maps should be avoided.
+  - TODO this is really just our opinion after trying to use it and finding the resulting WDL ugly. I'm not sure it can really be called a "best practice".
+  - This is because the non-empty qualifier can be cumbersome to deal with in WDL source code.


I don't actually recall what led to writing this. I'm not sure we should recommend skipping use of a WDL language feature though as a best practice. If anything, the spec should be improved around the feature to improve the user experience.

We briefly had Array[*]+ on main before we had some issue with sprocket+miniwdl compatibility that made us go back and revert them all (and write this entry), although I now can't remember the specifics. It should be in our git history though, will just take some digging to find

removed for now

If anything, the spec should be improved around the feature to improve the user experience.

this is true. I'm going to leave this thread unresolved, as we should go find that git history and share the pain points we ran into and file issues in OpenWDL for ways to improve the spec

Co-authored-by: Andrew Thrasher <adthrasher@gmail.com>

WIP[docs: revise best practices and contributing guide]

33b7338

a-frantz self-assigned this Feb 3, 2026

a-frantz requested a review from adthrasher February 3, 2026 20:59

chore: wordsmithing

ae296b0

a-frantz marked this pull request as ready for review February 3, 2026 21:59

a-frantz changed the title ~~WIP[docs: revise best practices and contributing guide]~~ docs: revise best practices and contributing guide Feb 3, 2026

a-frantz added 2 commits February 3, 2026 17:15

add entry for deleting tmp/intermediate files

adcc0bc

delete intro that no longer makes sense

628c5b7

maybe something else should go there?

adthrasher reviewed Feb 4, 2026

View reviewed changes

Apply suggestions from code review

675aebf

Co-authored-by: Andrew Thrasher <adthrasher@gmail.com>

adthrasher mentioned this pull request Feb 4, 2026

Set long-running tasks' maxRetries to zero #294

Open

a-frantz and others added 3 commits February 6, 2026 09:45

Apply suggestion from @adthrasher

5d26c51

Co-authored-by: Andrew Thrasher <adthrasher@gmail.com>

review feedback

1c9023a

address TODO

38bf5db

adthrasher approved these changes Feb 6, 2026

View reviewed changes

a-frantz merged commit 21b2fd9 into main Feb 6, 2026
19 of 21 checks passed

a-frantz deleted the docs/best-practices branch February 6, 2026 16:24

		- See `template/common-parameter-meta.txt` for common description strings.
		- If applicable, use the same parameter name and help text as the underlying tool called by the task.

Conversation

a-frantz commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

a-frantz commented Feb 3, 2026 •

edited

Loading