Skip to content

feat: Add feature to enable dynamic ec2 config via workflow labels#5003

Open
edersonbrilhante wants to merge 25 commits intogithub-aws-runners:mainfrom
edersonbrilhante:feat-ec2-dynamic-config
Open

feat: Add feature to enable dynamic ec2 config via workflow labels#5003
edersonbrilhante wants to merge 25 commits intogithub-aws-runners:mainfrom
edersonbrilhante:feat-ec2-dynamic-config

Conversation

@edersonbrilhante
Copy link
Contributor

@edersonbrilhante edersonbrilhante commented Jan 19, 2026

Summary

This PR resumes and completes the work started in #4529.

It also allows to use any other dynamic labels with prefix ghr-. Giving support for unique labels per job or per group of jobs

It ensures that EC2-specific config can be defined via run-ons

How to test:

Use your regular labels, and add ghr-ec2-instance-type and ghr-ec2-image-id

run-ons:
  - <regular-labels>
  - ghr-ec2-instance-type:<different-instance-type>
  - ghr-ec2-image-id:<different-ami-id>

In this case:

  • The runner is resolved from <regular-labels>
  • The EC2 instance type and AMI ID are taken exactly from the provided labels

@edersonbrilhante edersonbrilhante marked this pull request as ready for review January 20, 2026 16:16
@edersonbrilhante edersonbrilhante requested review from a team as code owners January 20, 2026 16:16
@npalm
Copy link
Member

npalm commented Jan 30, 2026

@edersonbrilhante great to see this PR.

@edersonbrilhante edersonbrilhante force-pushed the feat-ec2-dynamic-config branch 3 times, most recently from e9c175c to 3dc5d6d Compare January 30, 2026 20:37
@andrecastro
Copy link

This is a really interesting feature!

Just one suggestion from my side: would it be possible to support a whitelist of allowed instance types?

Also, it could be really powerful to have some kind of feature-flag / policy control over which parts of the configuration are allowed to be dynamic. For example, in my org we don’t want developers to be able to select arbitrary AMIs (only a pre-approved set), but it would be awesome to still allow them to choose the instance type for workflow jobs, as long as it’s constrained to an allowed list.

Maybe the "feature flag" is not even necessary as long as we could define the "allowed values" for each configuration, with this we could list only the pre-approved AMIs.

@edersonbrilhante
Copy link
Contributor Author

@andrecastro I liked and makes a lot of sense to me. I just need more time to think about the implementation. And tbis PR is already big enough XD. I could create a following for adding this restricted values feature

Copy link
Contributor

@stuartp44 stuartp44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to approve, but I do have a statement about incorrect labels and the effect on the process.

expect(canRunJob(workflowLabels, runnerLabels, false)).toBe(true);
expect(canRunJob(workflowLabels, runnerLabels, false, false)).toBe(true);
});

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference my previous point, what do I feel is valid here?

}),
});
});

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these tests work with all good values, but because we are in the user space, what about bad values and their effects? Is it maybe worth extending the tests to not trust the user data? I am not sure how the behaviour will be if someone makes a mistake, does it take the whole batch/process out?

Copy link
Contributor

@stuartp44 stuartp44 Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example would be m5.large, could be m5,large. How will this change behaviour?

@stuartp44
Copy link
Contributor

stuartp44 commented Feb 20, 2026

I also agree with what was previously mentioned; we probably need a safelist, as we don't want lateral movement when a compromised pipeline is used, especially with the VPC setting. Maybe worth some "allowed_instance_type" setting or something to that effect that can be checked against, and if not in the list, ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments