Merged
Conversation
added 30 commits
February 11, 2026 19:49
…autograd (missed that before).
Changed the dialation in forward and backward from {1, 1} to {0, 0} bc otherwise the calculation of input output tensor size is invalid
…dependency propagation (at least for windows)
…eds it) and added todo to remove cblas support (was / is experimental and untested anyway)
Windows specific timer inaccuracies (15ms+) made Benchmark Tests flaky
- added cleanup so arrayfire ast doesnt explode and cause crash on dealloc - removed invalid DNNL_ARG_SHIFT on executeNetwork call
fixes: - flooring buckets instead of rounding - integer math with unknown T on bucket size calc (now double) - not clipMaxValueExclusive => ">=" not ">"
assumption of thread num digits always beeing >= 5 is wrong on windows, now only truncating on >= maxDigits logging test: death if supported instead of force death test
…causing a crash on zeroGrad(). added vector of variables to be cleaned after each iteration to stop arrayfire ast explosion
- file blob dataset: added explicit std::ios_base::binary
…ype to comply with explicit types. all tests now passing on msvc_af_cpu_release :)
- removed overly agressive dll copying on windows for cpu builds - added link to openmp for linux builds
- added stub symlink on linux cuda dockerfile
- windows: forgot to replace a inputs.name
removed install-core-deps renamed some actions
- trying to fix symlinks for linux
- moved stuff to "unused" - removed requirements.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
IMPORTANT: Please do not create a Pull Request without creating an issue first. Changes must be discussed.
Original Issue: [#1]
Summary
Cleans up the old flashlight ci and adds a ci with the following capabilities:
missing: test af cuda
pr's will require manual cuda testing for now
📚 Documentation preview 📚: https://fl--3.org.readthedocs.build/en/3/