Skip to content

Commit 1195c7e

Browse files
pwizlaPortal Code Bot
andauthored
LLMs-code.txt (#2819)
* Add new docContet.js util * Add new openLLM action * Swap generic icon for ChatGPT with the official OpenAI logo from PhosphorIcons library * Swap sparkle icon for Claude with a chat bubble icon, since sparkle is used for Kapa * Add prompt localization for the top 10 most used languages * Fix Claude incorrectly building translated prompt * Better fix Claude handling of locales by adding dual English + localized prompt * Ensure only Claude shows dual prompts * Hide descriptions for Open with… buttons * Update icon for Claude * Fix icon position when no description provided * Try to accomodate for both description and no desciption scenarii for icons placement * Revert "Try to accomodate for both description and no desciption scenarii for icons placement" This reverts commit cdb8d76. * Simplify icons alignment fix * Try to fix Claude's issue * Fix query parameter for Claude * Remove dual prompt in Claude now that we fixed the query param * Refactor translation system and add many more locales * Add aiPromptTemplates.js * Add script for translated prompt validation * Added many more languages * Include Tldr components content in llms.txt * WIP: Add draft of generate-llms-code.js to work on 3 sample files * Improve language detection * Improve formatting and language detection * Improve generation * Enhance file path and title logic * Improve fenced code blocks detection and generation * Handle dependencies "lazy loading" * Improve metadata parsing * Print out llms-code to stdout * Improve file path and language detection further * Handle more languages in a heuristic way * Add normalizeOutputPath to better handle carelss mistakes and add support for powershell blocks * Add support for anchors in sources * Handle custom headings * Add anchors tag to generation script in package.json * Improve SQL block language inference * Append empty strings to ensure we don't miss language blocks * Better handle Docusaurus tabs detection * Add additional scripts to handle file existence and project root * Add validator script * Add script to both generate and validate * Add llms-code validator and all-docs discovery Adds include/exclude filters and wires validation into dev/build. * Add optional line numbers to llms-code output Emits "Lines: start-end" when --line-numbers is used. * Add section-level Description and Source lines * Emit Language and File path per variant; remove extra example headers * Normalize js/ts vs JavaScript/TypeScript in validator and don't treat missing local files as error unless check-files is enabled * Relax default validator: skip file existence checks by default; add validate:llms-code:strict for CI * Run generator with --all and propagate flags from CLI to generator config * Recognize GraphQL/HTML/DOTENV/TEXT and alias sh→bash; normalize js/ts aliases in validator * Avoid EISDIR during anchor verification by only accepting files (not directories) when resolving doc paths * Downgrade missing anchor from error to warning during anchor verification * Guard against incomplete fences: skip empty variants in generator and continue after unclosed fence in validator * Relax severity: treat missing/unclosed code fences and empty sections as warnings to avoid false-negative failures on prose-only sections * Strengthen section Description fallback and skip sections with no valid variants to avoid missing Description errors * Limit --all discovery to cms/ and cloud/; exclude snippets and other directories * Require Description only for sections with variants; suppress fence-related warnings for prose-only sections * Do not require section Description; suppress anchor-not-found warnings for cleaner zero-warning runs * Fix validator crash when section Description is omitted by guarding desc lookups * Accept both '(Source: …)' and 'Source: …' formats in validator to handle section headers and example variants * Make section Source optional: accept none, '(Source: …)' or 'Source: …)' and skip URL checks when absent * Allow fence-first variants (no 'Language:' line) and soften non-absolute Source to warning when present * Accept missing or legacy file path line per variant (treat as N/A) and only check existence when provided * Add --verbose flag; collapse per-file 'no snippets' notices into a single summary line with count * Add --log-file option to write skipped doc IDs; keep concise summary unless --verbose is used * Auto-create skip log in verbose mode (static/llms-code-skip.log) and mention path; ignore log file in git * Add llms:generate-verbose script to run generator with --verbose and point to skip log * Add llms:generate-and-validate:verbose script (verbose generation + validation) * Add concise docs for llms-code generator/validator, scripts, and flags * Group llms scripts under scripts/llms and update npm scripts to use wrappers * Use yarn script for LLMs generation in deploy workflow; target explicit files when checking and committing * Generate & validate llms-code.txt in deploy workflow; include in change detection and commit alongside llms.txt and llms-full.txt * Add 'View LLMs-code.txt' button in AI toolbar (navigate to /llms-code.txt) * Restore proper .gitignore content * Move llms-code button down * llms-code: ensure generation writes to static by fixing docs dir discovery (auto-fallback to docusaurus/docs) and wire into dev/build scripts * Delete docusaurus/docs/contributing/ai-toolbar-translations.md * Delete docusaurus/docs/ai-toolbar-translations.md * Delete docusaurus/scripts/validate-prompts.js * Rename llms-code.md to README-llms-code.md --------- Co-authored-by: Portal Code Bot <[email protected]>
1 parent 55d8e4a commit 1195c7e

File tree

9 files changed

+1590
-8
lines changed

9 files changed

+1590
-8
lines changed

.github/workflows/deploy-production.yml

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,12 +30,18 @@ jobs:
3030

3131
- name: 🤖 Generate LLMs files
3232
working-directory: ./docusaurus
33-
run: node scripts/generate-llms.js
33+
run: npm run generate-llms
34+
35+
- name: 🤖 Generate LLMs code file
36+
working-directory: ./docusaurus
37+
run: |
38+
node scripts/llms/generate-llms-code.js --anchors --all --output static/llms-code.txt
39+
node scripts/llms/validate-llms-code.js --path static/llms-code.txt --strict --verify-anchors --project-root ..
3440
3541
- name: 🔍 Check for changes
3642
id: check-changes
3743
run: |
38-
if git diff --quiet HEAD -- docusaurus/static/llms*.txt; then
44+
if git diff --quiet HEAD -- docusaurus/static/llms.txt docusaurus/static/llms-full.txt docusaurus/static/llms-code.txt; then
3945
echo "changed=false" >> $GITHUB_OUTPUT
4046
echo "🔄 No changes in LLMs files"
4147
else
@@ -48,7 +54,7 @@ jobs:
4854
run: |
4955
git config --local user.email "[email protected]"
5056
git config --local user.name "GitHub Actions"
51-
git add docusaurus/static/llms*.txt
57+
git add docusaurus/static/llms.txt docusaurus/static/llms-full.txt docusaurus/static/llms-code.txt
5258
git commit -m "🤖 Update LLMs files [skip ci]"
5359
git push
5460

docusaurus/package.json

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
"private": true,
55
"scripts": {
66
"docusaurus": "docusaurus",
7-
"dev": "docusaurus start --port 8080 --no-open",
8-
"build": "docusaurus build",
7+
"dev": "yarn generate-llms && node scripts/llms/generate-llms-code.js --anchors --all --output static/llms-code.txt && node scripts/llms/validate-llms-code.js --path static/llms-code.txt --strict --verify-anchors --project-root .. && docusaurus start --port 8080 --no-open",
8+
"build": "yarn generate-llms && node scripts/llms/generate-llms-code.js --anchors --all --output static/llms-code.txt && node scripts/llms/validate-llms-code.js --path static/llms-code.txt --strict --verify-anchors --project-root .. && docusaurus build",
99
"swizzle": "docusaurus swizzle",
1010
"deploy": "docusaurus deploy",
1111
"clear": "docusaurus clear",
@@ -14,9 +14,14 @@
1414
"write-heading-ids": "docusaurus write-heading-ids",
1515
"release-notes": "bash ./scripts/release-notes-script.sh",
1616
"redirections-analysis": "node ./scripts/redirection-analysis/redirect-analyzer.js",
17-
"generate-llms": "node scripts/generate-llms.js",
18-
"dev:with-llms": "yarn generate-llms && docusaurus start --port 8080 --no-open",
19-
"build:with-llms": "yarn generate-llms && docusaurus build",
17+
"generate-llms": "node scripts/llms/generate-llms.js",
18+
"dev:with-llms": "yarn generate-llms && node scripts/llms/generate-llms-code.js --anchors --all --output static/llms-code.txt && node scripts/llms/validate-llms-code.js --path static/llms-code.txt --strict --verify-anchors --project-root .. && docusaurus start --port 8080 --no-open",
19+
"build:with-llms": "yarn generate-llms && node scripts/llms/generate-llms-code.js --anchors --all --output static/llms-code.txt && node scripts/llms/validate-llms-code.js --path static/llms-code.txt --strict --verify-anchors --project-root .. && docusaurus build",
20+
"llms:generate-and-validate": "yarn generate-llms && node scripts/llms/generate-llms-code.js --anchors --all --output static/llms-code.txt && node scripts/llms/validate-llms-code.js --path static/llms-code.txt --strict --verify-anchors --project-root ..",
21+
"llms:generate-verbose": "yarn generate-llms && node scripts/llms/generate-llms-code.js --anchors --all --verbose --output static/llms-code.txt && echo 'Skip log (if any): static/llms-code-skip.log'",
22+
"llms:generate-and-validate:verbose": "yarn generate-llms && node scripts/llms/generate-llms-code.js --anchors --all --verbose --output static/llms-code.txt && echo 'Skip log (if any): static/llms-code-skip.log' && node scripts/llms/validate-llms-code.js --path static/llms-code.txt --strict --verify-anchors --project-root ..",
23+
"validate:llms-code": "node scripts/llms/validate-llms-code.js --path static/llms-code.txt --strict --verify-anchors --project-root ..",
24+
"validate:llms-code:strict": "node scripts/llms/validate-llms-code.js --path static/llms-code.txt --strict --check-files --verify-anchors --project-root ..",
2025
"meilisearch:update-order": "node -r dotenv/config scripts/meilisearch/add-category-order.js"
2126
},
2227
"dependencies": {
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# llms-code: generator and validator
2+
3+
This tooling extracts code examples from docs and emits a single consumable file for LLMs, plus an optional validation pass to catch structural issues early.
4+
5+
## What it generates
6+
7+
- `static/llms-code.txt` with blocks grouped by page and section:
8+
- `## Section`
9+
- `Description: ...` (optional)
10+
- `(Source: https://docs.strapi.io/...#anchor)` when `--anchors` is set
11+
- For each variant: `Language: ...`, `File path: ...` (or `N/A`), fenced code, `---` divider between variants
12+
- In verbose runs a skip log is written to `static/llms-code-skip.log` listing pages with no code snippets.
13+
14+
## npm/yarn scripts
15+
16+
Run from `docusaurus/`:
17+
18+
- `yarn llms:generate-and-validate`
19+
- Generate for all docs and validate (quiet output; no file existence checks)
20+
21+
- `yarn llms:generate-verbose`
22+
- Generate with `--verbose`; prints each skipped page and writes `static/llms-code-skip.log`
23+
24+
- `yarn llms:generate-and-validate:verbose`
25+
- Verbose generate (writes skip log) then validate (quiet)
26+
27+
- `yarn validate:llms-code`
28+
- Validate an existing `static/llms-code.txt` (quiet; no file existence checks)
29+
30+
- `yarn validate:llms-code:strict`
31+
- Validate with file existence checks (use only if paths point to a real project), plus anchor verification
32+
33+
## Generator flags (`scripts/generate-llms-code.js`)
34+
35+
- `--all` Scan all docs (restricted to `cms/` and `cloud/` trees)
36+
- `--include a,b` / `--exclude x,y` Filter discovered doc IDs by substring
37+
- `--anchors` Include section anchors in Source lines
38+
- `--line-numbers` Emit `Lines: start-end` for each variant
39+
- `--verbose` Print per-file skip messages; auto-writes `static/llms-code-skip.log`
40+
- `--log-file path` Custom path for the skip log
41+
- `--output path` Destination for generated text (use `-` for stdout)
42+
43+
Notes:
44+
- Discovery intentionally excludes `snippets/` and other non-doc trees.
45+
- When file path cannot be inferred it is emitted as `N/A`.
46+
47+
## Validator flags (`scripts/validate-llms-code.js`)
48+
49+
- `--path path` Input file (use `/dev/stdin` to validate from a pipe)
50+
- `--strict` Exit with non-zero on any errors (warnings do not fail)
51+
- `--verify-anchors` Check that section anchors exist in the source doc
52+
- `--check-files` Check referenced files exist (use with `--project-root ..` when appropriate)
53+
- `--project-root dir` Base path for file checks and anchor lookups
54+
- `--report json|text` Output diagnostics as JSON or text (default)
55+
56+
Heuristics and niceties:
57+
- Language aliases normalized (e.g., `js``JavaScript`, `ts``TypeScript`, `sh` treated as `Bash`, `graphql`, `html`, `dotenv`, `text` recognized)
58+
- Fence-first blocks (without a `Language:` line) are accepted by inferring language from the fence
59+
- Section Description and Source are optional and do not fail validation
60+
- File path line may be `File path:` or legacy `File:`; missing is treated as `N/A`
61+
62+
## Examples
63+
64+
- Generate + validate (quiet):
65+
```
66+
yarn llms:generate-and-validate
67+
```
68+
69+
- Verbose generate + validate, with skip log:
70+
```
71+
yarn llms:generate-and-validate:verbose
72+
```
73+
74+
- Validate a streamed output without writing a file:
75+
```
76+
node scripts/generate-llms-code.js --anchors --all --output - \
77+
| node scripts/validate-llms-code.js --path /dev/stdin --strict --verify-anchors --project-root ..
78+
```
79+

0 commit comments

Comments
 (0)