@@ -21,7 +21,7 @@ Extract the transcript from Youtube video to use it for RAG context and other po
2121
2222* fully qualified name: ` vac:bi:rag:2025q4-rag-context-code `
2323* owner: nickninov
24- * status: in progress (5 %)
24+ * status: in progress (35 %)
2525* start-date: 2025/10/01
2626* end-date: 2025/12/31
2727
@@ -32,14 +32,17 @@ https://github.com/status-im/data-docs/issues/82
3232Schedule note: Dates reflect quarter bounds; update when actual timing is known.
3333#### Deliverables
3434
35- * Add task to dagster ETL to include code repository to the RAG context
36- * Write documentation in Data-docs.
35+ - Updated the RAG upload prefix logic so freshly ingested chunks no longer collide, and backfilled the newest data through the pipeline.
36+ - Patched the Quadrant HTTPS ingestion bug and added monitoring for data freshness as the Quadrant DB grows.
37+ - Expanded the sources dashboard with chunk counts to make it easier to audit what has been loaded.
38+ - Add task to dagster ETL to include code repository to the RAG context
39+ - Write documentation in Data-docs.
3740
3841### Google Meeting transcript
3942
4043* fully qualified name: ` vac:bi:rag:2025q4-rag-context-google-meet `
4144* owner: nickninov
42- * status: in progress (5 %)
45+ * status: in progress (20 %)
4346* start-date: 2025/10/01
4447* end-date: 2025/12/31
4548
@@ -51,5 +54,6 @@ https://github.com/status-im/data-docs/issues/68
5154Schedule note: Dates reflect quarter bounds; update when actual timing is known.
5255#### Deliverables
5356
54- * Add task to dagster ETL to include meeting transcript to the RAG context.
55- * Write documentation in Data-docs.
57+ - Debugged the meeting transcript ingestion (YouTube metadata & evaluation pipeline) and documented the outstanding edge cases.
58+ - Add task to dagster ETL to include meeting transcript to the RAG context.
59+ - Write documentation in Data-docs.
0 commit comments