1212- ** PostgreSQL:** Uses ` $1 ` , ` $2 ` , etc. placeholders
1313
1414** Files Updated:**
15+
1516- [ src/db/repository.ts] ( ./src/db/repository.ts ) - All query methods now detect database type and use appropriate placeholders
1617- [ src/db/schema.ts] ( ./src/db/schema.ts ) - Added ` metadata ` column to ` processing_log ` table
1718- [ steps/classify-flow/complete-classify.step.ts] ( ./steps/classify-flow/complete-classify.step.ts ) - Fixed old ` db.prepare() ` calls, converted booleans to integers, JSON-stringified arrays/objects
2122** Problem:** SQLite only accepts primitives (numbers, strings, bigints, buffers, null) but we were trying to bind JavaScript booleans and objects.
2223
2324** Solution:**
25+
2426- ** Booleans → Integers:** All boolean fields (` is_cumulative ` , ` is_currency_denominated ` , ` boolean_review_passed ` ) now convert to 0/1
2527- ** Objects/Arrays → JSON Strings:** Fields like ` boolean_review_fields_wrong ` and ` final_review_corrections ` are JSON-stringified before saving
2628
2931** Problem:** ` processing_log ` table was missing the ` metadata ` column that was added to PostgreSQL.
3032
3133** Solution:**
34+
3235``` bash
3336sqlite3 ./data/classify-workflow-local-dev.db " ALTER TABLE processing_log ADD COLUMN metadata TEXT;"
3437```
@@ -40,6 +43,7 @@ sqlite3 ./data/classify-workflow-local-dev.db "ALTER TABLE processing_log ADD CO
4043** Solution:** Reduced to ** 1 concurrent batch** (5 indicators at a time):
4144
4245** Files Updated:**
46+
4347- [ scripts/run-random.ts] ( ./scripts/run-random.ts ) - ` concurrentBatches = 1 `
4448- [ scripts/run-all.ts] ( ./scripts/run-all.ts ) - ` concurrentBatches = 1 `
4549
@@ -48,6 +52,7 @@ sqlite3 ./data/classify-workflow-local-dev.db "ALTER TABLE processing_log ADD CO
4852** Problem:** When indicators got stuck, the script would wait forever with no feedback.
4953
5054** Solution:** Added smart progress detection:
55+
5156- Detects when no progress for 20 seconds
5257- Shows which indicators are stuck and at which stage
5358- Shows error messages for failed indicators
@@ -56,6 +61,7 @@ sqlite3 ./data/classify-workflow-local-dev.db "ALTER TABLE processing_log ADD CO
5661## Final Configuration
5762
5863### Environment Variables (` .env ` )
64+
5965``` bash
6066# SQLite Database
6167CLASSIFY_DB=sqlite
@@ -67,20 +73,23 @@ INTERNAL_EMIT_DELAY_MS=500 # 500ms delay between batches
6773```
6874
6975### Script Defaults
76+
7077``` typescript
71- const batchSize = 5 ; // 5 indicators per batch
72- const concurrentBatches = 1 ; // 1 batch at a time
78+ const batchSize = 5 ; // 5 indicators per batch
79+ const concurrentBatches = 1 ; // 1 batch at a time
7380// Total concurrency: 5 indicators × ~6 LLM stages = ~30 API calls max
7481```
7582
7683## Performance Characteristics
7784
7885### Before
86+
7987- ** Configuration:** 4 batches × 5 indicators = 20 concurrent
8088- ** LLM Calls:** ~ 120 concurrent API calls
8189- ** Result:** Rate limiting, stuck indicators, incomplete batches
8290
8391### After
92+
8493- ** Configuration:** 1 batch × 5 indicators = 5 concurrent
8594- ** LLM Calls:** ~ 30 concurrent API calls max
8695- ** Result:** Stable processing, all indicators complete, no rate limits
@@ -114,13 +123,15 @@ Next batch of 5 starts
114123## Database Compatibility
115124
116125### SQLite (Local Development)
126+
117127- ✅ Proper ` ? ` placeholders
118128- ✅ Boolean values as 0/1 integers
119129- ✅ JSON fields as TEXT
120130- ✅ WAL mode enabled for concurrency
121131- ✅ All schema columns present
122132
123133### PostgreSQL (Production)
134+
124135- ✅ Proper ` $1, $2 ` placeholders
125136- ✅ Boolean values as BOOLEAN type
126137- ✅ JSON fields as JSONB
@@ -130,11 +141,13 @@ Next batch of 5 starts
130141## Usage
131142
132143### Run 50 Random Indicators
144+
133145``` bash
134146deno task run:random -- -50 openai
135147```
136148
137149### Expected Output
150+
138151```
139152🚀 Processing 50 indicators in 10 batches of 5...
140153 Provider: openai
@@ -152,19 +165,22 @@ deno task run:random -- -50 openai
152165```
153166
154167### Average Timing
168+
155169- ** Per indicator:** ~ 25-35 seconds (with OpenAI GPT-4.1-mini)
156170- ** Per batch (5 indicators):** ~ 30-45 seconds
157171- ** 50 indicators total:** ~ 5-8 minutes
158172
159173## Monitoring
160174
161175### Check Completion Status
176+
162177``` bash
163178sqlite3 ./data/classify-workflow-local-dev.db \
164179 " SELECT COUNT(*) FROM classifications;"
165180```
166181
167182### Check Recent Activity
183+
168184``` bash
169185sqlite3 ./data/classify-workflow-local-dev.db \
170186 " SELECT stage, status, COUNT(*) as count
@@ -174,6 +190,7 @@ sqlite3 ./data/classify-workflow-local-dev.db \
174190```
175191
176192### Check Failed Indicators
193+
177194``` bash
178195sqlite3 ./data/classify-workflow-local-dev.db \
179196 " SELECT indicator_id, stage, error_message
@@ -210,17 +227,20 @@ These are non-critical and logged as warnings. The pipeline will continue proces
210227## Next Steps
211228
212229### To Increase Throughput (if no rate limits)
230+
2132311 . Increase ` concurrentBatches ` to 2
2142322 . Monitor for stuck indicators
2152333 . Adjust based on API performance
216234
217235### To Switch to PostgreSQL
236+
2182371 . Set ` POSTGRES_URL ` environment variable
2192382 . Remove or comment out ` CLASSIFY_DB=sqlite `
2202393 . Run migrations: ` deno task migrate `
2212404 . Restart dev server
222241
223242### To Use Local LLM (Free)
243+
2242441 . Install LM Studio
2252452 . Load a model (e.g., Mistral 7B)
2262463 . Set environment: ` LLM_PROVIDER=local `
@@ -229,22 +249,27 @@ These are non-critical and logged as warnings. The pipeline will continue proces
229249## Files Changed
230250
231251### Database Layer
252+
232253- ✅ [ src/db/repository.ts] ( ./src/db/repository.ts )
233254- ✅ [ src/db/schema.ts] ( ./src/db/schema.ts )
234255- ✅ [ src/db/client.ts] ( ./src/db/client.ts )
235256
236257### Workflow Steps
258+
237259- ✅ [ steps/classify-flow/complete-classify.step.ts] ( ./steps/classify-flow/complete-classify.step.ts )
238260
239261### Scripts
262+
240263- ✅ [ scripts/run-random.ts] ( ./scripts/run-random.ts )
241264- ✅ [ scripts/run-all.ts] ( ./scripts/run-all.ts )
242265
243266### Configuration
267+
244268- ✅ [ .env] ( ./.env )
245269- ✅ [ .env.example] ( ./.env.example )
246270
247271### Documentation
272+
248273- ✅ [ BATCHING.md] ( ./BATCHING.md )
249274- ✅ [ examples/README.md] ( ./examples/README.md )
250275- ✅ [ examples/parallel-batches.ts] ( ./examples/parallel-batches.ts )
0 commit comments