Original Reddit post

Hey guys, I’m working on OCR for files that contain tables, and I want to extract the actual table data. The problem is that every file has a different table layout/order, so the output gets messy but it’s correct and i think it’s okay to work with it I also don’t want to use a vision model because inference speed is really important for me Right now I’m feeding the LLM … raw OCR text output, then asking it to extract the items from the tables. But because the column order changes between files, the model keeps mixing up the columns/items I’ve already tried tweaking the prompt a LOT, but I’m still getting inconsistent results. I’m currently using Qwen 2.5 Speed matters a lot for this project, so I’m looking for advice on: Better/faster models for this use case (Arabic support is important) Better approaches for table extraction from raw OCR text Any preprocessing tricks or parsing methods before sending data to the LLM Whether I should abandon pure-text OCR parsing and use another lightweight method Would really appreciate any recommendations or experiences with similar problems submitted by /u/East-Educator3019

Originally posted by u/East-Educator3019 on r/ArtificialInteligence