Can we trust LLMs with financial data? Visualizing model overconfidence (ECE) across 30 stock predictions

i.redd.it

Can we trust LLMs with financial data? Visualizing model overconfidence (ECE) across 30 stock predictions

i.redd.it

eifachposteMB to AI (Reddit RSS)English · 14 hours ago

Original Reddit post

I plotted the Expected Calibration Error (ECE) for an LLM (Gemini 2.5 Pro) forecasting 30 different real-world time-series targets over 38 days (using the https://huggingface.co/datasets/louidev/glassballai dataset). Confidence was elicited by prompting the model to return a probability between 0 and 1 alongside each forecast. ECE measures the average difference between predicted confidence and actual accuracy across confidence levels.Lower values indicate better calibration, with 0 being perfect. The results: LLM self-reported confidence is wildly inconsistent depending on the target - ECE ranges from 0.078 (BKNG) to 0.297 (KHC) across structurally similar tasks using the same model and prompt. submitted by /u/aufgeblobt

Originally posted by u/aufgeblobt on r/ArtificialInteligence

You must log in or # to comment.

Chat