Original Reddit post

Hi everyone, I’m working on a fairly ambitious but well-defined project and I’m looking for someone experienced with LLMs / AI pipelines to help build it.

The idea

I want to convert ~400+ hours of YouTube content (trading education from a single expert) into a structured, logically ordered “course/book”. The goal is:

  • preserve nuance and reasoning
  • reconstruct the author’s decision-making process
  • turn scattered videos into a coherent learning system

What the system needs to do

Input:

  • YouTube playlists (≈ 418 hours total)
  • transcripts (I can provide them manually or via pipeline)

Processing (core of the project):

A multi-step LLM pipeline, roughly: Chunking

  • split transcripts into manageable segments Extraction (no loss)
  • extract ALL ideas without summarizing Structuring
  • group by themes (market structure, risk, etc.) Educational rewrite
  • convert into clean, readable learning material
  • preserve nuance (no generic AI fluff) Nuance + sanity checks
  • detect:
  • overgeneralizations
  • “motivational” nonsense
  • unsupported claims Deduplication
  • cluster similar content (lots of repetition across videos) Final output
  • structured lessons (Notion or similar)
  • readable like a course, not notes submitted by /u/Marginala

Originally posted by u/Marginala on r/ArtificialInteligence