TLDR: I had 2,207 GoPro videos, and I need to rewatch them to find interesting moments from my cycling journey. I built a project to index them locally on my M1 Max using open-source ML models, search for those moments, and send the best clips straight to my DaVinci Resolve timeline. I indexed 628 videos (668.68 GB, 15h 13m 18s of footage). I’m using local ML models because open source models are getting better, and you can get good results using them: Transcription: OpenAI Whisper Model Face recognition: https://github.com/serengil/deepface with RetinaFace as the face detector and VGG-Face as the recognition model. Object detection: Ultralytics YOLO Scene description: Qwen2.5-VL On-screen text: easyocr and I have a source available version: https://github.com/iliashad/edit-mind Full article: https://iliashaddad.com/blog/i-indexed-669-gb-of-my-gopro-videos-using-my-m1-max-computer submitted by /u/IliasHad
Originally posted by u/IliasHad on r/ArtificialInteligence
