Original Reddit post

Hey all - I built a small LLM experiment called Mr. Chatterbox, a chatbot trained entirely on books published during the Victorian era (1837–1899). It was trained on a subset of the BL Books dataset , then fine-tuned on a mix of corpus and synthetic data. I used nanochat for the initial training and supervised fine-tuning rounds. SFT consisted of two rounds: one round of two epochs on a large dataset (over 40,000 pairs) of corpus material and synthetic data, and a smaller round that focused on specific cases like handling modern greetings, goodbyes, attempted prompt injections, etc. The model is about 340 million parameters, and so far it’s quite good at discussing Victorian topics (like Darwin, the railroads, etc.) and staying in an authentic victorian voice. As a relatively small model, it can get confused and it definitely has some limitations. To overcome them I’m thinking that I may implement direct preference optimization as a means to continue to improve the model. Anyway, I would love to know if others here have experience with this kind of thing, and hear your experience with the model! submitted by /u/centerstate

Originally posted by u/centerstate on r/ArtificialInteligence