eifachposte

eifachposte

Here’s a quick summary of their LLM’s. Below, I’ve linked both their most innovative and useful models for researchers or businesses. EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes (32B) Abstract: "This technical report introduces EXAONE 4.0, which integrates a Non-reasoning mode and a Reasoning mode to achieve both the excellent usability of EXAONE 3.5 and the advanced reasoning abilities of EXAONE Deep. To pave the way for the agentic AI era, EXAONE 4.0 incorporates essential features such as agentic tool use, and its multilingual capabilities are extended to support Spanish in addition to English and Korean. The EXAONE 4.0 model series consists of two sizes: a mid-size 32B model optimized for high performance, and a small-size 1.2B model designed for on-device applications. The EXAONE 4.0 demonstrates superior performance compared to open-weight models in its class and remains competitive even against frontier-class models. The models are publicly available for research purposes and can be easily downloaded via this https URL. " Links: HuggingFace; blog; Github; Company Page. Licensed for research and education only, I think. Motif 2 12.7B technical report Abstract: “We introduce Motif-2-12.7B, a new open-weight foundation model that pushes the efficiency frontier of large language models by combining architectural innovation with system-level optimization. Designed for scalable language understanding and robust instruction generalization under constrained compute budgets, Motif-2-12.7B builds upon Motif-2.6B with the integration of Grouped Differential Attention (GDA), which improves representational efficiency by disentangling signal and noise-control attention pathways. The model is pre-trained on 5.5 trillion tokens spanning diverse linguistic, mathematical, scientific, and programming domains using a curriculum-driven data scheduler that gradually changes the data composition ratio. The training system leverages the MuonClip optimizer alongside custom high-performance kernels, including fused PolyNorm activations and the Parallel Muon algorithm, yielding significant throughput and memory efficiency gains in large-scale distributed environments. Post-training employs a three-stage supervised fine-tuning pipeline that successively enhances general instruction adherence, compositional understanding, and linguistic precision. Motif-2-12.7B demonstrates competitive performance across diverse benchmarks, showing that thoughtful architectural scaling and optimized training design can rival the capabilities of much larger models.” Links: HuggingFace; Company Page. Apache-licensed. Solar Open Technical Report (102B/12B-Active/Q4 MoE) Abstract: “We introduce Solar Open, a 102B-parameter bilingual Mixture-of-Experts language model for underserved languages. Solar Open demonstrates a systematic methodology for building competitive LLMs by addressing three interconnected challenges. First, to train effectively despite data scarcity for underserved languages, we synthesize 4.5T tokens of high-quality, domain-specific, and RL-oriented data. Second, we coordinate this data through a progressive curriculum jointly optimizing composition, quality thresholds, and domain coverage across 20 trillion tokens. Third, to enable reasoning capabilities through scalable RL, we apply our proposed framework SnapPO for efficient optimization. Across benchmarks in English and Korean, Solar Open achieves competitive performance, demonstrating the effectiveness of this methodology for underserved language AI development.” Links: HuggingFace. Custom license. Trillion Labs’ Tri-70B Intermediate Checkpoints They are training models from scratch up to 70B. They got rejected for a government deal but did this on their own anyway. HuggingFace collections are here. Their models are Apache 2.0. Others with Less Detail HyperCLOVA (0.5B-32B): Arxiv; Company Page; HuggingFace. License is custom with usage restrictions. A.X. 3.1 Light (7B): HuggingFace. License: Apache 2.0. Upstage Solar Pro (32B): Pro 2 Company Page; Pro 3 Page; HuggingFace. submitted by /u/nickpsecurity

Originally posted by u/nickpsecurity on r/ArtificialInteligence

South Korean LLM's

South Korean LLM's