As countries all around the world are beginning to seriously invest in their own models, there is a growing debate on whether the country should create its own local large language model. If we do, would it be cost-effective? And if it is, would it be beneficial? As an early-career professional, I have heard these questions before, whether we are talking about sovereignty, independence, or simply having something to be proud of. Some people in our industry keep asking that question, and more often than not, in that same instance, the reality of creating one always surfaces. First, it is expensive to do so. The training cost alone would cost millions of pesos in compute. As sheer amount of GPU power is required to train a large language model. Second, is the data problem. More often than not, current LLMs that can speak Tagalog do so in a way that is not aligned with how Filipinos actually speak the language. They tend to sound more formal, and more reserved. A fluent speaker of the language can see through this in seconds. Yes, it is valid Tagalog, but it is not daily conversational Tagalog. And that matters because how we speak online and offline depends on the circumstances and on minute shifts in tone that LLMs cannot imitate as of now. Why? Because these models are trained on public text from Wikipedia, textbooks, storybooks, and religious text translations. These all share that formal tone, and no amount of post-reinforcement can remove this issue completely. Models are taught to write that way. Third, is whether creating a local large language model is even practical. Should we even do it? Looking at the horizon, the battle is no longer just about building an LLM. It is about making it useful for daily tasks such as coding, writing, or even customer service. Having an LLM that can speak “normal” Tagalog would not have much application beyond being a chatbot, which current models are already doing good enough. Finally, there is the issue of practicality. “Good enough” is a phrase we often end up accepting. It is not perfect. These models, as I have said, write in formal Tagalog, but even though we cannot completely erase this side effect, people tend to tolerate it once prompt reinforcements are added. You can minimize this by around 60–80% through reinforcement. It is cheap, practical, and finally, good enough. Because the alternative is fine-tuning or full-blown training. But despite those challenges, I and other people in IT still have that dream: that one day, we can create our own local large language model trained on the natural conversations we have on social media sites and in daily life. Maybe one day, it will be cheap enough for us to try, not because it is financially or technically viable, but because we can. Do not get me wrong. I am confident that we can pull it off. Our society is chronically online. There are lots of vlogs out there and lots of organic conversations that are kept within private servers of startups and enterprise companies alike. The only challenge that remains is whether we have the time to sift through this messy data, clean it, and brace our wallets for training costs. In the end, we will not know the result until we have it.
Should Philippines Make Its Own Local Large Language Model?
AWS, Azure, and GCP Certifications are consistently among the top-paying IT certifications in the world, considering that most companies have now shifted to the cloud. Earn over $150,000 per year with an AWS, Azure, or GCP certification!
Follow us on LinkedIn, YouTube, Facebook, or join our Slack study group. More importantly, answer as many practice exams as you can to help increase your chances of passing your certification exams on your first try!
View Our AWS, Azure, and GCP Exam Reviewers Check out our FREE coursesOur Community
~98%
passing rate
Around 95-98% of our students pass the AWS Certification exams after training with our courses.
200k+
students
Over 200k enrollees choose Tutorials Dojo in preparing for their AWS Certification exams.
~4.8
ratings
Our courses are highly rated by our enrollees from all over the world.














