Showing posts with label LLM. Show all posts
Showing posts with label LLM. Show all posts

Sunday, January 26, 2025

Chinese Lab's AI LLM Performance Shocks Silicon Valley

A Chinese Lab has sparked panic in Silicon Valley with the release of its first AI model that can outperform America's best despite being built more cheaply and with less-powerful chips, according to the US media reports. The lab called DeepSeek has recently unveiled a free, open-source large-language model (LLM) that it says took only two months and $5.5 million million to build, using reduced-capability chips from Nvidia called H800s. By comparison, the US-based OpenAI's closed LLM model cost $100 million to develop and train using the most advanced H100 chips from Nvidia. Open-source and free DeepSeek models can significantly help developing nations like Pakistan by providing affordable access to the latest AI technology, allowing them to develop solutions tailored to their specific needs without high costs. 



DeepSeek, a small startup lab in China, has accomplished this feat despite the US technology export controls to slow down China's AI efforts. Former Google CEO Eric Schmidt is now acknowledging that China has narrowed or closed the AI technology gap with the United States. 

In 2022 America banned the export of advanced chips to China, according to Economist magazine. Nvidia, a leading chipmaker, has had to design special downgrades to its products for the Chinese market. America has also sought to prevent China from developing the capacity to manufacture top-of-the-line chips at home, by banning exports of the necessary equipment and threatening penalties for non-American firms that might help, too. 

The slower H800 chip was created by Nvidia to comply with export regulations that prevent the chipmaker from selling its high-end GPUs to China. Apparently, the limits imposed by Washington on Chinese engineers' access to the most advanced Nvidia chips forced them to develop a much more efficient model to achieve the same performance as their US counterparts.  Other Chinese tech companies ranging from Alibaba and Huawei to TenCents are also working on their own multiple AI models, including LLMs. 

DeepSeek has emerged from High-Flyer, a Chinese hedge fund started by 40-year-old Liang Wengfeng in 2015 to use AI to gain an edge in stocks-trading. Conducting fundamental research helped High-Flyer become one of the biggest quant funds in the country, according to The Economist magazine. 


Thursday, November 7, 2024

Pakistan to Develop Urdu LLM for Generative AI

National University of Science and Technology (NUST), National Information Technology Board (NITB) and Telecom network operator Jazz have signed a Memorandum of Understanding (MOU) to develop Pakistan’s first indigenous Large Language Model (LLM) with focus on Urdu, including datasets for Pashto and Punjabi languages. It is aimed at empowering individuals, businesses, and organizations with advanced AI tools in their native languages. The envisioned LLM is expected to drive innovation in Generative AI applications, boosting productivity and accessibility in critical sectors like healthcare, education, and agriculture.

GPT-4 Accuracy Scores. Source: The Economist


Generative AI tools such as ChatGPT are powered by large language models, or LLMs. These models need to be trained on vast amounts of data in specific languages to be useful. Unfortunately, the Urdu content of the Internet is less than 0.1%. This will present a challenge for the developers of Urdu LLMs.

Online Content of Various Languages. Source: W3Techs 


Lack of Urdu content available for training ChatGPT affects the accuracy of the results for Urdu language users. For example, the GPT-4 accuracy score in question-answer tests in Urdu is just over 70%, compared with 85% accuracy score in the English language, according to data from OpenAI. Other South Asian languages, including Hindi, Bengali, Punjabi, Marathi and Telugu, suffer from the same problem. 

It's not just a South Asian problem. These challenges exist in the developing world. Non-European languages are generally poorly represented online. It's a major obstacle for non-European nations in developing their own generative artificial-intelligence (AI) models, which rely on vast amounts of training data. Generative artificial intelligence (AI) can produce biased results due to a number of factors, including the data it's trained on, the algorithms used, and how it's deployed. 

The use of AI in developing nations such as Pakistan will remain limited to a small number of people proficient in the use of the English language. Broadening the adoption of AI applications will require LLMs trained on local language content. The absence of this development could cost Pakistan the opportunity to take full advantage of the AI Revolution