TEK2day, YouTube & Google Search

Sep 23, 2024

TEK2day will continue to publish video content on this Substack page as well as to Twitter. I am not sure how much longer TEK2day will post to YouTube as Google is not content creator-friendly.

Read →

3 Comments

Comment deleted

Sep 24, 2024

Comment deleted

Jonathan Maietta

Sep 24, 2024

I did see it, there are a couple more big data centers that ought to be announced soon. I'm skeptical that we will see a $100 billion frontier model in 2026 or 2027 however as Anthropic's CEO posits. If OpenAI's forthcoming Orion is a step function up over the o1 model, we could see it. However, I don't think we see it (the $100 billion LLM) without Treasury writing a check. I am not in favor of that. I don't believe that a single frontier model will acheive AGI in 3 years as some say. I think AGI will be achieved by a series of frontier models and smaller models that form a cluster(s). I also believe that the models we will primarily use in the future will be smaller models trained on vertical data sets. In the meantime U.S. Treasury could waste $ Trillions chasing AGI.

Reply (1)

Comment deleted

Sep 24, 2024Edited

Comment deleted

Jonathan Maietta

Sep 25, 2024

Related to our discussion about "small" language models, MSFT's GRIN MoE: https://tek2day.substack.com/p/microsofts-ai-reorg-and-grin-moe

Jonathan Maietta

Sep 25, 2024

The video is correct, the more data input the more accurate the LLM, especially if we are talking about a frontier LLM and assuming the scaling laws continue to apply. When I say "small model" I am referring to one that is trained on a data set that is domain-specific. Perhaps I should have said "vertical-specific" model or "domain-specific" model. The experts building the frontier LLMs consistently say AGI in approx 3 years or 2-3 model generations from now (feels ambitious to me), but even if we see AGI in 3 years that means $100 billion frontier models. My hunch is that something fundamental will evolve with the LLM training process between now and then. OpenAI using its o1 LLM to train Orion is an example of a new approach. o1's ability to memorize reasoning steps as opposed to only the raw input data is a new approach. I can imagine an enormous cluster of models that roll up to a "mother" LLM, similar to a beehive, where various models have specific tasks. Certain models may be rich with domain specific information, others may spend all of their time trying to improve the reasoning process, others may be dedicated to improving inference, while yet others may be tasked with realtime routing of energy to where it is most needed. All of this activity will be in service of the mother LLM. We may eventually be talking about hundreds or thousands of models working in concert, all feeding the mother LLM. That is not to say that some of these domain-specific or task-specific models can not be finished products which are extended to users via APIs and applications.

Yup, there will be enormous operating leverage if Government chooses to deploy advanced automation. We ought to insist! Automate the RMV today ;)