AI. New Coke. Max Headroom. What do they all have in common? Max was the spokesperson of New Coke. A lot of people think Max was computer generated – a precursor to AI. And AI is left out in the cold, because it actually has nothing to do with any of the above.
Max Headroom was a mega star, okay star in the 80s in the UK. Somehow, he ended up as the spokesperson of New Coke, the same coke which turned out to be a major blunder and removed from shelves in less than three weeks! Max looked like he was computer generated – alas, it was a guy with lots of make-up including prosthetics with a green screen behind him.
Thus, he wasn’t computer generated, nor a pre-cursor to AI. Rather, it was just a guy that landed into the pot of gold playing an annoying character that made the cover of major publications. Yes, a star was born.
I decided to put this post together not to talk about Max Headroom and the actor’s travels thereafter, rather to provide the latest around AI, from a few LLMs, to the carbon footprint (it’s bad), to some hype that hasn’t delivered, studies that might stun you, some surprises around Gen AI, and a variety of additional goodies.
As a promise – any jargon will be reduced to the best of my ability to be explained in examples to everyday life or at least in a laypeople term. A second document – call it my Cheat Sheet to Gen AI (will be published in a couple of weeks and be available on a new site I have for uh, downloading cool content and stuff).
Are you aware?
There is so much out there with AI, that it would require someone to read a lot of publications a night, just to stay current. They would call him a fool – and yes, I am such a fool.
The good news is that you don’t have to – because I will continue to espouse it for you – and how it can apply to L&D and Training at the minimum, and then those pesky enterprises who dove headfirst into AI, without realizing all those interesting nuances.
For some reason, there are people, who think that I can only talk about the negatives, the cons to Gen AI, and lack the deep understanding and insight such as various LLMs, RAG benefits (including what is a RAG), and so forth. For those fine folks, gear up, we are taking off (not on a Boeing though – Ha, I made a funny).
On a side note, there are vendors in our space, that are using RAG – which depending on what you are planning to do within the system with Gen AI – would make sense with the LLM or LLMs they have. Ditto on companies who might have a RAG too. You do not need a RAG, BTW.
Studies – It is the land of confusion
I am amazed, okay not really, on the number of “studies” that have come out around companies’ usage of AI in their budget, impact of financial, usage and so forth. Some contradict each other.
The studies tend to reach out to large numbers of companies, as a whole. The way to look at this, is to recognize that relying solely on one study isn’t the best way to ascertain what is going on with AI.
Study A – Lucidworks – Based on responses from over 2,500 businesses located all over the world
- Implementation of AI – 25% have not deployed – they are in a holding pattern – stuck in beta – “like a plane circling an airport for 25 days or more.”
- Concerns cited by companies around AI in their businesses – hallucinations, data security and the high cost to run the platforms (foundation is the general term instead of saying LLM, which uh is the foundation)
- 63% are planning to increase spending on AI in 2024, that is down from 93% who stated they planned to increase spending for 2024
- 36% plan to keep spending “flat” on AI
- And the kicker – 42% have yet to see any financial returns on AI adoption of products
Study B – Slack Workforce Lab
(Polled over 10,000 global desk workers – full-time (30 hrs or more per week – mostly executive, senior, middle and junior management roles), who work with “data, analyze information or this creatively per Workforce Lab’s methodology.”
In other words, global desk workers are a bit misleading – because it’s not Barnaby in the cube next you, unless he was a manager or exec.)
- 23% usage increase with AI in the workplace (Since Jan 24. The study does not provide a to X date), however, the increase of AI in the workplace since Sept 23 is 60% (again, no info on what date is this up to?)
- 81% say AI has boosted their productivity and quality of work
- 93% say that around work-related issues and trustworthiness – have concerns
The study presented additional information on employee engagement. Again, the results are misleading because it is not employees per say, it is specifically folks in management. Anyway, here it is:
- 18% increase in work-life balance
- 23% able to manage stress better
- 24% increase in happiness in the workplace (the study uses the term – overall satisfaction)
- 29% who state they feel very passionate about their work
- 73% “of desk workers – again, that isn’t workers, that’s management” believe the hype of AI is warranted
- 55% of workers in the age bracket (18 to 29) are highly enthusiastic about AI, and the automation to handle parts of their work, only 33% in the age bracket over 60, feel the same way
What the study doesn’t provide is how many of these folks are/were aware of hallucinations (and thus checking before just accepting the info), AI bias just to name two big ones. Another odd thing about the survey is the age bracket 18 to 29, how many of those folks are senior execs or at a minimum of their management level?
Speaking of execs and AI
I tend to find a lot of execs who have no idea about hallucinations – i.e. fake or false information – especially when it is only their content going into the LLM – hence private.
Assumption being if it is only ours, no issues with fake or false information. That is 100% wrong.
Worse, if you decide to remove your content and untrain the LLM, then put new stuff or updates or whatever back in, you will still have remnants of the previous data in the LLM. – Surprise!
Study C – Accenture
Companies that “apply AI – Generative AI) expect to see higher revenue of 25% after five years.”
For me, I would expect a higher revenue than 25% after five years, simply because AI is at a very tiny stage here, and it will only grow – equally though – not all LLMs are the same, and those darn token fees will hit your bottom line – depending on the number of folk using it – let’s say it is the text version only (like GPT-4o – aka GPT4 Omni, which is free for X number of instances, but the paid version – which would be for Enterprise, increases the number.
When GPT-5 rumbles out, again, for Enterprise in this case, you still have token fees).
On the flip side, if you are using an LLM that is multimodal – which 5, and there are others as well – not just OpenAI here – you can do far more than just enter text and get a retort in text.
As an Emeritus Professor of Economics from Oxford told me – “If you are a risk adverse company, I wouldn’t recommend using AI.”
As folks are aware I often talk about the impact of job losses with AI (as it evolves). While there are plenty of people out there who disagree, saying it either won’t be that bad, or it won’t be the case because there will be more jobs.
If you are in the financial services industry
Citigroup says 67% of banking jobs have “high potential” to change or be outsourced by AI. On the flip side though, AI could add 170 billion dollars to the profit “pool” in the banking sector by 2028.
The article adds that they expect no reduction in headcount – once AI related management hiring is identified and resolved.
One item I found of interest from the Citigroup report is their statement that the banking industry will be slow to adopt due to various factors including regulation and the lack of global rules (which is very big).
Study D- Booth School of Business (University of Chicago)
- GPT-4 does a better job of analyzing financial data including statements and predications based on those statements
- Using “Chain of thought” which directed GPT-4 to identify trends in the financial statements and calculate different financial ratios. They also found that by using a LLM (Large Language Model), GPT-4 achieved an accuracy of 60% compared to humans which is in the low 50% range (relating to predictive accuracy)
- Relating to financial acumen GPT-4 produced more profitable trading with higher share ratios (and alpha) beat the stock market.
Did you know?
Sam Altman, CEO of OpenAI, in a discussion around GPT-5, stated that GPT-4 is the “dumbest model any of you will ever have to use again.” Yes, he was comparing it to GPT-5, which is expected to be far superior, nevertheless his reference could have been something like, “GPT-4 was a solid model, but with 5 coming out you can expect superior results, and blah blah.” He didn’t.
The issues with 4 (which Enterprises were using, and maybe still are, without using Omni) included hallucinations and performance – which was awful.
Numerous people found that in the earlier part of the day, the performance and responses were fast; by the end of the day slow or even the “I can’t respond”.
How bad was it? A study found that the model went from 97.6 accuracy on one test in March 2023 to just 2.4 percent by June.
Another issue was the use of DALL-3 whose output went from the ability to save it as a .PNG to now a WebP format – i.e. a web page. The latter requires you to have software that allows you to change the format into PNG.
Open AI was well aware of this early on – after it started to happen – on their own Developer’s forum, people were complaining about it (and made me feel better, because I thought it was something wrong with my computer). As a fun bonus, Open AI still hasn’t fixed the issue.
The Foundational Model
There are two types of foundational models – the one most folks are aware of is LLM – which means large language model. A LLM requires a lot of computing power to such an extent that it’s carbon footprint is very close to the output of a coal plant.
On top of that, to cool down these computers you need water (a vital commodity in 2024 and beyond).
SLM – is a Small Language Model. An SLM will have a smaller set of parameters compared to a LLM, but it can run on a mobile device, even a laptop.
The latter one intrigues me the most when a vendor has a mobile app whereas folks can access content within the app.
Apple has a SLM in their iPhones, even the ability to use the SLM offline. The idea that a vendor in our industry, including learning technology could use a SLM then, seems possible. Phi3 from Microsoft is just one SLM.
The plus of an SLM is the privacy concerns are seriously reduced, they (SLM) can match an LLM in select functions. Even OpenAI recognizes the opportunity for an SLM themselves.
Be aware though that even with an SLM you will still have hallucinations.
Benchmarks
Folks always, okay a lot of people are wondering “how do they know which LLM is better as this, or better performs at that.” Hello, benchmarks.
Here is a short list of the top ones. Always remember though that they base their output on specific items – which they note – like solving a math problem, for example.
Equally remember just for an FYI – that every LLM, even SLM out there, will do things better or worse compared to another. For example, one LLM does a poor job with web links. Another has an issue with summaries.
Benchmarks to Check Out – Click on “Leaderboard” – Some require you to download files, and are complicated. AA, LiveBench, LMSys are easy to use and read.
- LiveBench
- AGIEval
- Hugging Face – Offers various Benchmarks that sit on Hugging Face
- LMSys which uses Elo Ratings – Includes GPT 4o (latest LLM from OpenAI)
- Artificial Analysis – I’m a fan of this one, because it also shows the latest from Open AI and Anthropic (3.5 Sonnet, matched up to others with performance as one item, fees as another. Best Benchmark out there, IMO.
Autonomous Agents
This is the GOLD. I wrote a year ago, that autonomous agents were the way to go, and the game changer. Well, read any articles these days around auto agents and you will notice they all point to it being a game changer and its powerful impact.
To reduce the number of times I say Autonomous agents, I will refer to them as either “auto agent” or “agent”.
An agent is far more than a copilot. They can make decisions, are self-directed (i.e. no human input) using chain of thought.
Based on an objective – whatever that objective happens to be (limited constraints – for example, pre-defined APIs – if you were using APIs – Auto agents could make those calls for you and collect the data.
On the other hand, if you just provided an objective, the agent can take over and do a lot. From creating the tasks, to remembering the task, change the task, loop themselves, build-upon said tasks and complete (solve) that objective.
We are talking about self-directed behavior with intelligence. Think this way, they think for themselves (some would say a scary thought) – independence without any ongoing input from you.
From a learning and training angle the capabilities of an agent are endless depending on the objective(s) set forth.
On a learning system, regardless of if it is being used for onboarding, or other employees in general and whatever tasks or things they have to do; to customer training/partner training/B2B to associations, heck even students in higher ed, and K-12.
Is there any vendor working on having agents in their system? Yes. One vendor (name retracted) is working on an agent as we speak.
They expect a launch later this summer or early fall. As an added bonus, when it launches, I will be writing a product review on the agent(s) capabilities along with all the other items they have today in their LMS, and what is coming soon.
What is cool about this vendor is today, they are using AI for skills mapping to content. I always have believed this should be a no-brainer for any vendor, and yet automatically skill mapping with the use of AI, is not an everyday capability among vendors.
Bottom Line
Whether you go with Perplexity (I no longer recommend it), or Claude from Anthropic – Claude 3.5 Sonnet beats GPT-4 Omni (according to Anthropic) or Gemini from Google (underwhelmed, work in progress) or Mistral Codestral (receiving a lot of capital- investment – the highest seed round in Europe – 2023) or Llama 3 from Meta or Falcon 2 11B or AutoCoder Assistant which beat GPT-4 Omni (the latter three are 100% open source, no token fees) or GPT 4 Omni or Nemotron-4 340B from Nvidia or SLMs such as Apple Intelligence SLM or Microsoft’s Phi-3 series or Google Gemma or Gemini Nano (underwhelming) and the list goes on.
Perhaps you want to test a few – the majority have a paid subscription option = usually $20 (USD). I like You.com – which has the latest models, albeit they add their own model with it. Nevertheless, I found the output to be strong – performance wise, and a plus to check out the newest as well as older.
Maybe though you want a Model as a Service (MaaS) instead. These are cloud-based platforms that contain a lot of LLMs that you can pick and choose from. They offer a lot of pluses for folks compared to say just going with an LLM.
There are quite a few out there – I’m a fan of Bedrock from AWS, others like Google Modal Garden. Be aware that with Model as a Service, each of them, regardless of the provider starts with machine learning. For example, Google – its Vertex AI (machine learning) which you need first.
If you are a vendor, I strongly recommend that you have at least two LLMs, rather than one.
Multiple LLMs can leverage their pluses to your system, reduce hallucinations (but they are not gone – i.e. no 100% removal here) and offer some serious advantages. There are vendors out there who have multiple LLMs. Heck one vendor uses three.
The point of this post is to give a wide perspective on AI, from various LLMs to SLMs to MaaS to latest studies, autonomous agents, benchmarks and even the top three learning system vendors using AI.
The list?
- CYPHER Learning
- Cornerstone Learning Management – Part of Galaxy (And all the items/mods that it comes with at no additional charge – to see the other items, click on Platform, then look under Cornerstone Learn)
- Thought Industries (A system for customer training/learning/education – whatever term you prefer)
As for me? Well, FindAnLMS, my platform will launch it’s AI version in 2025.
See you then.
Wait, see me on my site – yep, it has AI too – anyone up for synthetic audio with zero latency?
Get ready for an exciting time. AI here we come.
Learning and Training Leading the Pack.
HR?
Uh, let me get back to you.
With an autonomous agent.
E-Learning 24/7
Other AI Insight
Craig’s Gen AI Group on LinkedIn – Join and get the latest from me!