
Why do two people use the same AI model and get completely different results? One produces clarity and insight; the other, noise and frustration. Why do some think technology is coming to take all our jobs while others think it is garbage?
After two years of researching and practicing building various web app tools with large language models (LLMs), I began to see the answer wasn’t just in the technology — or even in the user’s mindset. It was deeper than that. Some tasks lend themselves to fuller automation, but most can only be meaningfully augmented. The difference depends on how people and AI models collaborate with this new technology.
Over time, I started to see a few clear patterns in that collaboration — what I now call the five levels of AI proficiency. These levels represent how individuals and organizations engage with the tools, from casual experimentation to structured integration. Roughly 95% of users remain at what I call Level 1 and Level 2 — typing casual questions into a chat box and getting varied results. A much smaller group, operating at Levels 3 and 4 and 5, achieves consistent, high-quality outcomes by treating AI as a process to be improved, not a magic box to be queried.
As a lean thinker, this variation felt familiar. Two operators can run the same equipment — one produces quality parts, the other defects. The problem isn’t just the machine. It’s the materials, the tools, the methods, and the entire process. The same is true with AI. Tools alone don’t guarantee good results; capability, standards, and feedback loops matter just as much.
When I first began experimenting with LLMs my results were somewhat inconsistent. Some expert practitioners I observed were generating genuinely insightful analyses, while others such as myself — often using the same model, producing what we’d call in lean terms “waste.”
Same technology. Significantly different outcomes. So, I decided to invest more time in learning about the LLMs and how they operate. In lean, we don’t just blame the equipment or the operator. We investigate the “Why.” We study the process, the training, and the method. Sort of like “going to gemba” except this was all digital learning. I dove into learning with books, papers, free courses from Stanford and Harvard, online communities, and endless experiments. And over time I started to see some patterns
To help explain the situation I came up with the following framework. The intent is to outline five levels of AI proficiency and how most users interact with LLMs and what you can expect in terms of output and quality. Here is my main takeaway — the method by which you use the model dictates your outcome and your opinion on the technology.
Level 1: Raw Dialogue Box (≈95% of Users)
This group uses LLMs, such as OpenAI, Gemini, or Claude, by typing casual questions into ChatGPT or similar tools. The responses are sometimes helpful, often generic, occasionally hallucinated. The user has no control over the model’s behavior and is not impressed by the outcome. They might give it a 70-80% score and put it down. The user in this case naturally walks away thinking “AI is overhyped.” It’s still like using Google, Wikipedia, or Reddit. Some nuggets are in there, but you must dig for it.
Level 2: Basic Prompt Engineering
At this level, the user has discovered that the quality of the model’s output depends heavily on the precision and clarity of the input — how questions are asked and instructions are framed truly matters. There are specific terms for this called prompt engineering and context engineering. Providing necessary context, assigning roles, structuring requests, and giving examples to the model dramatically improves results. A “system prompt” might consist of two pages of text for example. Suddenly the perceived results from the model increase to the 90% acceptable range or higher. Users here start steering the model rather than being at its mercy. Many stop here — and for casual use, that’s fine.
Level 3: Advanced Settings and Parameters
At this level the user starts to interact with the model through some basic lines of code and specific interfaces. In other words, you’re leaving the chat box interface behind. You are on the OpenAI AP Platform or Google Vertex AI sites. Here you experiment in coding environments or specialized tools, adjusting model parameters for creativity, consistency, or focus. Suddenly the model responses start getting very good and often change your opinion on the whole topic of AI. In other words, you have learned how to treat the model like a controllable system, not a black box. This level represents PDCA applied to AI use, and, if you experiment with the settings under the hood, you start getting responses that are higher quality, more insightful, and amazing in many cases. The problem, however, is that only about 5% of the public have any idea how to achieve this level. It took me close to a year to figure it out. The learning curve is still too steep for the casual user of Level 1 or Level 2.
Level 4: Custom Applications with RAG and Tools
At this level, you’re building your own AI tools. You design specialized interfaces, integrate specific domain knowledge, and create retrieval-augmented generation (RAG) databases for context. Coding skills help here but often the AI can explain how to hook things up — that is how I figured out my own applications. In this realm there are very specific tools like n8n or MCP servers that can also expand what’s possible. The results in my experience are very impressive but are also very domain-specific and, hence, tough to disseminate. These applications can automatically assist with research, coding, planning, documenting, and editing, and often produce amazing results far beyond what I assumed possible. However, the caveat is that only 1% of the population (or less) can interact with the models at this level. It is still too hard to navigate and only work on what I consider to be very niche tasks.
Level 5: Agentic Systems and Fine-Tuning
At the fifth level of the framework, models can be fine-tuned by end users to get more impressive results. And it is possible to chain “agents” together to complete certain tasks start to finish. Here the focus shifts to orchestration — multi-agent systems, fine-tuning, and autonomous workflows. While enterprise solutions are emerging in this area, they are extremely fragile, limited, and very task-specific. There is a company in Japan called Sakana AI that has succeeded in having agents conduct all the steps of the scientific method, for example, and produce journal-level articles. The most dramatic examples at tech conferences seem to still all involve coding, media generation, and tasks I would lump mainly in with sales and marketing flows. But this area is likely to develop quickly over the next few years.
At Level 1, users are often disappointed as results vary wildly, leading many to dismiss AI as just hype. At Level 2, users see some of the potential of LLM technology. At Levels 3 and 4, users control model behavior, integrating real data and producing consistent high-quality results. At Level 5, people are orchestrating systems of work across some sections of an administrative value stream. Same technology. Different methods. Different outcomes.
This unfortunate reality explains why some insist that AI is “slop” while others quietly build useful, high-impact niche systems. It’s like buying a new CNC machine and expecting it to produce perfect parts without setup or training. It just does not happen. Those very few users at Levels 3–5 have invested in learning, experimentation, and iteration — and discovered what’s possible beyond surface interaction. It is also the difference between frustration and capability, between waste and value.
I am in the camp that thinks the narrow AI tools (large language models) are overhyped in the short term but probably undervalued in the long term. The technology will be as impactful as the internet, websites, and Google Search were decades ago. I personally don’t think there is an impending overnight elimination of 40% of white-collar jobs as hyped in many news articles. Those claims seem to come from those who profit from selling the services of the models. I fully expect the models to get stronger, the interfaces to get easier, and the barriers to learning to get lower. The models will work their way into all language-related tasks over time and “augment” our abilities in profound ways. Humans + AI > Problems.
Things are going to change as technology matures, but I think these five pieces of advice will hold up in the long term.
1. Investment in Learning Pays Dividends
Real value requires real learning. Like any language, a phrasebook gets you tourist results, grammar and practice lead to fluency. I’m not a software engineer — I built car engines at Toyota, not web apps. Yet I built a 30,000-line web app with secure authentication, databases, and APIs in a couple of months. I learned by asking AI to be my coach. The models themselves can teach you. But you must put in the time.
2. Start Small, Not Large
Don’t always begin with an enterprise-AI-strategy-level problem. Pick one narrow, real problem that annoys you and build something small with AI to solve it. When getting started, you will learn more by doing small projects than large ones. I started by creating a simple one-page drag-and-drop analysis tool, then a 5-Why Root Cause Coach to help people strengthen their thinking. Lean teaches us to start with value, not technology. One problem, one tool, one experiment.
3. Apply Lean Thinking to AI Work
The AI world is going to eventually wrestle with the problem that language alone does not equal value-added work. More computing power isn’t always the answer. Smarter, simpler design often is. Start with the problem, not the tool. Ask:
• What value am I creating?
• What problem am I solving?
•Am I reducing waste or creating digital clutter?
Purpose before technology. Value before features. Go to the gemba — check with end users. Ensure what you build adds value, not complexity.
4. You’ll Need Help — But Less Than You Think
To reach Levels 3–4 of my framework, guidance helps. But you may need less outside support than you imagine. I learned roughly 90% of what I needed by using AI itself as my instructor — defining problems, diagnosing errors, and iterating solutions. I think the same is going to happen with a lot of skills at work. AI will augment our tasks but not fully replace them. That’s amplification in action: AI making people more capable, not replacing them.
5. Plan to Rebuild Everything Multiple Times — PDCA
When it comes to learning and projects involving AI, kaizen is going to be your ally. All of your first versions will be rough. Accept that fact and plan for it. Use PDCA — Plan, Do, Check, Act. Every cycle reveals something new. Most of my current tools are on their sixth or seventh or eighth iteration. Each rebuild taught critical lessons. Expect learning, not perfection, and your journey will be smoother.
Two years ago, I was a skeptic at Level 1 and not optimistic about AI’s potential in the real world. In hindsight, I was a bit like the poor golfer blaming my clubs for my poor results. Over time I built my way up the levels to get closer to shooting par and, overall, more optimistic about the potential of LLMs in lean and work in general. I think that AI can be a powerful lever for democratizing knowledge while also creating certain products easier, faster, and better.
The five levels outlined here form a roadmap: where most people are stuck (Level 1), where improvement begins (Level 2), and where consistent value creation emerges (Levels 3, 4 and 5). You don’t have to reach Level 5 to benefit. Simply moving from Level 1 to Level 2 can improve your results. But understanding the landscape helps you see what’s possible. We’ve got some knowledge development projects underway at LEI that we hope to share at the 2026 Lean Summit in Houston.
Just as lean teaches that process matters as much as tools, AI reminds us that skill and method matter as much as technology.
Humans + AI > Problems.
The tools are ready. The question is: are we ready to learn to use them well?
Start climbing. Start small. Start today.