Auto-GPT and the Future of Work With Large Language Models (LLMs)

July 11th, 2023

Although Generative AI has been evolving rapidly for some time now, its potential operational impact for most of us in the business community only began to raise eyebrows earlier this year with OpenAI’s introduction of ChatGPT. Members of the general population have been impressed, intrigued, bewildered, and terrified by what likely constitutes a redefinition of the “Modern Age.” As natural risk takers and visionaries, entrepreneurs - old and new alike - sense opportunity for radical product and business model innovation. Where investment in AI innovation is ultra high right now and likely to remain so, many business owners are asking questions like: “Do we have the technical prowess to understand and actually leverage this technology?” “Is it safe and secure?” “Do we know where this is going? I mean, who is the market leader going to ultimately be? OpenAI? Google? Microsoft? Meta? Someone else entirely?” “Is this going to be regulated in a sensible manner?” “Is this technology going to replace people in the workforce and maybe me too?” The focus of this post centers on the last question, and the short answer is “yes,” it is going to replace people and you to some extent but not necessarily in ways that you should fear or avoid.

Before concentrating on AI-enabled tools that are designed to replace people – and you – I’ll take a moment to share my personal thoughts on some of the other questions posed above. If you’re not interested and want to jump ahead to the main topic, please feel free to skip these musings and go straight to Auto-GPT.

I’ve been an entrepreneur and technologist for nearly 30 years now. In that time, I’ve seen the same questions come up before, and although none of us have seen anything quite so profound as ChatGPT, the years have shaped my views towards the advent of it. Let me start by saying that ChatGPT, and Large Language Models (LLMs) like it, are not self-aware. They work on sound computing principles and as such they are “tools.” Humanity has been using “tools” for eons. The ability to master tools is part of what defines us. Tools can be used to perpetuate good or bad, but they are tools, nonetheless. When I began my career as a software engineer, we were transitioning to use robust frameworks to accelerate work. Why write code that others have already engineered? Not long after, I witnessed – and participated in – the debate between using licensed software versus open-source packages in software product development. All of the same questions were present at that time as well. Initially, I feared pursuing open-source for product development because it was out of my comfort zone, but being an entrepreneur first and a technologist second, I realized that the market was clearly taking us in the open-source direction, so I shifted my position to embrace it. Later, the same questions applied to evaluating on-premises versus cloud-based storage and computing. Again, I initially had my reservations, but economics made pursuing cloud-based product innovation a clear winner. And now, these questions apply again to AI, and I say again, whether you like it or not, economics will drive its full adoption, separating by a very wide margin the winners who embrace it and the losers who avoid it. But these questions are relevant for business leaders so here are some brief responses:

“Do we have the technical prowess to understand and actually leverage this technology?”

This is an emerging technology. Very few, if anyone, fully understand it. Fortunately, it isn’t difficult to use without much knowledge of the mechanics “under the hood” so-to-speak. These models, by definition, are intended for “natural language.” That said, there has been much written about the need to coax the best response by “engineering” the best natural language “prompt” possible. However, as these models become more powerful, their ability to interpret a natural language request and give a better answer is going to improve as well. Likewise, the same models are beginning to be used to “generate” their own prompts for the best possible response. In other words, where there has been a gold rush to cash-in on the emerging field of “Prompt Engineering,” it is possible that this gold rush will prove to be short-lived. Leveraging these tools is, and will increasingly become, less about having technical prowess, and more about contemplating the universe of use-cases that can reshape your business. Ideation is likely the better concentration of time and effort than trying to grasp the mechanics of how LLMs work.

“Is it safe and secure?”

The short answer to this is, “it depends.” Shortly after it gained popularity, many organizations that must comply with PII requirements began to block access to ChatGPT. The concern is real. Apart from the question of “will AI ultimately become self-aware and kill us all,” the question of data security is likely more immediate and relevant. Unless an LLM is self-hosted within your own technology stack, whatever you send it goes into someone else’s technology stack. Again, all the same computing principles apply. Not many small to medium-sized enterprises can afford the investment in storage and compute requirements that have “historically” been associated with the training and hosting of an LLM. However, this is changing. Self-hosting an LLM, or hosting with a trusted provider, is becoming more feasible with conventional computing power. Within a private stack, data is managed according to your own standards and with roughly the same technical considerations that you’re already accustomed to.

“Do we know where this is going? I mean, who is the market leader going to ultimately be? OpenAI? Google? Microsoft? Meta? Someone else entirely?”

Who knows really, except that we know this technology isn’t going away. What I can suggest confidently mirrors what was leaked from Google not long after OpenAI released ChatGPT to the public… The leak was Google’s self-assessment in the wake of ChatGPT’s introduction. In it, the argument was made that, where clearly behind OpenAI, Google and OpenAI share the same threat – that there is no “moat” around either to enjoy. This followed the leaking and open-source availability of Meta’s LLM called LLaMA. It is very likely that open-source communities will provide the kind of practical innovation for both general and smaller, more specific language models that businesses can adopt affordably for use in their own private technology stacks. That’s likely great for innovation and perhaps not so great for large incumbents.

“Is this going to be regulated in a sensible manner?”

It is going to be regulated. Even OpenAI CEO Sam Altman has acknowledged the need for it. Will it be regulated in a “sensible” manner? That remains to be seen. As an entrepreneur and technologist, I would define “sensible” as regulation that mitigates the risk of undesirable decisions being made and acted upon by AI, and as penalizing those who use the technology in unethical ways. I personally hope that regulation does not extend to stifle business innovation – particularly innovation that centers on delivering superior product experiences in the most cost-effective manner possible.

Auto-GPT

So that brings us to the last question: “Is this technology going to replace people in the workforce and maybe me too?” If our goal is to leverage AI for “business innovation that centers on delivering superior product experiences in the most cost-effective manner possible,” then I think that answering this question in the affirmative is actually very intriguing and kind of exciting.

What is Auto-GPT?

Auto-GPT is a community-based open-source project that seeks to use the reasoning power of an LLM like ChatGPT and make its suggestions immediately “actionable” in an “automated” manner. If you’ve worked with ChatGPT at all, then you know that it responds to your questions with answers. It’s quickly become commonplace to ask ChatGPT how to accomplish a task. ChatGPT will respond accordingly with instructions, advice, recommendations, code, etc. ChatGPT builds a context and improves its responses as it gains a better understanding of your needs. Once you’ve received a desired “suggestion,” you might be totally satisfied to carry out the suggested work yourself, but what if you didn’t have to? What if the work could be carried out immediately by a digital agent formulated by the reasoning power of the LLM – an agent that is competent and capable of completing the task for you - autonomously?

Think of ChatGPT as a brain and Auto-GPT as the nervous system that connects the tissues, fingers, and toes capable of accomplishing the work suggested. Our own human brain has evolved over time to relegate repetitive tasks to the subconscious mind to automate them. We call these habits. It is this automation that frees higher brain function to solve still greater problems and to achieve greater success as a species. If you consider AI as an opportunity to expand on this truth and accelerate this evolution, then AI can free us from our daily tasks so that we can solve bigger problems and achieve greater success. Basically, it follows that once you learn “how” to do something, the next desire is to do it as efficiently as possible. “Automation” follows “Suggestion.” This is in part the goal of Artificial General Intelligence (AGI), which is why you see some of these current pursuits labeled under the topic of AGI and Baby AGI. We are in the very, very early stages of rising to AGI’s true definition of fully autonomous, artificial beings, capable of making and acting out the same decisions that a human would. Auto-GPT is “not” this, but it might quickly evolve into a very useful tool that derives some of the same benefits as digital assistants that would otherwise require a human being.

Auto-GPT is relatively easy to set up if you have even a basic understanding of software development or if someone who you know is at least a beginner. It is currently accessed via a Command Line Interface and begins with a simple prompt: I want Auto-GPT to “fill in the blank.” Auto-GPT will then interface with ChatGPT via OpenAI’s API, which you must obtain a key for. ChatGPT responds with how it “thinks” the work should be accomplished. A plan is formulated and suggested for how to do the work. Once the definition of the work and the plan are either confirmed or refined by providing your feedback, Auto-GPT will try to execute the work. This work might include searching the Internet, creating files, accessing APIs to get or post data, and so on. As part of its planning, Auto-GPT can even write the code necessary to accomplish specific work that has been suggested by ChatGPT. Auto-GPT has built-in functions to accomplish a lot of common tasks. It also has a number of built-in plugins that can accomplish more complex tasks like searching Google, creating images, converting text-to-speech, etc. You can also write custom plugins that are considered in task planning to interface with APIs and SaaS platforms of choice to really automate just about anything that you can imagine. Where ChatGPT is limited to the date of its last training, it can learn from current information that you provide it in prompts. Likewise, Auto-GPT automatically crafts the prompts to feed ChatGPT the current information that it needs to make decisions that are relevant today. And, it learns how to accomplish the work better as you provide feedback.

More hype than reality?

There seems to be a rather wide gap between people’s positive and negative experiences with Auto-GPT. Some people have successfully used Auto-GPT to automate a range of tasks that they would normally do themselves or would have a paid employee do for them. Others evaluate it and are completely unimpressed. Will this gap shrink with time? Probably. I think it depends on the lens that you’re viewing it through. As a tech entrepreneur, I’ve been writing software for 30 years and have kept my technical skills current. I see it through that lens. I’ve directly experienced some of the challenges that others have noted with Auto-GPT that have caused abandonment. If someone approaches Auto-GPT with the idea that they can ask it to do anything in a single prompt and accomplish the work without writing any code, he or she is likely to be very disappointed unless it is a relatively simple task. And I think that is part of the problem… Most people find Auto-GPT because they are truly enthusiastic about the idea of automating their work and retiring to a hammock or reducing payroll by automating the work of their employees. However, those kinds of human tasks are typically very complex and require some social interaction to complete. Is it impossible to automate such tasks? I don’t believe so, but it requires seeing the technology through the right lens, exercising patience as it evolves, and going to some effort to expand Auto-GPT’s capabilities by writing custom plugins in Python. Here are some common complaints and alternative views:

“It just gets stuck in a loop and never actually accomplishes the work.”

I encountered this myself. Remember that Auto-GPT is brokering a lot of the interaction with ChatGPT for suggestions on how to accomplish the work just like we would in a personal chat with ChatGPT. When I write direct prompts for ChatGPT, I often must engage in a dynamic dialogue that eventually - but not immediately - gets the answer that I want – a dialogue that builds a context of understanding between me and the model. If I ask ChatGPT for a code recommendation, the code suggested might not initially work at all. I will refine my prompt within the context of the dialogue to eventually get the working code that I want. As a human being and subject matter expert, I can probably do a better job of building this context to get the right, “working” answer sooner than Auto-GPT can by itself. Auto-GPT knows this and has built in the ability to provide direct, human “feedback.” This feedback helps to build a better context and exit these common, seemingly endless loops that occur if you just let it run without intervening with valuable feedback. Likewise, most people who have tried Auto-GPT prior to July 2023 have only had access to the ChatGPT 3.5 model, which is much less intelligent than the 4.0 model. The Auto-GPT community is also learning from these experiences and have begun deprecating some problematic commands that are better suited to be more specific, functional plugins. For these reasons, I’m less disappointed than others might be with this behavior.

“Auto-GPT only talks to ChatGPT and does not interface with other, emerging LLMs.”

This is true today, but unlikely to remain a limitation. The configuration file in Auto-GPT speaks to LLM configuration in the plural sense. It is very likely that models will be integrated as they become available, potentially reducing cost, and improving effectiveness for specific use-cases. As we are currently seeing a fall-off in the use of ChatGPT in favor of models that are less restricted and censored, the Auto-GPT community is likely to accelerate efforts to connect with and use other LLMs. This also might address some security concerns if you can imagine the ability to connect with a privately hosted LLM within your own technology stack.

“Auto-GPT racks up costs quickly.”

Because of how context is maintained with ChatGPT, it can be rather pricey to use Auto-GPT for task automation. If you connect to other APIs to complete the work, additional costs and rate limitations might apply. However, when you consider the cost of your own time and/or the paid time of employees, this paid use may not be as concerning. It depends. There are efforts to use cheaper, open-source LLMs for brain power, and there are standards being proposed for better separating the reasoning model from the work that could reduce the paid tokens required to automate a task. This is evolving quickly and likely to improve. In the meantime, you can set cumulative price limits to offset the risk of spending more than you care to wager.

“My work requires use of specific tools that Auto-GPT can’t access.”

This is true. Where Auto-GPT might be able to research and emulate an expert in a given job role, it cannot carry out specialized work without access to the tools that a person would actually use. And, where it might be able to obtain complete instructions for “how” to use a commercially available tool as an expert, it likely will be unable to act on that expertise without a custom Auto-GPT plugin written for that tool via its published API - or sending clicks and keystrokes, etc. Auto-GPT has some built-in plugins, i.e., to use DuckDuckGo for Internet research. It has some additional, published plugins to connect to other search engines, APIs, etc. Third parties are beginning to write plugins for their own SaaS platforms. And, Auto-GPT has established a template for creating a custom plugin that interfaces with whatever you want, but you must write your own code. Again, as a software developer myself, I probably have more enthusiasm because the possibilities to automate work become limitless with some additional code writing, and it isn’t hard to imagine a large marketplace of plugins emerging.

“I have lots of tasks that require coordination and completion by a team of workers, operating independently, concurrently, and cooperatively.”

Where I can absolutely identify individual tasks and contemplate how to do them better and faster with Auto-GPT even today, creating a “team” of AI agents, working together to automate an entire business process seems an order of magnitude above that goal. Auto-GPT operates in a synchronous manner to process a list of subtasks that complete a job. But what if you could create an orchestration of Auto-GPT instances that are each formulated to be experts at completing jobs independently, concurrently, and cooperatively with other AI agents to produce the best overall outcome? What if you could simulate an entire “team?” This is actually a question that my own 17-year-old son, Gabriel, has been intrigued by and experimenting with on his own. In his own words, here are some of his experiences “teaming” Auto-GPT instances:

Per Gabriel Turner:

I wanted to create a team of Auto-GPT agents, each acting like a person that specializes in finding and sharing information from different perspectives and to synthesize their responses into a consensus.
Since Auto-GPT is not designed for this directly, spinning up multiple instances posed some unique challenges:
- Coordinating the agents, much like a manager would.
- Handling the timing of when commands are dispatched.
- Sharing the same results among the agents as they perform work towards a common goal.
It became apparent that providing feedback across the team, “as work was being done,” had a significant impact.
It also became apparent that agents were getting better at performing tasks with repetition, which is encouraging because it mimics the effects of a learning curve - potentially lowering time and cost.

The Future of Work

Using LLMs to automate human work like Auto-GPT seeks to do is ambitious, but likely inevitable. It is experimental, but early examples of full automation are already found on YouTube responding to emails, addressing good and bad Google reviews, creating and posting blog content, finding and summarizing research, identifying missed savings and pursuing refunds, and so on. Examples are likely to become more sophisticated as ideas are circulated and development requested of early adopters who have gained the competency to create workflows and write the plugins required to automate specific work for business stakeholders. Professionals who adopt AI automation as a tool may free themselves to solve bigger problems and thrive. Likewise, AI seems to serve best those who are already experts in their field because expert prompts are generally better at getting to the right AI response quickly, and avoiding the erroneous responses that non-experts are more likely to accept as truth. So, staying on-top-of-your-game as a subject matter expert is likely to become even more important in this new age. It is best to assume that nothing is too complex or sophisticated to escape automation and to begin deconstructing value chains to identify opportunities for AI automation. Established businesses will find new opportunities to cut costs as they always have. Startups will have more opportunity to get-off-the-ground when cash is tight and payroll is an obstacle. AI can promote prosperity, but if you’re looking for instant gratification, AI automation with agents will currently disappoint you. If you have a longer-term vision and are willing to make an investment while the technology matures, you might find yourself very-well positioned to operate and compete in the new Modern Age.

If you have questions or ideas for automating tasks in your own business workflows, I'd love to engage with you.