
Last week, I published an my theory that technical Semi-Pros have unlocked the ability to build software when augmented with an LLM like ChatGPT. This week I'm going to be highlighting my first attempt to prove this out based on my using an LLM to help me build an app. The ironic part of this story is that I had to fail first (and fast) in order to succeed. On top of that the solution was to solve a different, more fundamental problem, first by recursively using a framework to help me build an app with the framework. Before we get ahead of ourselves, let's go back to the beginning of the plot.
Necessity is the Mother of All Invention
I read a lot. While I'm reading I have lots of thoughts about what I'm reading. Sometimes it is a criticism of the ideas being presented or how a story relates to something I've read elsewhere. Over the years, I've taken down some of these thoughts in a slapdash manner with highlights and notes spread out across notebooks and whatever the new shiny digital note-taking app was at the time.
Last year, I worked to built a system that would do a better job of capturing these fleeting thoughts and store them in a central repository. This system has done a great job of aggregating all of this knowledge in the same location but it has caused a new problem; pulling a specific idea that has been piled together in what is now thousands of notes is difficult.
The current understanding is that memory is an associative network. I might remember an idea that I liked about how "humans are deterministic animals living in a probabilistic world". Unfortunately, I can't remember who exactly said it or which book it came from if it even came from a book but I do have a vague sense that it was spring when I read that and it was in the recent past (i.e. last 2-3 years). When I search my notes, I can find times where I thought some other situation reminded me of this idea but no reference of where I first came across the idea.
The problem is that the version I remembered was a paraphrase that came from a poem in a book where determinism isn't really referenced much outside of this quote. I did a number of substitutions to the original quote that were semantically similar but not exact which made it difficult to find based on the way I had encoding the idea in my memory.
This was a problem begging for a solution. I wanted a way to search my notes for an idea in a fuzzy way. My brain seems to semantically overload concepts, so when I go to retrieve something I need to be able to find similar terms as well.
The Plan: Don't Reinvent the Wheel
When I went to see if there was a solution that could help me solve my problem, I noticed that there were a number of startups (e.g. Mem.ai) that had features that looked to solve my issue and even some large players that were bolting LLMs onto their existing products to try to do what I wanted (e.g. Notion). Another thing I noticed was that everyone was in beta (limited access) or wanted a hefty monthly subscription for access to the features. I don't consider myself a cheap person but I'm also wary of picking up yet another subscription service.
Then I noticed that there were a lot of proof-of-concepts floating around AI Twitter showing people how they could talk to the product docs or codebases. Not exactly what I was looking for but might be similar enough. If it could ingest product documentation in markdown then it could probably ingest my notes which were also stored in markdown.
There were a number of different open source projects or tutorials to follow but none would plug-in to my workflow out of the box. Then I considered if I could re-work one but I quickly realized that most were written in Python which is not my strong suit (I'm more of a Javascript person). Getting up to speed on a new project and new language at the same time was likely going to be more effort than it was worth. I did notice one other thing that connected the majority of all of the projects. They all seemed to utilize the same framework to interact with the LLM that was backing the ability to "chat" with the docs.
The Next Plan: Two Birds, One Stone
It appeared that LangChain was the common denominator between all of the open source projects that were promising the ability to chat your docs. This was great because it solved a practical problem that I had and it was a chance to create support for my thesis that Semi-Pros can become productive software engineers when partnered with an LLM. I would build an app to talk with my notes using ChatGPT as my pair programmer.
The only problem was that ChatGPT has only been trained up until September '21 which means LangChain wouldn't be part of the training set. ChatGPT is quite good at producing code but doesn't exist in the present. It lives in the recent past. That's okay. As long as the tutorials on the LangChain website could get me to a solution for talking to my docs, I could use some other project to help prove my thesis.
I tried a few different tutorials in the LangChain docs and could get a POC working but each time I tried to add in more functionality I would get lost. The LangChain docs were obviously thrown together quickly and iteratively so sometimes they referenced deprecated object classes or the methods/design patterns had been updated so that different docs recommended solving problems in different ways. It was a bit of a mess and each time I took a step forward it felt like I was also taking two steps back.
I tried utilizing ChatGPT to help me figure out how to work around these issues but it was difficult to know the correct context to paste in from the LangChain docs so that ChatGPT could effectively help me. The irony the situation I was in was not lost of me. I had thrown down the gauntlet predicting that Semi-Pros could build software with the help of LLMs and for my very first test, I was hitting wall after wall that I couldn't circumvent.
What I needed was a way to chat with the LangChain repository so that when I had a question about something specific, it was able to find the correct context across the entire repository not just the parts that I could paste into ChatGPT.
The Next Next Plan: The Meta Solution
Then I found a tutorial that showed me how to build an app that would allow me to talk with the LangChain repository. This is based on a more recent innovation called Retrieval-augmented generation (RAG) which allows you to send a query to an LLM and augment the contextual environment that the LLM will use to answer the query with user provided data. Bring your own context (BYOC), if you will.
In this particular case, I was going to use the LangChain codebase as my context and have an OpenAI model use that context to answer questions for me about something that is difficult for me as a Semi-Pro to parse efficiently. I'm able to understand the basics of programming languages but there is a lot of knowledge about the idiomatic way to structure frameworks that I don't understand and can't infer at a productive speed. Being able to "talk" to a codebase helps me overcome that barrier.
I powered through the tutorial quickly and had a working POC in ~30 minutes. The sample questions worked fine but none of the sample questions were particular interesting to me. Instead to test how well it worked, I made a list of features improvements that I wanted to make to the app that the tutorial helped me build:
The app would execute my query and return me an answer but in the background it was doing a lot of other things and giving me very little feedback on what was happening while I waited for the answer. I wanted a better UX with less staring at a blank terminal screen waiting for something to come back
The app would answer my questions about the codebase that I loaded in (LangChain, in this case) but it wasn't interactive. I had to hardcode the query directly into the source and then execute the app in a single go. Fine for a POC but doesn't really bring the user the experience that they would expect.
The app didn't have a way for me to customize the prompt that was sent along to the LLM with my user input.
Now I had some concrete questions to ask my app to help me with. I started by asking if there was a function that would enable a streaming response from the LLM instead of letting me hang on a blank screen until the LLM had calculated the entire response to my query before returning the answer as a single block of text. If you've played with ChatGPT this is the default response style. You are returned one words after the other in real-time rather than staring at nothing or a loaded while the entire response is predicted.
This is what I asked my app:
Question: Does LangChain have a function for enabling a streaming response from an LLM to standard out for the
ConversationalRetrievalChain
class?
This is what I got in return:
Answer: Yes, LangChain has a
StreamingStdOutCallbackHandler
class that can be used to enable streaming responses from an LLM to standard out. This class is specifically designed for use with LLMs that support streaming. You can use this class with theConversationalRetrievalChain
class to enable streaming responses.
Looked promising. I could either go looking in the docs or source for this class to figure out how to use it properly or I could just ask my app for an example. I'm pretty lazy so I decided to ask my app:
Question: Can you provide me an example of a
ConversationalRetrievalChain
using an OpenAI LLM model that prints the model response to standard out using theStreamingStdOutCallbackHandler
?Answer: Sure! Here's an example of a
ConversationalRetrievalChain
that uses an OpenAI LLM model and prints the model response to standard out using theStreamingStdOutCallbackHandler
:
from langchain import ConversationalRetrievalChain
from langchain import OpenAI
from langchain.callbacks.base import BaseCallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
# Initialize the OpenAI LLM model
llm = OpenAI(model_name="gpt-3")
# Initialize the StreamingStdOutCallbackHandler
callback_manager = BaseCallbackManager([StreamingStdOutCallbackHandler()])
# Initialize the ConversationalRetrievalChain
chain = ConversationalRetrievalChain(
llm,
callback_manager=callback_manager,
streaming=True
)
# Generate a response to a prompt
response = chain.generate_response("Hello, how are you?")
# Print the response to standard out
for generation in response.generations:
print(generation.text)
That looks pretty straight forward. I have to import two new functions, initialize an instance of the StreamingStdOutCallbackHandler()
object inside of a BaseCallbackManager()
object then pass that into my ConversationalRetrievalChain()
object with the streaming
parameter set to True
. I modified my code and repeated the same query and suddenly the response was streaming back into my terminal instead of being delivered all at once. Amazing!
I repeated this process for each of the features that I had outlined above and was able to implement each feature based on the recommendations and examples from my app. It was a crazy experience because I was in a dialogue with a codebase in order to build an app based on that codebase. I didn't have to be an expert in the codebase, I didn't have to be an expert in Python, I didn't have to be a professional developer. I had a useful tool that was helping me make the tool more useful in a wondrous virtuous cycle of feedback and iteration.
Closing the Loop
Let's circle back to the beginning of this story, I originally wanted to build an app that allowed me to have a conversation with my personal notes but on the journey down that path I realized there was another app that I needed before I could get to this goal. I needed to be able to talk to a codebase first. Now that I have accomplished this all I need to do is take the same abstract concept of being able to "talk" to a set of text documents and modify the app so that it can handle both code and arbitrary blocks of text. I'll save that story for the next installment in the ongoing experiment to prove out or invalidate my theory.
Takeaways
During this initial experiment, I came across a number of learnings:
Fail fast and pivot when things aren't working out the way you are expecting. This is a well-worn adage in Startupland but still an important one to understand in practice, not only in theory.
LLMs and their associated tooling are still very early. The real problem to solve may not be how to utilize LLMs to solve general problems but instead figuring out the tooling needed to work effectively with LLMs. By building the tooling needed to make the development experience with LLMs better, we can then tackle higher level problems that LLMs can add value to. Dev tools for LLMs
>
LLMs in Dev tools.
If you're experimenting with LLMs and want to join a community of other like-minded builders who are leveling up with software development skillset by leveraging LLMs, come join the crew on our community Discord. If this sort of content is your jam be sure to like and subscribe below.