So I have been messing around with GPT’s and the thing that most strike me is how the agent, after a while, gets context mixed up, starts to output things that appear logical and well sounded but are false, one of the main thing I see again and again is loss of context, for example you design a database schema and then when you implement it in the code you leave out purposely foreign keys and tables that you won’t be using after all, you submit the entire code to the code interpreter, you ask it to break it down and analyze it, after a while you tweak the code some more, let say you add a sequence matching function that will compare incoming text from a stream of data and text stored in a database that has already been seen and say stored.
Then you ask GPT to analyse it and explain the different parts of the code it just generated in the same task, it happens again and again that for example it explains the way the database works by using the schema that was originally generated (but not used, or partially used) and instead of adapting to what’s actually being used, the agent stays on the surface and literally hallucinate on what the code is actually doing.
In fact the longer the chat goes, the more it can get mixed up between what was implemented (and submited back to it) and what was originally requested/generated but not really implemented or not in the way that was initially generated.
It’s not surprising that, when this happens, if you start a new chat session, with the state of the code that was generated by the previous session and you start debugging from there, it will behave in a much more sane and methodic approach and interpret things from what it is given rather than trying to sum up different parts of an evolving context to make it reach a precise goal without really grasping the objectives.
There is nothing like learning by exercising on concrete ideas and pain points and then watching these pains go away and even run in production doing actual useful stuff humans shouldn’t be doing manually and iterating over and over to make a simple idea become a robust and stable approach to fixing a problem.
The past week I managed to take my telegram forwarder system to the next level by implementing a sequence matcher with proper debugging and database storage, the goal? avoid forwarding telegram messages that have already been seen/sent.
might sound like Chinese but the combination of nocode/low-cost automation such as N8N with APIs and Dockerized python code ends up being a powerful mash-up to create solutions that are easy to deploy to fix real life needs.
And that’s, the kind of learning I love doing!