Dev-Diary 4 — Building an AI Agent at Netguru

Dev-diary #4

Recently Jan Skóra asked me “What makes a good tool for an agent? What are the qualities of a good tool in your opinion?”

We are testing two approaches for AI tools. I mentioned one already in #1 diary entry, but to remind:

  1. Complex functions, that will handle end to end result. E.g. we are preparing such for creating feature list consisting of different modules to create and features (so dividing business language to more technical structured language). It has multiple calls to read files, prompts to refine the context and at the end return CSV file, that we can load to our estimation tool.
  2. Small functions for small things that cover basic operations like list files, read file, find message etc. Here instead of having one big tool, AI needs to create flow. For example, to read file, it needs to know id file. So first thing, list files, next read file etc. AI fully rely on their thinking to achieve goal.

Which one is better? No idea yet. I assume we need to consider functions in few dimensions:

  • being complete - will AI achieve same result for our CSV feature list without custom tool? giving it ability to create CSVs etc.? I don’t think so, it’s super custom and we need it to give really complete result. It will be annoying to iterate over result for people,
  • being small enough - we need to make it small enough to provide only one result, it cannot do multiple things to not confuse AI,
  • mixing - should I create new function to list files and read file in one go based on name? Can I mix my existing functions to list and read file, with more complex function to make it in one go?
  • model limitations - when you read documentation, there is a fragment: “Aim for fewer than 20 functions at any one time, though this is just a soft suggestion.” It makes scaling difficult. When you start reaching 20 tools, you might get worse results. So you start to merge functions into one, which causes thinking about previous points.

It’s hard to say what is the best now. We will do testing and hopefully give you better answer.