☆ Yσɠƚԋσʂ ☆

  • 1.73K Posts
  • 1.45K Comments
Joined 6 years ago
cake
Cake day: March 30th, 2020

help-circle








  • I think you’ll still need a human in the loop because only a human can decide whether the code is doing what’s intended or not. The nature of the job is going to change dramatically though. My prediction is that the focus will be on making declarative specifications that act as a contract for the LLM. There are also types of features that are very difficult to specify and verify formally. Anything dealing with side effects or external systems is a good example. We have good tools to formally prove data consistency using type systems and provers, but real world applications have to deal with outside world to do anything useful. So, it’s most likely that the human will just work at a higher level actually focusing on what the application is doing in a semantic sense, while the agents handle the underlying implementation details.


  • I’ve been using this pattern in some large production projects, and it’s been a real life saver for me. Like you said, once the code gets large, it’s just too hard to keep track of everything cause it overflows what you can keep in your head effectively. And at that point you just start guessing when you make decisions which inevitably leads to weird bugs. The other huge benefit is it makes it far easier to deal with changing requirements. If you have a graph of steps you’re doing, it’s trivial to add, remove, or rearrange steps. You can visually inspect it, and guarantee that the new workflow is doing what you want it to.






  • Also, I’d argue that you don’t actually need huge models for coding. The problem is with the way we structure code today, which is not conducive towards LLMs. Even small models that you can run locally are quite competent writing small chunks of code, say 50~100 lines or so. And any large application can be broken up into smaller isolated components.

    The way I look at it is that we can view applications as state machines. For any workflow, you can draw out a state chart where you have nodes that do some computation, and then the state transitions to another node in the graph. The problem with traditional coding style is that we implicitly bake this graph into function calls. You have a piece of code that does some logic, like authenticating a user, and then it decides what code should run after that. And that creates coupling, cause now you have to trace through code to figure out what the data flow actually is. This is difficult for agents to do because it causes context to quickly grow in unbounded way, leading to context rot. When an LLM has too much data in its context, it doesn’t really know what’s important and what to focus on, so it ends up going off the rails.

    But now, let’s imagine that we do inversion of control here. Instead of having each node in the state graph call each other, why not pull that logic out. We could pass a data structure around that each node gets as its input, it does some work, and then returns a new state. A separate conductor component manages the workflow and inspects the state and decides which edge of the graph to take.

    The graph can be visually inspected, and it becomes easy for the human to tell what the business logic is doing. The graphs don’t really have a lot of data in them either because they’re declarative. They’re decoupled from the actual implementation details that live in the logic of each node.

    Going back to the user authentication example. The handler could get a parsed HTTP request, try to look up the user in the db, check if the session token is present, etc. Then update the state to add a user or set a flag stating that user wasn’t found, or wasn’t authenticated. Then the conductor can look at the result, and decide to either move on to the next step, or call the error handler.

    Now we basically have a bunch of tiny programs that know nothing about one another, and the agent working on each one has a fixed context that doesn’t grow in unbounded fashion. On top of that, we can have validation boundaries between each node, so the LLM can check that the component produces correct output, handles whatever side effects it needs to do correctly, and so on. Testing becomes much simpler too, cause now you don’t need to load the whole app, you can just test each component to make sure it fills its contract correctly.

    What’s more is that each workflow can be treated as a node in a bigger workflow, so the whole thing becomes composable. And the nodes themselves are like reusable Lego blocks, since the context is passed in to them.

    And this idea isn’t new, workflow engines have been around for a long time. The reason they don’t really catch on for general purpose programming is because it doesn’t feel natural to code in that way. There’s a lot of ceremony involved in creating these workflow definitions, writing contracts for them, and jumping between that and the implementation for the nodes. But the equation changes when we’re dealing with LLMs, they have no problem doing tedious tasks like that, and all the ceremony helps keep them on track.

    I would wager that moving towards this style programming would be a far more effective way to use these tools, and that current crops of LLMs is more than good enough for that.




















  • Given time life tends to complexify, there’s some interesting research actually suggesting that self-organization and growing complexity might be an inevitable solution.

    There’s a recent hypothesis that any system comprising many diverse parts and operating in an environment that selects for certain functions will evolve toward states of increasing functional information. In other words, it will become increasingly complex and heterogeneous. And the study argues that it’s not just biological systems that are subject to evolutionary processes, but any physical systems with these properties. There’s also some fascinating research happening with simulations, like Game of Life, where emergent complexity and self-organization end up emerging all on their own.

    This study found that when a cellular grid is subjected to selective pressure for a certain function, particles organize into stable, propagating structures, such as gliders, which support the desired logical operations. Coherent information-carrying entities emerge across the spatial grid, and as these travelling objects collide with one another, they perform elementary logical operations as a result of their interactions. Computation emerges as the evolutionary answer to the problem of coordination and communication across a spatially distributed system.

    Another study provides the link to move the system from isolated particles to structured complexity. The research shows how self-replicating hierarchical entities form out of random particle arrangements. Unlike the simple gliders which merely traverse the grid, these complex entities contain and propagate the information needed to reproduce themselves. They embody a sort of digital DNA. Rather than being monolithic blocks, replicators are organized out of discrete interacting parts, each doing its own specific job.

    Evolution in a digital medium proceeds along a recursive path, with simple particles aggregating to form primitive logic gates, and these gates clustering to form replicators. Over time, groups of these entities could aggregate and cooperate, forming the complex memory and processing units that are able to do universal computation.

    This study shows there is no need to program any specific end goal either. When random programs are introduced into a sufficiently complex environment without a clear fitness landscape, the system will nonetheless generate self-replicators. They emerge from a combination of sheer chance and a predisposition of certain code-fragments toward interaction and self-modifying potential. Once the first replicator appears, there arises a natural tendency toward greater structural and functional complexity because replication provides both the material and the selective pressure for further evolution.

    These studies demonstrate how a simple system driven by a steady external energy source, can extract complexity from simplicity. Whether it’s the sun or some different energy source, the underlying mechanic is the same. With sufficient iterations, these simple principles force a phase transition in the nature of the system, so that a number of alterations coalesce to form organized structures. It’s a process of quantity transforming into quality.