David's Blog

Experimenting with llama

The other weekend I decided to try out Nelson Elhage's llama project which is a tool for offloading computation to AWS Lambda. This post is going to be a collection of my thoughts about the project and its potential.

A while back I had read a couple of the blog posts / newsletter entries that described llama and was intrigued by the tool but at the time the focus seemed to have been distributed compilation of C++ projects. I'm not doing so much C++ any more and so didn't have a strong use for that at the moment and set it aside but the idea stuck in the back of my mind. At some point this came back to mean and I started wondering if there was something like GNU Parallel for distributed computation but on AWS Lambda where the machines didn't have to be waiting for you 24/7? What I'd not remembered was that llama is a general tool that can do just that!

A hobby project of mine lately has involved fetching a number of CSV files daily and doing some light processing on them (sort of git scraping). This was a project tracking the data my state publishes around COVID-19 vaccination rates across multiple dimensions. While it didn't involve a ton of CPU bound work, fetching CSV files in parallel and doing the small amount of processing I do seemed like a great way to poke at llama once I realized it had the ability to distribute arbitrary work.

With something like GNU Parallel you need remote machines sitting around ready to do compute though. With lambda, it's on demand and can scale to very high parallelism for most purposes. Having such a tool could facilitate every day compute tasks that are expensive (slow locally on an older laptop, or drain battery life and create heat, or some combination of those) being done remotely. Assuming a fast enough internet connection, that is. Of course, things that take less time to compute locally than uploading and downloading the input/output to and from S3 might make less sense to do remotely. Overall, I'm excited about this concept - it provides a way to scale compute (or other things, like fast network access) without being tied to a powerful machine. After spending time with it, I think llama is a good early implementation of this idea.

I'm going to share some notes now about getting set up and my initial attempts to use llama. Overall, the README is sufficient to get started.

I did run into a few issues building/installing the command line tool on my Mac machine (tried both Go 1.15.x and 1.16.x)

  • Seemed to be some issue compiling go-parqet but didn't want to spend time debugging compilation right off the bat.
  • When switching over to a Linux machine, I had no problem compiling the binary, following the bootstrap process, and running some initial commands.
  • Building/updating lambda function images was also very very easy - I was able to build an image containing the tools I needed (just curl and Python) without too much fuss.

Setting up the AWS Credentials and CloudFormation stack was very seamless once I had the tool built and installed!

  • To experiment, I ported a small task that isn't very computationally expensive but still fits the model well.
  • It fetches CSV files from my state's health department that track COVID-19 vaccinations administered across a series of dimensions
    • I've been doing this locally a few times a week for a few months now and it works fine but I've actually run into rate limiting problems (the state appears to have a rate limit/bot detection such that downloading all the files once via a script sometimes triggers it).
    • Distributing the fetching to seemed like a nice proof of concept to start learning the tool - if every file is fetched by a different lambda host, then I shouldn't hit these problems.
    • I did get the fetching working in a satisfactory way but did hit some stumbling blocks
      • I'd started with a text file of URLs, where previously I was iterating over the list of these in a shell script that curled each URL but output the file to a localfile based on the URL but not exactly the URL. Think simple modifications like removing the protocol and domain, transforming a few characters into others.
      • The input/output file specifications using Go templating worked OK but where I was previously running a shell one-liner on the URL to create an output filename, I couldn't do that within my llama invocation
        • Understandably, I'd guess this is because llama needs to know in advance what files to pull from the container and map locally. But it does leave some restrictions behind in that where I was previously able to map over a list of URLs, I now needed some other way to map a URL to a filename (or vice versa).
        • I ended up creating a mapping file that had two space-separated entries per line: the URL and the desired output filename. I then used the output filename as input to the call and had the mapping file listed as an input file. Then, in the llama invocation, I'd use the mapping file to look up the URL I needed to curl while using the input line as the output file
          • This requires that the mapping file is up to date. I can use some shell to create one-liner to create the mapping file on demand but it took some time to arrive at this. I'd have preferred to be able to keep using the dynamic determination of output filename
            • This was probably both my unfamiliarity with the Go template language but possibly also the inflexibility of it to transform text without custom functions (or at least that's my perception)
            • Perhaps being able to specify an output directory would be nice? All files in the directory after completion would be transferred back locally.
        • That was the main thing to get around for me. was a bit easier to use for one-off single file use cases as it has the local:remote input/output file mapping.
      • This proof of concept didn't need a full core per function invocation since it was mostly IO so it would have been nice to be able to adjust the desired lambda memory on demand but understand that this project has mostly been focused on offloading compute-heavy tasks so I understand the desire to have a full core.
I do less of this than I used to, but I frequently end up crafting one-off shell pipelines to process some data and answer some question I have. I'm going to keep in mind and try to integrate it into workflows where it makes sense to enable offloading of computation. Llama seems like a good tool to have in the toolbox for both regularly run jobs that can be broken down as well as one-off shell pipelines that might take a bit more processing power.