Evan X. Merz

Programmer / Master Gardener / Doctor of Music / Curious Person

Weeds of California: Oxalis pes-caprae

One of the most common weeds in California gardens is Oxalis Pes-Caprae, aka Bermuda Buttercup. I usually refer to it by the name "Oxalis", but this is a really poor way to describe it because Oxalis is a genus of plants. There are many species of Oxalis, aka wood sorrels, that are native to California, such as Oxalis Oregana, aka Redwood Sorrel. Still, for the rest of this article I'm going to use the name Oxalis to refer only to the one I see most commonly in the garden: Oxalis pes-caprae.

Oxalis pes-caprae is a very common weed in California gardens

Oxalis pes-caprae is an easy plant to love, despite it's weedy nature. It's edible, and has a pleasant sour flavor. Some foragers like to use it like lettuce in salads. It also puts on a beautiful display of small, vibrant yellow flowers. For this reason, the Oxalis bloom in the late winter can sometimes attract hikers and sightseers.

But looks can be deceiving. The California Invasive Plants Council (Cal-IPC) calls it a moderately invasive plant. If not controlled, then it has the potential to invade natural areas and displace native plants. I know that I've seen it taking over some of my favorite nature areas, such as the Bear Creek Redwoods Preserve. I'm grateful to the staff and volunteers who have kept it in control there.

Controlling Oxalis

When I first moved into my house, Oxalis was everywhere. It was in the front yard, it was in the back yard, and there was a carpet of Oxalis covering one of the side yards. Controlling it can be somewhat difficult. Even though it "does not produce seeds, it is difficult to control because of its ability to form many persistent bulbs". So I got to work weeding, and when I was lucky I pulled out an entire frond, with a bulb attached at the end.

If you don't manage to pull out the bulb, then Oxalis will come back year after year until it is exhausted. Still, it's a relatively easy weed to pull. I just pulled it whenever I saw it coming up each year for around three years, and now I rarely see it in my yard at all.

University of California Agriculture and Natural Resources notes that if wedding isn't effective, herbicides can be used to control Oxalis.

weeds-of-california-oxalis-pes-caprae

Caching: The most important topic in software development

I increasingly think that caching is the most crucial concept required to be a good software developer. Understanding caching is vital at the micro level of computer engineering, but it's also vital at the macro level of system design, and in the middle with software engineering. You really can't be a good software developer if you don't have a strong understanding of caching.

This is the third in a series of three articles about caching. I wrote one article about how caching is limiting the AI industry in a way that nobody is talking about. I wrote another article about how Spotify's failure to understand how and when to use caching is hurting the user experience of their app. This article will give you three reasons why a deep understanding of caching is one of the most important skills for a software developer to master.

SPOILER: It's because the data structures that underlie all caches are critical to everything that software developers do.

An AI generated image of a hash table as a tree.

1. You cannot understand the web without understanding caching

How many layers of caching are involved in any web request?

Take a minute to think about it. It's not an easy question to answer. In fact, the correct answer is that it varies depending on context, however, it's generally at least three.

  1. All web browsers have at least one cache layer.
  2. Most websites use a Content Delivery Network (CDN) that caches static assets such as images.
  3. Most servers cache requests or pieces of requests.

And that is a simplified view of a very simple website with not many users. Once you get databases involved, you're almost certainly introducing another layer of caching. The web server will typically have one layer of caching around requests, or pieces or requests, and another layer of caching around database access.

The fact of the matter is that you can't understand how the web works without understanding the way the various standard layers of caching interact. Since almost any software you write is going to interact with the web in some way, a software developer has to have a basic understanding of web related caching in order to write effective software.

2. Misuse or underuse of caching is a common source of failure as systems are built or scaled

Using caching incorrectly, or not at all, is the most common source of failure when building or scaling a web service.

I was once at a company where they had a notification list for each user. They were struggling with the scalability of their website. They didn't understand why everything seemed to load so slowly, especially notifications. Well I asked how they were storing notifications and the answer was a search optimized database called ElasticSearch. ElasticSearch is a great tool ... for search. However, it has to re-index every document in the database every time a new one is added, and that's expensive. This is a failure to understand the way notifications should be stored or cached to make it fast for users to access them.

I was at an ecommerce company where list pages were slow. The problem was that the list pages had lots of parameters, and every time a parameter was changed the page had to hit the API to get a new list of items. Well, the items didn't change that much, and the parameters changed even more rarely. We figured out that it would be easier just to cache the list of every set of parameters for a given list page, and then the list pages loaded instantaneously. The difference was night and day.

The point is that almost all scaling problems can be solved by changing the way items are stored or cached.

3. Hash tables are crucial to software developer interviews

The final, crucial reason that all good software developers need a deep understanding of caching is that they need an understanding of caching to be hired in the first place. Questions about caching are critical to both coding problems and system design problems.

Most caches are implemented using a data structure called a Hash Table (aka Lookup table, dictionary, map, associative array). In almost every coding interview I've ever been part of (on either side), there has been a question that involved the use of a hash table. Some of the most frequently asked questions are based on hash tables: Two Sum, Ransom Note, and Top K Frequent Elements.

Caching is equally important to any system design interview. Whether you're designing Netflix, or a parking garage, or any other service, if you forget the cache layer, then you won't get the job.

caching-the-most-important-topic-in-software-development

Spotify has usability problems

I pay $19.99 per month for a family plan to use Spotify. It is the most expensive individual streaming service that I subscribe to, but I value it because I listen to a lot of music and a lot of podcasts, and I don't like it when they are interrupted by commercials.

But the Spotify app has had the same basic problems for years, and Spotify is either unaware of them, or lacks the software development expertise to address them. And these issues ruin the experience of using the app, especially for software developers like me who understand how easy it would be to mitigate them.

The first thing I usually do with Spotify when I open it is click Your Episodes. This is where the list of my podcast episodes that I've downloaded are displayed so I can listen. It's easily the most used page on the Spotify app for me, yet every time I click on it, I have to sit there and wait for a spinner that sometimes never ends.

The spinner that is shown to users when they click on Your Episodes

What's frustrating about this is that it's so easily solved with a basic, widely used approach that could be Googled and implemented relatively quickly. And yet this issue has persisted for years.

All you have to do in this case is cache the list of "Your Episodes" every time you fetch it. This means saving a file to the phone with the list of episodes in it. Then, when a user clicks "Your Episodes", you read that list and show it immediately to the user while showing an indicator that you are fetching the full list.

This is a common approach on list pages for ecommerce and media sites. Usually, the user is seeking something from that list that was there last time they checked it. It's rare that they need the latest thing. So this algorithm instantly satisfies most users.

But that isn't even the most egregious case of this in the Spotify app: I dare you to try to add something to a playlist. The Spotify app shows a spinner and runs a web request when you try to open the three dots menu on a track.

Yes, it needs to hit the Spotify backend to show a menu.

The spinner that is shown to users when they click on Your Episodes

This is embarrassing because not only is it strange to have to run a web request to get a menu, but it's strange that they aren't caching that menu. Even if customers see different menus, those customers can be grouped and the menus can be cached in some way.

The thing is, these usability issues aren't the only problems with Spotify as a whole. Spotify is notorious for not paying small artists, and Spotify pays artists less than Apple or Amazon.

So when you layer the usability issues on top of the structural issues with paying musicians, it's starting to become difficult for me to justify continuing to pay them $19.99 a month.

In closing, if anyone at Spotify sees this, let's talk: evanxmerz (at) yahoo (dot) com. This is a situation where it's so frustrating that I'd gladly solve it myself, or at least give the feedback to do so.

NOTE: I am a super user of music streaming services, so this is actually the third article like this that I've written. In 2016, I wrote a post about how Pandora makes discovery more difficult than I liked. In 2018, I wrote a somewhat similar article extolling the problems of using SoundCloud as a musician rather than a listener. Thanks for reading my opinions on music streamers. I used to be an Amazon Music subscriber, and I'll probably switch away from Spotify here soon, so I probably have a few more of these coming eventually.

spotify-has-usability-problems

The AI caching problem that everyone is ignoring

There is a lot of talk about AI lately, but I haven't heard much chatter about one particularly difficult problem that is facing the industry: caching.

It's not a sexy topic to discuss. It involves some esoteric technical intricacies that only web developers will understand. But caching is a big problem for AI.

Scaling traditional web services

A typical web request goes from your web browser to a web server somewhere in the world that processes your information and sends you the requested information. The requested information might be a blog post, a social media post, a video on a streaming site, or sports scores. However, in a traditional web service, many users are requesting the same thing. In other words the server is giving the exact same sports scores to every user who requests them.

Because users are requesting the same thing, the web service can optimize their system by caching that information. This means storing it in a way that is much easier and more efficient to access. One server running a caching system like Redis might cache millions of different requests and serve each response in a tiny fraction of a second. Those requests never hit the main server, which is much more expensive to run.

This is how traditional web services scale. If performant caching didn't work, then all web services would be much slower and much more expensive.

Image of a thinking machine generated by Meta AI

AI requests can't be cached

The fundamental problem facing everyone in the AI race is that requests to AI can't be cached in the same way. Even if users are making very similar requests, they might not want exactly the same response. In other words, when user A tries to generate a cartoon image of a cat, they probably want one that looks like their cat. When user B makes the same request, they probably don't want the same image as user A.

Everyone knows that AI involves some very complex math that uses Generative Adverserial Networks. These networks, even when not in training mode, can't be cached like traditional web requests. They need to be run on actual servers.

This is what has resulted in the big debate about server resources. AI inherently requires more computing power than traditional web services.

The big problem is profitability

The computing issue on its own is a big problem, but if you focus on it, then you miss the larger problem.

AI can't be profitable if it can't be cached.

If traditional web services couldn't be cached, then they would have a hard time being profitable. In other words, if the companies had to pay for many more servers to run their sites, then users would have to pay a monthly fee to use their services. If everyone had to pay a monthly fee to use social media, would they do so? The number of users would go way down. What about sites like Craigslist or YouTube or Yelp? Would those be able to exist if every user had to pay a monthly fee?

AI services might be free for now, and that's nice for regular people, but the services aren't going to last if they aren't making money.

And they won't make money unless they can solve the caching problem.

So the big problem facing the AI industry today is that we have a bunch of huge companies racing to own a field that, so far, is a money loser. It probably can't be profitable without a significant innovation in hardware or software.

Potential solutions to the AI caching problem

There are several potential solutions for this problem.

A hardware breakthrough could solve the issue. If someone found a way to make hardware that can execute AI models extremely efficiently, then the requests wouldn't need to be cached.

A software breakthrough could also solve the issue. If someone invented a way to break down AI requests into smaller pieces that could be cached, then that would make the services potentially scalable.

But I have a lot of experience in web development and in artificial intelligence. I'm not in academia (at the moment), but I'm familiar enough with the complexities of these issues to know that these are extremely difficult problems.

In fact, there are some problems in computing that we know are unsolvable, or at least not solvable without the computing power of the entire universe. The caching problem in AI looks a lot like one of these problems that can't be solved efficiently.

Of course, the counterargument to this is the existence of the human brain. The human brain somehow runs these calculations very efficiently. It's existence implies that there must be some way to optimize these queries.

I'm looking forward to seeing how the industry takes on this problem.

the-ai-caching-problem

Plans for an easy to build string trellis

I wanted to build a trellis for my peas this year, so I searched the internet and was immediately overwhelmed with AI slop. There were dozens of pages written by AI featuring trellises with impossible geometry and instructions that say "assemble your materials then follow the instructions to complete the trellis."

So I had to take what I could find and design something from scratch.

I'm not an expert woodworker, so these plans aren't a work of genius. However, I think they're a good starting point inasmuch as they are actual working plans that weren't generated by AI. If you are a more experienced woodworker and you make an improved version of this trellis, then share it with me on social media.

Here's what the final product looks like.

Foldable string trellis

Here's how to build it. Click through each of the steps in the tabs to see my hand drawn illustrations.

Step 1
Step 2
Step 3
Step 4
Step 5

For materials, I used eight foot long redwood 2x2s from Home Depot. These cost around $6 near me. Since I needed five 2x2s, the lumber cost around $30. I used a four foot long quarter inch pine dowel which cost around $1. I used screws and glue that I had lying around.

I used screws to attach the legs and crossbars, however, this results in a frame that is unstable. So I had to glue the string hooks in place in order to prevent it from folding up horizontally.

For tools, I used my table saw, a drill with 1/4 inch and 3/8 inch bits, a tape measure, a screwdriver, and a pair of scissors to cut the twine.

You could probably build this more cheaply by buying three 2x4s, and ripping them in half. I only used redwood 2x2s because I had them leftover from a previous project.

The only difficult part about cutting this project is making the string hooks. I cut small tenons to tie the strings off, but you could easily just put notches into the 3.5 inch pieces and wrap the strings around them.

If you build this trellis, or a version of it, then share it with me on social media. I'd love to see it!

easy-string-trellis-plans