Evan X. Merz

musician / technologist / human being

Tips for Managing Joins in Looker

Looker is a fantastic product. It really makes data and visualizations much more manageable. The main goal of Looker is to allow people who aren't data analysts to do some basic data analysis. To some extent, it achieves this, but there are limits to how far this can go. Ultimately, Looker is a big graphical user interface for writing SQL and generating charts. Under-the-hood, it's programmable by data engineers, but it's limited by the fact that non-technical users are using it.

The major design challenge for Looker is joins. A data engineer writes the joins into what Looker calls "explores". Explores are rules for how data can be explored, but ultimately just a container for joins. When someone creates a new chart, they start by selecting an explore, and thus selecting the joins that will be used in the chart.

They pick the join from a dropdown under the word "Explore". This is the main design bottleneck. Such a UI encourages users to have only a limited number of joins that can fit in the vertical resolution of the screen. This means limiting the number of explores, and hence limiting the ways tables are joined. This encourages using pre-existing joins for new charts.

This creates two problems.

  1. A non-technical user will not understand the implication of choosing an explore. They may not see that the explore they chose limits how the data can be analyzed. In fact, a non-savvy user may pick the wrong explore entirely, and create a chart that is entirely wrong.
  2. The joins may evolve over time. A programmer might change a join for a new chart, and this may make old charts incorrect.

The problem is that SQL joins are fundamentally interpretations of the data. Unless a join occurs on id fields AND is a one-to-one relationship, then a join interprets the data in some way.

So how can you limit the negative impact of re-using joins?

1. Encourage simple charts

Encourage your teammates to make charts as simple as possible. If possible, a chart should show a single quantity as it changes over a single dimension. This should eliminate or minimize the use of joins in the chart, thus making it far more future-proof.

2. Give explores long, verbose names

Make explore names as descriptive as possible. Try to communicate the choice that a user is making when they choose an explore. For instance, you might name one explore "Products Today" and another one "Product Events Over Time". These names might indicate that the first explore looks at the products table, but the second explore shows events relating to products joined with a time dimension.

One of the mistakes I made while first starting out with Looker is naming the explores with single word names. I now see that short names create maintenance nightmares. Before assessing the problems with a given chart, I need to know which explore the maker chose for it, and because the names were selected so poorly, the choice was often incorrect.

I hope these ideas help you find a path to a maintainable data project. To be honest, I have a lot of digging-out to do!

Pride in Software Craftsmanship

As I spend more and more time in Silicon Valley, my views on software management are changing. I read Radical Candor recently, and while I agree with everything in it, I feel like it over-complicates things.

This meditation has been pushed in part by my passion for food. I like going to new restaurants. It brings me joy to try something new, even if it's not a restaurant that would ever be considered for a Michelin Star. Even crappy looking restaurants can serve great food.

I am often awed by the disconnect between various parts of the restaurant business and the quality of the food. Some restaurants are spotlessly clean, have have beautiful decor, and amazing service… but the food is mediocre. The menu is bland and uninspired, and the food itself is prepared with all the zeal that a minimum wage employee can manage.

Then I'll go to a dirty looking greek joint down the road, and the service will be awful… but the menu is inspired. It's not the standard "greek" menu, but it's got little variations on the dishes. And when the food comes out (finally), maybe it isn't beautiful on the plate, but the flavors come together to make something greater than the ingredients and the recipe.

What seems to distinguish a good restaurant from a crappy one is pride. At restaurants that I return to, there is someone there, maybe a manager, maybe a cook, maybe the chef who designed the menu, who takes great pride in his work.

There's a diner by my old house, for instance, where the food is … diner food. There's no reason to go back to the restaurant… except for the manager. The man who runs the floor, seats the patrons, deals with the kitchen, and does all the little things that make a restaurant tick. He manages to make that particular diner worth going to. And for a guy who has two young kids, that's terrific.

I am starting to think that the same basic principle applies to software engineers. I've met brilliant engineers with all sorts of characteristics. Some of them have a lot of education and read all the latest guides. Others have little education, and don't read at all. The main thing that makes them good engineers is that they take pride in their work. They care about the quality of their work, regardless of how many people are going to use it, or how much time they put into it. They write quality code because their work matters.

So when it comes to managing software projects, I'm starting to think that all of these systems boil down to two basic steps.

  1. Put your engineers in a position to take pride in their work.
  2. Get out of the way.

Obviously, the first step is non-trivial. It's why there are so many books on the topic. But at the end of the day, pride is what matters.

Book Review: Life of a Song

I recently had the chance to read Life of a Song: The fascinating stories behind 50 of the worlds best-loved songs. It's a concise collection of fifty Life of a Song articles from the Financial Times. As I rarely have a reason to visit the FT website, and I only occasionally catch the Life of a Song podcast, the book was a great opportunity to catch up on what I'd missed. Regular readers may find nothing new in the book, but for pop fans and die-hard listeners, the short collection is definitely worth a read.

The cover of Life of a Song: The fascinating stories behind 50 of the worlds best-loved songs

The book consists of fifty articles from the regular Life of a Song column collected into book form. Each article takes on a different, well-loved tune from twentieth century popular music. Songs covered include ‘My Way', ‘Midnight Train to Georgia', ‘1999', ‘La Vie en Rose', and ‘This Land is Your Land'. There are only a few songs in the list that I didn't know off the top of my head, including ‘Song to the Siren', and ‘Rocket 88'. The articles usually include some remarks about the songwriter, often quoting them about their creation. Then they cover the journey from composition to hit recording, and usually mention other interpretations that followed the hit.

Each article appears to be less than 1000 words. As you might expect, that's a lot to cover in that much room. So each article is pretty topical, relating a single anecdote about it, and only touching on the rest. For instance, in the article about ‘Like a Rolling Stone', the author relates the recording process that shaped the final sound.

On take four of the remake, serendipity strikes. Session guitarist Al Kooper, 21, a friend of the band, walks in holding his guitar, hoping to join in. He is deemed surplus to requirements, but Dylan decides he wants an organ in addition to piano, and Kooper volunteers to fill in. He improvises his part, as he would later recall, ‘like a little kid fumbling in the dark for a light switch'. And suddenly the song turns into the tumbling, cascading version that will become the finished article.

There's two pieces of information that you need to know about this book in order to enjoy it.

  1. It is a collection of short articles by many contributors.
  2. Those writers are almost entirely arts journalists, rather than trained musicians.

This book was written by a lot of authors. I counted fourteen contributors, each of whom appears to be an English journalist. This can lead to the book feeling somewhat disjointed. Each author is comfortable talking about their own domain of the music industry. Some interpret the lyrics, others relate interviews with creators, others pick up on business maneuvers behind the scenes.

In the introduction, David Chael and Jan Dalley write that the book "is not about singers, or stars, or chart success – although of course they come into the story. It is about the music itself". If you are a musician, this may leave you expecting musical analysis, lyrical breakdowns, or at least comparisons to similar songs. The book "is about music" in as much as it tells stories about musicians, but it is strictly an outsiders perspective. There's no illusion that the writers were part of the culture of the song, or involved themselves with the people in the story. A reader shouldn't expect that in a collection such as this.

My favorite article is the one about ‘Midnight Train to Georgia'. That song has so much soul, that it surprised me to learn that the original title, given to the tune by its white songwriter, was ‘Midnight Plane to Houston'.

The soul singer Cissy Houston… decided to record its first cover version… But the title irked. It wasn't the collision of Houstons – singer and subject – that bothered her, but one of authenticity. If she was going to sing this song, she had to feel it. And, she later said, ‘My people are originally from Georgia and they didn't take planes to Houston or anywhere else. They took trains.'

Ultimately, Life of a Song is a great book to read on the way to and from work, or to sit in your book bin next to your favorite chair. It's a book that can be read in lots of small chunks, and each chunk reveals a little bit more about a song than the recording.

Now if you don't mind, I need to catch a plane to Houston.

I paid my dues to see David Gray live

One of the reasons I am fascinated with both computer science and music is that each is a bit like magic. Each has invisible power to make change.

Yesterday, my daughter woke up with the flu. Actually, we found out today that she has croup, which is apparently going around her school. So Erin stayed home with her, while I went to work. But we also had to cancel our plans for the evening. Instead of going to the David Gray concert together, I would go alone.

At work, I was stuck in a meeting that seemed like it would never end. During this meeting, I got a headache that kept getting worse and worse. When I rubbed my head, I could feel my temperature rising. I could tell that I was getting sick too. The meeting dragged on for four hours, but I pushed through it.

By the end of the day, I was exhausted and feverish. I had driven to work, because I was still going to make it to the concert, even if I was going alone. But in Palo Alto, you have to do a dance with the parking authority if you want to park for free. You have to move your car every two hours, from one colored zone to another. I left work a little early because I knew there would be traffic on the drive, but when I found my car, there was a bright orange envelope on the windshield. I owe Palo Alto $53.

At that point I had paid $70 for the tickets, plus $53 for the parking ticket, so I had invested $123 to see David Gray. The parking ticket only steeled my resolve. I was going to see him come hell or high water.

And this is all sort of silly, because I don't even like David Gray that much. Mostly, I have a deep sense of nostalgia for his one hit album that came out right before I went to college. I listened to it a lot in college. At the time, he was the only person I knew of who was doing singer-songwriter-plus-drum-machine really well. When I found out that Erin couldn't come to the concert, I tried to explain this to my younger coworkers who I invited to the concert. They were nonplussed to say the least. A singer-songwriter with a drum machine really doesn't sound very compelling today. It sounds practically commonplace. But nobody had quite figured out the formula back in 1998. So David Gray felt really fresh to me at the time.

My point is, I'm not a David Gray fanboy. I just respect the amount of time I spent listening to him when I was younger. Unfortunately, this is not enough to convince others to drive all the way up to Oakland for a concert.

The drive was hellish. If you have ever commuted from San Jose to/from Oakland during rush hour, then you know how this goes. The Greek Theater is only 40 miles from my workplace. The best route that Google could calculate took two and a half hours. I was in traffic for every minute of that drive, with a rising fever. It was extremely painful, and even though I left work fifteeen minutes early, I still arrived 10 minutes late.

But when I pulled up to the parking garage, things seemed to turn around. By this point I had a very high fever, the sun had gone down, and it was raining. So I couldn't see the "Full" sign on the parking garage until I had already pulled in using the wrong lane. Everyone was continuing on to the next lot. At first I tried to back out of the garage, but then I realized that it wasn't really full. So I pulled into a spot. I'd take my chances.

Then I stepped out into the rain, and started running to the theater. I could hear the music pouring over the hills. I saw a man standing in the rain, asking for extra tickets. I knew he was just going to scalp them, so I almost walked by, but fuck it, who cares. I gave him my extra ticket.

Then I ran up the steps, and breezed through security. I climbed to the top of the hill, and the music hit me.

That's the moment when you feel the true power of music. I was all alone and feverish, in the rain after a long day of work and an awful drive to the theater, yet the music seemed to heal me. I could feel myself recovering as the sound washed over me.

I didn't really talk to anyone. I listened to the music, and watched from the top of the grass. David Gray has a good band, and he has a good audience rapport. Even though his music isn't as fresh today as it was in 1998, it still changed me last night.

I bought a shirt, and felt a lot better on the drive home.

When Code Duplication is not Code Duplication

Duplicating code is a bad thing. Any engineer worth his salt knows that the more you repeat yourself, the more difficult it will be to maintain your code. We've enshrined this in a well-known principle called the DRY principle, where DRY is an acronym standing for Don't Repeat Yourself. So code duplication should be avoided at all costs. Right?

At work I recently came across an interesting case of code duplication that merits more thought, and shows how there is some subtlety needed in application of every coding guideline, even the bedrock ones.

Consider the following CSS, which is a simplified version of a common scenario.

.title {
  color: #111111;
}
.text-color-gray-1 {
  color: #111111;
}

This looks like code duplication, right? If both classes are applying the same color, then they do the same thing. If the do the same thing, then they should BE the same thing, right?

But CSS and markup in general presents an interesting case. Are these rules really doing the same thing? Are they both responsible for making the text gray? No.

The function of these two rules is different, even though the effect is the same. The first rule styles titles on the website, while the second rule styles any arbitrary div. The first rule is a generalized style, while the second rule is a special case override. The two rules do fundamentally different things.

Imagine a case where we optimized those two classes by removing the title class and just using the latter class. Then the designer changes the title color to dark blue. To change the title color, the developer now has to replace each occurrence of .text-color-gray-1 where it styles a title. So, by optimizing two things with different purposes, the developer has actually made more work.

It's important to recognize in this case that code duplication is not always code duplication. Just because these two CSS classes are applying the same color doesn't mean that they are doing the same thing. In this case, the CSS classes are more like variables than methods. They hold the same value, but that is just a coincidence.

What looks like code duplication is not actually code duplication.

But… what is the correct thing?

There is no right answer here. It's a complex problem. You could solve it in lots of different ways, and there are probably three or four different approaches that are equally valid, in the sense that they result in the same amount of maintenance.

The important thing is not to insist that there is one right way to solve this problem, but to recognize that blithely applying the DRY principle here may not be the path to less maintenance.

Previous page | Next page