about:drewcsillag

Feb 28, 2021 - 7 minute read - debugging logs sqlite

Log Analysis Using SQLite

It all started because I suck at the log searching tool I’ll call <LST>¹. Mostly because I used it a little, but not enough to warrant me really digging in and learning it for real and internalizing its search syntax.

Between me struggling with the tool, and that it would be multiple seconds between query and seeing results, I was getting really frustrated trying to diagnose issues with the services I was working on. Somewhere, I had seen a post about using SQLite for log analysis and I thought that seemed like an unusual but interesting idea, so I decided to give it a spin to see if it worked.

Starting Out

I started by doing a simple search in <LST> that returned some superset of what I needed, and exported it to CSV and loaded it into SQLite. Our logs were in JSON format, and SQLite has JSON functions to get what I needed. It worked, but it was a bit clunky.

After a while, I came up with some copy/paste recipes to create tables imported from the CSV that were exported from <LST>, and then wrapped it in a shell script². I also wrote a script to load a regular log file into a table for cases where I didn’t start with <LST>. Then I wrote a SQLite script to explode a JSON column into a new table — something similar to this (starting from a table named log):

-- Create a table that will contain all the JSON object keys
CREATE TABLE KEYS (attrname STRING, expr STRING);
INSERT INTO KEYS
       SELECT DISTINCT(e.key), e.fullkey AS key
         FROM json_each(log.line) e, log;

-- Generate the CREATE TABLE with all the JSON object keys
.once eval.sql
SELECT 'CREATE TABLE exploded (k_' || group_concat(attrname,' TEXT, k_') || ' TEXT);'
  FROM (SELECT attrname FROM keys);

-- Execute the CREATE TABLE
.read eval.sql

-- Generate the INSERT INTO statement to populate the exploded table
.once eval.sql
SELECT 'INSERT INTO exploded SELECT ' || GROUP_CONCAT(e, ', ') || ' FROM log;'
  FROM (SELECT 'json_extract(line, ''' || expr || ''')' AS e FROM keys);

-- execute the INSERT INTO
.read eval.sql

I also wanted to output results in a few formats that the SQLite command line tool didn’t, some of which it now does, but not all; such as Python repr format or being able to pipe it through a jq expression if the SELECT returned only one column of data, as well as a few others.

Now having written a patchwork of less-than-elegant tools, I realized that at least a more integrated less-than-elegant tool would be good, also as a place for other things like these that would crop up as time went on. I wound up writing my own SQLite REPL in Python. In retrospect, had I known about LiteCLI at the time, I probably would have just added it there instead of creating sqlp³ as I did, but oh well — at least I don’t have to worry about getting my changes accepted upstream. But there’s a bit more to say about this later.

A Different Way Of Doing A Database

The more I used this approach, the better I got at it. When you’re using a database this way, you find yourself doing things differently than you normally would when using a database. Normally, when using databases, you want to think deeply about naming and structure as you’re going to have to live with the schema for long time, and data migrations frequently take a lot of effort. For doing log analysis, you don’t think that hard about it at all as the database you make is unlikely to live for very long at all — the focus is on learning quickly and expedience, and you never need to do a data migration. Worst comes to worst, you throw the whole thing away, and start over, as the data in the database isn’t authoritative. Much as when processing logs in the shell, you create temporary intermediate files, in this case the whole database file is temporary.

A few upshots of this is that table names for me are often one letter. SQLite doesn’t require that you specify the types of columns, so unless there’s something specific that requires it, I don’t. I don’t worry about query performance as the datasets are comparatively small, and if I do something slow, I only slow down my machine, not some production database, so I don’t bother creating indexes either. Any time you spend thinking about these things is a waste, as you’re trying to spend more time analyzing and exploring the data in your logs to find whatever answer you’re looking for. But again, if the table structure turns out bad, just create another one, and I don’t usually bother deleting the bad one, unless to prevent myself accidentally using it instead of the one I want to use.

A Few Good Things I Found Myself Doing….

When you have a tool that make certain things easier, you notice yourself doing those things.

One of those things I encountered when using SQLite for log analysis is: I’ll often do a bit more processing of dates, simply because I can more easily. SQLite has good date parsing and stringifying functions. So doing some date math to compute elapsed/average times, or correct for some problematic clock skew (because all machines’ clocks are well synchronized right?) is way easier than if I wanted to attempt it in the shell. Honestly, because of how much harder it is to do in the shell, in practice I don’t do it at all.

Likewise, the equivalent of a GROUP BY expression with aggregation functions is considerably harder to do in the shell, so when using the shell, I tend not to. When doing things in SQLite, I do it pretty frequently. For example, writing a query like SELECT trace_id, MAX(julianday(timestamp, 'utc')) — MIN(julianday(timestamp, 'utc')) AS elapsed FROM f GROUP BY trace_id ORDER BY elapsed DESC LIMIT 3; to find the top three longest elapsed times for traces is not out of the ordinary when I’m using SQLite, but I probably wouldn’t even attempt to do something like this in the shell.

In The End, There /Can/ Be More Than One

Earlier, I noted that if I had known about LiteCLI earlier, I might have added what I put into sqlp into there instead. The good news is that because I’m doing things with SQLite, I’m not tied to a single tool, and can mix and match whatever SQLite tools I want and use each for what they’re the best tool for.

A Few Tips

A few shell scripting tool to SQL construct mappings to help you get started:

  • grep turns into either a LIKE or REGEX expression
  • sed turns into regex_extract or regex_sub (these are sqlp extensions)
  • sort turns into ORDER BY
  • uniq turns into DISTINCT

When I started using SQLite for this, I found out about the rowid field which turns out to be really useful. When creating derivative tables, such as using .jsonexplode in sqlp, I’ll include the original rowid value in the derivative table (usually I call it src_rowid) so if I need to join things back to the table row that the data derives from, I can.

Instead of trying to create the grand unified query to get everything you need in one go, with the schema you have, create intermediate tables to make the queries simpler.

Hopefully, this can be another tool in your toolbox and help you learn more, and more rapidly from your logs.


¹Using a generic name here as my problems with the tool is not the fault of the tool, but with myself. Also, because at the time I’m writing this, I work at a company which has a log searching tool as a product and don’t want to come across as spitting on a competitor’s tool.

²Newer versions of the SQLite REPL have something which does this.

³If you download sqlp, don’t expect it to be polished. I didn’t even package it, it’s just a .py file. It’s good enough for me. It works on the version of Python I have on my laptop which I think is > 3.7. YMMV. Pull requests are welcome though.