dragonflydb wouldn’t index my JSON
A short story about a small debugging adventure, entirely within the redis-cli.
I’m caching the content on this site in Dragonfly, a drop-in replacement for Redis with Redisearch support. I do this not only to store my content as JSON, but primarily so I can query it. Very nice for filtering, sorting, searching. Very demure, very mindful.
This worked great in development, but when I deployed it to production, it didn’t work: the page was empty. No content there, even though my logs state that the cache was successfully populated. I opened the redis-cli on production to inspect what’s going on, and to check what’s in the database.
$ KEYS *
> "posts:foo"
> "posts:bar"
> [...]
$ FT.INFO posts:index
> [...]
> num_docs 0
Interestingly enough, listing all stored keys with KEYS *
told me that it actually stored the records. My logs didn’t lie. But when inspecting the index with FT.INFO posts:index
, the final num_docs 0
told me that it didn’t index anything. Zero documents. So they are stored, but not indexed. Weird!
I spent some time blindly trying a few different things, like changing the keys and prefixes, until I decided to act like a proper engineer and go into the Redis CLI and test some things.
$ JSON.SET posts:foo . '{ "title": "lol" }'
> OK
$ FT.INFO posts:index
> [...]
> num_docs 1
I tried to create a new record in the CLI and — Wait… it worked! The post is stored and indexed. But why? Why does it work now? Even re-creating the index made no difference. So I tried the same thing again, just to check if the behavior was at least consistent:
$ JSON.SET posts:bar . '{ "title": "wot" }'
> OK
$ FT.INFO posts:index
> [...]
> num_docs 2
It still works. Okay, it’s consistent. The issue is not the key itself. But why is it working now? Why won’t it work with my actual content? Well, I didn’t test it with my actual content. Maybe I should try and save my actual records instead of just “lol” and “wot”.
$ JSON.SET posts:earbuds . '{
"title": "Open Ear Earbuds",
"description": null,
"frontmatter": {
"cover": "beans.jpg"
},
[...]
}'
> OK
$ FT.INFO posts:index
> [...]
> num_docs 2
Okokok. It didn’t work. The indexed document count didn’t change. I’m onto something. I can reproduce it. It must be the value. But what exactly? Is it the nested JSON?
$ JSON.SET posts:earbuds-2 . '{
"title": "Open Ear Earbuds",
"description": null,
[...]
}'
> OK
$ FT.INFO posts:index
> [...]
> num_docs 2
Removing the nested JSON still won’t work. So I can exclude nested JSON from my suspected issues (they were a bit sus at first because they are referenced with a funky JSON Path).
What’s left? Oh – oh no. Is it just the null
value? Let’s try and remove it:
$ JSON.SET posts:earbuds-3 . '{
"title": "Open Ear Earbuds",
"frontmatter": {
"cover": "beans.jpg"
},
[...]
}'
> OK
$ FT.INFO posts:index
> [...]
> num_docs 3
ffs, it is just the null
! It causes that the document won’t get indexed! And all my records had a null value in them, so nothing got indexed!
Sidebar: Kinda odd developer experience that it happily lets you store null
values, but then refuses to use them.
So I added a little utility to strip out all the null
’s from the records, deployed it, and everything worked fine on production.
Now the final question is: why did it work locally on my machine? idk, probably some mismatch in docker image versions, I didn’t care enough to spend more time on it lol.