This post is going to be short. Basically, CartoDB runs on the coolest database in the world, PostgreSQL. Because it runs on such an amazing open source technolgy, we often get to tap into amazing advances and amazing documentation of methods that come from the community.
One really nice write-up I came across over the week was, Full text search in milliseconds with PostgreSQL by Max Novakovic. If you haven’t read the post, it just shows you the handful of steps you need to take in order to make really fast, full text search possible in your PostgreSQL tables. I read the post and was like, “CartoDB readers need to know this”.
If you didn’t know, no matter type of CartoDB account you use, you have the ability to create functions and triggers that work on your tables that generate your maps. Given this flexibility, we can follow Max’s write-up for full text search directly in the CartoDB Editor.
Full text search on CartoDB
I have a really nice dataset from a weekend hack I did a couple years back. Basically, it is a bunch of (poorly) georeferenced text from the complete works of Mark Twain. You can find the complete dataset over on my public profile page.
Okay, given a table named twain
in my account, and two columns of text, name
which is the placename identified and passage
which is the fulltext of the line geocoded, let’s replicate the steps to make fast full text search possible.
Adding columns and indexes
Right in the CartoDB Editor, we are going to run the following SQL statements one at a time.
and then
When adding indexes in CartoDB, I often include the name of the table directly in my index name, twain_tsv_idx
for example. I do this so that I don’t hit the same index name twice if I use it on multiple tables. I don’t imagine this will be my last tsv example, so it will be good to do here.
Updates - setting our tsvector values
Just like addeing columns and indexes in CartoDB, we can run UPDATEs directly in the editor. Here we are going to follow Max’s method, but we aren’t using a JSON column as input, I’m just using my two text columns name
and passage
, so my query looks a little bit different.
Creating functions and triggers
Creating functions and triggers is just as easy, write them directly in the SQL editor in your account, hit apply query and if no errors are returned, you are gold. Again, I’m going to add the tablename to my function to ensure I don’t clash with future functions.
Same thing for creating our INDEX,
Running a full text query
Final thing is to see it in action. Here, I’m going to query for Mark Twain passages that include the phrase, river
,
Here we can see it in action,
And if you are building an application, you can run the same query over the SQL API to retrieve the data as a JSON object.