Tag: mysql

Using Active Record migrations beyond SQLite

Using Active Record migrations beyond SQLite

SQLite is really a good tool to set up quick proof of concepts and small applications; however it’s not the most robust solution on the market for working with relational databases. In the open source community two databases take the top of the list: PostgreSQL and MySQL.

I did a small project for my studies. I was using SQLite as I didn’t need much out of it. Curious, I decided to see how the application would behave on other databases and decided to try PostgreSQL and MySQL. I had two problems to solve, and this post is about the first one: how to deal with the migrations. They were as follows:

Active Record automatically put the field id in all of its tables, that’s why it is omitted on the migrations.

In PostgreSQL it went smoothly, all the migrations ran without any hiccup, except on MySQL, it gave me an error!

StandardError: An error has occurred, all later migrations canceled:

Column `artist_id` on table `songs` has a type of `int(11)`.
This does not match column `id` on `artists`, which has type `bigint(20)`.
To resolve this issue, change the type of the `artist_id` column on `songs` to be :integer. (For example `t.integer artist_id`).

Original message: Mysql2::Error: Cannot add foreign key constraint: ALTER TABLE `songs` ADD CONSTRAINT `fk_rails_5ce8fd4cc7`
FOREIGN KEY (`artist_id`)
REFERENCES `artists` (`id`)

The problem, beyond generating an ineligible name for an index: fk_rails_5ce8fd4cc7, is that artist_id on my table was as INT. The first thing I checked was to see if the artist.id was UNSIGNED and if my foreign key was also unsigned. They weren’t, but since were both signed, it wouldn’t throw an error. Looking more closely to the error message I noticed that the type in my foreign key column did not match the type on the primary key on the other table. Little did I know that Active Record generates the id field not as an INT, but as BIGINT.

I decided to go back and look at PostgreSQL, and to my surprise, and up to now I still am not sure of why, PostgreSQL did allow the column type mismatch where MySQL threw an error.

To fix it, I had to change the migration as follows:

Digging online, I found out how to create a bigint field with AR. According to the post, this would only work on MySQL, which they did, but I found it also worked with PostgreSQL (I tested MySQL 5.7 and Postgres 9.6): t.integer :artist_id, limit: 8.

The limit is used to set a maximum length for string types or number of bytes for numbers.

Why type matching is important

As an INT let’s say you can fit your number inside an espresso cup. Sure you can use the Starbucks Venti size cup to fit your coffee, but the full content of a Venti would never fit an espresso cup.

In the specific domain I am working on if I had a big list of Artists, and happen to have an artist which ID was higher than 2,147,483,647 (signed, and for both PostgreSQL and MySQL), I would get an error when trying to insert it into the Songs table since an Artist id can be up to 8 bytes (9,223,372,036,854,775,807).

Example:

Queen has its Artist id as: 21474836481 (which is a BIGINT)

Trying to insert “We Will Rock you” in the artist_id column for songs:

We get:

********** Error **********

ERROR: integer out of range
SQL state: 22003

This is the kind of problem we don’t usually notice in the beginning, and more often than not while the application is in production for even years, but this can happen and will happen if we don’t pay attention to foreign key types.

After that change, all the migrations ran smoothly. And I could actually move forward to the next problem (and post): Filtering a song title or artist name.

From MySQL 8.0.0 to MySQL 8.0.1 – or any other dev milestone

Disclaimer: This post is aimed to you, the curious developer, sys-admin, technologist, whatever-title-you-use. DO NOT run the following lines on production. Not even in a stable environment, do this if you don’t care about the outcome of the current data.

If you want to keep up with the newest MySQL developer milestones I have news for you: there is no upgrade available for milestone versions. The way to go is to remove old version and install new one, according to their website:

Upgrades between milestone releases (or from a milestone release to a GA release) are not supported. For example, upgrading from 8.0.0 to 8.0.1 is not supported, as neither are GA status releases.

So if you, like me, had the 8.0.0 version and want to test the 8.0.1 (alhtough 8.0.3 milestone is already in development) you need to do something like the following (tutorial based on Debian/Ubuntu servers).

Stop your service:

$ sudo service mysql stop

Download Oracle’s repository and install it, as of now this is the current version, you can get the new package here:

$ wget https://dev.mysql.com/get/mysql-apt-config_0.8.6-1_all.deb
$ sudo dpkg -i mysql-apt-config_0.8.6-1_all.deb

Clean your old install, you will lose all the data. Be careful, back up is on you!

$ sudo apt-get remove --purge mysql-server mysql-client mysql-common
$ sudo apt autoremove
$ sudo apt-get autoclean
$ sudo apt-get install mysql-server

This is the way to go to test the new features such as Descending Indexes and others. Remember, the new default encoding was changed from latin1 to utf8mb4.

Short feature list:

The complete list is available here.

Converting comma separated fields to MySQL JSON – a case study

This post is a case study of a job I had to do in a legacy application, it doesn’t mean it will apply to you, but it might.

This is a table of contents:

There are many ways to store a tree in a relational database, this is not by far the best option to do it, however it is still common to see it happen.

One way it is called materialized path, which consist with values separated by a delimiter, in my case a comma ,.

You would have a tree stored in a manner like this:

id parent_id user_id depth asc_path node
1 0 1 1
2 1 2 2 ,1, L
3 1 13 2 ,1, R
4 2 3 3 ,2,1, L
5 2 61 3 ,2,1, R
6 13 23 3 ,13,1, L
7 13 22 3 ,13,1, R
8 3 4 4 ,3,2,1, L
9 3 156 4 ,3,2,1, R
10 22 1568 4 ,22,13,1, L
11 22 26 4 ,22,13,1, R
12 23 1476 4 ,23,13,1, L
13 23 690716 4 ,23,13,1, R
14 61 1051 4 ,61,2,1, L
15 61 62 4 ,61,2,1, R

The column asc_path stands for ascending path of a tree in where which node has two other ones, not necessarily being a binary tree, being stored in the database.

This column has commas in the beginning and in the end because how queries are made to search if an element is present in the path or not by using LIKE "%,id,%". If someone did a search to know if the number 2 was a node in any of the paths, without the commas, it would also return 23, 62 and any other number containing 2.

Performance

The only way to make it a bit faster is having a FULLTEXT index created in asc_path. Because a BTREE index starts indexing in the beginning of a string, since the presence of the wildcard % in the string search it makes it in possible to use said index.

This is the graphical representation of the example above:

Tree

Searching

To search an specific element the query would be:

SELECT
parent_id,
user_id,
depth,
asc_path,
node
FROM tree
WHERE asc_path LIKE '%,13,%';

Result:

parent_id user_id depth asc_path node
13 23 3 ,13,1, L
13 22 3 ,13,1, R
22 1568 4 ,22,13,1, L
22 26 4 ,22,13,1, R
23 1476 4 ,23,13,1, L
23 690716 4 ,23,13,1, R

Converting to a JSON array

Some databases, like PostgresSQL (section 9.42) have more modifiers functions to convert strings to JSON, in my case I wanted to store the ascending tree path in a JSON field which would give me the possibility of using JSON_CONTAINS(json_doc, val) to know the records that have a given node in its path.

To do it, I had to transform the string in a JSON array.

1st step: remove the leading commas

Removing the leading commas, but before any update, lets test what we are doing:

SELECT
parent_id,
user_id,
depth,
asc_path,
TRIM(BOTH ',' FROM asc_path) AS trimmed_commas
FROM tree

Results:

parent_id user_id depth asc_path trimmed_commas
0 1 1
1 2 2 ,1, 1
1 13 2 ,1, 1
2 3 3 ,2,1, 2,1
2 61 3 ,2,1, 2,1
13 23 3 ,13,1, 13,1
13 22 3 ,13,1, 13,1
3 4 4 ,3,2,1, 3,2,1
3 156 4 ,3,2,1, 3,2,1
22 1568 4 ,22,13,1, 22,13,1

2nd step: add brackets to the string

A JSON array is formed around brackets [], and we need to have it in our string to be a valid JSON document:

SELECT
parent_id,
user_id,
depth,
asc_path,
TRIM(BOTH ',' FROM asc_path) AS trimmed_commas,
CONCAT("[", TRIM(BOTH ',' FROM asc_path), "]") AS added_brackets
FROM tree;

Results:

parent_id user_id depth asc_path trimmed_commas added_brackets
0 1 1
1 2 2 ,1, 1 [1]
1 13 2 ,1, 1 [1]
2 3 3 ,2,1, 2,1 [2,1]
2 61 3 ,2,1, 2,1 [2,1]
13 23 3 ,13,1, 13,1 [13,1]
13 22 3 ,13,1, 13,1 [13,1]
3 4 4 ,3,2,1, 3,2,1 [3,2,1]
3 156 4 ,3,2,1, 3,2,1 [3,2,1]
22 1568 4 ,22,13,1, 22,13,1 [22,13,1]

3rd step: validate if the changes works

Let’s use JSON_VALID() to see if it will accept our new string as a JSON, keep in mind that when the argument is NULL the return is also NULL:

SELECT
parent_id,
user_id,
depth,
asc_path,
TRIM(BOTH ',' FROM asc_path) AS trimmed_commas,
CONCAT("[", TRIM(BOTH ',' FROM asc_path), "]") AS added_brackets,
JSON_VALID(CONCAT("[", TRIM(BOTH ',' FROM asc_path), "]")) AS json_valid
FROM tree;

Results:

parent_id user_id depth asc_path trimmed_commas added_brackets json_valid
0 1 1
1 2 2 ,1, 1 [1] 1
1 13 2 ,1, 1 [1] 1
2 3 3 ,2,1, 2,1 [2,1] 1
2 61 3 ,2,1, 2,1 [2,1] 1
13 23 3 ,13,1, 13,1 [13,1] 1
13 22 3 ,13,1, 13,1 [13,1] 1
3 4 4 ,3,2,1, 3,2,1 [3,2,1] 1
3 156 4 ,3,2,1, 3,2,1 [3,2,1] 1
22 1568 4 ,22,13,1, 22,13,1 [22,13,1] 1
22 26 4 ,22,13,1, 22,13,1 [22,13,1] 1
23 1476 4 ,23,13,1, 23,13,1 [23,13,1] 1
23 690716 4 ,23,13,1, 23,13,1 [23,13,1] 1
61 1051 4 ,61,2,1, 61,2,1 [61,2,1] 1
61 62 4 ,61,2,1, 61,2,1 [61,2,1] 1

Replacing 1st step and 2nd step with a function

So that your query gets easier to use and not messy, you can create a function, I decided to create to_json_array(input_string, delimiter_char):

Running the query only with to_json_array on MySQL:

SELECT
parent_id,
user_id,
depth,
asc_path,
to_json_array(asc_path, ',') AS to_json_array,
JSON_VALID(to_json_array(asc_path, ',')) AS is_to_json_array_valid,
node
FROM tree;

Result:

parent_id user_id depth asc_path to_json_array is_to_json_array_valid node
0 1 1
1 2 2 ,1, [1] 1 L
1 13 2 ,1, [1] 1 R
2 3 3 ,2,1, [2, 1] 1 L
2 61 3 ,2,1, [2, 1] 1 R
13 23 3 ,13,1, [13, 1] 1 L
13 22 3 ,13,1, [13, 1] 1 R
3 4 4 ,3,2,1, [3, 2, 1] 1 L
3 156 4 ,3,2,1, [3, 2, 1] 1 R
22 1568 4 ,22,13,1, [22, 13, 1] 1 L

Disclaimer

This function is not native, and its use in production is not guaranteed.

Notice that the database returns the JSON as valid making it possible to convert that TEXT to a new column asc_path_json:

ALTER TABLE tree
ADD COLUMN asc_path_json JSON
AFTER asc_path;

UPDATE tree
SET asc_path_json = to_json_array(asc_path, ',');

Which gives us the ability to check more quickly if an item is in the path for that node:

SELECT *
FROM tree
WHERE json_contains(asc_path_json, "13");

Result:

id parent_id user_id depth asc_path asc_path_json node
6 13 23 3 ,13,1, [13, 1] L
7 13 22 3 ,13,1, [13, 1] R
10 22 1568 4 ,22,13,1, [22, 13, 1] L
11 22 26 4 ,22,13,1, [22, 13, 1] R
12 23 1476 4 ,23,13,1, [23, 13, 1] L
13 23 690716 4 ,23,13,1, [23, 13, 1] R