Understanding Generated Columns

The Theory

Generated Columns is a feature released on MySQL 5.7. They can be used during CREATE TABLE or ALTER TABLE statements. It is a way of storing data without actually sending it through the INSERT or UPDATE clauses in SQL. The database resolves what the data will be.

There are two types of Generated Columns: Virtual and Stored. They work with:

  • mathematical expressions (product_price * quantity)
  • built-in functions (RIGHT(), CONCAT(), FROM_UNIXTIME(), JSON_EXTRACT())
  • literals (“2”, “new”, 0)

Besides that, they can be indexed but they don’t allow subqueries in it.
A Generated Column works within the table domain. If you need subqueries on a particular column, you may have to look at Views.

The basic example

As an example I am going to use an e-commerce database as based on my past experience of what I have seen and worked. You will probably have at least these tables or something similar:

  • users – stores user info
  • products – stores product info like price and description
  • orders – stores the user_id, date of order
  • orders_items – stores product_id, order_id, quantity and price at the time of purchase

This is the whole DB: Gist.

Notice the order_items definition:

The retrieval would bring:

id order_id product_id product_price quantity
1 369 1304 202.18 7
2 4 1736 250.40 3
3 270 1404 29.89 5
4 256 179 190.40 10
5 107 1911 146.98 1

One example is to get the total of that order_item row, something like total_item_price that would store the value of product_price * quantity to show how much the summed amount of an item would be. Some databases have the MONEY type to store price, as with MySQL it is recommended to work with DECIMAL.

People solve this problem in different ways:

  • store the calculated price on a new column to make it easier to retrieve;
  • create a view;
  • or they calculate in the application itself, which in this case might cause problems due to how the language handles floats. There are libraries to deal with money values in a lot of languages and frameworks, however, the overhead of converting each row into a money object could be costly depending on the amount of data being transferred.

Another way I’ve seen is: people calculate in the query the total amount for the orders_items row as product_price * quantity:

id order_id product_id product_price quantity total_item_price
1 369 1304 202.18 7 1415.26
2 4 1736 250.40 3 751.20
3 270 1404 29.89 5 149.45
4 256 179 190.40 10 1904.00
5 107 1911 146.98 1 146.98

Virtual Columns

  • They take no disk space, except when using a Virtual Column as in a Secondary Index.
  • They are an INPLACE operation: it means the table definition is changed without having to recopy all the data again. More info.
  • The values are calculated on the fly during read operations and BEFORE triggers.

Consider using virtual columns for data where changes happens in a significant number of times. The cost of a Virtual Column comes from reading a table constantly and the server has to compute every time what that column value will be.

Stored Columns

  • They do use disk space.
  • It has the same cost of adding a new column, so it is a COPY operation
  • Values are updated in every INSERT and UPDATE statement.

You should consider using Stored Columns for when the data doesn’t change significantly or at all after creation, like for instance, the example above with the orders_items table. Once a purchase is made, the price of the product is stored, not being changed, neither the quantity. Considering this information we could create total_item_price as a Stored Column.

The code

Creating a table

-- Virtual Column

CREATE TABLE `orders_items` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`order_id` int(10) unsigned NOT NULL,
`product_id` int(10) unsigned NOT NULL,
`product_price` decimal(10,2) unsigned NOT NULL DEFAULT '0.00',
`quantity` int(10) unsigned NOT NULL DEFAULT 1,
`total_item_price` decimal(10,2) AS (`quantity` * `product_price`),

-- Stored Column

CREATE TABLE `orders_items` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`order_id` int(10) unsigned NOT NULL,
`product_id` int(10) unsigned NOT NULL,
`product_price` decimal(10,2) unsigned NOT NULL DEFAULT '0.00',
`quantity` int(10) unsigned NOT NULL DEFAULT 1,
`total_item_price` decimal(10,2) AS (`quantity` * `product_price`) STORED,

Notice how the definition changes on line 9 and 23: you have another keyword, AS, then an expression and specifically on line 23 you see a STORED keyword. In both lines they are generated columns, if nothing is specified will be a VIRTUAL column.

Altering a table

It uses the same syntax as adding a column, just adding the “AS (expression)” after the data type:

-- `full_name` as VIRTUAL COLUMN
ADD COLUMN `full_name` VARCHAR(500)
AS (CONCAT_WS(" ", `first_name`, `last_name`));

-- `total_item_price` as STORED COLUMN
ALTER TABLE orders_items
ADD COLUMN `total_item_price` DECIMAL(10, 2)
AS (`quantity` * `product_price`) STORED;

JSON fields

It is also possible to extract data from JSON fields using generated columns. As the functions for JSON are built-in, JSON_EXTRACT and JSON_UNQUOTE as well “->” and “->>” work as expressions for a generated column:

-- Stored Columns
ALTER TABLE `twitter_users`
ADD COLUMN `location` VARCHAR(255)
AS (response->>"$.location") STORED;

Final considerations

When the type is STORED, it must be specified after the expression otherwise the default behaviour will be to be VIRTUAL.

Generated columns can have indexes created as the following, no matter if stored, virtual or extracted from a JSON field:

ADD INDEX `ix_full_name` (`full_name`);

Which is the same syntax for normal columns.

Roles on MySQL 8.0

This is a blogpost about MySQL 8.0.0, the content here could be subject to changes in the future since this is not a stable release.

This is a feature that will make life the many DevOps and DBAs easier. Believe it or not, a lot of people control access to the database by sharing the same username and password. Which is completely insecure for a number of reasons:

  • If someone is fired, they still have access
  • If you get hacked for using the same password, well, I only can say: good luck

That also means: no more querying to Stack Overflow to get that giant GRANT statement for each user you need to create. (Thank you @mwop for reminding me of this).

Creating a Role

This is a group of privileges that will be assigned to users:

CREATE ROLE 'admin';

You can also create more than one role at once, however if one of them is already created, the whole statement fails and it won’t be created:

CREATE ROLE 'dba', 'developer', 'readonly';


mysql> CREATE ROLE 'dba', 'developer', 'readonly';
Query OK, 0 rows affected (0.01 sec)

If you try to run:

mysql> CREATE ROLE 'dba';
ERROR 1396 (HY000): Operation CREATE USER failed for 'dba'@'%'

It causes an error because the role already exists.

Granting privileges

Notice you didn’t state the privileges at this point, you need use GRANT to do so, and it works the same way as granting privileges to a user:

For each of the responses you will get something similar to this:

Query OK, 0 rows affected (0.00 sec)

Creating Users

In my example I will add Lisa Simpson as ‘dba’, Millhouse V. Houten as ‘developer’ and Bart Simpson and Homer Simpson as ‘readonly’.

So this is the proposed idea:

lisa_simpson dba
millhouse_houten developer
homer_simpson readonly
bart_simpson readonly

The generated SQL will be:

After the user is created is time to tell which roles they are using, a role can have more than one user and a user can have more than one role. To attribute users to a role, do the following:

Granting the roles doesn’t mean they will start automatically using them. Remember, a user can have multiple roles, so you need to tell the server which role the user is using:

This way, the default role for that user on that host will be the one used, as shown at the example below:

mysql> SHOW GRANTS FOR millhouse_houten@'localhost';
2 rows in set (0.00 sec)


| Grants for millhouse_houten@localhost                   |
| GRANT USAGE ON *.* TO `millhouse_houten`@`localhost`    |
| GRANT `developer`@`%` TO `millhouse_houten`@`localhost` |


This way is simpler to change user permissions in batch, not having to go user by user. If the user is using a different role for any reason, it must be informed before any SQL manipulation.

Source: MySQL 8.0.0 manual.


Previously I had use FLUSH PRIVILEGES when granting permissions to users, this however is not needed because I didn’t manipulate the table users using INSERT, UPDATE, DELETE. In this case I manipulated the creation and grating using the specific commands for it.

Thank you, Paul DuBois, for the reminder.

MySQL 8.0 (dev): what to look for

MySQL 8.0 (dev): what to look for

This is an unstable release, please don’t use in production.

It was rumored that the new MySQL version would be 8 and not as 5.8 as a lot of people thought, and it appears the rumors were true.

Below are some of the features that caught my eye at first glance:


Although password expiration was implemented 5.7, the newer version bring a set of collective privileges as a Role. No need to have to copy paste that massive GRANT command you had when creating new users.

UTF-8 as default Charset

This is not yet the default charset coming with the server, but utf8mb4 will be the main charset instead of latin1, and the default collation will change from latin1_swedish_ci to utf8mb4_800_ci_ai. The plan is to do that before General Availability.

Invisible Indexes

Giving an index already exists, you can make it active or inactive. It is a toggling feature which enables the debugging work to see if an index really can be dropped (if it is not being used). This is for the search only, on write operations the index is still maintained.

IPv6 and UUID Manipulation

MySQL do not support those fields natively, however, it is recommended to store those items with the VARBINARY(16) type. MySQL now provides functions to manipulate textual representations of IPv6/UUID and to use bit-wise operations, to test, extract or compare.

Having those function built in, you can use a generated column to index that data.


A more comprehensive list can be found at the MySQL Server Blog. It is worth the read.

™MySQL is a trademark of Oracle.