Tag: ruby

Ramblings on optimizations, anti patterns and N+1

A lot of people ask me to teach them how to do query analysis and performance. The truth is: there isn’t a script to follow. The following paragraphs are a brain dump on what usually goes on my mind when I am debugging and analyzing.

Please comment on what you think I should focus on to cover here.

TL; DR;

  • It’s just a messy post with database-y stuff
  • This post doesn’t have a conclusion, it is just me laying down my thoughts on performance and optimizations.

Thoughts

Query performance is a really difficult subject to talk about. Mostly because because SQL is a declarative language, leaving it up to the Optimizer to decide which way is the best to retrieve the information needed and that is based in so many variables.

The most common problem regarding optimization I see, comes not from the Database itself, but how we handle the requests on the application layer, the following for instance would cause N+1 problems:

Code example:

Although seemingly innocent at first, this code could easily slow down performance on the database due to the amount of requests that would be made.

You also need to know about the intricacies of indexes, which one is the best, if you have a composite index, which should go first, and what happens if I only use one of the fields of a two column indexes in my search? Does it still uses the index somehow? Another rule of thumb is that if an index is a BTREE, on a single column, you can use it either ASC​ or DESC.

Or better yet: why my transactions are taking so long to complete? Does it have too many indexes on the table? Is any other query locking table X?

Even a single ​INNER JOIN could be highly costly if joining two large tables.

Why are you saving that JSON in a TEXT​ field? Since we are on the subject, you really need the JSON in the relational database and not in a document store?

You don’t need to port all your data from PostgreSQL/MySQL to MongoDB if you want to have MongoDB on your stack. Everything has its place, relational data on relational databases and non-relational data on non relational databases. I even find unfair benchmarks between a SQL database and a NoSQL one. They were made to solve different problems, you can’t possibly have the same use case for both of them.

No, it’s not ok to have category_1, category_2, ..., category_n as columns on your products table.

Avoid as much as possible nullable fields.

Relationships should also explicitly live on the RDBMS, not only on your model, if you have a user_id​ on your addresses​ table, tell the database so, naming it user_id doesn’t automatically create the foreign key.

You need:

Or your migration should look something like this:

Line 24: adds to the table addresses​ a foreign key from users.

End

And you, what you think is missing in this blogpost? What do you want to get deeper on?

Congress, who is? – A Civic Tech project

Congress, who is? – A Civic Tech project

A while ago I had this idea for a project: To show how representatives voted, either for or against, on bills.

People elect representatives but often forget to follow what they are up to. I asked around: who is your representative? The most common response: I don’t know. If people don’t even know who their representatives are, when it comes to being listened to, how they are going to contact the House or Senate member?

That’s when Congress, who is? was born out of a 2 week project where I poured myself into and worked with the ProPublica Congress API, Twilio API and a bit of the Twitter API (those pictures must come from somewhere!).

People are able to search through their zipcode to find their representative or filter by State/Territory, Party, House or Name. Once into the member profile you can do a call directly from your browser to the member’s office.

Some images from https://www.congresswhois.com

The USA map is rendered showing a simple majority of the representation of the House. On click the listing of representatives is shown on the right.

It’s possible to also compare statistics from one politician to another. See how they vote with the party and in common between themselves.

Screenshot-2017-11-28 Congress, who is (3)

Features to come

  • Show beyond current Congress, at this moment the congress number is 115, and the API can show me members since 102-115 for House and 80-115 for Senate.
  • Show bills and votes
  • Add full text search
  • More to be defined

Code

Code will be released under MIT license. There is a few cleaning up to do, and I want to open source it with a few issues already opened and documented. As I said, the app was developed in two weeks, but it grew on me and I want to take a step further.

Stack

Backend:

  • Ruby on Rails
  • PostgreSQL

Frontend:

  • React
  • Redux
  • Semantic UI

Contributions

Right now the code is running in a “closed” beta, if you can’t wait and want to help, DM me on Twitter (no need to follow back, DMs are open on my end), or use this website contact form, or simply mail me at gabriela.io.

Thank you

I want to give a special thanks to Twilio. During this year PHPWorld they hosted a competition to showcase your project using Twilio. I showcased this project and they awarded the project with some awesome amount of credits for us to run for a while on it. So thank you for the support!

Disclaimers

Calls only works on Chrome, Firefox and Safari for Desktop. The client call doesn’t work on mobile, Internet Explorer or Opera. It’s more of a technical limitation on how each browser implement their JavaScript than application level development.

The data displayed may be incorrect. That is because it is synced daily with the ProPublica API, whatever they have on record, it is what I am showing.

Transferring ownership of repositories on GitHub

For the past couple months, I’ve been studying. As a side effect, my GitHub account was cluttered with code that is experimental. I didn’t exactly want to trash the experimental code. I wanted to keep my code but also not specifically keep it under my profile.

The solution I found was to create an organization and transfer my desired repositories there. I thought this was a great solution, there was just one little detail I was missing: the current GitHub API does not support repository ownership transfer. Which meant I would have to go to each repository, click on “Settings”, click on “Transfer”, fill in the “Repository Name” and then put the “Destination User”. A lot of steps for someone looking to move over 250 repositories.

The first thing that came to my mind was to use Selenium to automate this task. But my lack of exposure with the technology made me think a bit outside the box. One of the things I learned these past months was capybara. Capybara is used alongside Rspec, extending the test suite DSL. It mimics user interaction with the browser and comes with a Selenium driver out of the box.

In other words, you can create a bot to go to the browser, fill up forms and submit it. This is exactly what I was looking for. As I stated before, Capybara uses Rspec, so my code would actually have to be wrapped inside a test.

Caveats

  • You need to disable two-factor authentication on GitHub, and after the script finishes running set it back on.
  • It does not transfer private repositories.
  • You need Mozilla’s geckodriver installed. If you use macOS, you can use brew to install it.

The Script

This was developed as a hack. Use at your own risk. You can download it at gabidavila/github-move-repositories. As of now this code gets all public repositories, unless the variable ONLY_FORKS is set to TRUE, and moves to a destination user. Do not push your .env​ file.If you want to move only specific repositories, you will need to edit the code yourself. For more information the README.md of this project is kept up to date.

Contributions are welcome if you feel you can help improve the tool. For example: add options of which repositories to move.

Enjoy!