TILSeptember 13, 2019by Igor Alexandrov

How to parse CSV with double quote (") character in Crystal

We use microservice written in Crystal to parse large CSV files (about 1.5Gb). Some rows in these files may contain no closed " characters:


With Crystal default CSV parse settings this row and everything after it won't be parsed correctly because DEFAULT_QUOTE_CHAR constant is equal to ". Of couse you can override quote_char param in CSV contstructor with something that cannot be found in your document.

From my point of view the best is to use zero byte which is '\u0000' in Crystal.

csv = CSV.new(file, headers: true, strip: true, quote_char: '\u0000')

while csv.next

  # ... 



TILSeptember 02, 2019by Alexander Spitsyn

Determining class of an object with case equality operator (===)

Case equality operator (or triple equals, ===) in Ruby returns true if the passed class is in the ancestors list of the passed object's class:

1.class.ancestors # [Integer, Numeric, Object, ...]

Numeric === 1 # true

Object === 1 # true

So it can be used for determining object's class:

String === 'abc' # true

'abc'.class #=> String

In cases above the case equality operator works like #kind_of? (or #is_a?):

1.kind_of?(Integer) # true

1.is_a?(Numeric) # true

The classes above has different implementations of === operator, that's why the results of comparison are different:

String.===('abc') # the same as String === 'abc'

Also it means that order of the arguments is important:

1 === Integer # false

TILAugust 28, 2019by Ilia Kriachkov

Ruby double splat (**) operator cheatsheet

The operator ** is useful as an options hash.

def one_method(**options);end

This form is completely similar to the following:

def another_method(options = {});end

In addition, you can strictly define the set of required keys for the method.

def one_strict_method(first_name:, last_name: , **options)

  puts "options: #{options}"

  greeting = "Hello #{first_name} #{last_name}"

  puts options[:upcase] ? greeting.upcase : greeting


pry(main)> one_strict_method(upcase: true)

ArgumentError: missing keywords: first_name, last_name

pry(main)> one_strict_method(first_name: 'John', last_name: 'Doe', upcase: true)

options: {:upcase=>true}


=> nil

Another advantage of double splat literal is that it works like #merge for Ruby Hash

class Contact::ShowRepresenter #:nodoc:

  def call(contact)


      contact: {



        # You can add something more complex here.

        # **GeoLocaionRepresenter.new.(contact)





  def base_info(contact)


      id: contact.id,

      first_name: contact.first_name,

      last_name: contact.last_name,

      email: contact.email,

      phone: contact.phone



  def legal_info(contact)


      legal_name: contact.legal_name,

      legal_type: contact.legal_type,

      mailing_address: contact.mailing_address





=> {









    :mailing_address=>"15 Let Oktyabrya Street, #10b, Tver, Russian Federation 170008"



In conclusion, I want to demonstrate some benchmark results.

As you can see, the ** operator is a bit faster than Hash#merge.

require 'benchmark'

n = 50_000

Benchmark.bm(2) do |x|

  x.report('merge:             ') { n.times { merge } }

  x.report('double_splat_merge:') { n.times { double_splat_merge } }


def merge

  hash = { a: 'a' }

  { b: 'b' }.merge(hash)


def double_splat_merge

  hash = { a: 'a' }

  { b: 'b', **hash }


                     user      system      total        real

merge:               0.109247   0.088652   0.197899 (  0.204470)

double_splat_merge:  0.079480   0.003590   0.083070 (  0.083642)

TILAugust 26, 2019by Dmitry Voronov

How to store large JSON in PostgreSQL with Rails Attributes API

If you store large objects in the database (such as JSON), for example, data for big reports, then this can take up a lot of space. To reduce the size of data, you can compress and store in binary form.

PostgreSQL has a bytea field type for storing such data. You can add bytea column in Rails using migration

add_column :reports, :data, :binary

For binary field operations, you can use the Rails Attributes API and add a new BinaryHash data type

# app/types/binary_hash.rb

class BinaryHash < ActiveRecord::Type::Binary

  def serialize(value)

    super value_to_binary(value.to_json)


  def deserialize(value)

    super case value

          when NilClass


          when ActiveModel::Type::Binary::Data







  def value_to_hash(value)



      symbolize_names: true

    ) || {}


  def value_to_binary(value)




Register new type in initializers

# config/initializers/types.rb

ActiveRecord::Type.register(:binary_hash, BinaryHash)

And add to binary type attribute in model

# app/models/snapshot.rb

class Reports < ApplicationRecord

  attribute :data, :binary_hash


Tests show that data size is reduced by almost 3 times

Run time with 100000 width JSON

                           user     system      total        real

Compress JSON          0.008671   0.001535   0.010206 (  0.010885)

Decompress JSON        0.001357   0.000095   0.001452 (  0.001509)

json size       95450 bytes

binary size   33868 bytes

~ 2.82 times compression

TILAugust 26, 2019by Dmitry Voronov

A simple way to distribute jobs in Sidekiq queues

This option implies that jobs of one context are executed sequentially in one queue, and jobs of different contexts in parallel in different queues.

Let's look at the following example.

There are investment funds for which we want to make time-consuming reporting calculations. Jobs for calculation within the same fund are carried out sequentially so that there are no errors in the calculations, jobs of different funds are performed in parallel.

We automate the distribution of jobs in queues so as not to specify a queue manually.

Specify the queues in the sidekiq.yml configuration file:


  - fund_processor_0

  - fund_processor_1

  - fund_processor_2

  - fund_processor_3

  - fund_processor_4

It is important that the queues are numbered from 0.

Now, when starting the worker, we indicate in which queue we will set the job depending on the fund. To do this, we use the operation of obtaining the remainder from dividing the fund ID and the count of queues. So we get the queue number.

# 5 queues

# fund ID % count of queues = queue number

# 1 % 5 => 1

# 2 % 5 => 2

# 3 % 5 => 3

# 4 % 5 => 4

# 5 % 5 => 0

def queue_name(fund_id)

  queue_number = fund_id % 5



Start the worker, indicating to him the received queue name.

This can be done using the Sidekiq API


  'queue' => queue_name(fund_id),

  'class' => Fund::ReportCalculator,

  'args' => [fund_id]


TILAugust 23, 2019by Eugene Komissarov

Rubyists life made easier with composition operators.

If you write Ruby code and wandered into FP world you might just started writing those little tiny methods inside your classes/modules. And that was awesome to write code like this:

class Import

  # Some code goes here...

  def find_record(row)

    [ Msa.find_or_initialize_by(region_name: row[:region_name], city: row[:city], state: row[:state], metro: row[:metro] ), row ]


  # record is one of:

  # Object - when record was found

  # false - when record was not found

  def update_record(record, attributes)

    record.attributes = attributes



  # record is one of:

  # false

  # ZipRegion

  def validate_record(record)

    case record

    when false

      [:error, nil]





  # record is ZipRegion object

  def validate_record!(record)

    if record.valid?

      [:ok, record]


      error(record.id, record.errors.messages)

      [:error, record]



  def persist_record!(validation, record)

    case validation

    when :ok


    when :error





Yeah, I know there is YARD, and argument types are somewhat weird but at the time of coding, I was fascinated with Gregor Kiczales's HTDP courses (that was a ton of fun, sincerely recommend for every adventurous soul).

And next comes dreadful composition:

def process(row, index)

    return if header_row?(row)

    success(row[:region_name], persist_record!(*validate_record(update_record(*find_record(parse_row(row))))))


The pipeline is quite short but already hard to read. Luckily, in Ruby 2.6 we now have 2 composition operators: Proc#>> and its reverse sibling Proc#<<.

And, with a bit of refactoring composition method becomes:

def process(row, index)

    return if header_row?(row)

    composition = method(:parse_row) >>

                             method(:find_record) >>

                             method(:update_record) >>

                             method(:validate_record) >>

                             method(:persist_record!) >>



Much nicier, don't you think? Ruby just became one step closer to FP-friendly languages family, let's hope there'll be more!

TILJuly 31, 2019by Igor Alexandrov

Migrate tags in Rails to PostgreSQL array from ActsAsTaggableOn

ActsAsTaggableOn is a swiss army knife solution if you need to add tags to your ActiveRecord model.

Just by adding one gem to your Gemfile and acts_as_taggable to the model you get everything you need: adding tags, searching for a model by tag, getting top tags, etc. However, sometimes you don't need all these.

In our project, we used acts_as_taggable to store tags for Note model. Then we displayed a list of notes on several pages with assigned tags and had autocompleted input for tags on Note form. Everything worked well, but since we use PostgreSQL, I decided to store tags as an array in Note model.

First of all, I added tags Array<String> column to Note, after this migrated actsastaggable tags to notes table with migration.

class MigrateNoteTags < ActiveRecord::Migration[5.2]

  def change

    execute <<-SQL

    UPDATE notes 

    SET tags = grouped_taggings.tags_array 





        ARRAY_AGG ( tags.NAME ) tags_array 



        LEFT JOIN tags ON taggings.tag_id = tags.ID 


        taggable_type = 'Note' 

      GROUP BY


      ) AS grouped_taggings 


      notes.ID = grouped_taggings.taggable_id




To have backward compatibility, I added Note#tag_list method:

def tag_list

  tags.join(', ')


The last thing is to add the ability to search for tags. Since there about 500k records in the Notes table, I decided to create an SQL view:



  ( tags ) AS name,

  COUNT ( * ) AS taggings_count 





That's it! It takes from 100ms to 150ms to search for tags in this view, which is fine for me.

If you have more significant data sets, then the best would be to create tags table and add triggers to notes table that will update tags on INSERT/UPDATE/DELETE.

TILFebruary 20, 2019by Alexander Budchanov

AND & OR Operators Precedence

Are you still sure that && and and is the same operators? Look at this:

a = true && false


=> false

a = true and false


=> true

The same situation could be reproduced for || and or. Why? The answer lies in Ruby Operator Precedence.

The first example can be represented as:

a = (true && false)


(a = true) and false

Thanks to Igor Alexandrov

TILFebruary 03, 2019by Dmitry Voronov

Use hash or case-statement in Ruby?

Often, when we need to get a value based on the other one, we're using a case-statement. Like this

def realizing_trade_type(realizable_trade_type)

  case realizable_trade_type

  when 'buy'


  when 'short'


  when 'buy_contract'


  when 'short_contract'




But, if the conditions and the results are simple values, why don't we use hash for this? We can :)


  'buy'            => 'sell',

  'short'          => 'cover',

  'buy_contract'   => 'sell_contract',

  'short_contract' => 'cover_contract'


Here is the benchmark of both options, executed 10000000 times. It shows that a hash is faster in times for such the kind of usage.

>> require 'benchmark'


>> Benchmark.bm(15) do |x|

  x.report('hash') { 10_000_000.times { REALIZING_TRADE_TYPES['buy'] } }

  x.report('case-statement') { 10_000_000.times { realizing_trade_type 'buy' } }

  x.report('empty') { 10_000_000.times {} }


                      user     system      total        real

hash              0.990423   0.003412   0.993835 (  1.057612)

case-statement    1.752263   0.004531   1.756794 (  1.762030)

empty             0.380810   0.000728   0.381538 (  0.382153)

So, it's better to use a hash when you are just retrieving some values (like in the example above). If there is additional logic to execute, a case-statement is still a way to go.

TILJanuary 26, 2019by Andrey Morozov

Command for create zip archive without gem's 📁

class CreateZipCommand

  def call(files)

    # Create temp directory for files 

    tmp_dir = Dir.mktmpdir

    tmp_zip_path = File.join(tmp_dir, "files.zip")

    # Move files to the temporary folder you created above

    files.map do |file|

      download_file(file, tmp_dir)


    # Go to the folder and archive the entire contents

    `cd #{tmp_dir} && zip #{tmp_zip_path} ./*`

    # Return zip path




> CreateZipCommand.new.call(files)

=> "/var/folders/bk/0c864z710654sx555jpdpx9c0000gn/T/d20190126-7447-d27fpl/files.zip")

Most gems for working with archives eat a lot of memory when working with large files. This solution does not have these problems.

Make sure that the zip utility is installed on your computer - it don't work without it

TILJanuary 23, 2019by Andrey Morozov

Webhook integration in development with Ngrok 🚀

All begin when I’m using the Pipedrive (is a sales management tool designed to help small sales teams manage intricate or lengthy sales processes) webhook.

The solution was easy I just create an ngrok tunnel

> ngrok http 3000 

Session Status                online

Account                       Andrey (Plan: Free)

Version                       2.2.8

Region                        United States (us)

Web Interface       

Forwarding                    http://b1256cb6.ngrok.io -> localhost:3000

Forwarding                    https://b1256cb6.ngrok.io -> localhost:3000

and then sent the webhook to the generated address and it works. In local development I can user webhook from another api! It's work 🚀

To use it you need to do 4 steps:

  1. Register on ngrok

  2. Download ngrok from site

  3. Connect your account

  4. Run it 🚀


In addition, this service has a ngrok-tunel gem that allows you to fully integrate it with your application. But that's another story 💎

TILJanuary 07, 2019by Dmitry Voronov

How to create zip files on the fly w/o Tempfile

There are many articles about how to archive files from the server and send a zip-file to a client without persisting it on the server. But usually they don't literally do it, because they use temporary files.

There is a simple way to do it without creating any file though. You just have to put files directly to Zip::OutputStream and then read from it. Btw pay attention: you must rewind the stream before reading it.

# some files objects

def download(files)

  zip_stream = Zip::OutputStream.write_buffer do |zip|

    files.each.with_index(1) do |file, index|

     # file name must be uniq in archive





  # important - rewind the steam


  send_data zip_stream.read, 

            type: 'application/zip', 

            disposition: 'attachment', 

            filename: 'files-archive.zip'