Rethinking number formatting


Problem statement

A friend of mine, let’s call him Jack, asked me a seemingly innocent question (I’m paraphrasing):

How can I [in PHP1] format a number so that the number of decimals gets padded by spaces (not zeros) to a defined fixed length?

Which is another way of saying that Jack wants 12.345 and 123.4 formatted as strings "12.345_" and "123.4___" (respectively)2.

We had a lively discussion about it… and in the end concluded there’s no simple3 way to do it.

I mean, there’s:

<?php
print_r preg_replace_callback("/0*$/",
  function ($x) { return str_repeat("_", strlen($x[0])); },
  sprintf("%.8f", 12.345)) . "\n";
# => "12.345_____\n"

and it’s equivalent in Ruby:

("%.8f" % 12.345).sub(/0*$/) { |x| "_" * x.length }
# => "12.345_____"

But the sad fact is, sprintf sucks.

Can we do better?

Discussion

I think there is a problem with documentation and sprintf-style format strings in Jack’s question.

While you can right-pad the entire number with spaces (sprintf("%-8.4f", 12.345)) the documentation isn’t very clear on the fact that you can’t right/left pad individual parts of the number4. Which becomes rather important when you’re going for this effect:

Item Price
A 12.345_
B 123.4___

So we need a formatting function that allows to specify:

Solution

I think the above can be written as a wrapper around sprintf.

DEFAULT_FORMAT = {
  whole_min_length: 0,
  whole_padding: ' ',
  whole_direction: 'l',
  decimal_max_length: nil, # ≤ 15, tbh
  decimal_padding: '0',
  decimal_direction: 'r',
  decimal_separator: '.',
  thousands_separator: '\'',
}

def format(number, **opts)
  fmt = DEFAULT_FORMAT.dup.update(opts)
  whole, decimal = number.to_s.split(".")
  if fmt[:thousands_separator]
    whole = whole.reverse.scan(/.{1,3}/).join(fmt[:thousands_separator]).reverse
  end
  if whole.size < fmt[:whole_min_length]
    case fmt[:whole_direction]
    when 'l'
      whole = whole.rjust(fmt[:whole_min_length], fmt[:whole_padding])
    when 'r'
      whole = whole.ljust(fmt[:whole_min_length], fmt[:whole_padding])
    else
      # we could throw an error but we'll just refuse to pad instead
    end
  end
  if fmt[:decimal_max_length]
    ml = fmt[:decimal_max_length]
    if decimal.size > ml
      # round
      if (0..4).include?(decimal[ml,1])
        decimal = decimal[0, ml]
      else
        decimal = decimal[0, ml].succ
      end
    elsif decimal.size < ml
      # pad
      case fmt[:decimal_direction]
      when 'l'
        decimal = decimal.rjust(ml, fmt[:decimal_padding])
      when 'r'
        decimal = decimal.ljust(ml, fmt[:decimal_padding])
      else
        # we could throw an error but we'll just refuse to pad instead
      end
    end
  end
  [whole, decimal].join(fmt[:decimal_separator])
end

if __FILE__ == $0
  puts format(12.345, thousands_separator: ',') # => "12.345"
  puts format(12312.345, whole_min_length: 7) # => " 12'312.345"
  puts format(0.12345678901234567890) # => "0.12345678901234568"
  puts format(0.12345678901234567890, decimal_max_length: 7) # => "0.1234568"
  puts format(12.345, decimal_max_length: 8, decimal_padding: '_', decimal_direction: 'r')
  # => "12.345_____"
  puts format(12.345, decimal_max_length: 8, decimal_padding: '_', decimal_direction: 'l')
  # => "12._____345"
  puts format(12312.345, whole_min_length: 7,whole_direction: 'r', decimal_direction: 'l',
              decimal_max_length: 8, decimal_padding: ' ') # => "12'312 .     345"

end

But don’t ask me to write it in PHP. That’d be the end of me.

Closing words

Seeing how many moving parts there are, I’m not sure it could be compressed into a sprintf-style format string. But I’m pretty sure someone will eventually try.

Just like someone already implemented a Minecraft-powered CPU @ 1Hz.

  1. Let’s set aside that for me PHP is on par with Brainfuck.

  2. Obviously I’m replacing spaces with _ for legibility.

  3. Meaning: built-in, single function call.

  4. Originally I thought that PHP’s s formatting specifier would be useful but I don’t think that’s the case.