Convert units of measurement
This looks ugly:
with uu(unit, coefficient, u_ord) as (
select
unit,
coefficient,
case
when log(u.coefficient) < 0
then floor (log(u.coefficient))
else ceil(log(u.coefficient))
end u_ord
from
unit_conversion u
),
norm (label, norm_qty) as (
select
s.label,
sum( uc.coefficient * s.quantity ) AS norm_qty
from
unit_conversion uc,
substance s
where
uc.unit = s.unit
group by
s.label
),
norm_ord (label, norm_qty, log, ord) as (
select
label,
norm_qty,
log(t.norm_qty) as log,
case
when log(t.norm_qty) < 0
then floor(log(t.norm_qty))
else ceil(log(t.norm_qty))
end ord
from norm t
)
select
norm_ord.label,
norm_ord.norm_qty,
norm_ord.norm_qty / uu.coefficient val,
uu.unit
from
norm_ord,
uu where uu.u_ord =
(select max(uu.u_ord)
from uu
where mod(norm_ord.norm_qty , uu.coefficient) = 0);
but seems to do the trick:
| LABEL | NORM_QTY | VAL | UNIT |
-----------------------------------------
| mercury | 1.5e-8 | 15 | microlitre |
| water | 0.00112 | 112 | centilitre |
You don't really need the parent-child relationship in the unit_conversion
table, because the units in the same family are naturally related to each other by the order of coefficient
, as long as you have the family identified.
I think, this can be largely simplified.
1. Modify unit_conversion
table
Or, if you cannot modify the table, just add the column exp10
for "exponent base 10", which coincides with number of digits to shift in the decimal system:
CREATE TABLE unit_conversion(
unit text PRIMARY KEY
,exp10 int
);
INSERT INTO unit_conversion VALUES
('microlitre', 0)
,('millilitre', 3)
,('centilitre', 4)
,('litre', 6)
,('hectolitre', 8)
,('kilolitre', 9)
,('megalitre', 12)
,('decilitre', 5);
2. Write function
to calculate the number of positions to shift left or right:
CREATE OR REPLACE FUNCTION f_shift_comma(n numeric)
RETURNS int LANGUAGE SQL IMMUTABLE AS
$$
SELECT CASE WHEN ($1 % 1) = 0 THEN -- no fractional digits
CASE WHEN ($1 % 10) = 0 THEN 0 -- no trailing 0, don't shift
ELSE length(rtrim(trunc($1, 0)::text, '0')) -- trunc() because numeric can be 1.0
- length(trunc($1, 0)::text) -- trailing 0, shift right .. negative
END
ELSE -- fractional digits
length(rtrim(($1 % 1)::text, '0')) - 2 -- shift left .. positive
END
$$;
3. Query
SELECT DISTINCT ON (substance_id)
s.substance_id, s.label, s.quantity, s.unit
,COALESCE(s.quantity * 10^(u1.exp10 - u2.exp10)::numeric
, s.quantity)::float8 AS norm_quantity
,COALESCE(u2.unit, s.unit) AS norm_unit
FROM substance s
JOIN unit_conversion u1 USING (unit)
LEFT JOIN unit_conversion u2 ON f_shift_comma(s.quantity) <> 0
AND @(u2.exp10 - (u1.exp10 - f_shift_comma(s.quantity))) < 2
-- since maximum gap between exp10 in unit table = 3
-- adapt to ceil(to max_gap / 2) if you have bigger gaps
ORDER BY s.substance_id
, @(u2.exp10 - (u1.exp10 - f_shift_comma(s.quantity))) -- closest unit first
, u2.exp10 -- smaller unit first to avoid point for ties.
Explain:
- JOIN substance and unit tables.
- Calculate ideal number of positions to shift with function
f_shift_comma()
from above. - LEFT JOIN to the unit table a second time to find units close to the optimum.
- Pick the closest unit with
DISTINCT ON ()
andORDER BY
. - If no better unit is found, fall back to what we had with
COALESCE()
. - This should cover all corner cases and be pretty fast.
-> SQLfiddle demo.