Skip to content

Conversation

@tvanderpol
Copy link

Overview

We use a custom ULID function in our databases, this is a decision that was made prior to lexically sortable uuids natively available in postgres.

The problem is that the shape of a ULID is different from a UUID, the stanza that identifies the version and such has random data in it.

CREATE OR REPLACE FUNCTION public.generate_ulid()
       RETURNS uuid
       LANGUAGE sql
      AS $function$
          SELECT (lpad(to_hex(floor(extract(epoch FROM clock_timestamp()) * 1000)::bigint), 12, '0') || encode(gen_random_bytes(10), 'hex'))::uuid;
        $function$

This leads to a problem where some of our UUIDs are randomly assigned values that identifies them as a v4 UUID but others get identified as various other flavours, here's a sample:

iex(4)> UUID.info("019abd34-9346-01db-f2a8-71d3b6e53090")
{:ok,
 [
   uuid: "019abd34-9346-01db-f2a8-71d3b6e53090",
   binary: <<1, 154, 189, 52, 147, 70, 1, 219, 242, 168, 113, 211, 182, 229, 48,
     144>>,
   type: :default,
   version: 0,
   variant: :reserved_future
 ]}
iex(5)> UUID.info("019abd38-9994-eff4-24f5-8b3df581fd5d")
{:ok,
 [
   uuid: "019abd38-9994-eff4-24f5-8b3df581fd5d",
   binary: <<1, 154, 189, 56, 153, 148, 239, 244, 36, 245, 139, 61, 245, 129,
     253, 93>>,
   type: :default,
   version: 14,
   variant: :reserved_ncs
 ]}

Two things all possible ULIDs have in common are:

  • They save to a uuid column successfully
  • They are parsed successfully by UUID.info()

The problem is that in any large enough set of records there'll be some keys that are valid UUIDs according to Sequin and some that are not. I can't make an enrichment function that handles both cases. If I do this:

SELECT pv.id::uuid
FROM product_variants pv
WHERE pv.id::uuid = ANY($1::uuid[])

ULIDs that are parsed as version 4 or 7 function correctly, but all others throw an error like this:

Postgrex expected a binary of 16 bytes, got "019a9f4a-4407-4257-f238-16caabeb09aa". Please make sure the value you are passing matches the definition in your table or in your query or convert the value accordingly.

And if I use an enrichment function like this:

SELECT pv.id::text
FROM product_variants pv
WHERE pv.id::text = ANY($1::text[])

all the ULIDs that are "invalid" respond correctly but valid UUIDs return an error like this:

ERROR 22021 (character_not_in_repertoire) invalid byte sequence for encoding "UTF8": 0x98

(it complains about several different byte sequences depending on the specific id in question but that makes sense given it's just random bytes).

Solution

In this PR I propose that the UUID check is relaxed down to 'can UUID.info() parse this at all?'. I am not an Elixir expert so I might not have taken the best approach to get there, but I can confirm that this change consistently returns true for all the ULIDs I've thrown at it, while still returning false for nonsense strings.

@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. enhancement New feature or request labels Dec 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant