diff --git a/README.md b/README.md index 2b76135..c3f09d7 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ - [Snowflake Overview](docs/index.md) - [Building the Snowflake Extension](README.md#installation) - [Basic Configuration and Usage](README.md#configuring-snowflake) -- [Creating a Snowflake Sequence](docs/managing/creating.md) +- [Creating a Snowflake Sequence](docs/creating.md) - [Converting an Existing Postgres Sequence](docs/converting.md) - [Using Snowflake Functions](docs/snowflake_functions.md) - [Reviewing Snowflake Meta-Data](docs/review_seq.md) diff --git a/docs/converting.md b/docs/converting.md index 2dc0bde..e8bf6a5 100644 --- a/docs/converting.md +++ b/docs/converting.md @@ -1,101 +1,108 @@ -# Converting a PostgreSQL Sequence to a Snowflake Sequence - -You can use the `snowflake.convert_sequence_to_snowflake()` function to convert individual sequences to snowflake sequences. Note that this converts the sequence definition; existing values in a sequence column will not change. The command syntax in SQL is: - - `SELECT snowflake.convert_sequence_name('sequence_name');` - -**Where** - -`sequence_name` is the name of a sequence. - -There are a number of ways to return a list of sequences. One way is to use the Postgres [`psql client`](https://www.postgresql.org/docs/current/app-psql.html) to execute the following command: - -``` - \ds -``` - -Another way is to query metadata catalog information directly in `psql` or another SQL editor. - -``` - SELECT * FROM information_schema.sequences; -``` - -The second command will provide data type information, which is useful for learning which sequences will be converted to the BIGINT data type (64 bits). - - -## Example: Converting an Existing Sequence - -The example that follows demonstrates using the psql command line to convert a sequence; in this example, we are using a table named `orders` that has three columns; the last column is a sequence named `id`: - -``` -acctg=# CREATE TABLE orders (customer VARCHAR, invoice VARCHAR, id bigserial PRIMARY KEY); -CREATE TABLE -``` - -After creating the table, we insert data into the `orders` table. We only need to provide content for the first two columns, since the sequence definition keeps track of the value of the third column and adds it as needed: - -``` -acctg=# INSERT INTO orders VALUES ('Chesterfield Schools', 'art_9338'); -INSERT 0 1 -acctg=# INSERT INTO orders VALUES ('Chesterfield Schools', 'math_9663'); -INSERT 0 1 -acctg=# INSERT INTO orders VALUES ('Albemarle Schools', 'sci_2009'); -INSERT 0 1 -acctg=# INSERT INTO orders VALUES ('King William Schools', 'sci_7399'); -INSERT 0 1 -acctg=# INSERT INTO orders VALUES ('King William Schools', 'art_9484'); -INSERT 0 1 -acctg=# INSERT INTO orders VALUES ('Hanover Schools', 'music_1849'); -INSERT 0 1 -acctg=# INSERT INTO orders VALUES ('Washington Schools', 'hist_2983'); -INSERT 0 1 -``` -When we select the rows from our table, we can see the sequence numbers in the `id` column: - -``` -acctg=# SELECT * FROM orders; - customer | invoice | id -----------------------+------------+---- - Chesterfield Schools | art_9338 | 1 - Chesterfield Schools | math_9663 | 2 - Albemarle Schools | sci_2009 | 3 - King William Schools | sci_7399 | 4 - King William Schools | art_9484 | 5 - Hanover Schools | music_1849 | 6 - Washington Schools | hist_2983 | 7 -(7 rows) -``` -To convert the sequence definition for the `orders` table to use Snowflake sequences, invoke the command in `psql` or another SQL editor: - -``` -SELECT snowflake.convert_sequence_to_snowflake('orders_id_seq'::regclass); -``` - -The conversion process modifies the sequence definition to use Snowflake sequences, but does not update existing rows. If we reconnect with psql and add new rows to the table, the new row's `id` will be a Snowflake sequence: - -``` -acctg=# INSERT INTO orders VALUES ('Prince William Schools', 'math_8330'); -INSERT 0 1 -acctg=# INSERT INTO orders VALUES ('Fluvanna Schools', 'art_9447'); -INSERT 0 1 -``` - -In the query results that follows, you can see the unformatted sequence value in the `id` column, and the same information in the `format` column, formatted with the `snowflake.format(id)` function. The rows added before the conversion to Snowflake sequences show a fixed timestamp of `2022-12-31 19:00:00-05`, while the Snowflake sequences have a unique `id` and timestamp. - -Original entries in the table display a Postgres sequence, while entries made after the conversion display Snowflake sequences: - -``` -acctg=# SELECT id, snowflake.format(id), customer, invoice FROM orders; - id | format | customer | invoice ---------------------+-----------------------------------------------------------+------------------------+------------ - 1 | {"id": 1, "ts": "2022-12-31 19:00:00-05", "count": 0} | Chesterfield Schools | art_9338 - 2 | {"id": 2, "ts": "2022-12-31 19:00:00-05", "count": 0} | Chesterfield Schools | math_9663 - 3 | {"id": 3, "ts": "2022-12-31 19:00:00-05", "count": 0} | Albemarle Schools | sci_2009 - 4 | {"id": 4, "ts": "2022-12-31 19:00:00-05", "count": 0} | King William Schools | sci_7399 - 5 | {"id": 5, "ts": "2022-12-31 19:00:00-05", "count": 0} | King William Schools | art_9484 - 6 | {"id": 6, "ts": "2022-12-31 19:00:00-05", "count": 0} | Hanover Schools | music_1849 - 7 | {"id": 7, "ts": "2022-12-31 19:00:00-05", "count": 0} | Washington Schools | hist_2983 - 135824181823537153 | {"id": 1, "ts": "2024-01-10 14:16:48.438-05", "count": 0} | Prince William Schools | math_8330 - 135824609030176769 | {"id": 1, "ts": "2024-01-10 14:18:30.292-05", "count": 0} | Fluvanna Schools | art_9447 -(9 rows) -``` +# Converting a PostgreSQL Sequence to a Snowflake Sequence + +You cannot directly create a snowflake sequence; you must first create a sequence in Postgres and then convert it. Sequences are created explicitly with the `CREATE SEQUENCE` command but are also implicitly created by `GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY` table columns or `SERIAL` and `BIGSERIAL` columns. `GENERATED AS IDENTITY` is the recommended ANSI SQL method. + +You can use the `snowflake.convert_sequence_to_snowflake()` function to convert individual sequences to snowflake sequences. Note that this converts the sequence definition; existing values in a sequence column will not change. The command syntax in SQL is: + + `SELECT snowflake.convert_sequence_name('sequence_name');` + +**Where** + +`sequence_name` is the name of a sequence. + +There are a number of ways to return a list of sequences. One way is to use the Postgres [`psql client`](https://www.postgresql.org/docs/current/app-psql.html) to execute the following command: + +``` + \ds +``` + +Another way is to query metadata catalog information directly in `psql` or another SQL editor. + +``` + SELECT * FROM information_schema.sequences; +``` + +The second command will include data type information, which is useful for learning which sequences will be converted to the BIGINT data type (64 bits). + + +## Example: Converting an Existing Sequence + +The example that follows demonstrates using the psql command line to convert a sequence; in this example, we are using a table named `orders` that has three columns; the last column is a sequence named `id`: + +``` +acctg=# CREATE TABLE orders (customer VARCHAR, invoice VARCHAR, id bigserial PRIMARY KEY); +CREATE TABLE +``` + +After creating the table, we insert data into the `orders` table. We only need to provide content for the first two columns, since the sequence definition keeps track of the value of the third column and adds it as needed: + +``` +acctg=# INSERT INTO orders VALUES ('Chesterfield Schools', 'art_9338'); +INSERT 0 1 +acctg=# INSERT INTO orders VALUES ('Chesterfield Schools', 'math_9663'); +INSERT 0 1 +acctg=# INSERT INTO orders VALUES ('Albemarle Schools', 'sci_2009'); +INSERT 0 1 +acctg=# INSERT INTO orders VALUES ('King William Schools', 'sci_7399'); +INSERT 0 1 +acctg=# INSERT INTO orders VALUES ('King William Schools', 'art_9484'); +INSERT 0 1 +acctg=# INSERT INTO orders VALUES ('Hanover Schools', 'music_1849'); +INSERT 0 1 +acctg=# INSERT INTO orders VALUES ('Washington Schools', 'hist_2983'); +INSERT 0 1 +``` +When we select the rows from our table, we can see the sequence numbers in the `id` column: + +``` +acctg=# SELECT * FROM orders; + customer | invoice | id +----------------------+------------+---- + Chesterfield Schools | art_9338 | 1 + Chesterfield Schools | math_9663 | 2 + Albemarle Schools | sci_2009 | 3 + King William Schools | sci_7399 | 4 + King William Schools | art_9484 | 5 + Hanover Schools | music_1849 | 6 + Washington Schools | hist_2983 | 7 +(7 rows) +``` + +To convert the sequence definition for the `orders` table to use Snowflake sequences, invoke the following command in `psql` or another SQL editor: + +``` +SELECT snowflake.convert_sequence_to_snowflake('orders_id_seq'::regclass); +``` + +The conversion process modifies the sequence definition to use Snowflake sequences, but does not update existing rows. If we reconnect with psql and add new rows to the table, the new row's `id` will be a Snowflake sequence: + +``` +acctg=# INSERT INTO orders VALUES ('Prince William Schools', 'math_8330'); +INSERT 0 1 +acctg=# INSERT INTO orders VALUES ('Fluvanna Schools', 'art_9447'); +INSERT 0 1 +``` + +In the query results that follows, you can see the unformatted sequence values in the `id` column, and the same information in the `format` column, formatted with the `snowflake.format(id)` function. The rows added before the conversion to Snowflake sequences show a fixed timestamp of `2022-12-31 19:00:00-05`, while the Snowflake sequences have a unique `id` and timestamp. + +Original entries in the table display a Postgres sequence, while entries made after the conversion display Snowflake sequences: + +``` +acctg=# SELECT id, snowflake.format(id), customer, invoice FROM orders; + id | format | customer | invoice +--------------------+-----------------------------------------------------------+------------------------+------------ + 1 | {"id": 1, "ts": "2022-12-31 19:00:00-05", "count": 0} | Chesterfield Schools | art_9338 + 2 | {"id": 2, "ts": "2022-12-31 19:00:00-05", "count": 0} | Chesterfield Schools | math_9663 + 3 | {"id": 3, "ts": "2022-12-31 19:00:00-05", "count": 0} | Albemarle Schools | sci_2009 + 4 | {"id": 4, "ts": "2022-12-31 19:00:00-05", "count": 0} | King William Schools | sci_7399 + 5 | {"id": 5, "ts": "2022-12-31 19:00:00-05", "count": 0} | King William Schools | art_9484 + 6 | {"id": 6, "ts": "2022-12-31 19:00:00-05", "count": 0} | Hanover Schools | music_1849 + 7 | {"id": 7, "ts": "2022-12-31 19:00:00-05", "count": 0} | Washington Schools | hist_2983 + 135824181823537153 | {"id": 1, "ts": "2024-01-10 14:16:48.438-05", "count": 0} | Prince William Schools | math_8330 + 135824609030176769 | {"id": 1, "ts": "2024-01-10 14:18:30.292-05", "count": 0} | Fluvanna Schools | art_9447 +(9 rows) +``` + +**INCREMENT and Caching** + +Within the same millisecond, snowflake will allow a maximum of 4096 values. Some frameworks like the ORM framework Hibernate may be configured to try to fetch a sequence value, cache and assign a range by using an `INCREMENT` value for the sequence. Snowflake sequences honor the increment value, but you should not use a value greater than 4096. If you would like to change the increment value, use `ALTER SEQUENCE INCREMENT NO MAXVALUE`. diff --git a/docs/creating.md b/docs/creating.md index 1a69d7a..c4c1696 100644 --- a/docs/creating.md +++ b/docs/creating.md @@ -49,3 +49,5 @@ or use the `pg_ctl reload` command to reload the server: [pgedge]$ /home/pgedge/pg16/bin/pg_ctl reload -D "/home/pgedge/data/pg16" server signaled ``` + +Once snowflake is configured, you can begin to use snowflake sequences by first [converting](converting.md) them from Postgres sequences. diff --git a/expected/conversion.out b/expected/conversion.out index a2887d8..7945913 100644 --- a/expected/conversion.out +++ b/expected/conversion.out @@ -16,7 +16,7 @@ INSERT INTO t2 VALUES (DEFAULT); INSERT INTO t3 VALUES (DEFAULT); SELECT snowflake.convert_sequence_to_snowflake('t1'); -- ERROR, not a sequence ERROR: Input value "public.t1" is not a valid convertable sequence -SELECT snowflake.convert_sequence_to_snowflake('seq_1'); -- ERROR, not assocoated with any relation +SELECT snowflake.convert_sequence_to_snowflake('seq_1'); -- No associated relation found NOTICE: ALTER SEQUENCE public.seq_1 NO CYCLE MAXVALUE 43 convert_sequence_to_snowflake ------------------------------- @@ -216,7 +216,113 @@ SELECT :sf_val < x AS is_growing FROM t5 WHERE x > 42; t (1 row) +-- Test INCREMENT +CREATE TABLE t6(x bigserial, y int); +INSERT INTO t6 VALUES (DEFAULT, 10); +INSERT INTO t6 VALUES (DEFAULT, 20); +SELECT * FROM t6; + x | y +---+---- + 1 | 10 + 2 | 20 +(2 rows) + +INSERT INTO t6 VALUES (DEFAULT, 30); +SELECT * FROM t6; + x | y +---+---- + 1 | 10 + 2 | 20 + 3 | 30 +(3 rows) + +SELECT snowflake.convert_sequence_to_snowflake('t6_x_seq'); +NOTICE: EXECUTE ALTER TABLE public.t6 ALTER COLUMN x SET DEFAULT snowflake.nextval('public.t6_x_seq'::regclass) +NOTICE: ALTER SEQUENCE public.t6_x_seq NO CYCLE MAXVALUE 4 + convert_sequence_to_snowflake +------------------------------- + 0 +(1 row) + +-- Get the count portion of the id. +-- It happens within the same milisecond, should increment +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')); + get_count +----------- + 0 + 1 + 2 +(3 rows) + +-- Should return 1 (the counter difference) +SELECT ABS(snowflake.nextval('t6_x_seq') - snowflake.nextval('t6_x_seq')); + abs +----- + 1 +(1 row) + +-- If < 4096, will increment that amount +ALTER SEQUENCE t6_x_seq INCREMENT 100 NO MAXVALUE; +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')); + get_count +----------- + 104 + 204 + 304 +(3 rows) + +-- This will force the time ms portion to increment, +-- the count portion should be 0 +ALTER SEQUENCE t6_x_seq INCREMENT 4096; +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')); + get_count +----------- + 0 + 0 + 0 +(3 rows) + +-- should return 1 (the ms difference) +SELECT ABS(snowflake.nextval('t6_x_seq') - snowflake.nextval('t6_x_seq')) >> 22; + ?column? +---------- + 1 +(1 row) + +-- Test unreasonable value- still should get 0s +ALTER SEQUENCE t6_x_seq INCREMENT 9999; +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')); + get_count +----------- + 0 + 0 + 0 +(3 rows) + +-- should return 1 (the ms difference) +SELECT ABS(snowflake.nextval('t6_x_seq') - snowflake.nextval('t6_x_seq')) >> 22; + ?column? +---------- + 1 +(1 row) + -- Cleanup -DROP TABLE t1,t2,t3,t4,t5 CASCADE; +DROP TABLE t1,t2,t3,t4,t5,t6 CASCADE; DROP SEQUENCE favorite_seq; DROP EXTENSION snowflake; diff --git a/hibernate.md b/hibernate.md index af2103d..9c0cafc 100644 --- a/hibernate.md +++ b/hibernate.md @@ -1,5 +1,5 @@ Hibernate uses identifiers as the primary key of an entity. There are ways to -define identifiers in Hibernate, which are unique keys for an entity, ensuring +define identifiers in Hibernate, which are unique keys for an entity, ensuring they are not null. pgEdge Snowflake Sequences for PostgreSQL can be used with Hibernate, in this @@ -84,5 +84,10 @@ FROM 170505378935607296 "London" ``` -It is a basic example to demostrate use of pgEdge Snow Flake sequence with the +It is a basic example to demostrate use of pgEdge Snowflake sequence with the Java Hibernate code. + + +**INCREMENT and Caching** + +Within the same millisecond, snowflake will allow a maximum of 4096 values. Hibernate may be configured to try to fetch a sequence value, cache and assign a range by using an `INCREMENT` value for the sequence. Snowflake sequences honor the increment value, but you should not use a value greater than 4096. If you would like to change the increment value, use `ALTER SEQUENCE INCREMENT NO MAXVALUE`. diff --git a/snowflake.c b/snowflake.c index ddadd39..eb0c027 100644 --- a/snowflake.c +++ b/snowflake.c @@ -172,7 +172,9 @@ snowflake_nextval(PG_FUNCTION_ARGS) Buffer buf; Page page; HeapTupleData seqdatatuple; + HeapTuple pgstuple; Form_pg_sequence_data seq; + Form_pg_sequence pgsform; int64 result; bool logit = false; Snowflake flake; @@ -243,9 +245,25 @@ snowflake_nextval(PG_FUNCTION_ARGS) * The clock either has not ticked or is behind. We need to make * sure that the flake doesn't move backwards and that we bump * it into the future should the count roll over. + * + * There is special handing for when ids are assigned in ranges + * like for Hibernate. INCREMENT as part of the sequence definition + * can be used. If the count portion of snowflake exceeds the + * count mask + 1 (4096), it will cause the count to be set to 0 + * and the time ms portion is incremented. */ - flake.sf_count++; - if ((flake.sf_count & SNOWFLAKE_COUNT_MASK) == 0) + + pgstuple = SearchSysCache1(SEQRELID, ObjectIdGetDatum(relid)); + if (HeapTupleIsValid(pgstuple)) + { + pgsform = (Form_pg_sequence) GETSTRUCT(pgstuple); + elm->increment = pgsform->seqincrement; + ReleaseSysCache(pgstuple); + } + flake.sf_count += elm->increment; + + /* Mask uses least signicant bits, so this is safe */ + if (flake.sf_count > SNOWFLAKE_COUNT_MASK) { flake.sf_count = 0; flake.sf_msec++; @@ -533,6 +551,7 @@ init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel) elm->lxid = InvalidLocalTransactionId; elm->last_valid = false; elm->last = elm->cached = 0; + elm->increment = 1; } /* diff --git a/sql/conversion.sql b/sql/conversion.sql index 13f6ead..45e6749 100644 --- a/sql/conversion.sql +++ b/sql/conversion.sql @@ -99,7 +99,57 @@ INSERT INTO t5 values (DEFAULT); SELECT x AS sf_val FROM t4 WHERE x > 42 \gset SELECT :sf_val < x AS is_growing FROM t5 WHERE x > 42; +-- Test INCREMENT +CREATE TABLE t6(x bigserial, y int); +INSERT INTO t6 VALUES (DEFAULT, 10); +INSERT INTO t6 VALUES (DEFAULT, 20); +SELECT * FROM t6; +INSERT INTO t6 VALUES (DEFAULT, 30); +SELECT * FROM t6; +SELECT snowflake.convert_sequence_to_snowflake('t6_x_seq'); +-- Get the count portion of the id. +-- It happens within the same milisecond, should increment +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')); + +-- Should return 1 (the counter difference) +SELECT ABS(snowflake.nextval('t6_x_seq') - snowflake.nextval('t6_x_seq')); + +-- If < 4096, will increment that amount +ALTER SEQUENCE t6_x_seq INCREMENT 100 NO MAXVALUE; +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')); + +-- This will force the time ms portion to increment, +-- the count portion should be 0 +ALTER SEQUENCE t6_x_seq INCREMENT 4096; +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')); + +-- should return 1 (the ms difference) +SELECT ABS(snowflake.nextval('t6_x_seq') - snowflake.nextval('t6_x_seq')) >> 22; + +-- Test unreasonable value- still should get 0s +ALTER SEQUENCE t6_x_seq INCREMENT 9999; +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')) +UNION ALL +SELECT snowflake.get_count(snowflake.nextval('t6_x_seq')); + +-- should return 1 (the ms difference) +SELECT ABS(snowflake.nextval('t6_x_seq') - snowflake.nextval('t6_x_seq')) >> 22; + -- Cleanup -DROP TABLE t1,t2,t3,t4,t5 CASCADE; +DROP TABLE t1,t2,t3,t4,t5,t6 CASCADE; DROP SEQUENCE favorite_seq; DROP EXTENSION snowflake;