-
Notifications
You must be signed in to change notification settings - Fork 359
From SQL to SPL:Get the records after and before the searched one
The ProductionLine_Number in a certain table of the Mariadb database is a grouping field, and there are duplicate values in the Cardboard_Number field within the group.
| id | Cardboard_Number | date_Time | ProductionLine_Number |
|---|---|---|---|
| 2 | WDL-005943998-1 | 2024-02-29 17:13:50 | 1 |
| 4 | spL1ml82N4o | 2024-02-29 17:13:54 | 1 |
| 5 | WDL-005943998-1 | 2024-03-01 09:44:42 | 1 |
| 6 | WDL-005943998-1 | 2024-03-01 10:34:57 | 1 |
| 7 | 950024027237 | 2024-03-01 10:44:57 | 1 |
| 8 | 950024027237 | 2024-03-01 10:52:57 | 1 |
| 9 | WDL-005943998-1 | 2024-03-01 13:58:43 | 2 |
| 10 | WDL-005943998-1 | 2024-03-01 13:58:46 | 2 |
| 11 | spL1ml82N4o | 2024-03-01 14:09:43 | 2 |
| 12 | WDL-005943998-1 | 2024-03-12 15:48:36 | 2 |
Group by ProductionLine_Number, sort by date_Time within the group, and search for all records in each group with Cardboard_Number equal to the specified string. Retrieve the records before and after the specified offset and remove duplicate records. For example, Cardboard_Number="WDL-005943998-1", with an offset of 1, the result is as follows:
| id | Cardboard_Number | date_Time | ProductionLine_Number |
|---|---|---|---|
| 2 | WDL-005943998-1 | 2024-02-29 17:13:50 | 1 |
| 4 | spL1ml82N4o | 2024-02-29 17:13:54 | 1 |
| 5 | WDL-005943998-1 | 2024-03-01 09:44:42 | 1 |
| 6 | WDL-005943998-1 | 2024-03-01 10:34:57 | 1 |
| 7 | 950024027237 | 2024-03-01 10:44:57 | 1 |
| 9 | WDL-005943998-1 | 2024-03-01 13:58:43 | 2 |
| 10 | WDL-005943998-1 | 2024-03-01 13:58:46 | 2 |
| 11 | spL1ml82N4o | 2024-03-01 14:09:43 | 2 |
| 12 | WDL-005943998-1 | 2024-03-12 15:48:36 | 2 |
If Cardboard_Number= "spL1ml82N4o" with an offset of 2, then the IDs of the result are 2,4,5,6,9,10,11,12.
SQL:
with ranked_table AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ProductionLine_Number ORDER BY date_Time) AS rn
FROM table1
),
filtered_table AS (
SELECT id, Cardboard_Number, date_Time, ProductionLine_Number,rn
FROM ranked_table
WHERE Cardboard_Number = 'WDL-005943998-1'
)
SELECT DISTINCT t1.id, t1.Cardboard_Number, t1.date_Time, t1.ProductionLine_Number
FROM ranked_table t1
JOIN filtered_table t2
ON t1.ProductionLine_Number = t2.ProductionLine_Number
AND (t1.rn = t2.rn OR t1.rn = t2.rn - 1 OR t1.rn = t2.rn + 1)
ORDER BY ProductionLine_Number, date_Time;
SQL requires using window functions to spell out sequence numbers, and then implementing interval association using JOIN, which can be quite lengthy in code. SPL has grouped subsets and a positional reference mechanism, with simple code: https://try.esproc.com/splx?3Io
| A | |
|---|---|
| 1 | $select * from table1.txt order by ProductionLine_Number, date_Time |
| 2 | =A1.group(ProductionLine_Number) |
| 3 | =A2.conj(~.select(Cardboard_Number[-1:1].contain("WDL-005943998-1"))) |
A2: Group but do not aggregate.
A3: Search for the records of each grouped subset, find the records within the interval of 1 record before and 1 record after (a total of 3 records) with the Cardboard_Number field equal to the specified string, and merge the calculation results of each group. [-1:1] represents the relative interval range.
SPL Resource: SPL Official Website | SPL Blog | Download esProc SPL | SPL Source Code