-
Notifications
You must be signed in to change notification settings - Fork 100
Description
I am using dlt-meta to onboard a CSV file that contains column names with special characters.
While ingesting into the Bronze layer, DLT fails with an error related to Delta table naming convention requirements, because the raw CSV column names are used as-is.
To handle this, my understanding is:
I should define an explicit DDL (valid column names + datatypes) in the onboarding configuration file.
This DDL appears correctly in the bronze_dataflow_spec.schema column.
However, DLT-Meta does not apply this defined schema when creating/processing the Bronze table.
As a result, the pipeline still fails due to the original special-character column name.
This seems like a bug or missing feature:
Even when explicit DDL is provided, DLT-Meta continues to infer raw schema and does not rename/apply the provided column definitions.
i am using v0.0.09 version of dlt-meta library.
Expected Behavior
When explicit DDL is defined in the onboarding YAML,
DLT-Meta should apply the provided schema for the Bronze table, including valid column names.
Bronze ingestion should succeed without requiring custom bronze_custom_transform_func just to fix column names.
The Bronze layer should not enforce raw CSV column names when the user has already supplied a valid DDL.
Actual Behavior
The DDL appears in the metadata table (bronze_dataflow_spec.schema)
But the DDL is not used during Bronze table creation.
Pipeline fails with Delta table naming error caused by special characters in raw CSV columns.
Question / Request for Help
Can dlt-meta handle schema enforcement using user-defined DDL without requiring a bronze_custom_transform_func?
Is this a missing capability or an issue in the current implementation where the defined schema is not applied during Bronze ingestion?
Any guidance or fix would be appreciated.