From f7ee95aa7b71ee543dfb616651cd08f6952a4532 Mon Sep 17 00:00:00 2001 From: ronanstokes-db Date: Fri, 3 Feb 2023 14:10:36 -0800 Subject: [PATCH 1/4] updated badges --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index ee12dc84..c6fac30f 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,10 @@ [![build](https://github.com/databrickslabs/dbldatagen/workflows/build/badge.svg?branch=master)](https://github.com/databrickslabs/dbldatagen/actions?query=workflow%3Abuild+branch%3Amaster) [![codecov](https://codecov.io/gh/databrickslabs/dbldatagen/branch/master/graph/badge.svg)](https://codecov.io/gh/databrickslabs/dbldatagen) [![downloads](https://img.shields.io/github/downloads/databrickslabs/dbldatagen/total.svg)](https://hanadigital.github.io/grev/?user=databrickslabs&repo=dbldatagen) + +[![PyPi downloads](https://img.shields.io/pypi/dm/dbldatagen?label=PyPi%20Downloads)](https://pypi.org/project/dbldatagen/) ## Project Description The `dbldatgen` Databricks Labs project is a Python library for generating synthetic data within the Databricks From 71d438b469d68ad02ece41314a058f22b6d96032 Mon Sep 17 00:00:00 2001 From: ronanstokes-db Date: Fri, 3 Feb 2023 14:11:54 -0800 Subject: [PATCH 2/4] updated badges --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index c6fac30f..18b8e6e8 100644 --- a/README.md +++ b/README.md @@ -11,10 +11,10 @@ [![build](https://github.com/databrickslabs/dbldatagen/workflows/build/badge.svg?branch=master)](https://github.com/databrickslabs/dbldatagen/actions?query=workflow%3Abuild+branch%3Amaster) [![codecov](https://codecov.io/gh/databrickslabs/dbldatagen/branch/master/graph/badge.svg)](https://codecov.io/gh/databrickslabs/dbldatagen) [![downloads](https://img.shields.io/github/downloads/databrickslabs/dbldatagen/total.svg)](https://hanadigital.github.io/grev/?user=databrickslabs&repo=dbldatagen) +[![PyPi downloads](https://img.shields.io/pypi/dm/dbldatagen?label=PyPi%20Downloads)](https://pypi.org/project/dbldatagen/) -[![PyPi downloads](https://img.shields.io/pypi/dm/dbldatagen?label=PyPi%20Downloads)](https://pypi.org/project/dbldatagen/) ## Project Description The `dbldatgen` Databricks Labs project is a Python library for generating synthetic data within the Databricks From 95aec5174b2fb5184e71dafd36d290609c4424a9 Mon Sep 17 00:00:00 2001 From: ronanstokes-db Date: Fri, 3 Feb 2023 14:15:10 -0800 Subject: [PATCH 3/4] updated badges --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 18b8e6e8..b163a961 100644 --- a/README.md +++ b/README.md @@ -10,10 +10,10 @@ [![build](https://github.com/databrickslabs/dbldatagen/workflows/build/badge.svg?branch=master)](https://github.com/databrickslabs/dbldatagen/actions?query=workflow%3Abuild+branch%3Amaster) [![codecov](https://codecov.io/gh/databrickslabs/dbldatagen/branch/master/graph/badge.svg)](https://codecov.io/gh/databrickslabs/dbldatagen) -[![downloads](https://img.shields.io/github/downloads/databrickslabs/dbldatagen/total.svg)](https://hanadigital.github.io/grev/?user=databrickslabs&repo=dbldatagen) [![PyPi downloads](https://img.shields.io/pypi/dm/dbldatagen?label=PyPi%20Downloads)](https://pypi.org/project/dbldatagen/) ## Project Description From 769a31c9b470f092f9bf67bc0a0c4a2ee3abeb3a Mon Sep 17 00:00:00 2001 From: ronanstokes-db Date: Fri, 3 Feb 2023 16:22:33 -0800 Subject: [PATCH 4/4] updated change log and contributing doc --- CHANGELOG.md | 38 +++++++++++++++++++++++++++++++++----- CONTRIBUTING.md | 3 +++ 2 files changed, 36 insertions(+), 5 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 68216273..5dfad91f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,8 +1,26 @@ -## Databricks Labs Data Generator Release Notes +# Databricks Labs Data Generator Release Notes -### Requirements +## Change History +All notable changes to the Databricks Labs Data Generator will be documented in this file. -See the contents of the file `python/require.txt` to see the Python package dependencies +### Unreleased + +#### Changed +* Refactoring of template text generation for better performance + +#### Added +* Ability to change name of seed column to custom name (defaults to `id`) + +### Version 0.3.0 + +#### Changes +* Validation for use in Delta Live Tables +* Documentation updates +* Minor bug fixes +* Changes to build and release process to improve performance +* Modified dependencies to base release on package versions used by Databricks Runtime 9.1 LTS +* Updated to Spark 3.2.1 or later +* Unit test updates - migration from `unittest` to `pytest` for many tests ### Version 0.2.1 @@ -27,7 +45,10 @@ See the contents of the file `python/require.txt` to see the Python package depe * Use of data generator to generate static and streaming data sources in Databricks Delta Live Tables * added support for install from PyPi -### version 0.3.0 + +### General Requirements + +See the contents of the file `python/require.txt` to see the Python package dependencies The code for the Databricks Data Generator has the following dependencies @@ -39,6 +60,13 @@ While the data generator framework does not require all libraries used by the ru the Databricks runtime is used, it will use the version found in the Databricks runtime for 9.1 LTS or later. You can use older versions of the Databricks Labs Data Generator by referring to that explicit version. +The recommended method to install the package is to use `pip install` in your notebook to install the package from +PyPi + +For example: + +`%pip install dbldatagen` + To use an older DB runtime version in your notebook, you can use the following code in your notebook: ```commandline @@ -46,7 +74,7 @@ To use an older DB runtime version in your notebook, you can use the following c ``` See the [Databricks runtime release notes](https://docs.databricks.com/release-notes/runtime/releases.html) - for the full list of dependencies. + for the full list of dependencies used by the Databricks runtime. This can be found at : https://docs.databricks.com/release-notes/runtime/releases.html diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 4d1320cd..3cf2565b 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -13,6 +13,9 @@ warrant that you have the legal authority to do so. # Building the code +## Package Dependencies +See the contents of the file `python/require.txt` to see the Python package dependencies + ## Python compatibility The code has been tested with Python 3.8.10 and later.