diff --git a/CHANGELOG.md b/CHANGELOG.md index 68216273..5dfad91f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,8 +1,26 @@ -## Databricks Labs Data Generator Release Notes +# Databricks Labs Data Generator Release Notes -### Requirements +## Change History +All notable changes to the Databricks Labs Data Generator will be documented in this file. -See the contents of the file `python/require.txt` to see the Python package dependencies +### Unreleased + +#### Changed +* Refactoring of template text generation for better performance + +#### Added +* Ability to change name of seed column to custom name (defaults to `id`) + +### Version 0.3.0 + +#### Changes +* Validation for use in Delta Live Tables +* Documentation updates +* Minor bug fixes +* Changes to build and release process to improve performance +* Modified dependencies to base release on package versions used by Databricks Runtime 9.1 LTS +* Updated to Spark 3.2.1 or later +* Unit test updates - migration from `unittest` to `pytest` for many tests ### Version 0.2.1 @@ -27,7 +45,10 @@ See the contents of the file `python/require.txt` to see the Python package depe * Use of data generator to generate static and streaming data sources in Databricks Delta Live Tables * added support for install from PyPi -### version 0.3.0 + +### General Requirements + +See the contents of the file `python/require.txt` to see the Python package dependencies The code for the Databricks Data Generator has the following dependencies @@ -39,6 +60,13 @@ While the data generator framework does not require all libraries used by the ru the Databricks runtime is used, it will use the version found in the Databricks runtime for 9.1 LTS or later. You can use older versions of the Databricks Labs Data Generator by referring to that explicit version. +The recommended method to install the package is to use `pip install` in your notebook to install the package from +PyPi + +For example: + +`%pip install dbldatagen` + To use an older DB runtime version in your notebook, you can use the following code in your notebook: ```commandline @@ -46,7 +74,7 @@ To use an older DB runtime version in your notebook, you can use the following c ``` See the [Databricks runtime release notes](https://docs.databricks.com/release-notes/runtime/releases.html) - for the full list of dependencies. + for the full list of dependencies used by the Databricks runtime. This can be found at : https://docs.databricks.com/release-notes/runtime/releases.html diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 4d1320cd..3cf2565b 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -13,6 +13,9 @@ warrant that you have the legal authority to do so. # Building the code +## Package Dependencies +See the contents of the file `python/require.txt` to see the Python package dependencies + ## Python compatibility The code has been tested with Python 3.8.10 and later. diff --git a/README.md b/README.md index ee12dc84..b163a961 100644 --- a/README.md +++ b/README.md @@ -10,8 +10,11 @@ [![build](https://github.com/databrickslabs/dbldatagen/workflows/build/badge.svg?branch=master)](https://github.com/databrickslabs/dbldatagen/actions?query=workflow%3Abuild+branch%3Amaster) [![codecov](https://codecov.io/gh/databrickslabs/dbldatagen/branch/master/graph/badge.svg)](https://codecov.io/gh/databrickslabs/dbldatagen) -[![downloads](https://img.shields.io/github/downloads/databrickslabs/dbldatagen/total.svg)](https://hanadigital.github.io/grev/?user=databrickslabs&repo=dbldatagen) +[![PyPi downloads](https://img.shields.io/pypi/dm/dbldatagen?label=PyPi%20Downloads)](https://pypi.org/project/dbldatagen/) + ## Project Description The `dbldatgen` Databricks Labs project is a Python library for generating synthetic data within the Databricks