Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 33 additions & 5 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,26 @@
## Databricks Labs Data Generator Release Notes
# Databricks Labs Data Generator Release Notes

### Requirements
## Change History
All notable changes to the Databricks Labs Data Generator will be documented in this file.

See the contents of the file `python/require.txt` to see the Python package dependencies
### Unreleased

#### Changed
* Refactoring of template text generation for better performance

#### Added
* Ability to change name of seed column to custom name (defaults to `id`)

### Version 0.3.0

#### Changes
* Validation for use in Delta Live Tables
* Documentation updates
* Minor bug fixes
* Changes to build and release process to improve performance
* Modified dependencies to base release on package versions used by Databricks Runtime 9.1 LTS
* Updated to Spark 3.2.1 or later
* Unit test updates - migration from `unittest` to `pytest` for many tests

### Version 0.2.1

Expand All @@ -27,7 +45,10 @@ See the contents of the file `python/require.txt` to see the Python package depe
* Use of data generator to generate static and streaming data sources in Databricks Delta Live Tables
* added support for install from PyPi

### version 0.3.0

### General Requirements

See the contents of the file `python/require.txt` to see the Python package dependencies

The code for the Databricks Data Generator has the following dependencies

Expand All @@ -39,14 +60,21 @@ While the data generator framework does not require all libraries used by the ru
the Databricks runtime is used, it will use the version found in the Databricks runtime for 9.1 LTS or later.
You can use older versions of the Databricks Labs Data Generator by referring to that explicit version.

The recommended method to install the package is to use `pip install` in your notebook to install the package from
PyPi

For example:

`%pip install dbldatagen`

To use an older DB runtime version in your notebook, you can use the following code in your notebook:

```commandline
%pip install git+https://github.com/databrickslabs/dbldatagen@dbr_7_3_LTS_compat
```

See the [Databricks runtime release notes](https://docs.databricks.com/release-notes/runtime/releases.html)
for the full list of dependencies.
for the full list of dependencies used by the Databricks runtime.

This can be found at : https://docs.databricks.com/release-notes/runtime/releases.html

3 changes: 3 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ warrant that you have the legal authority to do so.

# Building the code

## Package Dependencies
See the contents of the file `python/require.txt` to see the Python package dependencies

## Python compatibility

The code has been tested with Python 3.8.10 and later.
Expand Down
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,11 @@

[![build](https://github.com/databrickslabs/dbldatagen/workflows/build/badge.svg?branch=master)](https://github.com/databrickslabs/dbldatagen/actions?query=workflow%3Abuild+branch%3Amaster)
[![codecov](https://codecov.io/gh/databrickslabs/dbldatagen/branch/master/graph/badge.svg)](https://codecov.io/gh/databrickslabs/dbldatagen)
[![downloads](https://img.shields.io/github/downloads/databrickslabs/dbldatagen/total.svg)](https://hanadigital.github.io/grev/?user=databrickslabs&repo=dbldatagen)
[![PyPi downloads](https://img.shields.io/pypi/dm/dbldatagen?label=PyPi%20Downloads)](https://pypi.org/project/dbldatagen/)
<!--
[![Language grade: Python](https://img.shields.io/lgtm/grade/python/g/databrickslabs/dbldatagen.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/databrickslabs/dbldatagen/context:python)
[![downloads](https://img.shields.io/github/downloads/databrickslabs/dbldatagen/total.svg)](https://hanadigital.github.io/grev/?user=databrickslabs&repo=dbldatagen)
-->

## Project Description
The `dbldatgen` Databricks Labs project is a Python library for generating synthetic data within the Databricks
Expand Down