Skip to content

Commit 4dd1768

Browse files
author
zaber-dev
committed
Add comprehensive documentation for Database Swap, including guides on getting started, configuration, CLI commands, architecture, and troubleshooting.
1 parent 4039140 commit 4dd1768

File tree

10 files changed

+560
-0
lines changed

10 files changed

+560
-0
lines changed

Learn.md

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
# Learn Database Swap
2+
3+
A guided path to learn and use Database Swap effectively. This page links to focused docs so you can ramp up quickly and deepen as needed.
4+
5+
- What it is: A Python tool to migrate data between SQLite, MySQL, and MongoDB with validation, rate limiting, and progress tracking.
6+
- Who it’s for: Developers and data engineers moving datasets across database engines reliably.
7+
8+
## Start here
9+
10+
1) Getting started (install + first migration)
11+
- Read: docs/getting-started.md
12+
13+
2) Understand the configuration
14+
- Read: docs/configuration.md
15+
16+
3) Use the CLI efficiently
17+
- Read: docs/cli.md
18+
19+
4) Know your database adapters
20+
- Read: docs/adapters.md
21+
22+
5) How it works under the hood
23+
- Read: docs/architecture.md
24+
25+
6) Extend to new databases
26+
- Read: docs/extending.md
27+
28+
7) Troubleshoot common issues
29+
- Read: docs/troubleshooting.md
30+
31+
8) Recipes and examples
32+
- Read: docs/recipes.md
33+
34+
9) FAQ
35+
- Read: docs/faq.md
36+
37+
## Quick cheat sheet
38+
39+
Installation (from source)
40+
41+
```powershell
42+
pip install -r requirements.txt
43+
pip install -e .
44+
```
45+
46+
Available CLI commands
47+
48+
```powershell
49+
# Initialize a config file in the current folder
50+
database-swap init-config -o config.yaml
51+
52+
# Test a connection
53+
database-swap test-connection --db-type sqlite --database .\source.db
54+
55+
# Analyze a database (structure + counts)
56+
database-swap analyze --db-type sqlite --database .\source.db
57+
58+
# Migrate from SQLite to MySQL (dry run first)
59+
database-swap migrate --dry-run `
60+
--source-type sqlite --source-database .\source.db `
61+
--target-type mysql --target-host localhost --target-database target_db --target-username root
62+
63+
# Run the actual migration
64+
database-swap migrate `
65+
--source-type sqlite --source-database .\source.db `
66+
--target-type mysql --target-host localhost --target-database target_db --target-username root
67+
```
68+
69+
Configuration keys you’ll use most
70+
71+
- source.* and target.*: database types and connection info
72+
- migration.batch_size: records per write
73+
- migration.rate_limit_delay: seconds to wait between batches
74+
- migration.tables: subset of tables to migrate
75+
- validation.strict_mode: fail fast on invalid data
76+
77+
See docs/configuration.md for a full reference.
78+
79+
## Examples in repo
80+
81+
- examples/sqlite-to-sqlite.yaml
82+
- examples/mysql-to-mongodb.yaml
83+
84+
These are great starting points for your own config files.
85+
86+
## Contributing
87+
88+
- Open issues and PRs are welcome.
89+
- See docs/extending.md to add new database adapters.

docs/adapters.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Adapters
2+
3+
Database adapters implement a common interface (`DatabaseAdapter`) to interact with each database engine.
4+
5+
Built-in adapters
6+
7+
- SQLite (`database_swap.adapters.sqlite.SQLiteAdapter`)
8+
- MySQL (`database_swap.adapters.mysql.MySQLAdapter`) – requires mysql-connector-python
9+
- MongoDB (`database_swap.adapters.mongodb.MongoDBAdapter`) – requires pymongo
10+
11+
The adapter factory (`database_swap.adapters.get_adapter`) resolves a type string to the adapter class if its dependency is available.
12+
13+
## Required connection fields
14+
15+
- SQLite: database (file path)
16+
- MySQL: host, database, username (password typically required)
17+
- MongoDB: host, database (username/password optional)
18+
19+
## Common capabilities
20+
21+
- connect(), disconnect(), test_connection()
22+
- get_tables(), get_table_schema(name), get_table_count(name)
23+
- read_data(name, batch_size): yields batches of dict rows
24+
- write_data(name, data, create_table)
25+
- create_table(name, schema), drop_table(name)
26+
27+
Schemas are lightweight dicts with a `columns` map, optional `primary_keys`, and `indexes`.
28+
29+
```python
30+
schema = {
31+
'columns': {
32+
'id': { 'type': 'INTEGER', 'nullable': False },
33+
'name': { 'type': 'TEXT', 'nullable': True }
34+
},
35+
'primary_keys': ['id'],
36+
'indexes': []
37+
}
38+
```
39+
40+
MongoDB schemas are inferred from sample documents and are only advisory.

docs/architecture.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Architecture overview
2+
3+
This project is organized into modular layers for clarity and extensibility.
4+
5+
- CLI: `database_swap/cli/interface.py` – user entry point and argument parsing
6+
- Config: `database_swap/config/settings.py` – YAML loading, defaults, overrides
7+
- Core: `database_swap/core/``migrator.py`, `rate_limiter.py`, `validator.py`
8+
- Adapters: `database_swap/adapters/` – DB-specific implementations
9+
- Utils: `database_swap/utils/` – logging, helpers, progress, timers
10+
11+
## Data flow
12+
13+
1) CLI parses args and loads config
14+
2) Migrator is created with config
15+
3) Migrator resolves source/target adapters and connects
16+
4) For each table/collection:
17+
- Read in batches from source
18+
- Validate and convert data types (`utils.helpers.convert_data_types`)
19+
- Rate limit writes (`core.rate_limiter.AdaptiveRateLimiter`)
20+
- Write to target and track stats
21+
5) Final stats and logs are emitted
22+
23+
## Key components
24+
25+
- DatabaseMigrator: Orchestrates migration, batching, and stats
26+
- RateLimiter / AdaptiveRateLimiter: Applies delays based on error rate
27+
- DataValidator / SchemaValidator: Optional validation of rows and schemas
28+
- MigrationStats: Aggregates counters and durations
29+
30+
## Error handling
31+
32+
- Retries for failed batch writes (configurable `max_retries`)
33+
- Exponential backoff per batch attempt
34+
- Non-fatal errors are collected and surfaced at the end
35+
36+
## Logging
37+
38+
- File and colored console logging via `utils.logger.setup_logging`
39+
- Progress logs per table via `ProgressLogger`

docs/cli.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# CLI reference
2+
3+
The `database-swap` CLI exposes four main commands defined in `database_swap/cli/interface.py`.
4+
5+
- init-config: Create a starter configuration file
6+
- test-connection: Check that you can connect to a database
7+
- analyze: Inspect tables/collections and record counts
8+
- migrate: Migrate data between source and target
9+
10+
All commands accept a global `--config` (-c) to load settings from YAML and `--verbose` (-v) for DEBUG logging.
11+
12+
## init-config
13+
14+
Create a `config.yaml` in the current directory (or a custom output path).
15+
16+
```powershell
17+
database-swap init-config -o config.yaml
18+
```
19+
20+
## test-connection
21+
22+
Test connectivity to one database.
23+
24+
```powershell
25+
# SQLite
26+
database-swap test-connection --db-type sqlite --database .\source.db
27+
28+
# MySQL
29+
database-swap test-connection --db-type mysql --host localhost --database mydb --username root --password
30+
31+
# MongoDB
32+
database-swap test-connection --db-type mongodb --host localhost --database mydb
33+
```
34+
35+
Options
36+
37+
- --db-type: sqlite | mysql | mongodb (required)
38+
- --host, --port
39+
- --database (required)
40+
- --username, --password
41+
42+
## analyze
43+
44+
Summarize tables/collections, schemas, and counts.
45+
46+
```powershell
47+
database-swap analyze --db-type sqlite --database .\source.db
48+
```
49+
50+
Options
51+
52+
- --db-type: sqlite | mysql | mongodb (required)
53+
- --host, --port
54+
- --database (required)
55+
- --username, --password
56+
- --table: analyze one table/collection only
57+
58+
## migrate
59+
60+
Perform the migration. You can pass connection details via flags or a YAML config.
61+
62+
```powershell
63+
# Dry run (no writes)
64+
database-swap migrate --dry-run `
65+
--source-type sqlite --source-database .\source.db `
66+
--target-type mysql --target-host localhost --target-database target_db --target-username root
67+
68+
# Actual run
69+
database-swap migrate `
70+
--source-type sqlite --source-database .\source.db `
71+
--target-type mysql --target-host localhost --target-database target_db --target-username root
72+
```
73+
74+
Options
75+
76+
- Source: --source-type, --source-host, --source-port, --source-database, --source-username, --source-password
77+
- Target: --target-type, --target-host, --target-port, --target-database, --target-username, --target-password
78+
- Scope: --tables "a,b,c" (optional)
79+
- Performance: --batch-size, --rate-limit-delay, --max-retries
80+
- Safety: --dry-run
81+
82+
## Using a config file
83+
84+
All flags can be provided through YAML and overridden on the CLI.
85+
86+
```powershell
87+
database-swap migrate --config .\config.yaml
88+
```
89+
90+
Global flags can precede any subcommand:
91+
92+
```powershell
93+
database-swap -c .\config.yaml -v migrate
94+
```
95+
96+
See docs/configuration.md for the YAML schema.

docs/configuration.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Configuration reference
2+
3+
Database Swap loads YAML from `--config` or searches: `config.yaml`, `config.yml`, `%USERPROFILE%\\.database_swap.yaml`, `/etc/database_swap.yaml`.
4+
5+
The default shape (see `database_swap/config/settings.py`):
6+
7+
```yaml
8+
source:
9+
type: sqlite | mysql | mongodb
10+
connection:
11+
host: localhost
12+
port: null
13+
database: source.db
14+
username: null
15+
password: null
16+
17+
target:
18+
type: sqlite | mysql | mongodb
19+
connection:
20+
host: localhost
21+
port: null
22+
database: target.db
23+
username: null
24+
password: null
25+
26+
migration:
27+
batch_size: 1000
28+
rate_limit_delay: 0.1
29+
max_retries: 3
30+
timeout: 30
31+
tables: null # null = all tables/collections
32+
33+
validation:
34+
strict_mode: true
35+
data_type_validation: true
36+
foreign_key_validation: false
37+
38+
logging:
39+
level: INFO
40+
file: database_swap.log
41+
console: true
42+
```
43+
44+
Notes
45+
46+
- For SQLite, only `connection.database` is required.
47+
- For MySQL, required: host, database, username; password typically too.
48+
- For MongoDB, required: host, database. Username/password optional.
49+
- `tables`: set a list like ["users", "orders"] to migrate a subset.
50+
- `rate_limit_delay`: seconds to wait between batch writes (adaptive during run).
51+
52+
## Override with CLI
53+
54+
Any key can be overridden via flags used by `Config.update_from_args`:
55+
56+
- Source: --source-type, --source-host, --source-port, --source-database, --source-username, --source-password
57+
- Target: --target-type, --target-host, --target-port, --target-database, --target-username, --target-password
58+
- Migration: --batch-size, --rate-limit-delay, --max-retries, --timeout, --tables
59+
- Logging: implied via `-v` (DEBUG) or by editing YAML
60+
61+
Example
62+
63+
```powershell
64+
database-swap migrate -c .\config.yaml --batch-size 500 --tables "users,orders"
65+
```

docs/extending.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Extending Database Swap
2+
3+
You can add support for a new database by implementing `DatabaseAdapter` and registering it.
4+
5+
## 1) Create an adapter class
6+
7+
Create `database_swap/adapters/postgresql.py` (as an example):
8+
9+
```python
10+
from .base import DatabaseAdapter
11+
12+
class PostgreSQLAdapter(DatabaseAdapter):
13+
def connect(self) -> bool: ...
14+
def disconnect(self) -> None: ...
15+
def test_connection(self) -> bool: ...
16+
def get_tables(self) -> list[str]: ...
17+
def get_table_schema(self, table_name: str) -> dict: ...
18+
def get_table_count(self, table_name: str) -> int: ...
19+
def read_data(self, table_name: str, batch_size: int = 1000, offset: int = 0): ...
20+
def write_data(self, table_name: str, data: list[dict], create_table: bool = True) -> bool: ...
21+
def create_table(self, table_name: str, schema: dict) -> bool: ...
22+
def drop_table(self, table_name: str) -> bool: ...
23+
```
24+
25+
Reuse patterns from `sqlite.py`, `mysql.py`, and `mongodb.py`.
26+
27+
## 2) Register in the factory
28+
29+
Edit `database_swap/adapters/__init__.py` to import conditionally and expose your adapter when its dependency is installed. Follow the MySQL/MongoDB examples.
30+
31+
## 3) Map types and conversions
32+
33+
If your engine needs special conversions, update `utils.helpers.convert_value` as needed to ensure cross-engine safety.
34+
35+
## 4) Tests and docs
36+
37+
- Add sample connection details and minimal test DB
38+
- Document required fields in `docs/adapters.md`
39+
40+
## 5) Packaging
41+
42+
- Add the dependency to `requirements.txt` (or make it optional and document installation)

0 commit comments

Comments
 (0)