Skip to content

Commit 4be5387

Browse files
authored
Merge pull request #5 from bsv-blockchain/refactor
Refactor to move to postgres
2 parents 7f1f762 + ddad31e commit 4be5387

33 files changed

+3543
-195
lines changed

Dockerfile

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,10 +34,12 @@ RUN CGO_ENABLED=1 GOOS=${TARGETOS:-linux} GOARCH=${TARGETARCH} go build -a -o te
3434
# Use UBI9 so we have glibc
3535
FROM registry.access.redhat.com/ubi9-minimal:9.3
3636
WORKDIR /
37-
RUN microdnf install -y sqlite
37+
# Install PostgreSQL client libraries for database connectivity
38+
RUN microdnf install -y postgresql
3839
COPY --from=builder /workspace/teranode-p2p-poc .
3940
COPY --from=frontend-builder /app/frontend-react/build ./frontend-react/build
40-
COPY config.yaml .
41+
# Don't copy config.yaml directly - we'll use ConfigMap in Kubernetes
42+
# COPY config.yaml .
4143

4244
# Expose the HTTP port
4345
EXPOSE 8080

POSTGRES_SETUP.md

Lines changed: 273 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,273 @@
1+
# PostgreSQL Optimized Setup for Teranode P2P
2+
3+
This guide explains how to set up and use the PostgreSQL-optimized version of the Teranode P2P application, designed to handle millions of records with high performance.
4+
5+
## Key Improvements
6+
7+
### Database Optimizations
8+
- **PostgreSQL** replaces SQLite for better concurrency and scalability
9+
- **Table partitioning** by month for efficient data management
10+
- **Composite indexes** for common query patterns
11+
- **BRIN indexes** for time-based queries
12+
- **Materialized views** for fast statistics
13+
- **Batch inserts** (1000 records at a time)
14+
- **Connection pooling** (50 connections)
15+
- **Prepared statements** for query optimization
16+
17+
### Performance Features
18+
- Handles millions of records efficiently
19+
- Sub-second query responses with proper indexing
20+
- Automatic partition creation monthly
21+
- Background stats calculation
22+
- Efficient data cleanup for old partitions
23+
- Concurrent read/write operations
24+
25+
## Quick Start
26+
27+
### 1. Start PostgreSQL with Docker
28+
29+
```bash
30+
# Start PostgreSQL and Redis
31+
docker-compose -f docker-compose.postgres.yml up -d
32+
33+
# Wait for PostgreSQL to be ready
34+
docker-compose -f docker-compose.postgres.yml ps
35+
36+
# Check logs if needed
37+
docker-compose -f docker-compose.postgres.yml logs postgres
38+
```
39+
40+
### 2. Initialize Database Schema
41+
42+
```bash
43+
# Connect to PostgreSQL and run migration
44+
docker exec -i teranode-postgres psql -U teranode -d teranode_p2p < migrations/001_optimized_schema.sql
45+
```
46+
47+
### 3. Build and Run Application
48+
49+
```bash
50+
# Build with PostgreSQL support
51+
go build -tags postgres -o teranode-p2p-postgres cmd/main_postgres.go
52+
53+
# Copy PostgreSQL config
54+
cp config.postgres.yaml config.yaml
55+
56+
# Run the application
57+
./teranode-p2p-postgres
58+
```
59+
60+
## Configuration
61+
62+
### Database Settings (config.yaml)
63+
64+
```yaml
65+
database:
66+
host: "localhost" # or "postgres" for Docker
67+
port: 5432
68+
user: "teranode"
69+
password: "teranode_secure_password"
70+
name: "teranode_p2p"
71+
sslmode: "disable"
72+
```
73+
74+
### Performance Tuning
75+
76+
```yaml
77+
performance:
78+
batch_size: 1000 # Messages per batch insert
79+
batch_interval: 5 # Seconds between flushes
80+
stats_interval: 60 # Stats calculation interval
81+
materialized_view_refresh: 300 # View refresh interval
82+
```
83+
84+
## Database Management
85+
86+
### View Statistics
87+
88+
```bash
89+
# Connect to database
90+
docker exec -it teranode-postgres psql -U teranode -d teranode_p2p
91+
92+
# View table sizes
93+
\dt+
94+
95+
# View partition information
96+
SELECT
97+
parent.relname AS parent_table,
98+
child.relname AS partition_name,
99+
pg_size_pretty(pg_relation_size(child.oid)) AS size
100+
FROM pg_inherits
101+
JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
102+
JOIN pg_class child ON pg_inherits.inhrelid = child.oid
103+
ORDER BY parent.relname, child.relname;
104+
105+
# View materialized views
106+
\dm+
107+
108+
# Check slow queries
109+
SELECT query, calls, mean_exec_time, max_exec_time
110+
FROM pg_stat_statements
111+
WHERE mean_exec_time > 1000
112+
ORDER BY mean_exec_time DESC
113+
LIMIT 10;
114+
```
115+
116+
### Refresh Materialized Views
117+
118+
```sql
119+
-- Manual refresh if needed
120+
SELECT refresh_all_materialized_views();
121+
122+
-- Check last refresh time
123+
SELECT schemaname, matviewname, last_refresh
124+
FROM pg_stat_user_tables
125+
WHERE schemaname = 'public';
126+
```
127+
128+
### Data Cleanup
129+
130+
```sql
131+
-- Drop old partitions (e.g., older than 3 months)
132+
DROP TABLE IF EXISTS blocks_2024_09;
133+
DROP TABLE IF EXISTS handshakes_2024_09;
134+
-- etc. for other tables
135+
136+
-- Vacuum and analyze for performance
137+
VACUUM ANALYZE;
138+
```
139+
140+
## Monitoring
141+
142+
### Using pgAdmin
143+
144+
1. Access pgAdmin at http://localhost:5050
145+
2. Login with:
146+
- Email: admin@teranode.local
147+
- Password: admin_password
148+
3. Add server connection:
149+
- Host: postgres
150+
- Port: 5432
151+
- Username: teranode
152+
- Password: teranode_secure_password
153+
154+
### Performance Metrics
155+
156+
```sql
157+
-- Current connections
158+
SELECT count(*) FROM pg_stat_activity;
159+
160+
-- Database size
161+
SELECT pg_database_size('teranode_p2p');
162+
163+
-- Table sizes
164+
SELECT
165+
relname AS table_name,
166+
pg_size_pretty(pg_total_relation_size(relid)) AS size
167+
FROM pg_stat_user_tables
168+
ORDER BY pg_total_relation_size(relid) DESC;
169+
170+
-- Index usage
171+
SELECT
172+
schemaname,
173+
tablename,
174+
indexname,
175+
idx_scan,
176+
idx_tup_read,
177+
idx_tup_fetch
178+
FROM pg_stat_user_indexes
179+
ORDER BY idx_scan DESC;
180+
```
181+
182+
## API Endpoints
183+
184+
All existing endpoints work with PostgreSQL but with improved performance:
185+
186+
- `/api/blocks` - Query blocks with pagination
187+
- `/api/handshakes` - Query handshakes
188+
- `/api/stats` - Get cached statistics (sub-second response)
189+
- `/api/peers` - List peers with activity
190+
- `/api/block-headers` - Query block headers
191+
192+
## Troubleshooting
193+
194+
### Connection Issues
195+
196+
```bash
197+
# Check PostgreSQL is running
198+
docker-compose -f docker-compose.postgres.yml ps
199+
200+
# View logs
201+
docker-compose -f docker-compose.postgres.yml logs -f postgres
202+
203+
# Test connection
204+
docker exec -it teranode-postgres psql -U teranode -d teranode_p2p -c "SELECT 1;"
205+
```
206+
207+
### Performance Issues
208+
209+
```sql
210+
-- Check for missing indexes
211+
SELECT
212+
schemaname,
213+
tablename,
214+
attname,
215+
n_distinct,
216+
correlation
217+
FROM pg_stats
218+
WHERE schemaname = 'public'
219+
AND n_distinct > 100
220+
AND correlation < 0.1
221+
ORDER BY n_distinct DESC;
222+
223+
-- Analyze query performance
224+
EXPLAIN ANALYZE SELECT * FROM blocks WHERE network = 'mainnet' LIMIT 100;
225+
```
226+
227+
### Reset Database
228+
229+
```bash
230+
# Stop application
231+
# Drop and recreate database
232+
docker exec -it teranode-postgres psql -U teranode -c "DROP DATABASE teranode_p2p;"
233+
docker exec -it teranode-postgres psql -U teranode -c "CREATE DATABASE teranode_p2p;"
234+
235+
# Re-run migration
236+
docker exec -i teranode-postgres psql -U teranode -d teranode_p2p < migrations/001_optimized_schema.sql
237+
```
238+
239+
## Production Recommendations
240+
241+
1. **Use connection pooling** - Consider pgBouncer for very high loads
242+
2. **Enable SSL** - Set `sslmode: require` in production
243+
3. **Regular backups** - Use pg_dump or continuous archiving
244+
4. **Monitor disk space** - Partitions can grow large
245+
5. **Tune PostgreSQL** - Adjust shared_buffers, work_mem based on server RAM
246+
6. **Use read replicas** - For read-heavy workloads
247+
7. **Enable compression** - For archived partitions
248+
8. **Set up alerting** - Monitor slow queries, connection count, disk usage
249+
250+
## Load Testing
251+
252+
```bash
253+
# Generate test load (example)
254+
for i in {1..1000000}; do
255+
# Simulate message insertion
256+
echo "INSERT INTO blocks (network, hash, height, peer_id) VALUES ('mainnet', 'hash$i', $i, 'peer123');"
257+
done | docker exec -i teranode-postgres psql -U teranode -d teranode_p2p
258+
259+
# Check performance
260+
docker exec -it teranode-postgres psql -U teranode -d teranode_p2p -c "SELECT COUNT(*) FROM blocks;"
261+
```
262+
263+
## Scaling Further
264+
265+
For even larger scales (10M+ records):
266+
267+
1. **Implement sharding** - Distribute data across multiple PostgreSQL instances
268+
2. **Use TimescaleDB** - PostgreSQL extension optimized for time-series data
269+
3. **Add Elasticsearch** - For complex search queries
270+
4. **Implement data archival** - Move old data to cheaper storage
271+
5. **Use column-store** - Consider CitusDB for analytical queries
272+
273+
This optimized setup can handle millions of records while maintaining fast query performance.

0 commit comments

Comments
 (0)