-
Notifications
You must be signed in to change notification settings - Fork 21
Stream legacy blocks straight to legacy without deserialization #250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
oskarszoon
wants to merge
8
commits into
bsv-blockchain:main
Choose a base branch
from
oskarszoon:bugfix/legacy-sync-timeout
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
e4133dc
Stream legacy blocks straight to legacy without deserialization
oskarszoon 3a6903d
Rename test helper
oskarszoon 12d95b4
Merge branch 'main' into bugfix/legacy-sync-timeout
oskarszoon 95e03ed
Stream legacy blocks straight from asset
oskarszoon 325af92
Increase http streaming timeout
oskarszoon 17f17c2
Merge branch 'main' into bugfix/legacy-sync-timeout
oskarszoon a2720a2
Prevent subtree data race
oskarszoon 9f79ab0
Merge branch 'main' into bugfix/legacy-sync-timeout
oskarszoon File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| package legacy | ||
|
|
||
| import ( | ||
| "io" | ||
|
|
||
| "github.com/bsv-blockchain/go-wire" | ||
| "github.com/bsv-blockchain/teranode/errors" | ||
| ) | ||
|
|
||
| // RawBlockMessage implements wire.Message for streaming raw block bytes | ||
| // without the overhead of deserializing into wire.MsgBlock structure. | ||
| // | ||
| // This is significantly more efficient for large blocks (e.g., 4GB+) because: | ||
| // - No CPU time spent deserializing millions of transactions into Go structs | ||
| // - No CPU time spent re-serializing those structs back to bytes | ||
| // - Reduced memory overhead (no Go struct allocation per transaction) | ||
| // | ||
| // The raw bytes are read from the source and written directly to the wire, | ||
| // bypassing the deserialize/serialize roundtrip that would otherwise require | ||
| // allocating memory for each transaction struct. | ||
| type RawBlockMessage struct { | ||
| data []byte | ||
| } | ||
|
|
||
| // NewRawBlockMessage creates a RawBlockMessage by reading all bytes from the reader. | ||
| // The data must be in wire-format block encoding (header + txcount + transactions). | ||
| func NewRawBlockMessage(reader io.Reader) (*RawBlockMessage, error) { | ||
| data, err := io.ReadAll(reader) | ||
| if err != nil { | ||
| return nil, errors.NewProcessingError("failed to read block data", err) | ||
| } | ||
|
|
||
| return &RawBlockMessage{data: data}, nil | ||
| } | ||
|
|
||
| // Command returns the protocol command string for the message. | ||
| // Implements wire.Message interface. | ||
| func (m *RawBlockMessage) Command() string { | ||
| return wire.CmdBlock | ||
| } | ||
|
|
||
| // BsvEncode writes the raw block bytes to the writer. | ||
| // Implements wire.Message interface. | ||
| func (m *RawBlockMessage) BsvEncode(w io.Writer, _ uint32, _ wire.MessageEncoding) error { | ||
| _, err := w.Write(m.data) | ||
| return err | ||
| } | ||
|
|
||
| // Bsvdecode is not supported for RawBlockMessage. | ||
| // Implements wire.Message interface. | ||
| func (m *RawBlockMessage) Bsvdecode(_ io.Reader, _ uint32, _ wire.MessageEncoding) error { | ||
| return errors.NewProcessingError("RawBlockMessage does not support decoding") | ||
| } | ||
|
|
||
| // MaxPayloadLength returns the length of the raw block data. | ||
| // Implements wire.Message interface. | ||
| func (m *RawBlockMessage) MaxPayloadLength(_ uint32) uint64 { | ||
| return uint64(len(m.data)) | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,82 @@ | ||
| package legacy | ||
|
|
||
| import ( | ||
| "bytes" | ||
| "io" | ||
| "testing" | ||
|
|
||
| "github.com/bsv-blockchain/go-wire" | ||
| "github.com/stretchr/testify/require" | ||
| ) | ||
|
|
||
| func TestRawBlockMessage_Command(t *testing.T) { | ||
| msg := &RawBlockMessage{data: []byte{}} | ||
| require.Equal(t, wire.CmdBlock, msg.Command()) | ||
| } | ||
|
|
||
| func TestRawBlockMessage_MaxPayloadLength(t *testing.T) { | ||
| data := make([]byte, 1000) | ||
| msg := &RawBlockMessage{data: data} | ||
| require.Equal(t, uint64(1000), msg.MaxPayloadLength(0)) | ||
| } | ||
|
|
||
| func TestRawBlockMessage_BsvEncode(t *testing.T) { | ||
| testData := []byte{0x01, 0x02, 0x03, 0x04, 0x05} | ||
| msg := &RawBlockMessage{data: testData} | ||
|
|
||
| var buf bytes.Buffer | ||
| err := msg.BsvEncode(&buf, 0, wire.BaseEncoding) | ||
| require.NoError(t, err) | ||
| require.Equal(t, testData, buf.Bytes()) | ||
| } | ||
|
|
||
| func TestRawBlockMessage_Bsvdecode(t *testing.T) { | ||
| msg := &RawBlockMessage{} | ||
| err := msg.Bsvdecode(nil, 0, wire.BaseEncoding) | ||
| require.Error(t, err) | ||
| require.Contains(t, err.Error(), "does not support decoding") | ||
| } | ||
|
|
||
| func TestNewRawBlockMessage(t *testing.T) { | ||
| testData := []byte{0x01, 0x02, 0x03, 0x04, 0x05} | ||
| reader := bytes.NewReader(testData) | ||
|
|
||
| msg, err := NewRawBlockMessage(reader) | ||
| require.NoError(t, err) | ||
| require.Equal(t, testData, msg.data) | ||
| } | ||
|
|
||
| func TestNewRawBlockMessage_ReadError(t *testing.T) { | ||
| // Create a reader that will return an error | ||
| reader := &failingReader{} | ||
|
|
||
| msg, err := NewRawBlockMessage(reader) | ||
| require.Error(t, err) | ||
| require.Nil(t, msg) | ||
| } | ||
|
|
||
| // failingReader is a test helper that always fails on Read | ||
| type failingReader struct{} | ||
|
|
||
| func (r *failingReader) Read(_ []byte) (int, error) { | ||
| return 0, io.ErrUnexpectedEOF | ||
| } | ||
|
|
||
| func TestRawBlockMessage_LargeData(t *testing.T) { | ||
| // Test with a larger block-like structure (1MB) | ||
| data := make([]byte, 1024*1024) | ||
| for i := range data { | ||
| data[i] = byte(i % 256) | ||
| } | ||
|
|
||
| reader := bytes.NewReader(data) | ||
| msg, err := NewRawBlockMessage(reader) | ||
| require.NoError(t, err) | ||
| require.Equal(t, uint64(1024*1024), msg.MaxPayloadLength(0)) | ||
|
|
||
| // Verify encode produces the same data | ||
| var buf bytes.Buffer | ||
| err = msg.BsvEncode(&buf, 0, wire.BaseEncoding) | ||
| require.NoError(t, err) | ||
| require.Equal(t, data, buf.Bytes()) | ||
| } |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory Issue: Loading entire block into RAM
The
io.ReadAll()call loads the entire block into memory. For 4GB blocks mentioned in the comments, this causes a 4GB+ memory spike per block request.Problem:
Suggestion:
Consider whether the wire protocol supports true streaming without loading the entire payload into memory first, or if this memory tradeoff is acceptable for the CPU performance gain.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Bitcoin wire protocol message format requires:
[Magic: 4 bytes][Command: 12 bytes][Length: 4 bytes][Checksum: 4 bytes][Payload]
The checksum must be in the header BEFORE the payload. This means you need the entire payload to:
We'll revisit this possible improvement later