-
Notifications
You must be signed in to change notification settings - Fork 7
Advanced Usage
This guide dives deeper into Brain4J, focusing on how to work efficiently with datasets, advanced training techniques, and utilizing GPU acceleration.
The DataSet class is designed to manage your training data efficiently. You can split data into batches using partition and partitionWithSize.
Divides the dataset into a fixed number of batches.
DataSet<DataRow> fullData = new DataSet<>();
// Populate dataset with some sample data
for (int i = 0; i < 100; i++) {
Vector input = Vector.random(5); // 5 input features
Vector output = Vector.of(Math.random());
fullData.add(new DataRow(input, output));
}
// Split data into 10 batches
fullData.partition(10);
// Access batches
List<List<DataRow>> batches = fullData.getPartitions();Splits the dataset into batches of a given size.
// Create batches with 16 samples each
fullData.partitionWithSize(16);
// Access batches
List<List<DataRow>> batches = fullData.getPartitions();Before training, it's good practice to shuffle the dataset to improve learning.
fullData.shuffle();SmartTrainer automates training by handling batch updates, stopping conditions, and evaluation.
SmartTrainer trainer = new SmartTrainer(0.95, 5); // Learning rate decay 0.95, evaluate every 5 epochs
// Train until loss < 0.01, also if the difference between the current loss and
// the last loss exceeds the tollerance (0.001 here), the learning rate is decreased
trainer.start(model, fullData, 0.01, 0.001);trainer.startFor(model, fullData, 1000); // Train for 1000 epochsYou can add listeners to track the training process in real time.
private static class ExampleListener extends TrainListener<DataRow> {
@Override
public void onEvaluated(DataSet<DataRow> dataSet, int epoch, double loss, long took) {
System.out.print("\rEpoch " + epoch + " loss: " + loss + " took " + (took / 1e6) + " ms");
}
}
trainer.addListener(new ExampleListener());Brain4J now supports hardware-accelerated neural network operations through its tensor system. Tensors replace traditional matrices and vectors, providing multidimensional data structures optimized for neural network operations.
Tensors are N-dimensional arrays that can represent scalars (0D), vectors (1D), matrices (2D), and higher-dimensional data. They form the foundation of modern neural networks.
// Create a 2D tensor (matrix)
Tensor matrix = TensorFactory.matrix(3, 4); // 3x4 matrix
// Create a 3D tensor
Tensor tensor3D = TensorFactory.create(2, 3, 4); // shape [2,3,4]
// Create tensors with initial values
Tensor ones = TensorFactory.ones(2, 2); // 2x2 matrix filled with 1.0
Tensor zeros = TensorFactory.zeros(3, 3); // 3x3 matrix filled with 0.0
Tensor random = TensorFactory.random(2, 3); // 2x3 matrix with random values
// Create from existing data
float[] data = {1.0f, 2.0f, 3.0f, 4.0f};
Tensor fromData = TensorFactory.of(new int[]{2, 2}, data); // Creates a 2x2 tensorBrain4J can automatically use GPU acceleration for tensor operations when available, providing significant speedups for large models.
// Check if GPU is available
boolean gpuAvailable = TensorGPU.isGpuAvailable();
// Check if GPU is currently being used
boolean usingGPU = TensorFactory.isUsingGPU();
// Enable GPU if available
TensorFactory.useGPUIfAvailable();
// Force CPU usage (even if GPU is available)
TensorFactory.forceCPU();
// Remember to release GPU resources when done
TensorGPU.releaseGPUResources();The tensor system provides optimized operations for neural networks:
// Matrix multiplication
Tensor result = tensorA.matmul(tensorB);
// Element-wise operations
Tensor sum = tensorA.add(tensorB);
Tensor difference = tensorA.sub(tensorB);
Tensor product = tensorA.mul(tensorB);
Tensor quotient = tensorA.div(tensorB);
// Apply function to all elements
Tensor activated = tensor.map(x -> Math.max(0, x)); // ReLU activation
// Reshape tensor
Tensor reshaped = tensor.reshape(3, 4);
// Transpose dimensions
Tensor transposed = matrix.transpose();Here's a simple example of training an XOR neural network with GPU acceleration:
// Enable GPU acceleration if available
TensorFactory.useGPUIfAvailable();
// Define network architecture
int inputSize = 2;
int hiddenSize = 3;
int outputSize = 1;
// Prepare data
Tensor[] inputs = {
TensorFactory.vector(0, 0),
TensorFactory.vector(0, 1),
TensorFactory.vector(1, 0),
TensorFactory.vector(1, 1)
};
Tensor[] labels = {
TensorFactory.vector(0),
TensorFactory.vector(1),
TensorFactory.vector(1),
TensorFactory.vector(0)
};
// Initialize weights
Tensor W1 = TensorFactory.randn(0.0, 0.5, hiddenSize, inputSize);
Tensor b1 = TensorFactory.zeros(hiddenSize);
Tensor W2 = TensorFactory.randn(0.0, 0.5, outputSize, hiddenSize);
Tensor b2 = TensorFactory.zeros(outputSize);
// Training loop
double learningRate = 0.1;
for (int epoch = 0; epoch < 10000; epoch++) {
// you may check the full source code in the tests package
}
// Release GPU resources when done
TensorGPU.releaseGPUResources();- Use cases: Check out Examples & Use Cases
This wiki is still under construction. If you feel that you can contribute, please do so! Thanks.