Skip to content

Large Negative Loss + Large RAM Usage #3

@Calcu-dev

Description

@Calcu-dev

Hi @anchitdharmw ,

I seem to be having problems with large RAM usage and large negative losses. I'll start by explaining the RAM usage.

RAM Usage
I have attempted to train on my GPU (NVIDIA GeForce 1060, 6Gb VRAM), and quickly run out of memory. I understand that Mask-RCNN is a large network and will most likely require more memory usage than this, so I can't complain there. However, when running on my CPU I see memory usage of 24Gb+. With only 2 workers and a mini batch size of 2, I consistently see memory usage spike to ~20Gb. My images are only 512x512 so I can't see why this network would take so much memory. This also prevents me from fully utilizing my CPU/utilizing my GPU since memory is the limiting factor.

Negative Losses
I know there was an issue (that I encountered as well) with the negative variance. I have applied the workaround you suggested. Even still, I am getting negative losses that hover around -2,000. I haven't been able to finish an epoch of training (245 images) due to the large memory usage and time. If the negative loss could be rectified by allowing it to run for all 10 epochs, I'll do so.

For some context here is some relevant information re: my computer:
CPU: Intel Core i9-10850k (10 cores)
GPU: NVIDIA Geforce 1060 (6Gb VRAM)
RAM: 32Gb DDR4 RAM

Let me know if there is any more information you need from me.

Best,
Adam

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions