Skip to content

The groupby operation uses up whole system disk space #378

@jonbakerfish

Description

@jonbakerfish

I ran a groupby on a large SFrame table (i.e. foo.shape=(1183747, 3110)). After a while, my system disk / is full and pops up the following errors:

Traceback (most recent call last):
  File "test.py", line 134, in <module>
    'bs':gl.aggregate.CONCAT('b'),
  File "/home/.../anaconda2/lib/python2.7/site-packages/graphlab/data_structures/sframe.py", line 4651, in groupby
    group_ops))
  File "/home/.../anaconda2/lib/python2.7/site-packages/graphlab/cython/context.py", line 49, in __exit__
    raise exc_type(exc_value)
IOError: Fail to write. Disk may be full.: unspecified iostream_category error: unspecified iostream_category error

After the program die, the disk space usage backs to normal.

Can I change the SFrame's cache location to somewhere else? (I have larger disks besides the system disk.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions