bug: powerIterationClustering failes if `src` or `dst`  columns are not Int

`graphframes` allows for the `id` of a node, and the `src` or `dst` of an edge to be of type `string`.
However, [the power Iteration Clustering wrapper](https://graphframes.io/04-user-guide/06-graph-clustering.html#power-iteration-clustering-pic) internally assumes that all of these will be `int` or `bigint`, causing a crash.

**To Reproduce**

Steps to reproduce the behavior:

```python

# Create a Vertex DataFrame with unique ID column "id"
v = spark.createDataFrame([
    ("a", "Alice", 34),
    ("b", "Bob", 36),
    ("c", "Charlie", 30),
], ["id", "name", "age"])

# Create an Edge DataFrame with "src" and "dst" columns
e = spark.createDataFrame([
    ("a", "b", "friend"),
    ("b", "c", "follow"),
    ("c", "b", "follow"),
], ["src", "dst", "relationship"])
# Create a GraphFrame
from graphframes import *

g = GraphFrame(v, e)
g.powerIterationClustering(k=2, maxIter=1)
```

> IllegalArgumentException: requirement failed: Column src must be of type equal to one of the following types: [int, bigint] but was actually of type string.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: powerIterationClustering failes if `src` or `dst` columns are not Int #757

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: powerIterationClustering failes if src or dst columns are not Int #757

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

bug: powerIterationClustering failes if `src` or `dst` columns are not Int #757