Skip to content
This repository was archived by the owner on Aug 17, 2024. It is now read-only.

Commit dccde1d

Browse files
committed
Adding push, find, debug, tests and doc. IT S READY
1 parent e4620a5 commit dccde1d

File tree

7 files changed

+404
-57
lines changed

7 files changed

+404
-57
lines changed

CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,17 @@
22

33
---
44

5+
## v0.2.0
6+
7+
**Author**: Guillaume Mousnier.
8+
9+
**Type**: Feature
10+
11+
**Changes**:
12+
- First functional version
13+
14+
---
15+
516
## v0.1.0
617

718
**Author**: Guillaume Mousnier.

README.md

Lines changed: 76 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# dataframe-js
2-
**v0.1.0**
2+
**v0.2.0**
33

44
## Presentation
55

@@ -10,6 +10,8 @@ A DataFrame is simply built on two concepts:
1010
- **Rows** providing ways to modify or filter your data.
1111

1212
````javascript
13+
const df = new DataFrame(rawData, columns)
14+
df.show()
1315
// DataFrame example
1416
| column1 | column2 | column3 | <--- Columns
1517
------------------------------------
@@ -19,11 +21,11 @@ A DataFrame is simply built on two concepts:
1921
| undefined | 6 | undefined |
2022
````
2123

22-
**DataFrame is immutable** (lazy, for performance purposes). Then, each modification on DataFrame will return a new DataFrame decreasing bug risks and making your data more secure.
24+
**DataFrame is immutable** (lazy, for performance purposes). Then, each modification on DataFrame will return a new DataFrame decreasing side effects and making your data more secure.
2325

2426
**DataFrame is easy to use** with a simple API (closed to Spark or SQL) designed to manipulate data faster and easier than ever.
2527

26-
**DataFrame is flexible** because you can switch from or to arrays and dictionnaries (hash, object) when you want.
28+
**DataFrame is flexible** because you can create DataFrames from multiple data format (array, object) and you can export your DataFrames into these (array, object, csv, json...).
2729

2830
**DataFrame is modulable** because you can use additional modules (Stat and Matrix by default) or create your own.
2931

@@ -44,11 +46,72 @@ dataframe-js contains a **principal core (DataFrame and Row)** and **two default
4446
To use dataframe-js, simply import the library. Then you can use DataFrame, Row or other Core components.
4547

4648
```javascript
47-
import { DataFrame } from 'dataframe-js';
49+
import { DataFrame, Row } from 'dataframe-js';
50+
```
51+
52+
To create a DataFrame, you have to passe your data and your column names. You can use different data structures as below:
4853

54+
```javascript
4955
const df = new DataFrame(myData, myColumns);
56+
57+
const dfFromObjectOfArrays = new DataFrame({
58+
column1: [3, 6, 8], //<------ A column
59+
column2: [3, 4, 5, 6],
60+
}, ['column1', 'column2']);
61+
62+
const dfFromArrayOfArrays = new DataFrame([
63+
[1, 6, 9, 10, 12], // <------- A row
64+
[1, 2],
65+
[6, 6, 9, 8, 9, 12],
66+
], ['c1', 'c2', 'c3', 'c4', 'c5', 'c6']);
67+
68+
const dfFromArrayOfObjects = new DataFrame([
69+
{c1: 1, c2: 6}, // <--- A row
70+
{c4: 1, c3: 2}
71+
], ['c1', 'c2', 'c3', 'c4']);
72+
```
73+
74+
If you don't pass column names, they will be infered from your data but **it's slower**:
75+
76+
```javascript
77+
// here you don't pass column names
78+
const dfFromObjectOfArrays = new DataFrame({
79+
column1: [3, 6, 8], //<------ A column
80+
column2: [3, 4, 5, 6],
81+
});
82+
83+
console.log(dfFromObjectOfArrays.listColumns())
84+
// ['column1', 'column2']
85+
86+
const dfFromArrayOfArrays = new DataFrame([
87+
[1, 6, 9, 10, 12], // <------- A row
88+
[1, 2],
89+
[6, 6, 9, 8, 9, 12],
90+
]);
91+
92+
console.log(dfFromArrayOfArrays.listColumns())
93+
// ['0', '1', '2', '3', '4', '5']
94+
95+
96+
const dfFromArrayOfObjects = new DataFrame([
97+
{c1: 1, c2: 6}, // <--- A row
98+
{c4: 1, c3: 2}
99+
]);
100+
101+
console.log(dfFromArrayOfObjects.listColumns())
102+
// ['c1', 'c2', 'c3', 'c4']
50103
```
51104

105+
Of course, you can do the reverse by exporting your DataFrame in another format by using:
106+
* [.toDict()](./doc/CORE_API.md#DataFrame+toDict) ⇒ <code>Object</code>
107+
* [.toArray()](./doc/CORE_API.md#DataFrame+toArray) ⇒ <code>Array</code>
108+
* [.toText([sep], [header], [path])](./doc/CORE_API.md#DataFrame+toText) ⇒ <code>String</code>
109+
* [.toCSV([header], [path])](./doc/CORE_API.md#DataFrame+toCSV) ⇒ <code>String</code>
110+
* [.toJSON([path])](./doc/CORE_API.md#DataFrame+toJSON) ⇒ <code>String</code>
111+
112+
or you can debug by using:
113+
* [.show([rows], [quiet])](./doc/CORE_API.md#DataFrame+show) ⇒ <code>String</code>
114+
52115
When you realize some operations on a DataFrame (or on a Row), it is never mutated. Indeed, when you modify a DataFrame (even if nothing change) you create a new instance of DataFrame. It's a bit slower but you avoid side effects.
53116

54117
Examples:
@@ -77,7 +140,9 @@ console.log(
77140

78141
```
79142

80-
#### List of available methods:
143+
For more informations you can find the API below.
144+
145+
#### List of available methods and their examples:
81146

82147
* [DataFrame](./doc/CORE_API.md#DataFrame)
83148
* [new DataFrame(data, columns, [...modules])](#new_DataFrame_new)
@@ -86,11 +151,12 @@ console.log(
86151
* [.toText([sep], [header], [path])](./doc/CORE_API.md#DataFrame+toText) ⇒ <code>String</code>
87152
* [.toCSV([header], [path])](./doc/CORE_API.md#DataFrame+toCSV) ⇒ <code>String</code>
88153
* [.toJSON([path])](./doc/CORE_API.md#DataFrame+toJSON) ⇒ <code>String</code>
89-
* [.show([rows], [quiet])](./doc/CORE_API.md#DataFrame+show) ⇒ <code>String</code>
154+
* [.push(...rows)](#DataFrame+push) ⇒ <code>[DataFrame](#DataFrame)</code>
90155
* [.dim()](./doc/CORE_API.md#DataFrame+dim) ⇒ <code>Array</code>
91156
* [.transpose()](./doc/CORE_API.md#DataFrame+transpose) ⇒ <code>ÐataFrame</code>
92157
* [.count()](./doc/CORE_API.md#DataFrame+count) ⇒ <code>Int</code>
93158
* [.countValue(valueToCount, [columnName])](./doc/CORE_API.md#DataFrame+countValue) ⇒ <code>Int</code>
159+
* [.show([rows], [quiet])](./doc/CORE_API.md#DataFrame+show) ⇒ <code>String</code>
94160
* [.replace(value, replacment, [...columnNames])](./doc/CORE_API.md#DataFrame+replace) ⇒ <code>[DataFrame](./doc/CORE_API.md#DataFrame)</code>
95161
* [.distinct(columnName)](./doc/CORE_API.md#DataFrame+distinct) ⇒ <code>Array</code>
96162
* [.unique(columnName)](./doc/CORE_API.md#DataFrame+unique) ⇒ <code>Array</code>
@@ -103,6 +169,7 @@ console.log(
103169
* [.chain(...funcs)](./doc/CORE_API.md#DataFrame+chain) ⇒ <code>[DataFrame](./doc/CORE_API.md#DataFrame)</code>
104170
* [.filter(func)](./doc/CORE_API.md#DataFrame+filter) ⇒ <code>[DataFrame](./doc/CORE_API.md#DataFrame)</code>
105171
* [.where(func)](./doc/CORE_API.md#DataFrame+where) ⇒ <code>[DataFrame](./doc/CORE_API.md#DataFrame)</code>
172+
* [.find(condition)](./doc/CORE_API.md#DataFrame+find) ⇒ <code>[Row](./doc/CORE_API.md#Row)</code>
106173
* [.map(func)](./doc/CORE_API.md#DataFrame+map) ⇒ <code>[DataFrame](./doc/CORE_API.md#DataFrame)</code>
107174
* [.reduce(func, [init])](./doc/CORE_API.md#DataFrame+reduce)
108175
* [.reduceRight(func, [init])](./doc/CORE_API.md#DataFrame+reduceRight)
@@ -159,7 +226,7 @@ const df2 = df.withColumn('column4', (row) => row.get('column2') * 2)
159226
df.fakemodule.test(8)
160227
```
161228

162-
If you want to create your own module, look at the Statisticical module (integrated by default) `./src/modules/stat.js` as example.
229+
If you want to create your own module, take a look at the Statisticical module (integrated by default) `./src/modules/stat.js` as example.
163230

164231
A simple example of a module structure:
165232

@@ -196,3 +263,5 @@ class fakeModule {
196263
* [.stats(columnName)](./doc/MODULES_API.md#Stat+stats) ⇒ <code>Object</code>
197264

198265
## Contribution
266+
267+
[How to contribute ?](./CONTRIBUTING.md)

doc/CORE_API.md

Lines changed: 83 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ DataFrame data structure providing an immutable, flexible and powerfull way to m
2828
* [.transpose()](#DataFrame+transpose) ⇒ <code>ÐataFrame</code>
2929
* [.count()](#DataFrame+count) ⇒ <code>Int</code>
3030
* [.countValue(valueToCount, [columnName])](#DataFrame+countValue) ⇒ <code>Int</code>
31+
* [.push(...rows)](#DataFrame+push) ⇒ <code>[DataFrame](#DataFrame)</code>
3132
* [.replace(value, replacment, [...columnNames])](#DataFrame+replace) ⇒ <code>[DataFrame](#DataFrame)</code>
3233
* [.distinct(columnName)](#DataFrame+distinct) ⇒ <code>Array</code>
3334
* [.unique(columnName)](#DataFrame+unique) ⇒ <code>Array</code>
@@ -38,8 +39,9 @@ DataFrame data structure providing an immutable, flexible and powerfull way to m
3839
* [.rename(newColumnNames)](#DataFrame+rename) ⇒ <code>[DataFrame](#DataFrame)</code>
3940
* [.drop(columnName)](#DataFrame+drop) ⇒ <code>[DataFrame](#DataFrame)</code>
4041
* [.chain(...funcs)](#DataFrame+chain) ⇒ <code>[DataFrame](#DataFrame)</code>
41-
* [.filter(func)](#DataFrame+filter) ⇒ <code>[DataFrame](#DataFrame)</code>
42-
* [.where(func)](#DataFrame+where) ⇒ <code>[DataFrame](#DataFrame)</code>
42+
* [.filter(condition)](#DataFrame+filter) ⇒ <code>[DataFrame](#DataFrame)</code>
43+
* [.find(condition)](#DataFrame+find) ⇒ <code>[Row](#Row)</code>
44+
* [.where(condition)](#DataFrame+where) ⇒ <code>[DataFrame](#DataFrame)</code>
4345
* [.map(func)](#DataFrame+map) ⇒ <code>[DataFrame](#DataFrame)</code>
4446
* [.reduce(func, [init])](#DataFrame+reduce)
4547
* [.reduceRight(func, [init])](#DataFrame+reduceRight)
@@ -247,6 +249,22 @@ df.select('column1').countValue(5)
247249

248250
0
249251
```
252+
<a name="DataFrame+push"></a>
253+
254+
### dataFrame.push(...rows) ⇒ <code>[DataFrame](#DataFrame)</code>
255+
Push new rows into the DataFrame.
256+
257+
**Kind**: instance method of <code>[DataFrame](#DataFrame)</code>
258+
**Returns**: <code>[DataFrame](#DataFrame)</code> - A new DataFrame with the new rows.
259+
260+
| Param | Type | Description |
261+
| --- | --- | --- |
262+
| ...rows | <code>Array</code> &#124; <code>[Row](#Row)</code> | The rows to add. |
263+
264+
**Example**
265+
```js
266+
df.push([1,2,3], [1,4,9])
267+
```
250268
<a name="DataFrame+replace"></a>
251269

252270
### dataFrame.replace(value, replacment, [...columnNames]) ⇒ <code>[DataFrame](#DataFrame)</code>
@@ -471,29 +489,86 @@ df.chain(
471489
```
472490
<a name="DataFrame+filter"></a>
473491

474-
### dataFrame.filter(func) ⇒ <code>[DataFrame](#DataFrame)</code>
475-
Filter DataFrame rows. /!\ Prefer to use .chain().
492+
### dataFrame.filter(condition) ⇒ <code>[DataFrame](#DataFrame)</code>
493+
Filter DataFrame rows.
476494

477495
**Kind**: instance method of <code>[DataFrame](#DataFrame)</code>
478496
**Returns**: <code>[DataFrame](#DataFrame)</code> - A new filtered DataFrame.
479497

480498
| Param | Type | Description |
481499
| --- | --- | --- |
482-
| func | <code>function</code> | A function sending a boolean taking the row as parameter. |
500+
| condition | <code>function</code> | A function sending a boolean taking the row as parameter or a column/value object. |
501+
502+
**Example**
503+
```js
504+
df.filter(
505+
line => line.get('column1') >= 3
506+
).show();
507+
508+
| column1 | column2 | column3 |
509+
------------------------------------
510+
| 3 | 5 | undefined |
511+
512+
df.filter(
513+
{'column2': 5, 'column1': 3}
514+
).show();
515+
516+
| column1 | column2 | column3 |
517+
------------------------------------
518+
| 3 | 5 | undefined |
519+
```
520+
<a name="DataFrame+find"></a>
521+
522+
### dataFrame.find(condition) ⇒ <code>[Row](#Row)</code>
523+
Find a row (the first met) based on a condition.
524+
525+
**Kind**: instance method of <code>[DataFrame](#DataFrame)</code>
526+
**Returns**: <code>[Row](#Row)</code> - The targeted Row.
527+
528+
| Param | Type | Description |
529+
| --- | --- | --- |
530+
| condition | <code>function</code> | A function sending a boolean taking the row as parameter or a column/value object.. |
483531

532+
**Example**
533+
```js
534+
df.find(
535+
line => line.get('column1') == 3
536+
);
537+
df.find(
538+
{'id': 958998}
539+
);
540+
```
484541
<a name="DataFrame+where"></a>
485542

486-
### dataFrame.where(func) ⇒ <code>[DataFrame](#DataFrame)</code>
487-
Filter DataFrame rows. /!\ Prefer to use .chain().
543+
### dataFrame.where(condition) ⇒ <code>[DataFrame](#DataFrame)</code>
544+
Filter DataFrame rows.
488545
Alias of .filter()
489546

490547
**Kind**: instance method of <code>[DataFrame](#DataFrame)</code>
491548
**Returns**: <code>[DataFrame](#DataFrame)</code> - A new filtered DataFrame.
492549

493550
| Param | Type | Description |
494551
| --- | --- | --- |
495-
| func | <code>function</code> | A function sending a boolean taking the row as parameter. |
552+
| condition | <code>function</code> | A function sending a boolean taking the row as parameter or a column/value object. |
553+
554+
**Example**
555+
```js
556+
df.filter(
557+
line => line.get('column1') >= 3
558+
).show();
559+
560+
| column1 | column2 | column3 |
561+
------------------------------------
562+
| 3 | 5 | undefined |
496563

564+
df.filter(
565+
{'column2': 5, 'column1': 3}
566+
).show();
567+
568+
| column1 | column2 | column3 |
569+
------------------------------------
570+
| 3 | 5 | undefined |
571+
```
497572
<a name="DataFrame+map"></a>
498573

499574
### dataFrame.map(func) ⇒ <code>[DataFrame](#DataFrame)</code>

0 commit comments

Comments
 (0)