Skip to content

Groupby behaves differently depending on the order of the columns #396

@igonro

Description

@igonro

Describe the bug
When creating a DataFrame, depending on the order of the columns the groupby() function works properly or returns an error.

To Reproduce
This column order works perfectly:

let data = {
    worker: ["david", "david", "john", "alice", "john", "david"],
    hours: [5, 6, 2, 8, 4, 3],
    day: ["monday", "tuesday", "wednesday", "thursday", "friday", "friday"],
};
let df = new dfd.DataFrame(data);

df.groupby(["day"]).col(["hours"]).sum().print()

// ╔════════════╤═══════════════════╤═══════════════════╗
// ║            │ day               │ hours_sum         ║
// ╟────────────┼───────────────────┼───────────────────╢
// ║ 0          │ monday            │ 5                 ║
// ╟────────────┼───────────────────┼───────────────────╢
// ║ 1          │ tuesday           │ 6                 ║
// ╟────────────┼───────────────────┼───────────────────╢
// ║ 2          │ wednesday         │ 2                 ║
// ╟────────────┼───────────────────┼───────────────────╢
// ║ 3          │ thursday          │ 8                 ║
// ╟────────────┼───────────────────┼───────────────────╢
// ║ 4          │ friday            │ 7                 ║
// ╚════════════╧═══════════════════╧═══════════════════╝

df.groupby(["worker"]).count().print()
// ╔════════════╤═══════════════════╤═══════════════════╤═══════════════════╗
// ║            │ worker            │ hours_count       │ day_count         ║
// ╟────────────┼───────────────────┼───────────────────┼───────────────────╢
// ║ 0          │ david             │ 3                 │ 3                 ║
// ╟────────────┼───────────────────┼───────────────────┼───────────────────╢
// ║ 1          │ john              │ 2                 │ 2                 ║
// ╟────────────┼───────────────────┼───────────────────┼───────────────────╢
// ║ 2          │ alice             │ 1                 │ 1                 ║
// ╚════════════╧═══════════════════╧═══════════════════╧═══════════════════╝

But when I change the column order to the following it doesn't work:

let data = {
    hours: [5, 6, 2, 8, 4, 3],
    worker: ["david", "david", "john", "alice", "john", "david"],
    day: ["monday", "tuesday", "wednesday", "thursday", "friday", "friday"],
};
let df = new dfd.DataFrame(data);

df.groupby(["day"]).col(["hours"]).sum().print()
// Uncaught Error: Can't perform math operation on column hours
//    arithemetic groupby.ts:266
//    operations groupby.ts:417
//    count groupby.ts:431

df.groupby(["worker"]).count().print()
// Uncaught Error: Can't perform math operation on column hours
//    arithemetic groupby.ts:266
//    operations groupby.ts:417
//    count groupby.ts:431

Expected behavior
I would expect that changing the order of the columns wouldn't make any change on the result.

Desktop (please complete the following information):

  • OS: Windows 11
  • Browser: Firefox v97.0.1, Chrome v98.0.4758.102, Edge v98.0.1108.56
  • Version: -

Additional context
I'm using the browser version, not the node.js one.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions