Skip to main content

Questions tagged [pipelines]

A pipeline is a sequence of functions (or the equivalent thereof), composed so that the output of one is input for the next, in order to create a compound transformation. Famously, a shell pipeline looks like "command | command2 | command3" (but use the tag "pipe" for this). It's also used in computer architecture to define a sequence of serial stages that execute in parallel over elements being fed into a pipe, in order to increase the overall throughput.

Filter by
Sorted by
Tagged with
5 votes
1 answer
52 views

The question is more data engineering related than data science, but since there is no data engineering stack exchange, thought I will shoot it here. Basically, as the title says. So, as part of a ...
Della's user avatar
  • 485
1 vote
0 answers
37 views

As part of a research project, I'm testing various statistical learning algorithms on various acoustics datasets. Instead of tediously typing scripts in python and Jupyter, I want to create a pipeline/...
DangerousTim's user avatar
2 votes
0 answers
34 views

We are currently looking for a pipeline orchestration tool to refactor a complex biodata pipeline. However, our since we are dealing with biodata, the orchestration tool would have to manage an ...
LiKao's user avatar
  • 121
3 votes
0 answers
44 views

There is a wide variety of "pipelines" that exists in today's Data Science world: data ("lift & shift," curation, reconciliation?) inference modeling machine learning (as ...
d8aninja's user avatar
  • 151
7 votes
1 answer
828 views

I'm hoping someone can help me think through this. I've come across a lot of different resources on nested-cv, but I think I'm confused as to how to go about model selection and the appropriate ...
molecularrunner's user avatar
3 votes
1 answer
532 views

In software engineering, a design pattern is a general, reusable solution to a common problem in software design. It is not a finished piece of code but rather a template or best practice that can be ...
Robert Long's user avatar
  • 5,855
0 votes
1 answer
37 views

Hello guys I am practicing Naive Bayes but I got an error : ValueError: Found input variables with inconsistent numbers of samples: [1, 4179] Also, I saw some ...
Marco Feregrino's user avatar
1 vote
1 answer
819 views

I have the following: train_set, test_set = train_test_split(arbres_df, test_size=0.2, random_state=42) Which is the old ...
Dimitri's user avatar
  • 43
1 vote
0 answers
116 views

I'm seeking advice on enhancing the deployment pipeline of a machine learning model that's accessed via a FastApi in production. My goal is to replace the existing setup with a more robust and ...
Daniel Ben Zaken's user avatar
0 votes
1 answer
79 views

I have an ML pipeline built with DVC that I use for experiment tracking. This allows running and tracking several experiments. Also, using hydra integration I can grid search hyper parameters. However,...
giulatona's user avatar
0 votes
1 answer
158 views

I'm making a data transformation pipeline on a dataset, and I am getting an error: all the input array dimensions except for concatenation axis must match exactly, but along dimension 0, the array at ...
Amy's user avatar
  • 1
1 vote
1 answer
919 views

Let's say I have dataset contains a timestamp (non-standard timestamp column without datetime format) as a single feature and count as Label/target to predict ...
Mario's user avatar
  • 610
0 votes
0 answers
61 views

I am trying to run the Python code below: ...
user avatar
2 votes
1 answer
935 views

I have a number of raw features that go into a scikit-learn model. I've already got a number of preprocessing steps (such as PolynomialFeatures) that creates additional features as part of my pipeline....
gammapoint's user avatar
0 votes
2 answers
72 views

I want to perform a global optimization of the entire model development pipeline. I have several stages of development, each of which can be performed automatically: preprocessing, removal of outliers/...
Andrew's user avatar
  • 406

15 30 50 per page
1
2 3 4 5
7