Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
September 27, 2021 08:03 pm GMT

Creating new Data Pipelines from the command line

Kedro new is simply a wrapper around the cookiecutter templating library. The kedro team maintains a ready made template that has everything you need for a kedro project. They also maintain a few kedro starters, which are very similar to the base template.

Unsure what kedro is, Check out yesterdays post on What is Kedro.

pipx

I recommend using pipx when running kedro new. pipx is designed for system level cli tools so that you do not need to maintain a virtual environment or worry about version conflicts, pipx manages the environment for you.

The kedro team does not recommend pipx in their docs as they already feel like there is a bit of a tool overload for folks that may be less familiar with.

pipx kedro new

I like using pipx as it gives you better control over using a specific version or always the latest version, unlike when you run what you have on your system depends on when you last installed or upgraded.

Kedro New

The kedro team also has a set of starters, by passing in --starter you can start with a different template. Here is an example with the kedro spaceflights starter.

pipx run kedro new --starter spaceflights=============Please enter a human readable name for your new project.Spaces and punctuation are allowed. [New Kedro Project]: Spaceflights CompleteRepository Name:================Please enter a directory name for your new project repository.Alphanumeric characters, hyphens and underscores are allowed.Lowercase is recommended. [spaceflights-complete]:Python Package Name:====================Please enter a valid Python package name for your project package.Alphanumeric characters and underscores are allowed.Lowercase is recommended. Package name must start with a letteror underscore. [spaceflights_complete]:Change directory to the project generated in /home/u_walkews/git/spaceflights-completeA best-practice setup includes initialising git and creating a virtual environment before running ``kedro install`` to install project-specific dependencies. Refer to the Kedro documentation: https://kedro.readthedocs.io/

Other versions of kedro with pipx

pipx not only ensures that you run the latest version, it can also run a very specific version.

pipx run --spec kedro==0.16.6 kedro new

https://waylonwalker.com/kedro-environment/

The next post in this series will help you create your virtual environment for your new kedro project.

Check Out These Related Posts

Connect with Me

I am trying to build my YouTube@waylonwalker channel I would greatly appreciate a sub.

Connect with me on twitter@_waylonwalker.

I stream a few times per week on twitch@waylonwalker.

Check out all of my public repos on github@waylonwalker.

Stay up to date by joining the newsletter if that's your thing.

Connect on LinkedIn@waylonwalker.

Follow me on Dev.to@waylonwalker.


Original Link: https://dev.to/waylonwalker/creating-new-data-pipelines-from-the-command-line-3aim

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To