Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
December 21, 2022 04:28 pm GMT

Billions of unnecessary files in GitHub

As I was looking for easy assignments for the Open Source Development Course I found something very troubling which is also an opportunity for a lot of teaching and a lot of practice.

Some files don't need to be in git

The common sense dictates that we rarely need to include generated files in our git repository. There is no point in keeping them in our version control as they can be generated again. (The exception might be if the generation takes a lot of time or can be done only during certain phases of the moon.)

Neither is there a need to store 3rd party libraries in our git repository. Instead of that we store a list of our dependencies with the required version and then we download and install them. (Well, the rightfully paranoid might download and save a copy of every 3rd party library they use to ensure it can never disappear, but you'll see we are not talking about that).

.gitignore

The way to make sure that neither we nor anyone else adds these files to the git repository by mistake is to create a file called .gitignore, include patterns that match the files we would like to exclude from git and add the .gitignore file to our repository. git will ignore those file. They won't even show up when you run git status.

The format of the /gitignore file is described in the documentation of .gitignore.

In a nutshell:

/output.txt

Ignore the output.txt file in the root of the project.

output.txt

Ignore output.txt anywhere in the project. (in the root or any subdirectory)

*.txt

All the files with .txt extension

venv

The venv folder anywhere in the project.

There are more. Check the documentation of .gitignore!

Not knowing about .gitignore

Apparently a lot of people using git and GitHub don't know about .gitignore

The evidence:

Python developers use something called virtualenv to make it easy to use different dependencies in different projects. When they create a virtualenv they usually configure it to install all the 3rd party libraries in a folder called venv. This folder we should not include in git. And yet:

There are 452M hits for this search venv

In a similar way NodeJS developers install their dependencies in a folder called node_modules. There are 2B responses for this search: node_modules

Finally, if you use the Finder applications on macOS and open a folder, it will create an empty(!) file called .DS_Store. This file is really not needed anywhere. And yet I saw many copies of it on GitHub. Unfortunately so far I could not figure out how to search for them. The closest I found is this search.

Misunderstanding .gitignore

There are also many people who misunderstand the way .gitignore works. I can understand it as the wording of the explanation is a bit ambiguous. What we usually say is that

If you'd like to make sure that git will ignore the __pycache__ folder then you need to put it in .gitignore.

A better way would be to say this:

If you'd like to make sure that git will ignore the __pycache__ folder then you need to put its name in the .gitignore file.

Without that people might end up creating a folder called .gitignore and moving all the __pycache__ folder to this .gitignore folder. You can see it in this search

Help

Can you suggest other common cases of unnecessary files in git that should be ignored?

Can you help me creating the search for .DS_store in GitHub?


Original Link: https://dev.to/szabgab/billions-of-unnecessary-files-in-github-i85

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To