Search code examples
githubgithub-linguist

Github shows 4% of repository is in a language that is not used in repository


I have a personal repository on GitHub that is completely written in C#, with a few XML configuration files, and some PowerShell files from included NuGet packages. On the main repository page, GitHub shows a colored bar to display the breakdown of different languages used in the repository.
enter image description here

If you click this bar, it shows the language names and actual percents. enter image description here

This particular language breakdown seems a bit odd to me, since I am the only contributor, and I have never used Smalltalk.

If you click a language name, it will show you a list of the files using that language. enter image description here

In this last image, you can see on the left side that the repository really only contains C#, XML, PowerShell, text and markdown files.

So why does GitHub think I'm using Smalltalk? And why doesn't the color bar mention that I'm using XML?


Solution

  • Since GitHub is using linguist to detect languages, you can open a PR to report some files incorrectly tagged as "Smalltalk".

    For instance, issue 2012 is still active (even though it is closed).