Repository List Mining searches and lists the Github repositories based on criteria such as create date, recent commit date, language type, and the number of commits/forks.
1. Options
1.1. Create Data: search projects since in range of 'Create Date'.
1.2. Recent Commit Date: search projects had pushed in range of 'Recent Date'.
1.3. Language Type: search projects whose dominant language is 'Language Type', e.g., Java, C, C++, and so on.
1.4. Author Token(optional): authentication token for Github user. It allows to send more requests per minute.
1.5. Commit Count(optional): search projects whose number of commits are in range of "Commit Count".
1.6. Fork Count: search projects whose number of forks are in range of "Fork Count".
2. About Commit Options
Users can give criteria for the total number of commits in the last year of each repository.
However, since no services related to commit are provided in the repository searching query,
the system will once again compute the list of discovered.
Unfortunately, this may take a long time to create a list of thousands of repositories that meets the criteria.
The commit option is certainly important, but it can be used when necessary because it can give results quickly when not in use.
3. Example
Options: Recent Commit Date-[Since: 2019-01-01 Until: 2020-01-01], LanguageType-[java], ForkCount-[Min: 100 Max: 2000], AuthorToken-['personal_token']
This means that we can get a list of repositories, written in java, have fork counts greater than 100 and less than 2000,
and whose last committed date is 2019-01-01 to 2020-01-01, using personal authentication token.
Source and details are in
https://github.com/ISEL-HGU/AllGitClone