Add/reclone cmds (#461)

This commit is contained in:
gabrie30
2024-09-28 05:53:35 +05:30
committed by GitHub
parent 683c5ece6e
commit d1bb7c3044
9 changed files with 331 additions and 32 deletions

View File

@@ -9,6 +9,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
- GHORG_PRESERVE_SCM_HOSTNAME, note that this feature changes the directory struture that gitlab all-users and all-groups clone into; thanks @rrrix
- GHORG_PRUNE_UNTOUCHED, to prune repos that users make no changes in; thanks @MaxG87
- GHORG_GITHUB_TOKEN_FROM_GITHUB_APP to handle github app tokens; thanks @PaarthShah
- Command reclone-server, to run adhoc reclone commands via HTTP requests
- Command reclone-cron, to run periodic reclone commands on a timer
### Changed
### Deprecated
### Removed

119
README.md
View File

@@ -268,7 +268,100 @@ curl https://raw.githubusercontent.com/gabrie30/ghorg/master/sample-reclone.yaml
Update file with the commands you wish to run.
### Docker
## Reclone Server and Cron Commands
### Reclone Server
The `reclone-server` command starts a server that allows you to trigger adhoc reclone commands via HTTP requests.
#### Usage
```sh
ghorg reclone-server [flags]
```
#### Flags
- `--port`: Specify the port on which the server will run. If not specified, the server will use the default port.
#### Endpoints
- **`/trigger/reclone`**: Triggers the reclone command. To prevent resource exhaustion, only one request can processed at a time.
- **Query Parameters**:
- `cmd`: Optional. Allows you to call a specific reclone, otherwise all reclones are ran.
- **Responses**:
- `200 OK`: Command started successfully.
- `429 Too Many Requests`: Server is currently running a reclone command, you will need to wait until its completed before starting another one.
- **`/stats`**: Returns the statistics of the reclone operations in JSON format. `GHORG_STATS_ENABLED=true` or `--stats-enabled` must be set to work.
- **Responses**:
- `200 OK`: Statistics returned successfully.
- `428 Precondition required`: Ghorg stats is not enabled.
- `500 Internal Server Error`: Unable to read the statistics file.
- **`/health`**: Health check endpoint.
- **Responses**:
- `200 OK`: Server is healthy.
#### Examples
Starting the server. The default port is `8080` but you can optionally start the server on different port using the `--port` flag:
```sh
ghorg reclone-server
```
Trigger reclone command, this will run all cmds defined in your `reclone.yaml`:
```sh
curl "http://localhost:8080/trigger/reclone"
```
Trigger a specific reclone command:
```sh
curl "http://localhost:8080/trigger/reclone?cmd=your-reclone-command"
```
Get the statistics:
```sh
curl "http://localhost:8080/stats"
```
Check the server health:
```sh
curl "http://localhost:8080/health"
```
### Reclone Cron
The `reclone-cron` command sets up a simple cron job that triggers the reclone command at specified minute intervals indefinitely.
#### Usage
```sh
ghorg reclone-cron [flags]
```
#### Flags
- `--minutes`: Specify the interval in minutes at which the reclone command will be triggered. Default is every 60 minutes.
#### Example
Set up a cron job to trigger the reclone command every day:
```sh
ghorg reclone-cron --minutes 1440
```
#### Environment Variables
- `GHORG_CRON_TIMER_MINUTES`: The interval in minutes for the cron job. This can be set via the `--minutes` flag. Defualt is 60 minutes.
## Using Docker
The provided images are built for both `amd64` and `arm64` architectures and are available solely on Github Container Registry [ghcr.io](https://github.com/gabrie30/ghorg/pkgs/container/ghorg).
@@ -297,7 +390,7 @@ GHORG_ABSOLUTE_PATH_TO_CLONE_TO=/data
These can be overriden, if necessary, by including the `-e` flag to the docker run comand, e.g. `-e GHORG_GITHUB_TOKEN=bGVhdmUgYSBjb21tZW50IG9uIGlzc3VlIDY2`.
#### Persisting Data on the Host
### Persisting Data on the Host
In order to store data on the host, it is required to bind mount a volume:
- `$HOME/.config/ghorg:/config`: Mounts your config directory inside the container, to access `config.yaml` and `reclone.yaml`.
@@ -323,17 +416,6 @@ alias ghorg="docker run --rm -v $HOME/.config/ghorg:/config -v $HOME/repositorie
ghorg clone kubernetes --match-regex=^sig
```
## Windows support
Windows is supported when built with golang or as a [prebuilt binary](https://github.com/gabrie30/ghorg/releases/latest) however, the readme and other documentation is not geared towards Windows users.
Alternatively, Windows users can also install ghorg using [scoop](https://scoop.sh/#/)
```
scoop bucket add main
scoop install ghorg
```
## Tracking Clone Data Over Time
To track data on your clones over time, you can use the ghorg stats feature. It is recommended to enable ghorg stats in your configuration file by setting `GHORG_STATS_ENABLED=true`. This ensures that each clone operation is logged automatically without needing to set the command line flag `--stats-enabled` every time. **The ghorg stats feature is disabled by default and needs to be enabled.**
@@ -367,6 +449,17 @@ go install github.com/gabrie30/csvToJson@latest && \
csvToJson _ghorg_stats.csv
```
## Windows support
Windows is supported when built with golang or as a [prebuilt binary](https://github.com/gabrie30/ghorg/releases/latest) however, the readme and other documentation is not geared towards Windows users.
Alternatively, Windows users can also install ghorg using [scoop](https://scoop.sh/#/)
```
scoop bucket add main
scoop install ghorg
```
## Troubleshooting
- If you are having trouble cloning repos. Try to clone one of the repos locally e.g. manually running `git clone https://github.com/your_private_org/your_private_repo.git` if this does not work, ghorg will also not work. Your git client must first be setup to clone the target repos. If you normally clone using an ssh key use the `--protocol=ssh` flag with ghorg. This will fetch the ssh clone urls instead of the https clone urls.

View File

@@ -304,7 +304,6 @@ func cloneFunc(cmd *cobra.Command, argz []string) {
setOutputDirName(argz)
setOuputDirAbsolutePath()
args = argz
targetCloneSource = argz[0]
setupRepoClone()
}
@@ -507,18 +506,6 @@ func filterByExcludeMatchPrefix(repos []scm.Repo) []scm.Repo {
return filteredRepos
}
// exclude wikis from repo count
func getRepoCountOnly(targets []scm.Repo) int {
count := 0
for _, t := range targets {
if !t.IsWiki {
count++
}
}
return count
}
func hasRepoNameCollisions(repos []scm.Repo) (map[string]bool, bool) {
repoNameWithCollisions := make(map[string]bool)
@@ -903,7 +890,7 @@ func CloneAllRepos(git git.Gitter, cloneTargets []scm.Repo) {
return
}
count, _ = git.RepoCommitCount(repo)
count, err = git.RepoCommitCount(repo)
if err != nil {
e := fmt.Sprintf("Problem trying to get post pull commit count for on repo: %s", repo.URL)
cloneInfos = append(cloneInfos, e)
@@ -1067,7 +1054,7 @@ func CloneAllRepos(git git.Gitter, cloneTargets []scm.Repo) {
}
func writeGhorgStats(date string, allReposToCloneCount, cloneCount, pulledCount, cloneInfosCount, cloneErrorsCount, updateRemoteCount, newCommits, pruneCount int, hasCollisions bool) error {
func getGhorgStatsFilePath() string {
var statsFilePath string
absolutePath := os.Getenv("GHORG_ABSOLUTE_PATH_TO_CLONE_TO")
if os.Getenv("GHORG_PRESERVE_SCM_HOSTNAME") == "true" {
@@ -1077,6 +1064,12 @@ func writeGhorgStats(date string, allReposToCloneCount, cloneCount, pulledCount,
statsFilePath = filepath.Join(absolutePath, "_ghorg_stats.csv")
}
return statsFilePath
}
func writeGhorgStats(date string, allReposToCloneCount, cloneCount, pulledCount, cloneInfosCount, cloneErrorsCount, updateRemoteCount, newCommits, pruneCount int, hasCollisions bool) error {
statsFilePath := getGhorgStatsFilePath()
fileExists := true
if _, err := os.Stat(statsFilePath); os.IsNotExist(err) {

View File

@@ -2,7 +2,6 @@ package cmd
import (
"embed"
_ "embed"
"fmt"
gtm "github.com/MichaelMure/go-term-markdown"

52
cmd/reclone-cron.go Normal file
View File

@@ -0,0 +1,52 @@
package cmd
import (
_ "embed"
"log"
"os"
"os/exec"
"strconv"
"time"
"github.com/gabrie30/ghorg/colorlog"
"github.com/spf13/cobra"
)
var recloneCronCmd = &cobra.Command{
Use: "reclone-cron",
Short: "Simple cron that will trigger your reclone command at a specified minute intervals indefinitely",
Long: `Read the documentation and examples in the Readme under Reclone Server heading`,
Run: func(cmd *cobra.Command, args []string) {
if cmd.Flags().Changed("minutes") {
os.Setenv("GHORG_CRON_TIMER_MINUTES", cmd.Flag("minutes").Value.String())
}
startReCloneCron()
},
}
func startReCloneCron() {
if os.Getenv("GHORG_CRON_TIMER_MINUTES") == "" {
return
}
colorlog.PrintInfo("Cron activated and will first run after " + os.Getenv("GHORG_CRON_TIMER_MINUTES") + " minutes ")
minutes, err := strconv.Atoi(os.Getenv("GHORG_CRON_TIMER_MINUTES"))
if err != nil {
log.Fatalf("Invalid GHORG_CRON_TIMER_MINUTES: %v", err)
}
ticker := time.NewTicker(time.Duration(minutes) * time.Minute)
defer ticker.Stop()
for range ticker.C {
colorlog.PrintInfo("starting reclone cron, time: " + time.Now().Format(time.RFC1123))
cmd := exec.Command("ghorg", "reclone")
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
log.Printf("Failed to run ghorg reclone: %v", err)
}
}
}

133
cmd/reclone-server.go Normal file
View File

@@ -0,0 +1,133 @@
package cmd
import (
_ "embed"
"encoding/csv"
"encoding/json"
"fmt"
"net/http"
"os"
"os/exec"
"sync"
"github.com/gabrie30/ghorg/colorlog"
"github.com/spf13/cobra"
)
var recloneServerCmd = &cobra.Command{
Use: "reclone-server",
Short: "Server allowing you to trigger adhoc reclone commands via HTTP requests",
Long: `Read the documentation and examples in the Readme under Reclone Cron heading`,
Run: func(cmd *cobra.Command, args []string) {
if cmd.Flags().Changed("port") {
os.Setenv("GHORG_RECLONE_SERVER_PORT", cmd.Flag("port").Value.String())
}
startReCloneServer()
},
}
func startReCloneServer() {
var mu sync.Mutex
serverPort := os.Getenv("GHORG_RECLONE_SERVER_PORT")
if serverPort != "" && serverPort[0] != ':' {
serverPort = ":" + serverPort
}
http.HandleFunc("/trigger/reclone", func(w http.ResponseWriter, r *http.Request) {
userCmd := r.URL.Query().Get("cmd")
if !mu.TryLock() {
http.Error(w, "Server is busy, please try again later", http.StatusTooManyRequests)
return
}
// Signal channel to notify when the command has started
started := make(chan struct{})
go func() {
defer mu.Unlock()
var cmd *exec.Cmd
if userCmd == "" {
cmd = exec.Command("ghorg", "reclone")
} else {
cmd = exec.Command("ghorg", "reclone", userCmd)
}
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// Notify that the command has started
close(started)
if err := cmd.Run(); err != nil {
fmt.Printf("Error running command: %s\n", err)
}
}()
// Wait for the command to start before responding
<-started
w.WriteHeader(http.StatusOK)
})
http.HandleFunc("/stats", func(w http.ResponseWriter, r *http.Request) {
if os.Getenv("GHORG_STATS_ENABLED") != "true" {
http.Error(w, "Stats collection is not enabled. Please set GHORG_STATS_ENABLED=true or use --stats-enabled flag", http.StatusPreconditionRequired)
return
}
statsFilePath := getGhorgStatsFilePath()
fileExists := true
if _, err := os.Stat(statsFilePath); os.IsNotExist(err) {
fileExists = false
}
if fileExists {
file, err := os.Open(statsFilePath)
if err != nil {
http.Error(w, "Unable to open file", http.StatusInternalServerError)
return
}
defer file.Close()
reader := csv.NewReader(file)
records, err := reader.ReadAll()
if err != nil {
http.Error(w, "Unable to read CSV file", http.StatusInternalServerError)
return
}
var jsonData []map[string]string
headers := records[0]
for _, row := range records[1:] {
rowData := make(map[string]string)
for i, value := range row {
rowData[headers[i]] = value
}
jsonData = append(jsonData, rowData)
}
jsonBytes, err := json.Marshal(jsonData)
if err != nil {
http.Error(w, "Unable to encode JSON", http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
w.Write(jsonBytes)
return
}
w.WriteHeader(http.StatusOK)
})
http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
})
colorlog.PrintInfo("Starting reclone server on " + serverPort)
if err := http.ListenAndServe(serverPort, nil); err != nil {
fmt.Printf("Error starting server: %s\n", err)
}
}

View File

@@ -93,7 +93,7 @@ func printFinalOutput(argz []string, reCloneMap map[string]ReClone) {
fmt.Println("")
colorlog.PrintSuccess("Completed! The following reclones were ran successfully...")
if len(argz) == 0 {
for key, _ := range reCloneMap {
for key := range reCloneMap {
colorlog.PrintSuccess(fmt.Sprintf(" * %v", key))
}
} else {

View File

@@ -48,6 +48,8 @@ var (
githubAppInstallationID string
githubUserOption string
githubFilterLanguage string
cronTimerMinutes string
recloneServerPort string
includeSubmodules bool
skipArchived bool
skipForks bool
@@ -73,7 +75,6 @@ var (
ghorgPreserveScmHostname bool
ghorgPruneUntouched bool
ghorgPruneUntouchedNoConfirm bool
args []string
cloneErrors []string
cloneInfos []string
)
@@ -168,6 +169,10 @@ func getOrSetDefaults(envVar string) {
os.Setenv(envVar, "false")
case "GHORG_NO_CLEAN":
os.Setenv(envVar, "false")
case "GHORG_CRON_TIMER_MINUTES":
os.Setenv(envVar, "60")
case "GHORG_RECLONE_SERVER_PORT":
os.Setenv(envVar, ":8080")
case "GHORG_FETCH_ALL":
os.Setenv(envVar, "false")
case "GHORG_DRY_RUN":
@@ -300,6 +305,8 @@ func InitConfig() {
getOrSetDefaults("GHORG_EXIT_CODE_ON_CLONE_INFOS")
getOrSetDefaults("GHORG_EXIT_CODE_ON_CLONE_ISSUES")
getOrSetDefaults("GHORG_STATS_ENABLED")
getOrSetDefaults("GHORG_CRON_TIMER_MINUTES")
getOrSetDefaults("GHORG_RECLONE_SERVER_PORT")
// Optionally set
getOrSetDefaults("GHORG_TARGET_REPOS_PATH")
getOrSetDefaults("GHORG_CLONE_DEPTH")
@@ -407,7 +414,11 @@ func init() {
lsCmd.Flags().BoolP("long", "l", false, "Display detailed information about each clone directory, including size and number of repositories. Note: This may take longer depending on the number and size of the cloned organizations.")
lsCmd.Flags().BoolP("total", "t", false, "Display total amounts of all repos cloned. Note: This may take longer depending on the number and size of the cloned organizations.")
rootCmd.AddCommand(lsCmd, versionCmd, cloneCmd, reCloneCmd, examplesCmd)
recloneCronCmd.Flags().StringVarP(&cronTimerMinutes, "minutes", "m", "", "GHORG_CRON_TIMER_MINUTES - Number of minutes to run the reclone command on a cron")
recloneServerCmd.Flags().StringVarP(&recloneServerPort, "port", "p", "", "GHORG_RECLONE_SERVER_PORT - Specifiy the port the reclone server will run on.")
rootCmd.AddCommand(lsCmd, versionCmd, cloneCmd, reCloneCmd, examplesCmd, recloneServerCmd, recloneCronCmd)
}
func Execute() {

View File

@@ -298,3 +298,19 @@ GHORG_RECLONE_PATH:
# Quiet logging output with reclone command
# flag (--quiet)
GHORG_RECLONE_QUIET: false
# +-+-+-+-+-+ +-+-+-+-+-+-+-+ +-+-+-+-+-+-+
# |G|H|O|R|G| |R|E|C|L|O|N|E| |S|E|R|V|E|R|
# +-+-+-+-+-+ +-+-+-+-+-+-+-+ +-+-+-+-+-+-+
# Port to run the relcone server on
# flag (--port) e.g. --port=3000
GHORG_RECLONE_SERVER_PORT: ":8080"
# +-+-+-+-+-+ +-+-+-+-+-+-+-+ +-+-+-+-+
# |G|H|O|R|G| |R|E|C|L|O|N|E| |C|R|O|N|
# +-+-+-+-+-+ +-+-+-+-+-+-+-+ +-+-+-+-+
# Number of minutes to run the cron on
# flag (--minutes) e.g. --minutes=1440
GHORG_CRON_TIMER_MINUTES: "60"