Skip to content
  • Categorieën
  • Recent
  • Tags
  • Populair
  • Gebruikers
  • Groepen
  • Zoeken
Collapse
Brand Logo

Kennisbank

  1. Home
  2. TFH Tech
  3. Documentatie
  4. Scripts voor dagelijks gebruik
  5. TFH Dagelijkse Systeemcontrole — Automatisch Opstartscript

TFH Dagelijkse Systeemcontrole — Automatisch Opstartscript

Scheduled Pinned Gesloten Verplaatst Scripts voor dagelijks gebruik
1 Berichten 1 Plaatsers 11 Weergaven
  • Oudste berichten bovenaan
  • Meest recente berichten bovenaan
  • Meeste stemmen
Aanmelden om te reageren
Dit onderwerp is verwijderd. Alleen gebruikers met beheerrechten op onderwerpniveau kunnen dit inzien.
  • nikitaskliarovN Offline
    nikitaskliarovN Offline
    nikitaskliarov administrators
    wrote on voor het laatst aangepast door nikitaskliarov
    #1

    Overzicht

    Elke ochtend worden er handmatig drie platforms gecontroleerd om te zorgen dat alle systemen van The Freight Hero correct functioneren. Dit script automatiseert deze controles en toont de resultaten direct in de terminal bij het opstarten van de laptop.

    De volgende controles worden uitgevoerd:

    1. Cronitor — Status van alle cronjobs op de server (gezond / falend / gepauzeerd)
    2. AWS S3 — Overzicht van database-backups in de tfh-backup-2 bucket, inclusief leeftijd van bestanden en maandelijkse kosten. Backups ouder dan 3 dagen worden gemarkeerd als "pending deletion" conform het retentiebeleid.
    3. Google Cloud — Aantal API-verzoeken en foutmeldingen (4xx/5xx) van de Google Maps API's over de afgelopen 24 uur

    Vereisten

    Tool Installatie
    jq sudo apt install jq
    bc sudo apt install bc
    AWS CLI v2 Installatiehandleiding
    Google Cloud CLI Installatiehandleiding

    Configuratie

    AWS-inloggegevens

    Er is een aparte IAM-gebruiker aangemaakt (nikita-ubuntu-laptop) met minimale rechten:

    • s3:ListBucket en s3:GetBucketLocation op arn:aws:s3:::tfh-backup-2
    • ce:GetCostAndUsage voor het ophalen van factureringsgegevens

    Configureer de AWS CLI eenmalig:

    aws configure
    # Vul Access Key ID, Secret Access Key en regio (bijv. eu-west-1) in
    

    Google Cloud-inloggegevens

    Configureer eenmalig:

    gcloud auth login
    gcloud config set project tfh-maps-477109
    

    Omgevingsvariabelen

    Bestand: ~/tfh-check/morning-check.env

    # ============================================================================
    #  TFH Morning Check — Environment Configuration
    #  Copy this file to morning-check.env and fill in your values
    # ============================================================================
    
    # --- Cronitor ---
    # Get your API key from: https://cronitor.io/settings/api
    CRONITOR_API_KEY=KEY_FROM_TELEMETRY
    
    # --- AWS S3 ---
    # Auth is handled by 'aws configure' (your nikita-ubuntu-laptop IAM user credentials)
    # No keys needed here — just set bucket and retention policy
    S3_BUCKET=tfh-backup-2
    S3_RETENTION_DAYS=3
    
    # --- Google Cloud ---
    # Auth is handled by 'gcloud auth login'
    # Find your project ID at: https://console.cloud.google.com/home/dashboard
    GCP_PROJECT_ID=tfh-maps-477109 
    
    # Comma-separated list of Maps API services to monitor
    # Find your enabled APIs at: https://console.cloud.google.com/apis/dashboard
    GCP_MAPS_SERVICES=maps-backend.googleapis.com,geocoding-backend.googleapis.com,places-backend.googleapis.com,directions-backend.googleapis.com
    
    

    Let op: Gebruik SDK API key van cronitor.

    Installatie

    # Map aanmaken en bestanden plaatsen
    mkdir -p ~/tfh-check
    # Kopieer morning-check.sh en morning-check.env naar ~/tfh-check/
    chmod +x ~/tfh-check/morning-check.sh
    

    Het Script

    #!/bin/bash
    # ============================================================================
    #  TFH Morning Health Check Script
    #  Runs daily checks on: Cronitor, AWS S3 Backups, Google Cloud Maps API
    # ============================================================================
    
    # --- Colors & Formatting ---
    RED='\033[0;31m'
    GREEN='\033[0;32m'
    YELLOW='\033[1;33m'
    BLUE='\033[0;34m'
    CYAN='\033[0;36m'
    BOLD='\033[1m'
    DIM='\033[2m'
    NC='\033[0m' # No Color
    
    PASS="${GREEN}✔${NC}"
    FAIL="${RED}✘${NC}"
    WARN="${YELLOW}⚠${NC}"
    
    # --- Configuration (set these or use environment variables) ---
    CRONITOR_API_KEY="${CRONITOR_API_KEY:-}"
    AWS_PROFILE="${AWS_PROFILE:-default}"
    S3_BUCKET="${S3_BUCKET:-tfh-backup-2}"
    S3_RETENTION_DAYS="${S3_RETENTION_DAYS:-3}"
    GCP_PROJECT_ID="${GCP_PROJECT_ID:-}"
    # Comma-separated list of Google Maps API service names to check
    # Common ones: maps-backend.googleapis.com, places-backend.googleapis.com,
    #              geocoding-backend.googleapis.com, directions-backend.googleapis.com
    GCP_MAPS_SERVICES="${GCP_MAPS_SERVICES:-maps-backend.googleapis.com,geocoding-backend.googleapis.com,places-backend.googleapis.com,directions-backend.googleapis.com}"
    
    # Max backup age in seconds
    MAX_AGE_SECONDS=$((S3_RETENTION_DAYS * 86400))
    
    # ============================================================================
    divider() {
        echo -e "${DIM}────────────────────────────────────────────────────────${NC}"
    }
    
    header() {
        echo ""
        echo -e "${BOLD}${CYAN}┌──────────────────────────────────────────────────────┐${NC}"
        echo -e "${BOLD}${CYAN}│  TFH Morning Health Check  —  $(date '+%Y-%m-%d %H:%M:%S')       │${NC}"
        echo -e "${BOLD}${CYAN}└──────────────────────────────────────────────────────┘${NC}"
        echo ""
    }
    
    section() {
        echo -e "${BOLD}${BLUE}▸ $1${NC}"
        divider
    }
    
    # ============================================================================
    #  1. CRONITOR — Check monitor health
    # ============================================================================
    check_cronitor() {
        section "CRONITOR — Cron Job Health"
    
        if [ -z "$CRONITOR_API_KEY" ]; then
            echo -e "  ${FAIL} CRONITOR_API_KEY not set. Skipping."
            echo ""
            return
        fi
    
        # Fetch all monitors (paginated, first page — usually enough)
        response=$(curl -s -w "\n%{http_code}" \
            -u "${CRONITOR_API_KEY}:" \
            "https://cronitor.io/api/monitors?page=1")
    
        http_code=$(echo "$response" | tail -1)
        body=$(echo "$response" | sed '$d')
    
        if [ "$http_code" != "200" ]; then
            echo -e "  ${FAIL} Cronitor API returned HTTP ${http_code}"
            echo ""
            return
        fi
    
        # Parse monitors using jq
        # Cronitor status values: up, down, grace, new, paused
        # The status may also be in .passing or latest_event fields depending on API version
        total=$(echo "$body" | jq '.monitors | length')
    
        echo "$body" | jq -r '.monitors[] | "\(.status // .passing // "unknown")|\(.name // .key)|\(.key)"' | \
        while IFS='|' read -r status name key; do
            case "$status" in
                up|true)
                    echo -e "  ${PASS} ${name}"
                    ;;
                down|false)
                    echo -e "  ${FAIL} ${name}  ${RED}(DOWN)${NC}"
                    ;;
                grace)
                    echo -e "  ${WARN} ${name}  ${YELLOW}(grace period)${NC}"
                    ;;
                paused)
                    echo -e "  ${WARN} ${name}  ${YELLOW}(paused)${NC}"
                    ;;
                new)
                    echo -e "  ${WARN} ${name}  ${YELLOW}(new — no data yet)${NC}"
                    ;;
                *)
                    echo -e "  ${WARN} ${name}  ${DIM}(status: ${status})${NC}"
                    ;;
            esac
        done
    
        # Summary counts
        ok_count=$(echo "$body" | jq '[.monitors[] | select(.status == "up" or .passing == true)] | length')
        fail_count=$(echo "$body" | jq '[.monitors[] | select(.status == "down" or .passing == false)] | length')
        other_count=$((total - ok_count - fail_count))
    
        echo ""
        echo -e "  ${DIM}Total: ${total}  |  OK: ${ok_count}  |  Failing: ${fail_count}  |  Other: ${other_count}${NC}"
        echo ""
    }
    
    # ============================================================================
    #  2. AWS S3 — Backup bucket check
    # ============================================================================
    check_s3_backups() {
        section "AWS S3 — Backup Bucket (${S3_BUCKET})"
    
        # Check if AWS CLI is available
        if ! command -v aws &> /dev/null; then
            echo -e "  ${FAIL} AWS CLI not installed. Skipping."
            echo ""
            return
        fi
    
        # --- 2a. Check remaining credits (AWS billing / cost explorer) ---
        echo -e "  ${BOLD}Account Balance / Credits:${NC}"
    
        # Try to get credit balance from Cost Explorer
        # Note: This requires ce:GetCostAndUsage permission
        month_start=$(date '+%Y-%m-01')
        today=$(date '+%Y-%m-%d')
        credit_info=$(aws ce get-cost-and-usage \
            --time-period "Start=${month_start},End=${today}" \
            --granularity MONTHLY \
            --metrics "UnblendedCost" \
            --output json 2>&1)
    
        if echo "$credit_info" | jq -e '.ResultsByTime' &>/dev/null 2>&1; then
            raw_cost=$(echo "$credit_info" | jq -r '.ResultsByTime[0].Total.UnblendedCost.Amount // "0"')
            currency=$(echo "$credit_info" | jq -r '.ResultsByTime[0].Total.UnblendedCost.Unit // "USD"')
            # Format nicely — show $0.00 instead of -0.0000000006
            formatted_cost=$(printf "%.2f" "$raw_cost")
            echo -e "  ${DIM}Current month spend: ${currency} ${formatted_cost}${NC}"
        else
            echo -e "  ${WARN} Could not fetch billing data (need ce:GetCostAndUsage permission)"
            echo -e "  ${DIM}Tip: Check manually at https://console.aws.amazon.com/billing/${NC}"
        fi
    
        echo ""
    
        # --- 2b. List backups and check ages ---
        echo -e "  ${BOLD}Backup Objects (max ${S3_RETENTION_DAYS} days old):${NC}"
    
        # List all objects, sorted by date (newest first)
        objects=$(aws s3api list-objects-v2 \
            --bucket "$S3_BUCKET" \
            --query 'Contents[].{Key: Key, LastModified: LastModified, Size: Size}' \
            --output json 2>&1)
    
        if echo "$objects" | jq -e '.' &>/dev/null 2>&1 && [ "$objects" != "null" ]; then
            now_epoch=$(date +%s)
            total_objects=$(echo "$objects" | jq 'length')
            fresh_count=0
            stale_count=0
    
            echo "$objects" | jq -r '.[] | "\(.LastModified)|\(.Key)|\(.Size)"' | sort -r | \
            while IFS='|' read -r modified key size; do
                # Strip timezone suffix (+00:00 or Z) for date parsing
                clean_date=$(echo "$modified" | sed 's/+00:00$//' | sed 's/Z$//')
                mod_epoch=$(date -d "$clean_date" +%s 2>/dev/null || echo 0)
                age_seconds=$((now_epoch - mod_epoch))
                age_days=$((age_seconds / 86400))
                age_hours=$(( (age_seconds % 86400) / 3600 ))
    
                # Human-readable size
                if [ "$size" -gt 1073741824 ] 2>/dev/null; then
                    h_size="$(printf "%.2f GB" "$(echo "scale=2; $size/1073741824" | bc)")"
                elif [ "$size" -gt 1048576 ] 2>/dev/null; then
                    h_size="$(printf "%.1f MB" "$(echo "scale=1; $size/1048576" | bc)")"
                elif [ "$size" -gt 1024 ] 2>/dev/null; then
                    h_size="$(printf "%.1f KB" "$(echo "scale=1; $size/1024" | bc)")"
                else
                    h_size="${size} B"
                fi
    
                # Truncate filename: show just the filename part, max 50 chars
                short_name=$(basename "$key")
                if [ ${#short_name} -gt 50 ]; then
                    short_name="${short_name:0:47}..."
                fi
    
                # Check freshness
                if [ "$age_seconds" -gt "$MAX_AGE_SECONDS" ]; then
                    echo -e "  ${DIM}░ ${short_name}  ${age_days}d ${age_hours}h  (pending deletion)${NC}"
                else
                    echo -e "  ${PASS} ${short_name}  ${DIM}${age_days}d ${age_hours}h  |  ${h_size}${NC}"
                fi
            done
    
            # Summary — use shell-based counting to avoid jq date issues
            stale=0
            fresh=0
            echo "$objects" | jq -r '.[].LastModified' | while read -r mod; do
                clean=$(echo "$mod" | sed 's/+00:00$//' | sed 's/Z$//')
                ep=$(date -d "$clean" +%s 2>/dev/null || echo 0)
                if [ $((now_epoch - ep)) -gt "$MAX_AGE_SECONDS" ]; then
                    echo "stale"
                else
                    echo "fresh"
                fi
            done | sort | uniq -c | while read -r count label; do
                if [ "$label" = "stale" ]; then stale=$count; fi
                if [ "$label" = "fresh" ]; then fresh=$count; fi
            done
    
            # Recount for display (subshell workaround)
            stale=$(echo "$objects" | jq -r '.[].LastModified' | while read -r mod; do
                clean=$(echo "$mod" | sed 's/+00:00$//' | sed 's/Z$//')
                ep=$(date -d "$clean" +%s 2>/dev/null || echo 0)
                [ $((now_epoch - ep)) -gt "$MAX_AGE_SECONDS" ] && echo 1
            done | wc -l)
            fresh=$((total_objects - stale))
    
            echo ""
            echo -e "  ${DIM}Total: ${total_objects}  |  Fresh: ${fresh}  |  Pending deletion: ${stale}${NC}"
    
            if [ "$fresh" -gt 0 ]; then
                echo -e "  ${PASS} Latest backup: $(echo "$objects" | jq -r 'sort_by(.LastModified) | last | .LastModified' | sed 's/+00:00$//')"
            fi
        else
            echo -e "  ${FAIL} Could not list objects in s3://${S3_BUCKET}"
            echo -e "  ${DIM}Error: ${objects}${NC}"
        fi
    
        echo ""
    }
    
    # ============================================================================
    #  3. GOOGLE CLOUD — Maps API usage & errors
    # ============================================================================
    check_google_maps() {
        section "GOOGLE CLOUD — Maps API Usage & Errors"
    
        # Check if gcloud is available
        if ! command -v gcloud &> /dev/null; then
            echo -e "  ${FAIL} gcloud CLI not installed. Skipping."
            echo ""
            return
        fi
    
        # Auto-detect project if not set
        if [ -z "$GCP_PROJECT_ID" ]; then
            GCP_PROJECT_ID=$(gcloud config get-value project 2>/dev/null)
        fi
    
        if [ -z "$GCP_PROJECT_ID" ]; then
            echo -e "  ${FAIL} GCP_PROJECT_ID not set and no default project configured."
            echo ""
            return
        fi
    
        echo -e "  ${DIM}Project: ${GCP_PROJECT_ID}${NC}"
        echo ""
    
        # --- 3a. Billing cost for current month ---
        echo -e "  ${BOLD}Current Month Billing:${NC}"
    
        # Use gcloud billing to get cost info (if available)
        # Alternative: use Cloud Billing API
        billing_account=$(gcloud billing projects describe "$GCP_PROJECT_ID" \
            --format="value(billingAccountName)" 2>/dev/null)
    
        if [ -n "$billing_account" ]; then
            echo -e "  ${DIM}Billing account: ${billing_account}${NC}"
            echo -e "  ${DIM}Tip: Detailed costs at https://console.cloud.google.com/billing/${NC}"
        else
            echo -e "  ${WARN} Could not fetch billing info"
        fi
        echo ""
    
        # --- 3b. API Usage (request count) last 24 hours ---
        echo -e "  ${BOLD}API Request Counts (last 24h):${NC}"
    
        # Calculate time window
        end_time=$(date -u '+%Y-%m-%dT%H:%M:%SZ')
        start_time=$(date -u -d '24 hours ago' '+%Y-%m-%dT%H:%M:%SZ' 2>/dev/null || \
                     date -u -v-24H '+%Y-%m-%dT%H:%M:%SZ' 2>/dev/null)
    
        IFS=',' read -ra SERVICES <<< "$GCP_MAPS_SERVICES"
        for service in "${SERVICES[@]}"; do
            service=$(echo "$service" | xargs)  # trim whitespace
    
            # Query the monitoring API for request count
            # Metric: serviceruntime.googleapis.com/api/request_count
            result=$(gcloud monitoring time-series list \
                --project="$GCP_PROJECT_ID" \
                --filter="metric.type=\"serviceruntime.googleapis.com/api/request_count\" AND resource.labels.service=\"${service}\"" \
                --interval-start-time="$start_time" \
                --interval-end-time="$end_time" \
                --format=json 2>/dev/null)
    
            if [ $? -eq 0 ] && echo "$result" | jq -e '.' &>/dev/null 2>&1; then
                # Sum all data points
                total_requests=$(echo "$result" | jq '[.[].points[].value.int64Value // 0 | tonumber] | add // 0')
    
                # Count errors (4xx + 5xx)
                error_requests=$(echo "$result" | jq '
                    [.[] |
                        select(.metric.labels.response_code_class == "4xx" or
                               .metric.labels.response_code_class == "5xx") |
                        .points[].value.int64Value // 0 | tonumber
                    ] | add // 0')
    
                if [ "$error_requests" -gt 0 ]; then
                    echo -e "  ${WARN} ${service}"
                    echo -e "     Requests: ${total_requests}  |  ${RED}Errors: ${error_requests}${NC}"
                elif [ "$total_requests" -gt 0 ]; then
                    echo -e "  ${PASS} ${service}"
                    echo -e "     ${DIM}Requests: ${total_requests}  |  Errors: 0${NC}"
                else
                    echo -e "  ${DIM}  ○ ${service} — no requests in last 24h${NC}"
                fi
            else
                # Fallback: try using curl directly with the monitoring API
                token=$(gcloud auth print-access-token 2>/dev/null)
                if [ -n "$token" ]; then
                    api_result=$(curl -s \
                        -H "Authorization: Bearer $token" \
                        "https://monitoring.googleapis.com/v3/projects/${GCP_PROJECT_ID}/timeSeries?filter=metric.type%3D%22serviceruntime.googleapis.com%2Fapi%2Frequest_count%22%20AND%20resource.labels.service%3D%22${service}%22&interval.endTime=${end_time}&interval.startTime=${start_time}&aggregation.alignmentPeriod=86400s&aggregation.perSeriesAligner=ALIGN_SUM")
    
                    if echo "$api_result" | jq -e '.timeSeries' &>/dev/null 2>&1; then
                        total=$(echo "$api_result" | jq '[.timeSeries[].points[].value.int64Value // "0" | tonumber] | add // 0')
                        echo -e "  ${PASS} ${service}  —  ${total} requests"
                    else
                        echo -e "  ${DIM}  ○ ${service} — no data or insufficient permissions${NC}"
                    fi
                else
                    echo -e "  ${FAIL} ${service} — could not authenticate"
                fi
            fi
        done
    
        # --- 3c. Check for API errors specifically ---
        echo ""
        echo -e "  ${BOLD}Error Summary (4xx/5xx, last 24h):${NC}"
    
        token=$(gcloud auth print-access-token 2>/dev/null)
        if [ -n "$token" ]; then
            # Query for error responses across all Maps APIs
            error_result=$(curl -s \
                -H "Authorization: Bearer $token" \
                "https://monitoring.googleapis.com/v3/projects/${GCP_PROJECT_ID}/timeSeries?filter=metric.type%3D%22serviceruntime.googleapis.com%2Fapi%2Frequest_count%22%20AND%20metric.labels.response_code_class%3Done_of(%224xx%22%2C%225xx%22)&interval.endTime=${end_time}&interval.startTime=${start_time}&aggregation.alignmentPeriod=86400s&aggregation.perSeriesAligner=ALIGN_SUM")
    
            if echo "$error_result" | jq -e '.timeSeries[0]' &>/dev/null 2>&1; then
                echo "$error_result" | jq -r '.timeSeries[] | 
                    "\(.resource.labels.service)|\(.metric.labels.response_code_class)|\(.points[0].value.int64Value // 0)"' | \
                while IFS='|' read -r svc code count; do
                    echo -e "  ${FAIL} ${svc}: ${count} ${code} errors"
                done
            else
                echo -e "  ${PASS} No API errors in the last 24 hours."
            fi
        else
            echo -e "  ${WARN} Could not authenticate to check errors"
        fi
    
        echo ""
    }
    
    # ============================================================================
    #  MAIN
    # ============================================================================
    
    header
    check_cronitor
    check_s3_backups
    check_google_maps
    
    echo -e "${BOLD}${CYAN}──────────────────────────────────────────────────────${NC}"
    echo -e "${DIM}Check complete. $(date '+%H:%M:%S')${NC}"
    echo ""
    

    Gebruik

    Handmatig uitvoeren

    morning
    

    Dit alias is toegevoegd aan ~/.bashrc:

    alias morning="set -a && source ~/tfh-check/morning-check.env && set +a && ~/tfh-check/morning-check.sh"
    

    Automatisch bij opstarten

    Het script draait automatisch bij het opstarten van de laptop via Opstartapplicaties (Startup Applications) met het volgende commando:

    gnome-terminal -- bash -c "set -a && source ~/tfh-check/morning-check.env && set +a && ~/tfh-check/morning-check.sh; echo; read -p 'Press Enter to close...'"
    

    Daarnaast draait het script eenmalig per dag bij het openen van de eerste terminal via een markerbestand in /tmp.

    Voorbeelduitvoer

    ┌──────────────────────────────────────────────────────┐
    │  TFH Morning Health Check  —  2026-02-12 11:14:06       │
    └──────────────────────────────────────────────────────┘
    
    ▸ CRONITOR — Cron Job Health
    ────────────────────────────────────────────────────────
      ✔ root /usr/sbin/csf --lfd restart > 2
      ✔ systemctl restart redis.service
      ✔ /usr/local/bin/server-health-check.sh
      ✔ sh /home/freighther/tfh-backup-database.sh
      ✔ sh /home/freighther/...-database.sh > ...
    
      Total: 5  |  OK: 5  |  Failing: 0  |  Other: 0
    
    ▸ AWS S3 — Backup Bucket (tfh-backup-2)
    ────────────────────────────────────────────────────────
      Account Balance / Credits:
      Current month spend: USD 0.00
    
      Backup Objects (max 3 days old):
      ✔ ...DATABASE-20260212_110001.tar.gz  0d 0h  |  1.39 GB
      ✔ ...DATABASE-20260212_090001.tar.gz  0d 2h  |  1.39 GB
      ✔ ...DATABASE-20260212_060002.tar.gz  0d 5h  |  1.39 GB
      ...
      ░ ...DATABASE-20260209_110001.tar.gz  3d 0h  (pending deletion)
    
      Total: 31  |  Fresh: 27  |  Pending deletion: 4
    
    ▸ GOOGLE CLOUD — Maps API Usage & Errors
    ────────────────────────────────────────────────────────
      Project: tfh-maps-477109
    
      API Request Counts (last 24h):
      ✔ maps-backend.googleapis.com  —  1155 requests
      ✔ places-backend.googleapis.com  —  426 requests
      ✔ directions-backend.googleapis.com  —  105 requests
    
      Error Summary (4xx/5xx, last 24h):
      ✔ No API errors in the last 24 hours.
    
    1 Antwoord Laatste antwoord
    0

    • Login

    • Login or register to search.
    • First post
      Last post
    0
    • Categorieën
    • Recent
    • Tags
    • Populair
    • Gebruikers
    • Groepen
    • Zoeken