Concurrent Web Scraper in Go

package main


import (

    "fmt"

    "net/http"

    "io/ioutil"

    "sync"

)
func fetch(url string, wg *sync.WaitGroup) {

    defer wg.Done()
    resp, err := http.Get(url)

    if err != nil {

        fmt.Printf("Error fetching %s: %v\n", url, err)

        return

    }
    defer resp.Body.Close()
    body, err := ioutil.ReadAll(resp.Body)

    if err != nil {

        fmt.Printf("Error reading response body from %s: %v\n", url, err)

        return

    }
    fmt.Printf("Fetched %s: %d bytes\n", url, len(body))

}
func main() {

    urls := []string{

        "https://www.example.com",

        "https://www.google.com",

        "https://www.github.com",

    }
    var wg sync.WaitGroup
    for _, url := range urls {

        wg.Add(1)

        go fetch(url, &wg)

    }

wg.Wait() fmt.Println("All fetches completed.") }

In this Go script, we are creating a concurrent web scraper that fetches data from multiple websites concurrently.

We start by importing necessary packages: “fmt” for printing, “net/http” for making HTTP requests, “io/ioutil” for reading response bodies, and “sync” for synchronization primitives.

We define a `fetch` function that takes a URL and a pointer to a `sync.WaitGroup`. Inside the function, we make an HTTP GET request to the URL, read the response body, and print the size of the fetched data.

In the `main` function, we define a list of URLs to scrape. We then create a `sync.WaitGroup` to coordinate the concurrent fetching of URLs. We iterate over the URLs, increment the `WaitGroup`, and launch a goroutine to fetch each URL concurrently.

After launching all goroutines, we wait for all of them to finish by calling `wg.Wait()`. Once all fetches are completed, we print a message indicating that all fetches have been completed.

This script demonstrates how to use goroutines and `sync.WaitGroup` in Go to fetch data from multiple websites concurrently.