Bandwidth calculation

Hello All,
I would like to know the calculation of consumed bandwidth in proxy.
Let's say, I am going to download a bin file with the size of 100GB.
When I run wget with proxy function, how much GB will I consume at the end of the process?
I hope you do not say "100GB" :slight_smile:

Thank you
Boris

Hello,

I think here it is important to distinguish between two things in this scenario: the amount of bandwidth used on your local network connection (the link from the client to the proxy); and the amount of bandwidth used on your external network connection (the link from the proxy to the external server holding the 100GB file). There is also a second factor to consider: if this is the first time an attempt has ever been made to download the file; or a subsequent attempt after the first.

So, assuming you have a caching proxy server, the first time you try to download the file, you will use 100GB of external bandwidth (as the proxy retrieves the file from the Internet), and 100GB of local bandwidth (as your client retrieves the file from the proxy, and saves it to its own local storage).

The next time you attempt to retrieve the file, the proxy will not consume any external bandwidth, since it already has the file. However, unless you are using flags to tell wget not to re-download a file it already has, you will still use 100GB of local bandwidth again as the proxy transfers the file to the client again.

External bandwidth would not be used to retrieve this file again until the cached copy on the proxy expired, or until another client which did not use the proxy directly retrieved the file over the Internet connection, if your local network setup is configured to permit such connectivity.

Hope this helps ! If I've mis-understood something about your question or the exact scenario in this situation, please do let me know and we can take things from there.

1 Like

On one hand I agree with what @drysdalk has said.

On the other hand, my experience as a proxy administrator suggests that the proxy server may not cache a file that is 100 GB in size by default. As in some caching proxies will only cache a file up to a specific size, measured in MB on the systems I've worked on.

Check with your proxy administrator, or if you are the proxy administrator, review the proxy configuration for maximum file size.

Then there is the pesky issue of how much disk space is available on the caching proxy. If it only has ~15 GB free, there's no way for it to cache the 100 GB file.

1 Like

Well, looks to me like we are kinda mixing terms.

A bandwidth is a maximum amount your link can offer (local network or from the internet) e.g 1 GBE or ~120 Megabytes per second.
100 GB is total size of desired file you wish to transfer over that link.

100 GBE is ~12 Gigabytes per second - which is a monster value, even if your links support this, your disks might have an issue writing this fast, effectively making network wait for disks and utilizing much less then pipe offers.

IF your local link has 100 GBE avail BW, it is high probability that it is some kind of network load balancing over multiple ports say 4 x 25 GBE in LACP or likes.

One invocation in this (lab, synthetics) scenario with performance software would be able to saturate one port at most with one process. One wget only - unlikely.
Multi-process saturation would depend of algo employed on client side bond interface and algo used.
From simple round robin to more advanced methods your OS and/or switches support such as TLB, ALB, LACP or likes.

The point of caching/proxy software such as Squid or Varnish is to minimize internet network traffic and bandwidth for URL/files..that are accessed constantly by clients from local network going to internet URL to fetch those.

A simple example would be a squid/varnish which is configured to cache request going to S3 bucket on cloud provider hosting a 100 GB file.
Client would issue a request to GET that file using HTTP_PROXY directive, and cache server would download that file in cache initially (which can be disk, memory or both)

Next client (or any subsequent one) would issue GET against same file, but would now be served by cache server, effectively not using internet link at all, but only local network - which is much faster.

In turn, you have just saved your company money as cloud provider S3 charges downloads and internet link can / is utilized for other stuff.

Of course, there are many applications of caching software, even in LAN environments, as it can be utilized to reduce load on various HTTP based apps by caching stuff in memory when put in front of those.

Varnish can increase performance of such apps many times without developer intervention.

Additional perk would also be security which you can enforce on such systems in terms of WAF or MITM TLS inspection.

Hope that helps
Regards
Peasant.

2 Likes