本文描述了Http服务器使用http range支持浏览器的下载管理器下载大文件时可暂停和继续的实现方法,后面用英文写,但我的英文不太好,如有错误之处,希望大家帮我纠正,我的联系方式在后面。
Section A. Summary
I'm making a http server in C, in order to support big file download, I have to support Http Range in my server, as the usual way for web browsers to download big files is to use its download manager, where we can pause and resume the download progress. For the pause and resume functions of download manager to work, the http server must support Http Range requests.
Besides file download, http range is also needed for <video> element to play videos on web page, when the user click a random position of the video time line, the browser use http range requests to get the corresponding file content of the video file.
Http range is imported into http protocol from http 1.1. A http range request is like this:150Please respect copyright.PENANAT5SxDEUnTu
----------http range request-------150Please respect copyright.PENANAUdy9EyeDsT
GET /zen.iso HTTP/1.1150Please respect copyright.PENANAxvj9AT2KpR
Range: bytes=100-1023150Please respect copyright.PENANAp2nWKswiZJ
If-Match: "534231200_1732277116"150Please respect copyright.PENANAWUeNBCp9gK
......150Please respect copyright.PENANAfUgkVZ2wK8
-----------------------------------150Please respect copyright.PENANANx43SHxn48
It means to get the content from byte 100 to byte 1023, totally 924 bytes, and it tells the http server the etag value of already downloaded part is "534231200_1732277116", the etag value is used for the web browser to determine if the requested content has been modified, it's necessory for some browsers like chrome and edge to resume the download of a file. If http server do not provide etag for range content, the resume function of web browser download manager will not work.
Section B. Detailed Steps
The following is the steps for serving file download, support pause and resume functions of web browser download manager.
150Please respect copyright.PENANAVrnqiHM7lS
Step 1. make a link for file to download, for example:150Please respect copyright.PENANAUIomevV65I
----------------html-----------------150Please respect copyright.PENANADACSGqJnTe
<a href="/zen.iso">zen.iso</a>150Please respect copyright.PENANAkRH9cDyBsJ
-------------------------------------150Please respect copyright.PENANAHxbOMcsNPm
When user click the link, web browser download manager will take care of the file download immediately. We cannot use javascript fetch(url) to start downloading, as in this way javascript code will take care of the download instead of download manager.
When the link is clicked, the http request sent to server is like this:150Please respect copyright.PENANAQL8QZE8EZ0
----------get request------------150Please respect copyright.PENANAofehRFAbPH
GET /zen.iso HTTP/1.1150Please respect copyright.PENANAPlLAB0O6Xy
......150Please respect copyright.PENANA3SgaCm9zaB
---------------------------------
150Please respect copyright.PENANAx56mVRKv4L
Step 2. when http server receives above request, it sends the whole file content as response body back as usual, and indicates support of http range in the response headers, like this:150Please respect copyright.PENANAUxIucizoLw
----------------server response--------------150Please respect copyright.PENANAyvZpSzAxFx
HTTP/1.1 200 OK150Please respect copyright.PENANAdzqnv8viZD
Date: Sat, 23 Nov 2024 06:29:19 GMT150Please respect copyright.PENANAYgBhe5e3li
accept-ranges: bytes150Please respect copyright.PENANAup1T1MZojK
content-range: bytes 0-534231199/534231200150Please respect copyright.PENANASMBMvP5sJt
ETag: "534231200_1732277116"150Please respect copyright.PENANAP2J2hu2RRK
Content-Disposition: attachment; filename="zen.iso"150Please respect copyright.PENANA3lMm4gjvSg
Content-Length: 534231200150Please respect copyright.PENANAqE2ZnXy7hu
Content-Type: application/octet-stream
(http response body is the whole file content)150Please respect copyright.PENANAYhau4uiULL
-------------------------------------------
Http server usually use a small buf to send the whole file content by multiple times if the file size is bigger than it, like this:150Please respect copyright.PENANAs6bzvWfQ2o
---------------http server send big file content by multiple times--------------150Please respect copyright.PENANAkGvjI15KUz
const int buf_size=20000000; // buf size is about 20M150Please respect copyright.PENANAx0bwk4nqBS
int client_socket; char send_buf[buf_size]; FILE *file;
// first send http response headers150Please respect copyright.PENANApirUZBFppH
send_http_response_headers(client_socket, http_response_headers);
// then send the whole file content as response body by multiple times if the file is big150Please respect copyright.PENANAxCPQrnTbox
while(there_is_remain_file_content_to_send)150Please respect copyright.PENANANc0WsweYy5
{ 150Please respect copyright.PENANAEtVGT3TH0S
read_remain_file_content_to_send_buf(file, send_buf);150Please respect copyright.PENANAc6SIihyfMe
send_result = send(client_socket, send_buf);150Please respect copyright.PENANAIwrknUzma1
if(send_result == fail) return; else continue;150Please respect copyright.PENANAXhY4No4mPf
}
// a way to make etag for a file using its file size and last modifed time150Please respect copyright.PENANAMYOQIDXMj4
const char *file_etag(const char *path) 150Please respect copyright.PENANA2Eru8uCx69
{150Please respect copyright.PENANARaHSZERb00
struct stat st; if (stat(path, &st) < 0) return NULL; char* data=malloc(64); snprintf(data, 63, "%ld_%ld", (long)st.st_size, (long)st.st_mtime); return data;150Please respect copyright.PENANAmV3LJWIPUA
}150Please respect copyright.PENANAzi3atailFT
-------------------------------------------------150Please respect copyright.PENANAK9fguPaXOf
"accept-ranges: bytes" tells client web browser that the http server support range.150Please respect copyright.PENANABKVC2WKxQZ
"content-range: bytes 0-534231199/534231200" tells client web browser the content range(0-534231199) of the response body and the total file size(534231200).150Please respect copyright.PENANARiDy886SGt
'ETag: "534231200_1732277116"' tells client web browser the file version.150Please respect copyright.PENANANsxKTu8fIk
'Content-Disposition: attachment; filename="zen.iso"' tells client web browser to download the file as attachment, and its file name.150Please respect copyright.PENANAednehXg0mh
"Content-Length: 534231200" tells client web browser the total size of the response body, which is the size of the whole file in this example.150Please respect copyright.PENANANZreNU285a
"Content-Type: application/octet-stream" tells client web browser to treat the response body as binary stream.
150Please respect copyright.PENANAqXhh1jk91e
step 3. When the web browser receives above response, it knows the server support http range, and knows the total file size by "content-range: bytes 0-534231199/534231200", it also saves the etag for later comparation.150Please respect copyright.PENANAEP4jFuLrS8
If the file is big, the above sends may take several minutes, during this time, the user can click pause in the download manager of web browser(for firefox, you have to press ctrl and right click the file to show the pause command.)150Please respect copyright.PENANAuX8RX1Tawo
When pause is pressed, the web browser disconnects with http server, and the "send_result = send(client_socket, send_buf);" of http server will be fail, then the http server stops sending anymore.
150Please respect copyright.PENANAnACyPmMAnI
step 4. Later the user can click resume to continue the download in download manager, and the download manager will send a http range request to tell the server the resume start point, like this:150Please respect copyright.PENANAzi083PAtMd
---------resume range request---------150Please respect copyright.PENANAGVilg13JZP
GET /zen.iso HTTP/1.1150Please respect copyright.PENANASvOcHDVGjK
Range: bytes=243220480-150Please respect copyright.PENANAc2w7gJTYj5
If-Match: "534231200_1732277116"150Please respect copyright.PENANA49nbKt7GLJ
......150Please respect copyright.PENANAqm6YVsHcWf
-------------------------------------150Please respect copyright.PENANAa7C4zIlpH8
"Range: bytes=243220480-" tells the http server to send content from byte 243220480 to end.150Please respect copyright.PENANAUqadaiqkOb
'If-Match: "534231200_1732277116"' tells the http server to check the file etag.
If the etag remains the same, it means the file is not changed, so the server can send from the resume point 243220480, and its response status should be 206 partial content, like this:150Please respect copyright.PENANAk4KeeUB1kY
-------response content from resume point-------150Please respect copyright.PENANA1Y5eZdYCDz
HTTP/1.1 206 Partial Content150Please respect copyright.PENANAeeDFpG4SHz
Date: Sat, 23 Nov 2024 06:56:17 GMT150Please respect copyright.PENANAyAekk7nX4Z
accept-ranges: bytes150Please respect copyright.PENANATNNm8qYAxi
content-range: bytes 243220480-534231199/534231200150Please respect copyright.PENANAPlH1zV3inr
Content-Length: 291010720150Please respect copyright.PENANAU3QYSoJ9Qb
Content-Type: application/octet-stream150Please respect copyright.PENANAuSMjwkOPDz
Content-Disposition: attachment; filename="zen.iso"150Please respect copyright.PENANACH7dEkIMOD
ETag: "534231200_1732277116"
(http response body is the file content from resume point to end)150Please respect copyright.PENANAhRsq33DI25
------------------------------------------------150Please respect copyright.PENANAGiV4hZvVG7
When client download manager receives this response, it will keep the already downloaded content and append the new respone body content to it.
If the etag diffs, it means the file has been changed, then the server should send the file content from beginning to end, with a 200 OK response status, like this:150Please respect copyright.PENANApNMDKwAUAw
-------response content from resume point-------150Please respect copyright.PENANAaQu7FW0aSC
HTTP/1.1 200 OK150Please respect copyright.PENANAQZIjEwEu50
Date: Sat, 23 Nov 2024 06:29:19 GMT150Please respect copyright.PENANAhZBqnmr9aU
accept-ranges: bytes150Please respect copyright.PENANA9RXAWlDzyV
content-range: bytes 0-634231201/634231202150Please respect copyright.PENANA8rClO4FOE6
ETag: "634231202_2732277118"150Please respect copyright.PENANA92d1rokqjO
Content-Disposition: attachment; filename="zen.iso"150Please respect copyright.PENANA6x1kWIr3eH
Content-Length: 634231202150Please respect copyright.PENANAAp2XpAuaA7
Content-Type: application/octet-stream
(http response body is the whole file content)150Please respect copyright.PENANAKiOAJPZm3T
------------------------------------------------150Please respect copyright.PENANAXp0NpnNbij
When client download manager receives this response, it will clear the already downloaded content and start download from beginning.
150Please respect copyright.PENANAsL1WhuQdHd
step 5. when a <video src="/video_file.mp4"> is used to play videos on web page, the browser also used http range requests, as Zeng Xu's article describes:150Please respect copyright.PENANAZP1bq1oRYl
https://www.zeng.dev/post/2023-http-range-and-play-mp4-in-browser/?
150Please respect copyright.PENANATX4DGcpCd9
Section C. Thanks
感謝曾旭的幫助,曾旭的文章对http range有非常好的描述。
感谢stackoverflow对使用etag的提示150Please respect copyright.PENANAh3KkU5yfX9
https://stackoverflow.com/questions/66998172/does-google-chrome-and-similar-browsers-support-range-headers-for-standard-downl
感谢gemini等ai对写代码的辅助150Please respect copyright.PENANAjCvGgmcBX3
https://gemini.google.com/
感谢互联网上无私奉献的参考资150Please respect copyright.PENANAq5lNEg6dFQ
https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests150Please respect copyright.PENANAUvttLzi3Nb
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Range150Please respect copyright.PENANAE76SUGXF95
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag150Please respect copyright.PENANAHgdd8Hk9r1
https://www.rfc-editor.org/rfc/rfc7233150Please respect copyright.PENANACQACZhQ4jt
https://mirrors.tuna.tsinghua.edu.cn/ubuntu-releases/24.10/
150Please respect copyright.PENANApSCODDRR6P
Section D. Contacts Me
If you found any errors or have any suggestions for this article, please let me know, my wechat: si_jinmin, my email: [email protected]150Please respect copyright.PENANAxqPHTl3LHG
如果您发现本文有任何错误,或者对本文有好的建议,欢迎与我联系探讨,我的微信: si_jinmin, 我的email: [email protected]
如果您對C/C++ programming, Linux, website development, Vue, Git, vscode感興趣,邀請您加入「Linux/C/C++ Website Development」 微信群,請加我的微信(si_jinmin)以便拉您进群。
ns216.73.217.1da2