Unix & Linux Asked by Zombo on January 5, 2022
If you put this link in a browser:
https://unix.stackexchange.com/q/453740#453743
it returns this:
https://unix.stackexchange.com/questions/453740/installing-busybox-for-ubuntu#453743
However cURL drops the Hash:
$ curl -I https://unix.stackexchange.com/q/453740#453743
HTTP/2 302
cache-control: no-cache, no-store, must-revalidate
content-type: text/html; charset=utf-8
location: /questions/453740/installing-busybox-for-ubuntu
Does cURL have an option to keep the Hash with the resultant URL? Essentially I
am trying to write a script that will resolve URLs like a browser – this is what
I have so far but it breaks if the URL contains a Hash:
$ set https://unix.stackexchange.com/q/453740#453743
$ curl -L -s -o /dev/null -w %{url_effective} "$1"
https://unix.stackexchange.com/questions/453740/installing-busybox-for-ubuntu
Curl download whole pages.
A #
points to a fragment.
Both are not compatible.
The symbol #
is used at the end of a web page link to mark a position inside a whole web page.
...convention called "fragment URLs" to refer to anchors within an HTML document.
What is it when a link has a pound "#" sign in it
It's a "fragment" or "named anchor". You can you use to link to part of a document.
Wikipedia: Uniform Resource Locator (URL)
An optional fragment component preceded by an hash (#). The fragment contains a fragment identifier providing direction to a secondary resource, such as a section heading in an article identified by the remainder of the URI. When the primary resource is an HTML document, the fragment is often an id attribute of a specific element, and web browsers will scroll this element into view.
Its main use is to move the "presentation layer" (what is viewed) to the start of an item.
There is no "presentation layer" in curl, its goal is to download whole pages, not parts or fragments of pages. Therefore, there is no use for a "fragment" marker in curl. It is simply ignored by curl.
Re-append the tag to the (redirected) link:
originallink='https://unix.stackexchange.com/q/453740#453743'
wholepage=$(curl -Lso /dev/null -w %{url_effective} "$originallink")
if [ "$originallink" != "${originallink##*#}" ]; then
newlink=$wholepage#${originallink##*#}
else
echo "link contains no segment"
newlink="$wholepage"
fi
echo "$newlink"
Will print:
https://unix.stackexchange.com/questions/453740/installing-busybox-for-ubuntu#453743
A quite faster solution is to not download the page. It is being redirected to /dev/null
anyway. By removing the -L
option and asking what would be the link if the (first) redirect were followed. The first redirect works in this case and most others.
wholepage=$(curl -so /dev/null -w %{redirect_url} "$originallink")
Answered by ImHere on January 5, 2022
According to this thread on the curl
website titled: Re: How to send fragment part of URL? the hashmark is meant for the browser and not the server, hence why curl
is truncating it.
The fragment part of a URI is not meant to be sent in the HTTP request - it is used to identify a specific section in the resource that will be fetched by using the particular URI. If you want to force #-letter into the request I think encoding it sounds like a perfect idea.
Looking I did not see any method for curl
to persist it beyond encoding it as %23
, which I don't think is what you want.
Since it's the client that's maintaining the string after the hashmark, I'd "lean into it" and simply parse it out and then re-append it to the returned URL from curl
as a true browser client would do it:
$ set 'https://unix.stackexchange.com/q/453740#453743'
$ echo "$(curl -I -L -s -o /dev/null -w %{url_effective} "$1")#$(echo "$1" | cut -d"#" -f2)"
https://unix.stackexchange.com/questions/453740/installing-busybox-for-ubuntu#453743
Answered by slm on January 5, 2022
1 Asked on November 1, 2020 by venkateswaran-r
1 Asked on October 30, 2020 by stonethrow
0 Asked on October 30, 2020 by jkroepke
1 Asked on October 30, 2020 by tshepang
1 Asked on October 19, 2020 by francesco-lucian
0 Asked on October 19, 2020 by mejustjohndoe
2 Asked on October 18, 2020 by jamal
0 Asked on October 14, 2020 by nelsonic
1 Asked on October 14, 2020 by falky
4 Asked on October 11, 2020 by sonnuforevis
1 Asked on October 10, 2020 by mr-kenneth
5 Asked on October 10, 2020 by tulains-crdova
1 Asked on October 9, 2020 by javarunner
5 Asked on October 6, 2020 by john-zhau
1 Asked on September 28, 2020 by caveman
0 Asked on September 26, 2020 by enrico
0 Asked on September 22, 2020 by unviray
Get help from others!
Recent Questions
Recent Answers
© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP