Python: Join multiple components to build a URL
Using join
Have you tried simply "/".join(url_join_items)
. Does not http always use the forward slash? You might have to manually setup the prefix "https://" and the suffix, though.
Something like:
url = "https://{}.json".format("/".join(url_join_items))
Using reduce and urljoin
Here is a related question on SO that explains to some degree the thinking behind the implementation of urljoin
. Your use case does not appear to be the best fit.
When using reduce
and urljoin
, I'm not sure it will do what the question intends, which is semantically like os.path.join
, but for urls. Consider the following:
from urllib.parse import urljoin
from functools import reduce
parts_1 = ["a","b","c","d"]
parts_2 = ["https://","server.com","somedir","somefile.json"]
parts_3 = ["https://","server.com/","somedir/","somefile.json"]
out1 = reduce(urljoin, parts_1)
print(out1)
d
out2 = reduce(urljoin, parts_2)
print(out2)
https:///somefile.json
out3 = reduce(urljoin, parts_3)
print(out3)
https:///server.com/somedir/somefile.json
Note that with the exception of the extra "/" after the https prefix, the third output is probably closest to what the asker intends, except we've had to do all the work of formatting the parts with the separator.
I also needed something similar and came up with this solution:
from urllib.parse import urljoin, quote_plus
def multi_urljoin(*parts):
return urljoin(parts[0], "/".join(quote_plus(part.strip("/"), safe="/") for part in parts[1:]))
print(multi_urljoin("https://server.com", "path/to/some/dir/", "2019", "4", "17", "some_random_string", "image.jpg"))
This prints 'https://server.com/path/to/some/dir/2019/4/17/some_random_string/image.jpg'
Here's a bit silly but workable solution, given that parts
is a list of URL parts in order
my_url = '/'.join(parts).replace('//', '/').replace(':/', '://')
I wish replace
would have a from
option but it does not hence the second one is to recover https://
double slash
Nice thing is you don't have to worry about parts already having (or not having) any slashes