How can I see what filename my browser is reading when it's on a site?
The browser isn't looking for a file. It's just asking for a resource. The server then decides what that resource returns.
At it's most basic level that "file" is literally just a file. In the case of the default index page of a directory how the server is set up will determine which files is returned. Some servers are configured by default to return index.html
if the file exists, then fall back to index.htm
, etc. Others default to default.html
, etc. They will keep trying until the list of default files is exhausted and will then return a 404 error.
In the case when server rewriting is turned on or dynamic pages are being constructed, the content being returned typically isn't a file at all. It resembles a file as the output is (typically) HTML just like a .html
file would contain. But behind the scenes tens or hundreds of files create that content.
Your browsers doesn't load any file, it requests a resource which the server then provides at his discretion (lengthy elaboration below).
If you type google.com
into your browsers toolbar, the browser wil first append a protocol, either http://
or https
.
Then browser will look up the IP address belonging to google.com
, which is 172.217.19.206
. Your browser will then establish a socket with that server on the proper port (80 for http
, 443 for https
).
Your browser will then send the following request to the server:
GET / HTTP/1.1
Host: google.com
The web server will then decide what to do. This can involve a lot of steps.
A web server usually has something called a document root for any domain he serves. Files that the web server is allowed to serve the user usually reside inside this document root. For example, the document root for google.com
might reside at /var/www/domains/google.com/htdocs/
.
Now, when you request a resource the web server first inspects the resource, and then takes proper action. For example, if the resource ends wih .php
, the web server might decide not to serve anything himself, but instead invokes the PHP interpreter, and lets the PHP interpreter execute the proper PHP-file for the requested resource, and then serves the user the output.
Take for example this request:
GET /article.php?id=123456 HTTP/1.1
Host: news.example.org
In this case, the web server on news.example.org
is tasked to serve the resource /article.php?id=123456
. What likely happens is that this web server will start the PHP interpreter. fetch the article.php
file from the document root, feed it into the PHP interpreter and wait for the output. It will then send the output bck to the browser that requested it. In this case, this would likely be a site from a blog with certain content loaded from a database (the contents of the article stored with the id 12345).
But other things can happen, too.
Lets get back to the original example:
GET / HTTP/1.1
Host: google.com
What with any stanadard web server (Apache, Lighttpd, etc.) happens, is more or less the following:
- Look for a file named
index.html
(in the document root) and serve it - If that doesn't exist, look for a file
index.htm
and serve it - If that doesn't exist, look for a file
index.php
and start the PHP interpreter, serve the output - If that doesn't exist, serve a
404 NOT FOUND
error
The precedence of the extension is usually configurable on the side of the web server. The server might not serve any index.xxx
file at all. For example, if you have a node.js
server running, then the web server would task the node.js
server to provide the resource /
, which might just be anything the JS-App that is runnng on node decides it to be.
tl:dr; The browser doesn't look up a file. The browser requests a resource, the web server then handles the request and serves the content approrpiate for the requested resource, which might be a file, but might also be the output of a 3rd-party program.
As far as speed is concerned, this is dependent on the web server. But if you want your web server to always serve the asjkdjhfz9874jykdfndsk.html
file when /
is requested, you would usually configure the web server to look for such a file first, making it as fast as any other configuration.
Disclaimer This is not a full decsription how any web server works, nor tailored specific to one. Most web servers work similar, but especially sites like google.com
will likely run some custom things that are tailored specifically to their needs.
Your browser will usually offer tools to inspect network activity. Using Chrome, you can open the "Dev tools" and inspect the headers. this is what my browser send to SE to let me edit this answer:
GET /posts/93567/edit HTTP/1.1
Host: webmasters.stackexchange.com
There are a few more that tell the server about caching, what language I am expecting, what browser I am using and where I'm coming from, but those aren#t interesting here. The point is, my browser requests the resource /posts/93567/edit
. My browser will never have any idea about what file the web server serves. SE runs on ASP.NET MVC 5, which means that the web server (in SEs case, an IIS) will likely load some proper .asp
file (that can be located anywhere), and lets the runtime evaluate it for the parameter postId=93567
. The actual file or inner workings are never exposed to the browser, because the browser doesn't need to know (and because it is safer to hide that information for the one running the server).
The view will also show you any other resources (CSS files, JS files, images etc.) that your browser requests to correctly render the site. But with them, you will only learn about the resource, not wether or not those are actually files in the file system.
I'm wondering how I know what file is being rendered. I can eventually guess it accurately on a long enough timeline by simply explicitly calling on that filename in the URL. www.xyz.com/index.html fail to load anything? Then try www.xyz.com/index.htm and then so on until I get the site to render. I'm just looking for a shortcut to know what file my browser has loaded.
I agree with John here that what you are requesting by specifying a URL is a resource (or for a better word, an object) from the server.
You'll never know 100% for sure what actual disk file is being read when a URL is requested. This is especially true if the server requires a third-party program to associate with it in order to produce output.
A typical third-party program is the PHP interpreter which is something Wordpress uses to deliver content. The interpreter can process code that may involve loading any number of files from the server's disk in order to construct the HTML data which is then delivered to the user's browser.
On top of that, special configuration can be applied to the server to assign special URLs to resources. This (in an apache environment) is known as URL rewriting, and this is very good since its the start to friendly URL creation.
The users won't know the exact filenames of the files loaded nor will they care (unless they are hackers) because all they care about is the actual content on the page.
It's also possible that some server admins decide not to use actual filenames in URLs for security reasons.