Citation of a website: How to determine the year?

The Publication Date may be in the webpage's meta data or source

There may be date information contained in the meta tags in the page source. If you are not using a tool (see below) to extract this information, you can view the page source and attempt to interpret it. There are many different ways publication date information may be stored in meta tags. The most common ones will begin with <meta. The number of ways in which an appropriate date may be stored in in these tags is just too numerous to cover in a single post. If you are not familiar with what these might look like, a tool (see below) to extract the data will be quite helpful.

Extracting the date from meta tags, and picking the correct date (there may be several), can be complex. How, exactly, to do so varies from website to website. In addition, each site may change their format from time to time. If you are going to be referencing more than one or two webpages a well maintained tool to extract the reference information will be very helpful.

As an example, this page (randomly selected for testing some time ago), which does not display a date if you have JavaScript turned off (dates are displayed if JavaScript is turned on), contains appropriate publication dates in the following tags (there are more tags that contain dates that are not appropriate):

<meta name="parsely-pub-date" content="2014-10-14T23:45:00.011Z" data-ephemeral="true">
<meta name="date" content="2014-10-14T23:45:00.011Z" data-ephemeral="true">
<meta name="iso-8601-publish-date" content="2014-10-14T23:45:00.011Z" data-ephemeral="true">
<time class="published-at time-based" datetime="2014-10-14T23:45:00.011Z" itemprop="datePublished">
<time class="updated-at__time" datetime="2014-10-15T05:07:37.564Z">
<meta name="pubdate" content="Tue Oct 14 2014 19:45:00 GMT-0400" data-ephemeral="true">

That page also contains the publication and update dates located in multiple <script> tags, links, and various other tags.

While it is possible to extract this type of information by hand, it is usually much more effective allow a tool to do so for you. The tools mentioned in the last section of this answer should do a reasonable job of extracting the publication date from most webpages.

However, if you don't have one of those available, here is a bookmarklet that will toggle a display at the top of the page of all tags, except <A> and <IMG>, which contain a date in YYYY-[M]M-[D]D format; or in English language Month, [D]D, YYYY; or [D]D Month YYYY (It does not show text which is part of the displayed text):1,2

javascript:void((function(){var toRm=document.getElementById('showTagsWithDate');if(toRm){document.body.removeChild(toRm);return;}document.body.insertAdjacentHTML('afterbegin','<div id="showTagsWithDate" style="background-color:white;color:black;">Tags with a date in YYYY-[M]M-[D]D format; or in English (US, or non-US format):<ul/></div>');var myul=document.body.firstChild.lastChild;var tags=[];function addMoreDates(reg){var addTags=document.documentElement.innerHTML.match(reg);if(addTags){addTags.forEach(function(newTag){if((newTag.indexOf('<a ')===0)||(newTag.indexOf('<img '))===0){return;}if(tags.indexOf(newTag) === -1){tags.push(newTag);}});}}addMoreDates(/<[A-Z][^>]*\D(20\d\d|1\d\d\d)[\s\/\-.,]\s*([1-9]|0[1-9]|[1][012])[\s\/\-,.]\s*([1-9]|0[1-9]|[12]\d|3[01])\s*(st|nd|rd|th){0,1}\D[^>]*>/img);addMoreDates(/<[A-Z][^>]*\b([1-9]|0[1-9]|[12]\d|3[01])(st|nd|rd|th){0,1}[\/\-\s]\s*(january|february|march|april|may|june|july|august|september|october|november|december|jan|feb|mar|apr|may|jun|jul|aug|sep|sept|oct|nov|dec)[\s,.\/\-][\s,.\/\-]?\s*(20\d\d|1\d\d\d)\b[^>]*>/img);addMoreDates(/<[A-Z][^>]*\b(january|february|march|april|may|june|july|august|september|october|november|december|jan|feb|mar|apr|may|jun|jul|aug|sep|sept|oct|nov|dec)[\s,.\/\-][\s,.\/\-]?\s*([1-9]|0[1-9]|[12]\d|3[01])(st|nd|rd|th){0,1}[\s,.\-]+(20\d\d|1\d\d\d)\b[^>]*>/img);if(tags.length===0){tags=['No tags with dates.'];}tags.forEach(function(tag){myul.appendChild(document.createElement('LI')).appendChild(document.createTextNode(tag));});document.body.firstChild.appendChild(document.createElement('BR'));})())

Bookmarklet searching all text on the page for dates

In case you need to look through all the text, including displayed text, for dates, the following bookmarklet will display just the dates contained in the HTML. It will not show any context for these dates. You should be careful when using the results of this bookmarklet, as you will need to determine why a particular date is in the text, and you will need to verify that the date is, in fact, a date, because the regular expressions used can recognize some strings which are not dates as dates. But, it may provide you with some hints as to what you should be looking for. The formats displayed include those in the previous bookmarklet plus YYYY month/season; month/season YYYY); [M]M-[D]D-YYYY; [D]D-[M]M-YYYY; and YYYYMMDD. Duplicates are not displayed. In addition, the list is sorted from earliest to latest (except some formats). Given the much broader set of things being searched for, on some pages there will be some items found which are not dates. You will need to use your own judgement. This bookmarklet is quite long.1,2 The bookmarklet will toggle showing/not showing all the dates it finds in the text.

javascript:void((function(){var toRm=document.getElementById('showTagsWithDate');if(toRm){document.body.removeChild(toRm);return;}document.body.insertAdjacentHTML('afterbegin','<div id="showTagsWithDate" style="background-color:white;color:black;">Dates in the HTML in multiple numeric and English language formats:<ul/></div>');var myul=document.body.firstChild.lastChild;var tags=[];function addMoreDates(reg){var addTags=document.documentElement.innerHTML.match(reg);if(addTags){addTags.forEach(function(newTag){if(tags.indexOf(newTag)===-1){tags.push(newTag);}});}}addMoreDates(/(20\d\d|1\d\d\d)[\s\/\-.,]\s*([1-9]|0[1-9]|[1][012])[\s\/\-,.]\s*([1-9]|0[1-9]|[12]\d|3[01])\s*(st|nd|rd|th){0,1}(?=\D)/img);addMoreDates(/([1-9]|0[1-9]|[12]\d|3[01])(st|nd|rd|th){0,1}[\/\-\s]\s*(january|february|march|april|may|june|july|august|september|october|november|december|jan|feb|mar|apr|may|jun|jul|aug|sep|sept|oct|nov|dec)[\s,.\/\-][\s,.\/\-]?\s*(20\d\d|1\d\d\d)/img);addMoreDates(/(january|february|march|april|may|june|july|august|september|october|november|december|jan|feb|mar|apr|may|jun|jul|aug|sep|sept|oct|nov|dec)[\s,.\/\-][\s,.\/\-]?\s*([1-9]|0[1-9]|[12]\d|3[01])(st|nd|rd|th){0,1}[\s,.\-]+(20\d\d|1\d\d\d)/img);addMoreDates(/\b([1-9]|0[1-9]|[1][012])[\s\/\-.,]\s*([1-9]|0[1-9]|[12]\d|3[01])[\s\/\-,.]\s*(20\d\d|1\d\d\d)\s*\b/img);addMoreDates(/\b([1-9]|0[1-9]|[12]\d|3[01])[\s\/\-.,]\s*([1-9]|0[1-9]|[1][012])[\s\/\-,.]\s*(20\d\d|1\d\d\d)\s*\b/img);addMoreDates(/\b(winter|spring|summer|fall|autumn|january|february|march|april|may|june|july|august|september|october|november|december|jan|feb|mar|apr|may|jun|jul|aug|sep|sept|oct|nov|dec)[\s,.\/\-][\s,.\/\-]?\s*(20\d\d|1\d\d\d)\b/img);addMoreDates(/(20\d\d|1\d\d\d)[\s,.\/\-]\s*(winter|spring|summer|fall|autumn|january|february|march|april|may|june|july|august|september|october|november|december|jan|feb|mar|apr|may|jun|jul|aug|sep|sept|oct|nov|dec)/img);addMoreDates(/\b(20\d\d|1\d\d\d)(0[1-9]|[1][012])(0[1-9]|[12]\d|3[01])\b/img);tags.sort(function(a,b){var aVal=Date.parse(a);var bVal=Date.parse(b);if(aVal===bVal){return 0;}if(aVal>bVal){return 1;}return -1;});if(tags.length===0){tags=['No dates detected in page.'];}tags.forEach(function(tag){myul.appendChild(document.createElement('LI')).appendChild(document.createTextNode(tag));});document.body.firstChild.appendChild(document.createElement('BR'));})())

The above bookmarklets are not the best tool for the task of finding the correct date of a page. A purpose-built referencing tool can devote significantly more logic to obtaining, formatting, and displaying such dates with the context needed for you to determine the correct one to use. In addition, for dates that include the month written out as a work, the above bookmarklet only looks for English months.

Last resort: use the Last Modification Date

If the page does not contain an explicit publishing/creation date in either the displayed text, or the meta data in the page source (viewed either by hand or using a referencing tool), you should include the Last Modification Date. The Last Modification Date is the date the publisher, the company hosting the page, is claiming as the date the page was last modified (i.e. the date it was published). It should be used as a last resort, if no other date is available. It should not your first choice for a date.

The Last Modification Date is usually the date/time at which the primary file for the page was changed which is determined by the modification timestamp stored with that file. This date may, or may not, be accurate. It may only represent the date and time to which the clock was set on the system where that file is stored at the last point the file was modified. It may, or may not, consider any dates on additional resources which are loaded onto the page from other locations (e.g. images, or dynamically by JavaScript). While it is not guaranteed to be accurate, it is the "publication date" provided by the publisher (the company hosting the webpage).

For dynamically generated webpages, the Last Modification Date may be the current day and time. While that might not be the information you desire, it is the date/time the page you are viewing was assembled from its base content and presented to you. The system serving the webpage to you may be composing the page from various different sources (e.g. an article fetched from a database which is combined with today's banner, a header, a footer and appropriate ads). While it would be preferable to have the date of the primary content you are referencing, the current date may be the only one the server can provide. Alternately, it might provide the date of the last modification of the primary content. What date is provided depends on how the server has been programmed. This is one of the reasons that using the Last Modification Date should be your last choice when no more authoritative date is available.

In addition, it should be noted that the Last Modification Date may be completely invalid. You should use your own judgement when looking at the date as to its validity.

You can obtain the last modification date using a bookmarklet. One that will show the last modification date is:1,2

javascript:void(window.alert('The page was last modified on '+document.lastModified))

When I have used a last modification date in a reference I will usually include a note like (last modified: xxxx-xx-xx) which indicates how the date was obtained. This is needed because such dates are not normally displayed to the viewer of a webpage.

Bookmarklet with both last modification date and dates in the page

The following bookmarklet combines the above two bookmarlets to show both the last modified date and the dates in the page. It will toggle showing/not showing all the dates it finds.

javascript:void((function(){var toRm=document.getElementById('showTagsWithDate');if(toRm){document.body.removeChild(toRm);return;}var tags=[];function addMoreDates(reg){var addTags=document.documentElement.innerHTML.match(reg);if(addTags){addTags.forEach(function(newTag){if(tags.indexOf(newTag)===-1){tags.push(newTag);}});}}addMoreDates(/(20\d\d|1\d\d\d)[\s\/\-.,]\s*([1-9]|0[1-9]|[1][012])[\s\/\-,.]\s*([1-9]|0[1-9]|[12]\d|3[01])\s*(st|nd|rd|th){0,1}(?=\D)/img);addMoreDates(/([1-9]|0[1-9]|[12]\d|3[01])(st|nd|rd|th){0,1}[\/\-\s]\s*(january|february|march|april|may|june|july|august|september|october|november|december|jan|feb|mar|apr|may|jun|jul|aug|sep|sept|oct|nov|dec)[\s,.\/\-][\s,.\/\-]?\s*(20\d\d|1\d\d\d)/img);addMoreDates(/(january|february|march|april|may|june|july|august|september|october|november|december|jan|feb|mar|apr|may|jun|jul|aug|sep|sept|oct|nov|dec)[\s,.\/\-][\s,.\/\-]?\s*([1-9]|0[1-9]|[12]\d|3[01])(st|nd|rd|th){0,1}[\s,.\-]+(20\d\d|1\d\d\d)/img);addMoreDates(/\b([1-9]|0[1-9]|[1][012])[\s\/\-.,]\s*([1-9]|0[1-9]|[12]\d|3[01])[\s\/\-,.]\s*(20\d\d|1\d\d\d)\s*\b/img);addMoreDates(/\b([1-9]|0[1-9]|[12]\d|3[01])[\s\/\-.,]\s*([1-9]|0[1-9]|[1][012])[\s\/\-,.]\s*(20\d\d|1\d\d\d)\s*\b/img);addMoreDates(/\b(winter|spring|summer|fall|autumn|january|february|march|april|may|june|july|august|september|october|november|december|jan|feb|mar|apr|may|jun|jul|aug|sep|sept|oct|nov|dec)[\s,.\/\-][\s,.\/\-]?\s*(20\d\d|1\d\d\d)\b/img);addMoreDates(/(20\d\d|1\d\d\d)[\s,.\/\-]\s*(winter|spring|summer|fall|autumn|january|february|march|april|may|june|july|august|september|october|november|december|jan|feb|mar|apr|may|jun|jul|aug|sep|sept|oct|nov|dec)/img);addMoreDates(/\b(20\d\d|1\d\d\d)(0[1-9]|[1][012])(0[1-9]|[12]\d|3[01])\b/img);tags.sort(function(a,b){var aVal=Date.parse(a);var bVal=Date.parse(b);if(aVal===bVal){return 0;}if(aVal>bVal){return 1;}return -1;});if(tags.length===0){tags=['No dates were detected in the page.'];}document.body.insertAdjacentHTML('afterbegin','<div id="showTagsWithDate" style="background-color:white;color:black;">The page was last modified on '+document.lastModified+'<br>Dates in the HTML in multiple numeric and English language formats:<ul/></div>');var myul=document.body.firstChild.lastChild;tags.forEach(function(tag){myul.appendChild(document.createElement('LI')).appendChild(document.createTextNode(tag));});document.body.firstChild.appendChild(document.createElement('BR'));})())

Always include an Access Date for a webpage or web resource

Even if you include any other date, you should always include the date you accessed the page, an Access Date. The Access Date is the only date which you know is correct and is critical information needed for someone to verify they are seeing the same resource you did.

Referencing is about providing enough information such that a reader can verify the information being referenced with the source material, or read more detail if interested. Unlike paper publication, webpages are not static. At any time, they can be changed or removed by the person/company in control of the website, or disappear if the company/institution goes out of business. Thus, even if there is a "publication date" contained in the page, the date you accessed the page should always be provided. If you do not provide the access date, then there is no way for a reader to know exactly which version of a page you were viewing. While a particular webpage may not have changed from the time it was created to when you viewed it, there is no way for you to know that it will never change in the future.

The date you accessed the page can be included in the reference as a note (e.g. (accessed xxxx-xx-xx)).

Create an archive at the time you access the page

When using a webpage as a reference, I almost always cause at least one archive to be created at the time I view/reference the webpage. This helps ensure that readers wishing to view the source material are able see exactly the same page I did even if the webpage is changed or removed.

In addition, depending on the format you are using for the reference (e.g. online vs. paper), you may be helpful to include in the reference a direct link to the archive which was created of the page you are referencing.

To make an archive at the time I am viewing the page, I commonly use a bookmarklet to archive.org, or other archiving site, which causes them to create an archive at that time.

Bookmarklet to create an archive of the current page on archive.org

That bookmarklet to create an archive of the page you are currently viewing on archive.org is1:

javascript:void(window.open('https://web.archive.org/save/'+location.href))

In my opinion, archive.org is, by a significant margin, the most well established and stable archiving site. However, there are others. Below are bookmarklets for a couple of additional archiving sites:

Bookmarklet to create an archive of the current page on archive.is

A bookmarklet to create an archive of the page you are currently viewing on archive.is. To actually create the archive, you will need to click on "Save the page" on the page which is opened when you use this bookmarklet.1:

javascript:void(open('https://archive.is/?run=1&url='+encodeURIComponent(document.location)))

Tools for getting reference data, including dates, from webpages

There are many tools which can help you obtain reference information from webpages, including dates. If you are not already using such a tool, one place which you can find free tools to view reference information contained within a webpage is to investigate the ones used on Wikipedia. While the formats used for references on Wikipedia is probably not appropriate for most cases, Wikipedia has been dealing with the issue of obtaining reference information from webpages for quite some time. Their page "Help:Citation tools" contains multiple tools which will extract dates, and other information, from webpages and display it in a format which you can copy into whatever reference format you are using.

Any tool you use to extract reference information is just a tool. The quality of the information extracted will depend significantly on how up to date the tool is relative to any changes on a specific site. You will need to review for accuracy any information provided by such tools.

Notes:
1. For security reasons, StackExchange does not permit JavaScript links to be created in pages. Thus, if you want to use the above bookmarklet you will need to manually create it. The above text should go in the "location" area of the bookmark.

2. If you are using Firefox, when you save the bookmarklet, all of the spaces will be automatically translated to %20. If you entered it with %20s, Chrome would do the reverse (translate all %20s to spaces). IE leaves either the space, or the %20, as you entered it. As a result, if you are using Firefox, don't worry when you see the bookmarklet location looks something like: javascript:void(window.alert('The%20page%20was%20last%20modified%20on%20'+document.lastModified))


As prevalent mostly in Computer Science, while referring to websites you mention the date on which you accessed the referred link. For example, if you want to refer http://www.example.com on Absolute Random Topic, you could do it as follows

[1]: Absolute Random Topic, http://www.example.com (Accessed on: 02/08/2016)


I usually use this site when citing websites:

https://archive.org/web/

It gives you information on when the page was last updated - that's about the best you can do I think (beats just putting it down as the current year anyway).

Tags:

Citations