Twitter website doesn't have open graph tags?
Twitter uses client-side-rendering (CSR) to generate HTML in the browser
Viewing the source directly will not show any of the relevant <meta>
tags or actual page HTML content, because it is all dynamically generated on the client's browser in React using JavaScript (i.e. CSR: Client-side rendering). In fact, the HTML source will have a stub containing "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?". This can be verified by opening up developer tools and peeking at the "Elements" tab during page load/render or downloading the page without JavaScript emulation.
However, to improve Search Engine Optimization (SEO) for various prominent web-crawlers, Twitter will instead return server-side-rendered (SSR) HTML content (which does contain the <meta>
tags). This enables crawlers to not have to emulate JavaScript to view the page, and only crawl raw HTML content. Twitter recognizes crawlers based on the supplied User-Agent
HTTP Header. Server-side-rendering is generally a more expensive operation than offloading the HTML rendering onto the client, which may be a reason why Twitter opts for client-side-rendering as the default behavior.
Bypassing the User-Agent
whitelist to receive server-side-rendered (SSR) HTML
Various prominent web-crawlers are whitelisted by Twitter to receive server-side-rendered HTML. By spoofing the User-Agent
HTTP Header in your own request, you can bypass the whitelist and receive server-side-rendered HTML containing the relevant <meta>
tags (whether or not this is recommended is a totally different subject matter). For programmatic HTTP requests, check for support for changing the User-Agent
HTTP Header in your HTTP library - most non-trivial libraries support this functionality.
whatismybrowser.com
has a list of well known web-crawler User-Agent
headers; some of these web crawlers are whitelisted (but not necessarily all). At the time of writing, here are some working user agents:
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
Mozilla/5.0 (compatible; Discordbot/2.0; +https://discordapp.com)
It looks like twitter is allowing the facebook crawler to view their open graph tags. If you can set your user agent similar to what is described in the Troubleshooting section on the facebook crawler site, the full set of tags appears.
$ curl -s --compressed -H "Range: bytes=0-524288" -H "Connection: close" -A "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)" "https://
twitter.com/sharifshameem/status/1284095222939451393" | grep -i 'og:'
<meta property="og:type" content="video">
<meta property="og:url" content="https://twitter.com/sharifshameem/status/1284095222939451393">
<meta property="og:title" content="Sharif Shameem on Twitter">
<meta property="og:image" content="https://pbs.twimg.com/ext_tw_video_thumb/1284094287383166977/pu/img/LsArMNT3djA7xg53.jpg">
<meta property="og:description" content="“I just built a *functioning* React app by describing what I wanted to GPT-3. I'm still in awe. https://someurl”">
<meta property="og:site_name" content="Twitter">
<meta property="og:video:url" content="https://twitter.com/i/videos/1284095222939451393?embed_source=facebook">
<meta property="og:video:secure_url" content="https://twitter.com/i/videos/1284095222939451393?embed_source=facebook">
<meta property="og:video:type" content="text/html">
<meta property="og:video:width" content="1200">
<meta property="og:video:height" content="696">
Without specifying the user agent:
$ curl -s "https://twitter.com/sharifshameem/status/1284095222939451393" | grep -i 'og:'
<meta property="og:site_name" content="Twitter" />