Iāve noticed that there is some issue with this blogās RSS feed, but I havenāt got a chance to look into it. Since my collection of web feeds continue to grow and they all function perfectly, this broken experience becomes increasingly annoying, so I decide to spend some time fixing it.
Problem
Basically I have noticed two problems
- The feed doesnāt render properly in RSS reader
- The feed doesnāt update (staled) when there is new posts published
Diagnose Cause
According to wikipedia
RSS is a web feed that allows users and applications to access updates to websites in a standardized, computer-readable format.
In order to figure out why the feed isnāt rendering properly, I choose to validate it against the RSS Specific first using RSS validator to see if thereās any syntax issue. The validation turns out to be fine, so I assume there might be some optional elements (see š) missing in my case. I checked the generated feed and noticed that both
content
and enclosure
are missing.RSS Specification
- Channel elements
- title (required)
- description (required)
- url (required)
- ttl (optional)
- pubDate (optional)
- etc
- Item elements (all optional besides title/description)
- title
- description
- enclosure
- content (custom element)
- pubDate
- etc
How to fix
To add the missing elements to the feed, I need to figure out howās the RSS feed gets generated first.
This blog is actually a static website powered by Next.js with Notion as data source. Image šĀ gives a glimpse about its tech stack.
Data source are blog posts that made available either through Headless CMS or Local File System. Since Iām using Notion as a headless CMS, the latest feed of this site can only be derived from accessing the Notion APIs.
Headless CMS
Data (Posts) made available at run time (e.g. pulling from APIs). Publishing new posts will not require new deployment of this site and latest feed can only be fetched from remote API.
Local File System
Data (Posts) made available at build time (e.g. parsing static markup files). Publishing new posts will trigger new deployment of this site and latest feed gets updated as part of the deployment process.
However, since the official Notion client doesnāt offer any methods for rendering each post, I need a custom html renderer doing this. Iāve found one that works.
const { html } = await NotionPageToHtml.convert(`https://notion.so/${pageId}`, { bodyContentOnly: true });
But this also hits the Serverless Function Execution Timeout of Vercel free tier limit. I end up using some file caching to get around it.
const fileCache = path.join(process.cwd(), 'rss.data.json') const cache = JSON.parse(fs.readFileSync(fileCache, 'utf-8')) for (const pagePath of Object.keys(siteMap.canonicalPageMap)) { // ... if (cache[pageId]) { feedItems.push(cache[pageId]) continue } // ... cache[pageId] = feedItem feedItems.push(cache[pageId]) }
Ā
Once the missing elements are being added, the change wonāt be immediately live due to the second problem. itās not hard to guess that is related to caching. Thereāre two levels of cache worth noticing, one is the <ttl> sub-element of the above channel element, the other one is the feed endpoint (i.e. /feed of this site) has a
Cache-Control
directive set to public, max-age=86400, stale-while-revalidate=86400
. Instead, I reset it to public, max-age=0, must-revalidate
to disable caching so that all latest posts will be āseenā by RSS readers.