Capturing web content¶

Content on the internet is ephemeral: a lot of material I have linked to in the past has disappeared. That is also true for pages that I wanted to read later.

Over the years, I have tried different services to keep track of links I still want to read, and ways to keep local copies of content.

I currently use two main tools:

Obsidian's web clipper works well to quickly put the content of a page in my notes, in an inbox folder. With an extra plugin, it can also store copies of images on my own machine.
I have used Zotero for quite some time as a bibliographical reference manager, to work with more permanent literature and documents that I want to cite.

A major benefit of these options is that both capture the content from the browser, rather than accessing a URL independently. This makes it easy to also capture content behind paywalls, and store it the way I see it myself.

Other options¶

ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline. It works with URLs, browser history, bookmarks, Pocket, Pinboard, etc., and saves HTML, JS, PDFs, media, and more…

I have worked with (and paid for) Omnivore, Pocket, Pinboard, tt-rss. I still use Feedly as an RSS feed reader and have used it to capture content for later use.