Another year has passed and a lot has happened in Alt All In’s second calendar year in existence. I spent quite a bit of time in 2017 squashing bugs and tweaking the newsletter, and backend, to get it running smoothly. There are quite a few moving parts to the software, from data ingestion, to link scoring, to the bespoke CMS that I use to build and send the newsletter via Mailchimp. And it breaks. And when it breaks that usually means I’m not consuming data making for an out of date morning digest. The feeds I’m grabbing also tend to either get updated, fail spectacularly, or implement DDOS software which means running a headless browser to fetch links. Yes, this is the world of Alt All In.
There are 49 feeds in the database but not all are being scraped. As the newsletter has evolved, I’ve added, but never deleted, and stopped scrapping the endpoint as the feed wasn’t getting enough traction in the newsletter. As per my motivation, this is totally subjective, but the editor must do what the editor must do. In 2017, I “retired” 13 feeds. In 2018, I retired two feeds: The Drum and Medium articles tagged Social Media.
There are 994,039 links in the database. In 2017, I collected 236,685 links. In 2018, the number was down 190,518. And before you do the math, I have links going all the way back to 2015.
As most of the feeds are pointing to their own domain, e.g. the NY Times Technology feeds serves all domains at nytimes.com, listing the feeds with the mosts links is relatively easy. The only outliers are Hacker News and Techmeme where 90% of the links are to posts outside of their own site as they are aggregators.
This is where it gets a little tricker: in order to query the database based on a domain, I first needed to slice and dice the Hacker News and Techmeme data to get those URLs into the mix. The top five domains I collected (as there’s no guarantee I pulled everything through) from Hacker News and Techmeme in 2018:
Running these domains against the database, in 2018, yields the top links by domain:
Is Quartz really pumping out 39.4 articles per day?
I scrape both Facebook and Twitter to derive a “share count”. Facebook comes from a simple hit to their API. Twitter is a little more in depth as they removed this functionality a couple of years ago. In order to aggregate this number from Twitter I search the URL, via their API, and page through the results yielding the total number of shares.
And there you have it. The only other real number that counts: in 2018 I sent out 207 issues for a total of 468 newsletters sent since inception. I finally managed to get a date-based archive set up this year. This is the first time I’ve run a review of the data I’ve been collecting and it’s been interesting for me to see the numbers in full. It’s also good to get an understanding of the different volumes that the feeds are kicking out in order to make informed decisions on where to hunt for links.
Hopefully, this has given you some insight into the inner workings of Alt All In and what goes in to producing it. If you’re not already a subscriber then please sign up. And if you are please don’t hesitate to hit that forward button and send it to everyone you know.
Sunday January 6th, 2019