Instagram to Cloudinary

We're going to walk through setting up a Node.js project that will fetch Instagram posts, filter by a set of hashtags, then upload those images to Cloudinary, and then trigger a Netlify build. This script will run on a schedule through Github Actions.

tl;tr

Here's the instagram-cloudinary repo if you'd like to skip to the final code.

The Problem

First, a little history...

I've been using the gatsby-source-instagram plugin for a while on my site to highlight some of my illustrations. I even added some functionality to the plugin so consumers could query their posts by hashtag.

The plugin author has done a great job tracking down issues and documenting the convoluted steps to get Facebook/Instagram API credentials. The plugin worked great for a while, but due to many changes in the Facebook/Instagram API, it became less reliable and harder to debug.

It got to a point where I couldn't update my site because of the errors from the Instagram API. I believe that part of the problem is the frequency of querying the API, where it seemed to only work every few minutes, throwing an error in between of "Please try again later." When working on the site locally, it would cause a cascade of errors in Gatsby's graph. Because of this and the ever-changing permissions in the Facebook API, the plugin has been much harder to maintain.

To query the Instagram API less frequently and have more control, I decided to set up a Node.js project to fetch posts from the Instagram API, filter and find hashtags, and then upload those to Cloudinary. When the posts send successfully, the script fires a Netlify build via a webhook.

The Goal

To make this work for my case, I jotted down what I needed this script to do.

  1. fetch the latest posts from Instagram
    • make sure we get the first few comments (I used to put the post's hashtags in the first comment)
  2. upload the posts that match my hashtag list to Cloudinary
  3. trigger a build of my site (my site is running gatsby-source-cloudinary)

Setting up a new project

The first thing to do is set up a new Node.js project. There are quite a few starters out there, but, for this, I just started from scratch.

Create the project folder and navigate to it.

Initialize the package.json. I use yarn, but feel free to use npm.

This is what my package.json file looks like after initialization:

I like to use TypeScript so that's the first dependency I'll add here. The -D flag sets them as devDependencies. ts-node is going to help run the script.

Next, it's time to set up the tsconfig.json file. I usually run the init and update the items as necessary.

This is how mine looks after cleanup.

Since we'll be using API keys for Instagram and Cloudinary, we can add dotenv to help us with the environment variables.

Next is adding the script file in the src folder.

In this file we can add some pseudo code to help us step through. Our src/index.ts can look like this:

Instagram Query

Let's get those posts!

We'll start with adding axios as a dependency. Axios will help us with the requests.

Next, let's set up some environment variables for Instagram. For this, you'll need an access token and Instagram id from Facebook so that you can query Instagram. Honestly, I don't remember how I got these working correctly but, the directions on gatsby-source-instagram were helpful.

Create a .env file at the root of your project and add your values like:

Now that we have these ready, we can start putting the script together. I'll put the Instagram fetching in its own function, so it's easier to reason about later.

First, we add the require("dotenv").config(); so we have access to the environment variables that we set up above.

For the request, we're using axios and building the URL with our parameters. The parameters are specialized to what we need for grabbing all the user's posts and the first three comments on those posts.

We're using the environment variables declared above to place into these parameters for the id and access token. The MAX_POSTS is a setting that we can increase whenever we need to do more or decrease if we're only trying to maintain the last few posts posted to Instagram.

We then use an await to make sure the posts resolve before moving on to the next step. We can also add a try-catch in case there is a failure.

Prepare the Posts by Hashtag

Let's do this section in its own function as well. We can add the invocation of it to instagramToCloudinary(). We'll add a type of UploadPost that we can define in src/types.ts and import into our script. The UploadPost array will store our data in a way that we can pass to Cloudinary more easily.

Now, let's create the convertInstagramPostToCloudinaryEntity() function. This one is fairly deep, so I'll try to walk through it in pieces, then put it all together. We can stub out the function and add a type of Post which is, roughly, the shape we get back from the Instagram API.

Now that we have a function ready to start with, we need to figure out how to filter the posts based on a hashtag. I opted to create a list of the hashtags I would like to showcase on my site. The list has an id and a regex because I messed up the hashtags a few times on Instagram and didn't want to go back through to fix them.

First, we'll loop over the hashtag config and filter down to the posts that have the hashtags that we want. Then, we'll be able to combine the hashtags, if a post has more than one and create an entity that will be easier for the upload step. We can store these in a variable outside of the loop that will be returned at the end of the function.

Upload to Cloudinary

Now that we have an array of Cloudinary-like entities, we're ready to move on to uploading to Cloudinary. To achieve this, we'll need to make sure we have upload credentials.

Cloudinary API Credentials

If you're new to Cloudinary, they have a very generous free tier where you can sign up at cloudinary.com or use this invite link. Once you're signed in, you can follow the steps below.

If you have a Cloudinary account, you can log in and go to Settings -> Security -> Access Keys. Here, you can add a new pair, which will be used in this script.

Once finished, you should see all the details you'll need at the top of your Cloudinary dashboard.

screenshot of Cloudinary's dashboard, showing api credentials

Cloudinary Client

Let's get the Cloudinary client connected. We'll need to add some items to our .env file.

We can now use these in our app to help us connect to Cloudinary. Here is the connection config. I placed this at the top of the index file, under the imports.

Start the Uploads

We're now ready to start the upload process. We'll start by creating a new async function because we want to know if all of the uploads made it successfully.

Cloudinary doesn't have a bulk upload script but their individual upload is very quick and can handle many uploads simultaneously so we can just loop over our entities and send them up. Here's the upload API we'll be using. There are a lot of options that can be adjusted based on the projects' needs. Here, we're just using the basic options.

I'm using an array of promises to help me determine if any had an error while uploading. This may not be necessary for all cases but I like to know, especially before kicking off a new build of the site.

The last argument in the upload function is a callback that can help us determine if the upload was successful or not. We can check against this in our successfullyResolved check and return an appropriate status.

Trigger Netlify Deploy

If all the images were uploaded successfully, we'll trigger the build on Netlify. First, we'll need our Netlify webhook. You can set one up in the deploys settings https://app.netlify.com/sites/<account>/settings/deploys under the Build Hooks section. Once you have the hook, you can add that to your .env file.

We'll now set up a conditional build script using axios.

Our script should look something like this now:

Github Action to Run on Schedule

In your Github project you can create an action and use the following config for a daily run.

The - cron: "0 0 * * *" is what configures the action to run once a day.

Conclusion

Offloading my Instagram posts to Cloudinary has made my site build more reliably, especially when I'm doing local development. The Cloudinary API has been rock solid for me so far and it's easier to work with.

Hopefully, this process is broken up enough that any service could be swapped out for another with little effort. I can see other social channels being aggregated into this pipeline as well.

CategoryDev
Discuss this article on Twitter