Every year, students in my Introduction to Digital Humanities class want to form their final projects around analyzing Instagram, and it’s always been a struggle to help them collect the data. This year, I found a Python module – Instaloader – that does it and even has several advanced options that are intriguing for future areas of exploration.
It’s worth bearing in mind that downloading content from other people’s social media posts requires ethical engagement with issues of consent, and it’s key to be thoughtful and careful about how the images and content are analyzed. In the context of my class, students are often looking to create Imageplots and Image Montages, which only display images at a small size and in a low resolution.
Acknowledging that these issues are thorny, it’s immensely helpful to have a way to analyze this material, so Instaloader is key to our projects (at least for now). Below are the simple steps I used to get started and troubleshoot some issues I had. This isn’t a comprehensive tutorial, but hopefully it’s helpful, especially when paired with Instaloader’s own documentation.
- Login to Instagram using Firefox
- It’s important to use Firefox as your browser for this
- If you encounter issues later, it may be worth clearing your cookies in Firefox and logging into Instagram again.
- Download the login script that Instaloader shares on this page. (Here is the direct download to that script file.)
- Leave the script file titled “615_import_firefox_session.py”
- Make sure to save it to the location on your computer where you’ll run the python code. For me, that meant within the “Users” section of my computer, within my username.
- Open Terminal (on a Mac), and then enter:
- pip3 install instaloader
- python 615_import_firefox_session.py
- instaloader yourusername usernameyouaredownloading
- It should start downloading every image from the profile that you indicated, creating a folder with the profile name, with all of the images as jpg, the videos as .mp4, and .txt files for each post with the image/video caption. Filenames are the date and time of the post, and each image/video within a post is collected (not just one from a multi-image post).