Jason Frasca |||

Selective Web Scraping

Over the last few weeks, my research has led me to web scraping tools.

I prefer web scrapers that rely on low-code, no-code, or Chrome Extensions. I’m not interested in learning beyond Python’s rudiments.

What I noticed is web scraping tools and the people behind them like to download hordes of data. This creates a new, bigger problem - dirty data.

Dirty data requires extensive data scrubbing. Scrubbing can outweigh the benefits of scraping.

Save time acquiring data; lose more time cleaning and sorting it.

I prefer clean, organized, uniform data in all that I do.

These two points led me to scraping tools that privilege selection and control over masses of unorganized data.

A slower scrape that affords me control of the extraction and ensuring data organization. Minimizes cleaning and organizing.

Two Chrome Extensions I’ve found useful for this approach:

  1. Link Klipper to extract ahref data - a Chrome Extension
  2. Simple Mass Downloader - to extract files via selected links - a Firefox Extension
Up next Drafts Action Posts to Blot Amped up to share my first Draftsapp action with the community. The Drafts Action is: Publish Draft as a Post on Blot.im Kind of meta, as I’m using Trust At A Glance - TAAG In rifling through the comments of a Product Hunt post I noticed an unfamiliar acronym: Competence At A glance [CAAG]. A quick Google search did not
Latest posts Keyword Research to Determine Supply and Demand Building Long IF Statements in Google Sheets Thoughts: Super Wildcard Weekend 2021 Website Images: The Hidden Sunk Cost Fall 2020 Semester Recap - No Code ROT13 to Obfuscate for Better Search Results 2020 MIX Lab - A Year in Review Differences in SEO Tool Evaluations Improvements to the 3D Printing Lab’s Network Trust At A Glance - TAAG Selective Web Scraping Drafts Action Posts to Blot Write Once Use Twice Research Ways To Present My Newsletter Podcast Editing Migraine Journal in Airtable Pandemic Network Effect Helping Local Business During the Pandemic Getting the Copyright Right Juxtaposed Setting Up The Day With Intention How I Measure Sprint Progress What Will The Kids Do This Summer? Power of the Daily Standup Which Conferencing Tool Records Audio? Wordpress vs Ghost vs Squarespace vs Blot via Dropbox Back of the Napkin Sketch




© 2020 Jason M. Frasca