Blog tag

#distributed systems

11 posts tagged with distributed systems.

← Back to all posts
4 min read

From Hardcoded Logic to Agent-Driven Routing: Refactoring ClawHub’s Orchestration Layer

How I replaced ClawHub’s monolithic routing node with agent self-determination to improve scalability and reduce coupling.

agent architecturesystem designrefactoringscalabilityClawHubdistributed systems
Read more
4 min read

Building a Resilient Task Requeue Mechanism in GhostGraph: Recovering Orphaned Pipeline Jobs

How I built a requeue endpoint in GhostGraph to revive stalled Redis Stream jobs and maintain pipeline integrity.

Redis Streamstask queuespipeline resilienceGhostGraphdistributed systems
Read more
4 min read

How I Built a Real-Time Fleet Dashboard for Distributed Scraping Workers in GhostGraph

I built a lightweight, real-time dashboard to monitor GhostGraph's distributed scraping workers using FastAPI, Redis Streams, and server-sent events.

FastAPIRedis Streamsreal-time monitoringdistributed systemsweb scrapingPython
Read more
4 min read

Building Autonomous Browser Agents: How I Scaled Vultr Crawler with Session Management and DOM Distillation

How I built stateful, token-efficient browser agents in Vultr Crawler using session APIs, DOM distillation, and autonomous action loops.

web scrapingbrowser automationLLM optimizationREST APIdistributed systems
Read more
4 min read

Building a Smarter Web Crawler: How I Implemented Two-Phase Intelligent Exploration in Vultr Crawler

I rebuilt my web crawler to move beyond brute-force scraping—now it learns patterns and adapts in real time.

web crawlingPlaywrightRedispattern recognitiondistributed systems
Read more
4 min read

How I Fixed Hung Connections in My Distributed Crawler with Hard Timeout Enforcement

I stopped silent network hangs in my Python crawler by layering signal-based hard timeouts over curl_cffi and adding IP rotation to preserve throughput.

PythonWeb ScrapingDistributed Systemscurl_cffiTimeoutsDebugging
Read more
4 min read

How I Scaled a Distributed Crawler with Atomic Redis State Management

How atomic Redis operations fixed state corruption during worker shutdowns in my distributed Vultr Crawler.

redisdistributed-systemsweb-crawlerpythondata-consistency
Read more
4 min read

Migrating Job State Management from Redis to Postgres: Why I Centralized Crawler Jobs in a Single Source of Truth

I moved job claiming in the Vultr Crawler from Redis to Postgres for better consistency, auditability, and operational simplicity.

distributed systemsPostgresRedisjob queuescrawler architecturedata consistency
Read more
4 min read

Replacing ARQ with a Unified Redis Streams Worker: Why I Simplified My Distributed Task System

I replaced ARQ with a lightweight Redis Streams polling worker—cutting 6k+ lines and improving reliability across my scraping fleet.

pythonredisdistributed systemstask queuesarchitecture
Read more
4 min read

Migrating from ARQ to Motia: Building a Lightweight, Event-Driven Worker Framework for Scalable Scraping

I replaced ARQ with my custom event-driven framework Motia to gain control, clarity, and reliability in my scraping workflows.

PythonBackground JobsDistributed SystemsWeb ScrapingSystem Design
Read more
4 min read

Building a Smart Crawler with LLM-Powered Extraction and ARQ Task Orchestration

How I used LLMs and ARQ to build a self-adapting, scalable web scraper that survives real-world site changes.

web scrapingARQLLMdistributed systemsPythondata pipelines
Read more