Building high-perf image processing pipeline to create vernacular catalogs

Preview: Catalog for Hindi and English created on run-time.

Why an async pipeline for processing images?

Reading, manipulating, and saving images are high compute tasks and take time. Doing these operations on the main app-server may create a race-condition for other critical APIs. For a holistic experience, it’s better to execute this task on separate workers pool.

A good rule of thumb is to avoid api requests which run longer than 300ms. ~ Experience

The major components in processing pipelines are:

Setting high throughput message broker

Broker selection depends on the nature of the data. For payment-related transaction data, it’s advised to go with a persistent distributed queue like Kafka. Whereas for short-lived jobs, when the scale is preferred than consistency, an in-memory broker like Redis is a strong candidate. Since persistence is not the main goal of this data store, disabling Redis snapshots add quite a lot to the performance.

Redis Push and Pop operation on Google’s n1-standard-2.

Workers pool

The key concern to be addressed while selecting workers framework is:

  1. Having a lock to avoid executing the same job in multiple workers.
  2. Able to handle and store failed jobs to be able to triage.
  3. Able to prioritize jobs based on the message.

Monitoring and Dashboard

RQ-dashboard provides a necessary basic view of queues with pending and failed tasks. It’s a lightweight, Flask-based web front-end to monitor your RQ queues, jobs, and workers in realtime.

Dashboard showing the state of workers and job

Implementation with Python and Redis

Architecture diagram in GCP.
{
"id": 1,
"style": {
"fill_color": "white",
"stroke_color": "black",
"x": 512,
"y": 900
},
"title": {
"hindi": "जोकर",
"english": "Joker"
},
"fonts": {
"hindi": "NotoSans-Bold.ttf",
"english": "NotoSans-Bold.ttf"
}
}
Image’s title showing Hindi and English script based on the above mentioned config.

Sample Project on Docker

The sample project is available on https://github.com/arinkverma/vernacular-image. It’s free to download and explore.

Other use-cases for Image processing pipeline

  1. Adaptive resolution: Adjusting image size and resolution for a better experience across screen size.
  2. Annotations: Adding badges, icons over the image to grab attention.
  3. Creative Filter: Treating image to blend and make the image visually appealing
Example of image processing in Native Ads and OTT thumbnail

Further Reading

  1. Redis’s Push/Pop operation: https://redis.io/commands/rpush
  2. RQ: https://github.com/Parallels/rq-dashboard
  3. RQ-Dashboard: https://github.com/Parallels/rq-dashboard
  4. Wand.py, ctypes-based simple ImageMagick binding for Python.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Arink Verma

Arink Verma

Code, arts, process and aspirations. co-Founded GreedyGame | IIT Ropar. Found at www.arinkverma.in