Frozendict in Python (PEP 814): The Safer Default-Argument Story for Mappings

Python’s mutable-default-argument gotcha is infamous. In my earlier post, I went one step further and tried to exploit it: using a mutable default as a state bucket, then showing why it breaks the moment you want multiple independent instances (and why it gets even uglier around concurrency). Now there’s a language-level proposal that’s relevant to the same theme—but in a much more principled way. PEP 814 proposes a new built-in type: frozendict, an immutable mapping designed to be “safe by design.” It’s currently a Draft targeting Python 3.15. ...

December 25, 2025 · 4 min

How I Slashed a 1 million Email processing Pipeline from 11 Days to 38 Hours with Lightweight Parallelism

In the era of Generative AI, the quality and scale of data processing have become more critical than ever. While sophisticated language models and ML algorithms steal the spotlight, the behind-the-scenes work of data preparation remains the unsung hero of successful AI implementations. From cleaning inconsistent formats to transforming raw inputs into structured information, these preparatory steps directly impact model performance and output quality. However, as data volumes grow exponentially, traditional sequential processing approaches quickly become bottlenecks, turning what should be one-time tasks into resource-intensive operations that delay model training and deployment. For organizations working with moderate to large datasets—too small to justify a full Hadoop or Spark implementation, yet too unwieldy for single-threaded processing—finding the middle ground of efficient parallelism has become essential for maintaining agile AI development cycles. ...

April 18, 2025 · 16 min