Nguyen Pham

Python Functions with Multi-Valued Parameters

Python Functions with Multi-Valued Parameters str.strip() takes a string that is treated as a set of characters. >>> " Hello world! ".strip(" !") 'Hello world' str.startswith and str.endswith accept a tuple of values. >>> files = ["a/file1.md", "b/file2.py", "c/file3.txt"] >>> [f for f in files if f.startswith(("a/", "b/"))] ['a/file1.md', 'b/file2.py'] >>> [f for f in files if f.endswith(("md", "txt"))] ['a/file1.md', 'c/file3.txt'] re.split takes a string that is actually a regex pattern.

Save history When AI agent gives up and rolls back all session changes before committing to Git, the savior is local saves in VSCode Timeline.

BigQuery Pipe syntax examples

BigQuery Pipe syntax examples with products (product_id, price) as ( select 'p1', 5 union all select 'p2', cast(NULL as integer) ) select * from products |> aggregate count(*) as count_all, count(price) as count_not_null |> extend if(count_all = count_not_null, 'PASS', 'FAIL') as test_result -- count_all count_not_null test_result -- 2 1 FAIL

A Halloween data horror story

A Halloween data horror story 🎃 We were migrating system U to system P. Q: How to calculate user onboard duration? A: We don’t, we copy the value from [deprecated system] field XYZ_OBD_WK Q: Can we ask the developer for that calculation? A: He/she has left 2 years ago. Q: Can we see the code? A: Here it is - a single 5k-line stored procedure 😱

Bash command status check

Bash command status check Do not check for command output - it might fail to detect failure Check for command return value instead run_ls() { ls "$@" || { echo 'RUN_FAILED'; } } # missing \ at the end of line below CMD_OUTPUT=$(run_ls -a ) if [[ "$CMD_OUTPUT" == *"RUN_FAILED"* ]]; then echo 'Fail' else echo 'Pass' fi ~ ❯ ./test.sh ./test.sh: line 8: -a: command not found Pass run_ls() { ls "$@" || { return 1; } } # missing \ at the end of line below CMD_OUTPUT=$(run_ls -a ) CMD_STATUS=$?

dbt Model Versioning

dbt model versioning to avoid breaking downstream consumers Models A -> B -> Consumers. There is a major change in B to be observed on production for some time before publishing to consumers. Versioning model B is the perfect approach for this use case to test-drive new logic while limiting code changes to a single model. Existing consumers: SELECT * FROM b; New logic data: SELECT * FROM b_v2; models: - name: b latest_version: 1 versions: - v: 1 defined_in: b config: alias: b - v: 2 defined_in: b_v2

Auto Daily LeetCode Notification

Automating Daily LeetCode Notifications My simple solution for daily LeetCode notifications: 💡 Tech Stack: GitHub Actions: handles runtime and scheduling. curl: fetches data from GraphQL. jq: extracts problem details. grep: derives new or solved status. JS: creates GitHub issues to trigger email notifications. flat file: extracts and stores solved problems. ❌ Avoided: Airflow, Python, Kafka, Redis, and S3 See it in action Auto-append solved problems echo "message=$(git log -1 --pretty=%B)" >> $GITHUB_ENV if [[ $message =~ ^[0-9]+$ ]]; then echo "is_integer=true" >> $GITHUB_ENV fi

Python-dotenv not updated vars

Python-dotenv not updated vars python-dotenv by default does not override existing environment variables. Always use load_dotenv(..., override=True) to make sure our configs are updated. 🔴 🔴 export FOO=gotcha echo 'FOO=BAR' > demo.env cat << EOF > demo.py from dotenv import load_dotenv import os load_dotenv('demo.env') print(os.environ['FOO']) EOF python demo.py gotcha echo $FOO gotcha 🟢 🟢 export FOO=gotcha echo 'FOO=BAR' > demo.env cat << EOF > demo2.py from dotenv import load_dotenv import os load_dotenv('demo.

BigQuery UNION syntax benefits

UNION with Fewer Columns LEFT UNION ALL BY NAME with products_1 as ( -- primary source select 'p1' as product_id, 5 as price union all select 'p2', 10 ), products_2 as ( -- secondary source select 'p1' as product_id union all select 'p3' ), products_new as ( select p2.* from products_2 p2 left outer join products_1 p1 on p1.product_id = p2.product_id where p1.product_id is null ) select product_id, price from products_1 union all select product_id, null from products_new with products_1 as ( -- primary source select 'p1' as product_id, 5 as price union all select 'p2', 10 ), products_2 as ( -- secondary source select 'p1' as product_id union all select 'p3' ) select * from products_1 |> left union all by name ( -- products_new select p2.

BigQuery Pipe syntax benefits

End of Subquery Madness “pipe operator can see every alias that exists” with products as ( select 'p1' as product_id, 5 as price union all select 'p2', 10 ), product_classification as ( select *, case when price > 5 then 'Expensive' else 'Cheap' end as classification from products ) select product_id, classification from product_classification where classification = 'Cheap' -- error without subquery with products as ( select 'p1' as product_id, 5 as price union all select 'p2', 10 ) select product_id, case when price > 5 then 'Expensive' else 'Cheap' end as classification from products where classification = 'Cheap' -- ERROR Unrecognized name: classification with products as ( select 'p1' as product_id, 5 as price union all select 'p2', 10 ) select * from products |> extend case when price > 5 then 'Expensive' else 'Cheap' end as classification |> where classification = 'Cheap' |> select product_id, classification

Data Lake Analytics Lead | DB Architect | Multi-Cloud | R, Python, SQL

Timeline Tags LinkedIn

2

-

3

2/3