Full-Stack Developer & Content Creator

Hello, I'm DavidHE

Over 3 years of full-stack development experience, specializing in building large-scale, high-performance, and scalable web applications. Combining digital content creation and media experience to deliver comprehensive solutions.

  • Full-Stack Development
  • AI Tools & Ecosystem
  • Content Creation
  • Digital Media
3+ Years Experience
10000 Followers
Full-Stack Developer
Abstract 3D illustration representing innovation

Crafting Digital Excellence

Over 3 years of full-stack experience, specializing in building large-scale, high-performance, and scalable web applications. Comfortable across the entire product lifecycle—from front-end interface design to back-end business logic and databases—across multiple technology stacks.

DavidHE professional portrait ✨ Full-Stack Developer

Dedicated to User-Centered Innovation

I build large-scale web applications with strong performance and scalability. I combine solid engineering with hands-on digital content experience to deliver end-to-end, effective solutions.

Personal Information:
My name is Nguyen Dai Hoang, and I am currently a 3rd-year student majoring in Information Technology at the International Training Institute of Thai Nguyen University of Information and Communication Technology (ICTU).

Full-Stack AI Tools Content Creator

Visit my school:

Core Principles

  • Simplicity & elegance in design
  • User empathy & accessibility
  • Scalable, maintainable code
  • Continuous learning & adaptation

"I believe in creating software that solves real problems while elevating user satisfaction."

Hobbies

Quiet street at sunset
Minimal desk setup by the window
Morning coffee and sketchbook
Running in misty park
Reading corner at night
Cozy coffee shop
Mountains and lake view
Creative desk with drawings
Creative 9
Creative 9
Running in misty park
Running in misty park
Running in misty park
Running in misty park

Technical Proficiency

Full-stack Web Development

Over 3 years of full-stack experience, specializing in large-scale, high-performance, and scalable web apps.

  • HTML5/CSS3
  • JavaScript/ES6+
  • Next.js (Basic)
  • APIs (integration, consumption & design)
  • Python
  • .NET Core (C#)
  • RESTful APIs
  • SQL (DB design & queries)
  • Docker
  • AWS (basic cloud services)
  • CI/CD (Basic)
  • Performance Tuning

AI Tools & Ecosystem

Using modern AI tools to accelerate development and improve product quality.

  • Visual Studio Code
  • Visual Studio 2022
  • Cursor (AI Assistant)
  • GitHub Copilot / Tabnine
  • Claude (LLM)
  • Grok (LLM)
  • GPT (LLM)
  • Gemini (LLM)
  • Google AI Studio

Digital Content Creation & Media

Hands-on experience building a personal brand and engaging audiences on digital platforms.

  • TikTok (9K+ Followers)
  • CapCut (quick editing)
  • Content strategy
  • Streamer / YouTuber
  • Adobe Premiere Pro
  • Photoshop (Thumbnail/Image Design)
  • MC at TNTV
  • Scriptwriting
  • Media content creation

Work

Học Chứng Khoán Edu
FINANCIAL EDUCATION · Open Project, Vietnamese-first

hocchungkhoan.edu.vn — Dự Án Học Chứng Khoán Miễn Phí Cho Người Việt

GoldenStock
FINTECH · Next.js, FastAPI, VNStock, Docker

GoldenStock — Stock Analytics Platform

Cờ Tướng GenZ
GAME · React, Next.js

Cờ Tướng GenZ — Modern Chess

IELTS Pro
EDUCATION · React, Next.js, Tailwind

IELTS Pro — Exam Prep Platform

Ice Tea
WEB DESIGN · React, Vite, CSS

Ice Tea — Tea Culture Web

Game Vui Dòng Tiền
GAME · Phaser, JavaScript

Game Vui Dòng Tiền VN

HE Coffee
SYSTEM · Python, JS

HE Coffee — Cafe Management App

LearnLangs
AI EDUCATION · C#, .NET, Blazor

LearnLangs — Language App

typingtest
TOOL · Python, JS

typingtest — Typing Practice App

Latest Articles

The 5 Whys - Root Cause Analysis
Data Analytics · 5 min read

The 5 Whys: Master the Art of Root Cause Analysis in Data Analytics

Big Data Fundamentals by Thomas Erl
Data Analytics · 8 min read

From Pyramids to Big Data: Exploring the Professional Data Analysis Process

CPMAI Certificate by DavidHE
Artificial Intelligence · 10 min read

CPMAI: The Key to Avoiding Failure in AI Project Management

Clean Code Journey by DavidHE
Engineering · 20 min read

Clean Code Is Not Just for Seniors: 3 Years of Real-World Lessons

Power and Prediction: AI Economics by DavidHE
Strategy · 25 min read

Why 89% of AI Investments Fail: Lessons from "Power and Prediction"

Designing Data-Intensive Applications by Martin Kleppmann
Engineering · 30 min read

From MySQL Chaos to System Thinking: My Journey with DDIA

Warren Buffett Investment Mindset
Finance · 8 min read

Market Truths Unveiled: 5 Life-Changing Mindsets from Warren Buffett

Zero to One - Peter Thiel
Engineering · 6 min read

Zero to One: Don't Become an "Obsolete" Version of Someone Else

Customer disruption
Inspiration · 5 min read

Don't Blame Technology: Is the Customer Disrupted Your Business?

Scientific Conference
News · 4 min read

Honored to Attend and Participate in National Scientific Conference

Achievement Unlocked: Second Prize Winner
Engineering · 3 min read

[Achievement Unlocked] Proud Winner of the Second Prize! 🥈

Specialized seminar
Research · 5 min read

Seminar: Advanced Technology Applications in Biomedical Science

Immersing in AI with Dr. Lin
Inspiration · 6 min read

From Student to Discussion Partner: Immersing in AI with Dr. Lin (FCU)

MC Experience
Media · 7 min read

MC Experience at Thai Nguyen Newspaper and Television Broadcasting

English Club President
Experience · 5 min read

My Journey as English Club President

Community Education & Fintech

Chia sẻ về dự án học chứng khoán miễn phí của mình và LuahoaTeam

Mình và LuahoaTeam vừa hoàn thiện một dự án nhỏ mang tên hocchungkhoan.edu.vn — một nền tảng học chứng khoán mở, hoàn toàn bằng tiếng Việt, được xây dựng riêng cho người Việt.

Dự án hocchungkhoan.edu.vn

Tụi mình hiểu rằng thị trường tài chính rất rộng và đôi khi khó tiếp cận, đặc biệt với người mới bắt đầu. Vì vậy, dự án ra đời với mục tiêu rất rõ ràng: mang kiến thức thực tế đến gần hơn với mọi người, không rào cản ngôn ngữ, không rào cản chi phí.

Điểm đặc biệt của dự án

  • Dành cho người Việt: Nội dung viết bằng ngôn ngữ gần gũi, dễ hiểu, bám sát bối cảnh đầu tư tại Việt Nam.
  • Mở và miễn phí: Tụi mình chia sẻ công khai để ai cũng có cơ hội tiếp cận tri thức như nhau.
  • Nội dung thực tế: Từ phân tích biểu đồ, đọc mô hình kinh doanh đến tài liệu nghiên cứu chuyên sâu đều được hệ thống rõ ràng, dễ tìm.

Nếu bạn đang tìm một nơi để tự học bài bản và nâng cao kiến thức đầu tư theo hướng thực chiến, rất mong bạn ghé qua trải nghiệm cùng tụi mình. Hy vọng đóng góp nhỏ này từ mình và LuahoaTeam sẽ giúp cộng đồng nhà đầu tư Việt có thêm một điểm tựa học tập đáng tin cậy.

Game Development

Cờ Tướng GenZ — Modernizing a Classic

How can we make a centuries-old game appeal to the TikTok generation? That was the core challenge behind Cờ Tướng GenZ.

Cờ Tướng GenZ

This project isn't just about the rules of chess; it's about the experience. By combining a vibrant GenZ aesthetic with smooth animations and a responsive interface, we've transformed the traditional Chinese Chess into something fresh and exciting.

Key Features

  • Modern UI/UX with neon accents.
  • Real-time gameplay logic.
  • Mobile-first responsive design.
Education Tech

IELTS Pro — Your Digital Learning Companion

Mastering IELTS requires more than just books; it requires an interactive environment that tracks your progress and sharpens your skills.

IELTS Pro

IELTS Pro was designed to simplify the preparation process. From simulated listening tests to reading modules, every feature is optimized for student success. The platform uses Next.js for lightning-fast performance and a clean UI to keep learners focused on what matters.

Web Design & Culture

Ice Tea — A Digital Journey into Tea Culture

Tea is not just a drink; it's a culture. This web experience was built to celebrate that heritage in a modern digital space.

Ice Tea Culture

The goal of Ice Tea Web was to create an atmosphere of tranquility. Through minimalist design, soft typography, and elegant imagery, the site invites visitors to explore premium tea varieties and the stories behind them.

Edu-Gaming

Game Vui Dòng Tiền — Financial Wisdom for Vietnam

Why is financial literacy so hard to teach? Because it needs to be fun. This game makes managing money an adventure.

Game Cash Flow

Inspired by the classic Cashflow game but tailored specifically for the Vietnamese market, this project helps players understand investment, saving, and wealth management through a fun, interactive simulation.

Fintech & Data Platform

GoldenStock — From a Beautiful Dashboard to a Production-Ready Stock Analytics Platform

GoldenStock was built to solve a practical need: open one screen and instantly understand company health, valuation, capital flow, and macro context—without jumping across fragmented data sources.

GoldenStock dashboard

Product goals

  • Fast and intuitive reading experience for end users.
  • Reliable behavior under unstable upstream APIs or rate limits.
  • Easy extensibility for new modules and production deployment.

Core technical challenges

  1. Inconsistent schemas across endpoints and data providers.
  2. Token/API key security with strict server-side handling.
  3. Upstream failures while keeping the UI responsive and stable.

Architecture

Browser → Next.js App → /api/vnstock proxy → FastAPI bridge → VNStock

I split the system into three layers: frontend for UX, proxy for secure token handling, and a FastAPI bridge for normalization, caching, and fallback strategies.

What has been built

  • Unified dashboard: Price/Volume, Revenue, Net Income, Margin, ROE/ROA, valuation, and cash flow.
  • Integrated News + Macro widgets (USD/VND, gold, oil, interest rates).
  • Stock Screener by sector/P-E/RS with instant Excel export.
  • Multi-sheet chart export for offline analysis.
  • LIVE/FALLBACK data badges for transparent source status.

Results

  • End-to-end flow completed from UI to data bridge.
  • Strong resilience with caching and synthetic fallback data.
  • Dockerized and ready for production deployment.
Data Analytics & Big Data

From Pyramids to Big Data: Exploring the Professional Data Analysis Process

Did you know that the origins of data analysis date back to... Ancient Egypt? When building the majestic Pyramids, scribes used papyrus to record calculations and material statistics – that was the "primitive version" of the Excel spreadsheets we use today.

Big Data Fundamentals and Data Analysis Process
The data analysis process is the key to turning raw data into real value.

If you are starting your journey to become a Data Analyst, the book "Big Data Fundamentals: Concepts, Drivers & Techniques" by Thomas Erl is an indispensable "compass." Let's explore the valuable lessons on the data analysis process suggested by this book and top experts!

1. There is no "single formula" for every problem

Data is an endless flow, changing by industry and business scale. Therefore, the way we process it cannot be rigid. Depending on the goal, you can approach it according to different process models. Here are 4 popular models today:

  • Google: Extremely neat and memorable with 6 steps (Ask, Prepare, Process, Analyze, Share, Act). This is the standard process taught in the famous Google Data Analytics certificate.
  • EMC (Dell): Emphasizes the cyclic nature (Data Analytics Lifecycle). This process is not a straight line from A to Z, but a continuous loop for optimization.
  • SAS: An endless spiral model, highly focused on re-evaluating results after each cycle to improve the predictive model.
  • Big Data Fundamentals (Thomas Erl): A detailed and in-depth process of up to 9 steps, specifically designed for Big Data systems.

2. Highlights from the book "Big Data Fundamentals"

While other models try to streamline the process for accessibility, authors Thomas Erl, Wajid Khattak, and Paul Buhler choose to "dissect" the analysis process into 9 extremely detailed stages. The biggest difference lies in:

Meticulousness in data processing: Instead of just calling a step "Data Preparation", Thomas Erl divides this stage into 4 independent steps: Extraction, Filtering, Validation, and Cleaning. This is extremely important in the Big Data era, where raw data is often very noisy and unstructured.

High practical applicability: Emphasizes Aggregation and Data Representation before actually starting in-depth Analysis.

"This is the bedside book for those who want to understand deeply how Big Data operates – where a small error during data cleaning can lead to business decisions worth millions of dollars."

3. The 6-step "Golden" process for beginners

If you feel the 9-step process of Thomas Erl too complex when starting out, the 6-step process from the Google Data Analytics certificate is the perfect starting point:

  1. Ask: Clearly define the business problem and the question to be solved. You cannot find the right answer if you ask the wrong question.
  2. Prepare: Collect data from different sources and store it safely.
  3. Process: Clean the data, remove null values, duplicates, or deviations to ensure accuracy.
  4. Analyze: Use tools and statistical techniques to look for trends, rules, and hidden relationships in the data.
  5. Share: Visualize data (Data Visualization) through charts and present results clearly to stakeholders.
  6. Act: Turn the insights found into practical decisions and actions in the business.

Conclusion

Data analysis is not just about dry numbers or complex lines of code. At its core, it is the art of curiosity and Data Storytelling. Whether you choose Thomas Erl's 9-step process or Google's lean 6 steps, the core of success still lies in the ability to ask the right questions and persistence in finding the answers.

#DataAnalytics #BigData #LearningData #ThomasErl #GoogleDataAnalytics
Finance & Wisdom

Market Truths Unveiled: 5 Life-Changing Mindsets from Warren Buffett to Make Your Money Work for You

Did you know that a single Class A share of Berkshire Hathaway, led by Warren Buffett, now costs over $600,000—equivalent to nearly 15 billion VND? Over the past 60 years, their return has reached a staggering 5.5 million percent.

The stock market is not a coin-toss game for gamblers. From the perspective of value investing, here are 5 "peak" mindsets from the Oracle of Omaha that can shatter illusions of quick wealth and help you build a real fortune.

Warren Buffett Value Investing Philosophy
"Price is what you pay, value is what you get."

1. Think Like a "Bargain-Hunting Housewife"

Warren Buffett has an iron rule: "Price is what you pay, value is what you get." Your mission is to buy stocks when their price is significantly lower than their intrinsic value.

It sounds easy, but 90% of investors do the opposite! When you go shopping for clothes or food and see a "Sale off" sign, you rush in. Yet in the stock market, when good stocks are being sold off at bargain prices, the crowd panics and sells at a loss; when prices skyrocket to the peak, they excitedly jump in. Stay sharp: The ultimate rule is to buy low and sell high!

2. Stop the Overnight Wealth Illusion: "Patience is a Virtue"

Many enter the market with a "grab-and-go" mentality, wanting to force a business to yield double-digit profits in just a few days. That is not investing; that is gambling! Rushing to get rich only leads to mistakes or even bankruptcy.

Buffett struck a decisive blow to this mindset with a witty but harsh analogy: "No matter how great the talent or efforts, some things just take time. You can't produce a baby in one month by getting nine women pregnant."

🔥 Practical Tip: If you're new, don't throw a massive amount of money in at once. Patiently test the waters with $20-$40 (500k-1M VND) every month for at least 3 to 4 years before increasing your capital.

3. Don't Try to Jump a 7-Foot Bar; Step Over 10 8-Inch Steps!

Do you think having a modest salary or not having a 160 IQ like Newton means you can't get rich? Wrong! In fact, even Newton once lost his fortune due to the madness of the stock market.

Buffett points out: "You don't need to do extraordinary things to get extraordinary results." Instead of trying to jump a 7-foot bar and risking a broken leg, walk through 10 steps, each only 8 inches high, in a steady manner—slow but sure. Opportunity is for everyone; you just need to consistently do the simple things well.

4. Master Your Emotions: When to be Greedy, When to be Fearful?

The classic mantra everyone knows: "Be greedy when others are fearful and vice versa." But the crowd usually gets it backwards!

You shouldn't blindly "buy the dip" just because a stock dropped for one session, nor should you panic-sell when the market just starts to recover. The secret lies in Valuation:

  • Be greedy and accumulate stocks when others are in absolute panic, provided that the asset's valuation is extremely cheap.
  • Be fearful and withdraw when others are overly euphoric, pushing valuations to ridiculously expensive levels.

5. Risk Management: "Only When the Tide Goes Out..."

In an uptrend, everyone claims to be an expert because almost everything they buy turns a profit. But only when the market crashes (the tide goes out) do you find out who has actually been keeping their money. Unskilled "surfers" who ignore risk management will be swept away completely.

🔥 Practical Tip: During "harvest season" when the market is at its peak of euphoria, know when to take profits to protect your gains. Move some of that profit into gold, safe bonds, or bank deposits. Why? So that when the market crashes and stocks go on "clearance sale," you have the cash ready to go "shopping."

"Investing is the only business where customers run out of the store when items go on sale."

CONCLUSION: DON'T WAIT TO BE RICH TO INVEST

If you rationalize that "My salary is only $200 a month, barely enough to live, how can I invest?", you will stay poor forever. Whether your income is high or low, discipline yourself to set aside a portion—even if it's just $5 or $10 a month—to start. Keep going, keep accumulating, and keep learning every day. Once you expand your knowledge, the most "insane" doors of opportunity will naturally open before your eyes!

Business & Philosophy

Zero to One: Don't Become an "Obsolete" Version of Someone Else

In the world of business and personal development, we are often spoon-fed a very safe mindset: Look at what others are doing successfully and do it again, but slightly better. Open a more beautifully decorated café, create a smoother app, or try to compete for 1% of a massive market.

Zero to One - Peter Thiel philosophy
Peter Thiel: "Competition is for losers."

But after finishing "Zero to One" by Peter Thiel, I realized that mindset is actually a trap. Peter Thiel threw a splash of cold water on traditional business teachings by declaring: "Competition is for losers."

The Paradox of Competition

We have been obsessed with competition since our school days. Grades, rankings, prestigious universities... it all trains us to become warriors excellent at defeating others to win identical rewards. The consequence is that when we enter the real world, startups also dive into tearing each other apart in bloody "red oceans," only for profits to be eroded to zero.

Reflecting on this, Thiel's philosophy of "Going from 0 to 1" is profoundly deep. Why dive into opening the 100th social network when you can create something that never existed before? True value does not lie in beating your competitors, but in reaching monopoly by solving a unique problem that no one else can. Monopoly, in a positive sense, is the reward for those who dare to innovate.

Life is Not a Lottery

One of the chapters that made me think the most is "You are not a lottery ticket". Nowadays, people often attribute the success of Bill Gates or Mark Zuckerberg to luck, circumstances, or "being at the right time." Our generation is suffering from "indefinite optimism"—believing the future will be better, but having no specific plan, only trying to accumulate many general skills to "play it safe."

Thiel reminds us: You cannot diversify your own life. You have only one life, one future. Applying "intelligent design" thinking instead of leaving it to the random "evolution" of circumstances is the only way to create quantum leaps.

A Lesson for Each of Us

Even if you don't dream of building a billion-dollar startup, the philosophy of "Zero to One" still applies perfectly to building a personal brand and career. Don't try to be a "well-rounded" but mediocre employee. Find your own "secret"—a niche field, a specialized skill that few care about, then become the best and most unique in that area.

"The path from 0 to 1 is a lonely one, going against the crowd. But as author Peter Thiel said, it's the only way we can not only make the future different but also make it better."

The future doesn't just happen; it's waiting for us to create it. Everything great starts with a first step, going from a round zero to reach the unique number one.

Business & Strategy

Title: Don't Blame Technology, Customers Are the Ones "Disrupting" Your Business Model!

We live in an era where terms like "digital transformation," "AI," or "Blockchain" are ubiquitous. When an industry "giant" falls, financial articles immediately conclude: "They were too slow in adopting technology!" But is that really the truth? Nokia or Borders invested mountains of money into R&D and e-commerce, achieved countless innovation awards, yet still perished. Why?

Customer comparing prices inside an electronics retail store
Are your customers skipping the checkout line to buy online while in your store?

The Real Enemy Is Not Your Competitor's Tech

The book Unlocking the Customer Value Chain by Thales S. Teixeira delivers a heavy blow to the conventional thinking of business leaders: The real enemy is not the competitor's technology, but the shift in customer behavior.

The concept of "Decoupling" (Separating the value chain) in the book truly made me stop and reflect. Think about our own shopping habits. How many times have you walked into a high-end electronics store, admired a curved-screen TV, listened to the enthusiastic sales consultant, and then... pulled out your phone to order that exact model on Shopee or Amazon because it was cheaper?

This behavior (known as showrooming) completely severs the "product discovery" phase from the "payment" phase that traditional retailers spent immense effort building.

The Biggest Mistake of Legacy Companies

The greatest mistake long-standing companies make is designing their business based on the resources they possess, forcing the customer to play by their rules. They obsess over competitors (because competitors are few and easy to see), while forgetting to monitor millions of customers silently changing their preferences every day.

When customers realize they are spending too much "time" and "effort" (not just money), they will immediately leave when a small startup appears, offering a more convenient, decoupled option.

The Core Lesson: Clear the Path

So, what is the core lesson here? Innovation is not about adding cumbersome technological features. Innovation means clearing the path, removing the friction in the customer's experience. Instead of getting angry and trying to "trap" consumers with complex contracts, businesses need the courage to "proactively decouple" themselves.

Best Buy achieved this successfully when they stopped trying to be a pure retailer and pivoted to charging electronics brands a fee to turn their stores into premium exhibition spaces.

Finding Your Missing Link

You don't need to create an entirely new product to be successful. Sometimes, all you need to do is look at your industry's current value chain, find a loose link that is frustrating users, and provide a smoother solution.

So, in your industry, which "link" is costing your customers the most time and effort? That might just be the next goldmine waiting for you to uncover!

Research & Technology

Specialized Seminar: Advanced Technology Applications in Biomedical Science and Artificial Intelligence

On September 22, Thai Nguyen University of Information and Communication Technology (ICTU) successfully organized an important specialized seminar, marking academic and research collaboration with experts from the Department of Electronic Engineering and Computer Science at Hong Kong Metropolitan University (HKMU).

The event attracted the enthusiastic participation of leadership, faculty, researchers, and students from ICTU, creating a vibrant and highly practical academic exchange atmosphere.

Specialized seminar at ICTU - Overview
The specialized seminar at ICTU with enthusiastic participation from faculty and students

✨ Core Content: The Intersection of Technology and Medicine

The seminar focused on two research areas that are among the world's top priorities:

1. Advancing 3D Ultrasound Tools for Carotid Atherosclerosis: Transforming Clinical Trials with Novel Biomarkers and Automation

Speaker: Dr. Shirley CHEN, Senior Lecturer, HKMU.

Dr. Shirley Chen presenting on 3D Ultrasound
Dr. Shirley Chen presenting on advanced 3D Ultrasound technology

Key Highlights: Dr. Shirley Chen's presentation focused on integrating advanced 3D Ultrasound technology with Automation algorithms. This approach not only improves the accuracy and efficiency of medical imaging diagnostics but also provides novel biomarkers for evaluating and monitoring carotid atherosclerosis, thereby optimizing clinical trials.

2. Leveraging Quantum Machine Learning to Classify Biosignals for Disease Detection

Speaker: Assoc. Prof. Dr. Kevin HUNG King Fai, Head of Department, HKMU.

Assoc. Prof. Dr. Kevin Hung presenting on Quantum Machine Learning
Assoc. Prof. Dr. Kevin Hung introducing Quantum Machine Learning

Key Highlights: Prof. Kevin Hung introduced the breakthrough potential of Quantum Machine Learning (QML). QML, with its ability to process complex data on a large scale and at higher speeds than classical computers, is being applied to analyze biosignals. This research opens new pathways for early and accurate detection of various diseases, demonstrating the indispensable role of quantum physics and AI in modern medicine.

Discussion and exchange scene at the seminar
Enjoying and learning directly from leading experts.

🤝 The Importance of International Collaboration

The seminar served not only as a platform for sharing specialized knowledge but also as an opportunity to strengthen the collaborative relationship between ICTU and HKMU—a public university in Hong Kong renowned for its applied training and research programs.

The event successfully facilitated international academic connections, creating an environment for Vietnamese researchers, faculty, and students to directly access world-leading technological trends in Information Technology and Biomedical Science. This provides a solid foundation for developing high-quality human resources capable of integration and contributing to the nation's scientific and technological development.

Group photo of faculty, Dr. Lin, and students
The group photo captures the wonderful learning spirit and the tight academic cooperation between ICTU and HKMU.
Student Life

My Journey as English Club President

This is a comprehensive article compiling the memories from my time as President of the English Club at ICTU. A journey filled with challenges, achievements, and unforgettable moments that shaped not just my leadership skills, but also my personal growth.

1. Introduction - The Beginning

Stepping into the role as English Club President was both an honor and a significant responsibility. I remember the mix of excitement and nervousness that filled me during my first official meeting. Taking on this leadership position meant I would be responsible for organizing events, managing a diverse team, and representing our club at various competitions and activities.

The first official meeting was a moment I'll never forget. Standing in front of the members, I felt the weight of expectations but also the thrill of new possibilities. It was the beginning of a journey that would teach me invaluable lessons about leadership, teamwork, and personal growth.

First official meeting as English Club President
The Beginning Moment – Full of Passion and Expectation.

2. Building the Foundation

The early days were all about building connections and establishing a strong foundation for the club. Our early club meetings focused on team bonding activities that brought members closer together. We organized fun games and interactive sessions that not only improved English skills but also created lasting friendships.

Learning to lead and manage a diverse team was one of my first challenges. Each member brought unique perspectives and talents, and it was my responsibility to create an environment where everyone felt valued and motivated. Through these early activities, I began to understand the true meaning of collaborative leadership.

Team bonding activities and club meetings
Early club meetings with fun games and team bonding activities.

3. Major Achievements & Competitions

A. English Festival 2024-2025

One of the most memorable achievements was leading our team to compete at Thai Nguyen University. The English Festival 2024-2025 was a significant event that brought together English clubs from various universities. The competition was intense, but our team's dedication and preparation paid off.

We won the consolation prize, receiving 1,500,000 VND. But more than the prize money, the pride of representing ICTU on the red carpet was an unforgettable experience. Standing there with my team, I felt immense pride in what we had accomplished together.

First official meeting as English Club President
First official meeting as English Club President
First official meeting as English Club President

B. Awards & Recognition Ceremony (2023-2024)

The Awards & Recognition Ceremony for the 2023-2024 academic year was a moment of celebration for our entire team. Receiving awards for club excellence was a testament to the hard work and dedication of every member. Standing with the entire team as we were honored was a moment that reinforced the importance of collective effort.

First official meeting as English Club President

C. Olympic Tiếng Anh ICTU 2024

Organizing the Olympic Tiếng Anh ICTU 2024 was one of the largest responsibilities I undertook as President. This major English competition required months of planning, coordination, and teamwork. Seeing so many students participate and excel in the competition was incredibly rewarding. The event not only showcased the English skills of ICTU students but also strengthened the English learning community at our university.

First official meeting as English Club President

D. STAR Awards 2024

Another significant achievement was our participation in the STAR Awards 2024. We competed in the "Young Generation - Healthy Lifestyle for a Healthy Heart" category, which required us to create meaningful content about health and wellness. Once again, our team's efforts were recognized with another consolation prize win of 1,500,000 VND.

Presenting alongside talented teammates was an experience that taught me the value of collaboration and mutual support. Each team member contributed their unique strengths, creating a presentation that was both informative and engaging.

First official meeting as English Club President

E. T.E.C Spell-Off Competition

The T.E.C Spell-Off Competition was a fun and challenging experience that tested our spelling skills and quick thinking. While it was a lighter competition compared to others, it provided valuable opportunities for members to practice and improve their English vocabulary in a competitive yet friendly environment.

Major competitions and achievements
Major events and competitions – where we connected and shined.

4. Special Events & Cultural Activities

A. Colourful ICTU 2025

Colourful ICTU 2025 was one of the most vibrant events we organized. This cultural celebration brought together international guests and students from various backgrounds. We organized interactive games and meaningful exchanges that celebrated cultural diversity and created a vibrant, inclusive community.

The event was a beautiful representation of how language and culture can bring people together. Seeing students from different countries and backgrounds connect through English and shared activities was truly inspiring. It reinforced my belief in the power of cultural exchange and inclusive leadership.

Major competitions and achievements
Major competitions and achievements
Major competitions and achievements

B. Vietnamese Teachers' Day Celebration

The Vietnamese Teachers' Day Celebration was a heartfelt event where we honored our mentors and advisors. Organizing this celebration was particularly meaningful as it allowed us to express our gratitude to the teachers who had supported and guided us throughout our journey.

The gift-giving and appreciation activities created a warm atmosphere that strengthened the bond between students and teachers. It was a reminder that leadership is not just about organizing events, but also about fostering meaningful relationships and showing appreciation to those who support us.

Major competitions and achievements
Major competitions and achievements

5. Growth & Lessons Learned

My time as English Club President was a period of tremendous personal and professional growth. One of the most significant improvements was in my public speaking skills. Leading meetings, presenting at competitions, and speaking at various events helped me become more confident and articulate in expressing my ideas.

Event management and organization became second nature to me. From planning small weekly activities to organizing major competitions, I learned to balance multiple responsibilities, manage timelines, and coordinate with various stakeholders. These skills have proven invaluable in both my academic and professional life.

Perhaps the most important lesson was about teamwork and collaboration. Leading a diverse team taught me that the best results come from leveraging each member's unique strengths. I learned to listen, delegate effectively, and create an environment where everyone felt empowered to contribute.

Handling pressure and responsibility was another crucial skill I developed. There were moments of stress and uncertainty, but these challenges taught me resilience and problem-solving. I learned that leadership is not about having all the answers, but about finding solutions together with your team.

Major competitions and achievements
Major competitions and achievements

6. Conclusion - Memories That Last

Looking back on my journey as English Club President, I am filled with gratitude for all the experiences, challenges, and achievements. The memories we created together – from the excitement of our first meeting to the pride of winning competitions, from organizing cultural events to celebrating with teachers – these are moments that will stay with me forever.

This journey was not just about leading a club; it was about personal transformation, building lasting friendships, and contributing to a community that values learning and growth. The skills I developed, the relationships I built, and the memories I created have shaped who I am today.

To all the members, advisors, and supporters who were part of this journey – thank you. These memories are not just mine; they belong to all of us who shared in this incredible experience. The English Club will always hold a special place in my heart, and I am proud to have been part of its story.

Major competitions and achievements
AI & Research

From Student to Discussion Partner: Immersing in AI with Dr. Lin (FCU) at ICTU

That afternoon at ICTU was truly a memorable milestone for me and my fellow technology enthusiasts. The specialized seminar, "Artificial Intelligence Research Experience Sharing: Embedded Devices and Image Processing," featuring Dr. Feng Cheng Lin from Feng Chia University (FCU), Taiwan, was not merely a lecture but a high-level dialogue about the future of AI.

Dr. Lin conveyed in great detail the technical challenges of "embedding" complex AI models into devices with limited memory and power. This is precisely the point of convergence between theory and practical application that I and many engineering students are constantly seeking.

Dr. Feng Cheng Lin passionately shares strategies for optimizing AI models
Dr. Feng Cheng Lin passionately shares strategies for optimizing AI models, opening new directions for smart devices.

💬 The Dialogue Moment: Moving Beyond Textbook Theory

The part I anticipated the most—and the part I actively participated in—was the Q&A session. After learning about the techniques of Model Pruning and Quantization, I took the initiative to pose a question about the feasibility of deploying specific image processing algorithms in a real-time environment.

For me, asking a question was not just about satisfying a personal curiosity; it was an opportunity to validate that what we are learning in school can be directly applied to the most cutting-edge research.

Student actively engaging in Q&A session
I myself am actively engaging in the Q&A, representing the deep interest of ICTU students in the practical application of AI. That was the moment we transitioned from learners to discussion partners.

🤝 Academic Connection and Personal Vision

Directly participating in this discussion has given me a profound insight into future research directions. This seminar is clear proof of the importance of international cooperation in education and research.

Photo with Dr. Lin
Enjoying and learning directly from leading experts.

This is a valuable memory, a great motivation for me and my peers to continue striving in the field of AI.

The seminar is not only a place for knowledge sharing but also a bridge for a generation of Vietnamese engineers ready to tackle global technological challenges.

Group photo of faculty, Dr. Lin, and students
The group photo captures the wonderful learning spirit and the tight academic cooperation between ICTU and FCU.
Achievement

[Achievement Unlocked] Proud Winner of the Second Prize! 🥈

I am thrilled to share that my team and I have officially taken home the Second Prize at the finals of the "Developing AI-Integrated Applications in Education 2025" competition, hosted by the Faculty of Information Technology at ICTU.

Close-up of the Giải Nhì placard
The Second Prize (Giải Nhì) award - A proud moment of recognition for our hard work.

This journey has been incredible. Integrating Artificial Intelligence into education is more than just a technological trend; it's about creating practical solutions that truly enhance the teaching and learning experience. This award is a wonderful validation of our team's hard work and research over the past few months.

Team photo holding certificate on stage
Celebrating our achievement on stage with the certificate - a moment of pride and accomplishment.

I want to extend a huge thank you to the organizers for creating this intellectual playground, our mentors for their guidance, and especially my amazing teammates for their dedication. This achievement is a huge motivation for me to continue innovating in the EdTech space.

Large group photo with all participants
Group photo with all participants, organizers, and mentors - celebrating innovation in AI and Education.

Here's to the next milestone! 🚀

New Research

Honored to Attend and Participate in National Scientific Conference

I was honored to be invited to attend and participate in the National Scientific Conference: "Multi-document Summarization Solutions using Pre-trained Language Models and Deep Learning and Applications in Education".

The conference brought together researchers, academics, and professionals working on cutting-edge solutions in natural language processing, AI, and their applications in educational settings. It was an inspiring platform to learn about the latest developments in multi-document summarization techniques and their potential impact on education.

During the event, I had the opportunity to engage with research related to digital technology in architectural reconstruction and simulation at the Cat Tien National Archaeological Site. This research explores how digital technologies can be applied to preserve and reconstruct cultural heritage, particularly ancient architectural structures.

Photo with conference banner
Photo with the conference banner at the National Scientific Conference.

Here are some photos from the conference:

Group photo with conference delegates
Group photo with conference delegates and participants.
Photo with supervising Associate Professor
Photo with my supervising Associate Professor at the conference.

I'm grateful to the organizers for creating this valuable platform for academic exchange and to all the participants for the insightful discussions. This experience has been inspiring and motivates me to continue exploring innovative applications of technology in various fields.

Artificial Intelligence Strategy

CPMAI: The Key to Avoiding Failure in AI Project Management

Over 80% of AI projects fail. Not because of weak AI technology, but because our planning, management, and deployment processes are heading in the wrong direction.

PMI Certified Professional in Managing AI
The journey to mastering professional AI project management (CPMAI).

That is the reality that the Introduction: PMI Certified Professional in Managing AI (PMI-CPMAI)™ course opened with, and it's also why I was hooked from the very first minutes. This article is a summary of the core knowledge I've extracted, dedicated to anyone working with AI, preparing to start an AI project, or simply wanting to understand why AI often "dies young."

Why Traditional Methods Aren't Enough for AI?

Traditional project management methods, whether Agile, Waterfall, or CRISP-DM, were not designed to handle the unique characteristics of AI:

  • Emphasis on Data: AI learns from data, not from code. If the data quality is poor, no amount of code can compensate for it.
  • Non-linear Iterations: AI needs continuous loops, not just for feature improvements but also for data cleaning and re-labeling.
  • Operational Challenges: After "going live," AI needs to be monitored (MLOps) to avoid "model drift" (the phenomenon of models becoming outdated and less accurate over time).
  • Non-technical Barriers: Data ownership, bias, ethics, and ROI are often overlooked, leading to legal and business risks.

"CPMAI doesn't replace Agile or DevOps; it adds a layer of specific practices for AI that those methods are missing."

Thought Framework: The 7 Patterns of AI

Identifying the right pattern from the beginning helps clarify data requirements and risks. Here are the 7 main patterns as defined by CPMAI:

Conversational & Human Interaction

Chatbots, virtual assistants, voice AI, and advanced natural language interaction systems.

Recognition

Image realization, facial recognition, object detection, and precise handwriting recognition.

Patterns & Anomalies

Identifying financial fraud, system failures, or any unusual behaviors in large data sets.

Predictive Analytics

Forecasting sales, market trends, predictive maintenance, and supply chain operational optimization.

Hyperpersonalization

Highly targeted personalized content recommendations and architecting flexible user experiences.

Autonomous Systems

Self-driving cars, smart robots, drones, and fully automated operational systems.

The Core Method: 6 Iterative Phases

Instead of moving in a straight line, CPMAI organizes projects into an iterative cycle with Data at the center:

1

Business Understanding

Identify ROI, success criteria, and AI Go/No-Go evaluation.

2

Data Understanding

Inventory of the 4 Vs (Volume, Variety, Velocity, Veracity) and Bias assessment.

3

Data Preparation

Cleaning and Data Labeling — which often consumes 80% of project effort.

4

Model Development

Model training, hyperparameter tuning, and experimental tracking.

5

Model Evaluation

Technical evaluation (Accuracy/F1) and Business assessment (ROI/Adoption).

6

Model Operationalization

Operational deployment, setting up MLOps, and starting the continuous improvement loop.

Sustainable Foundation: Trustworthy AI

Technical success alone is not enough. For AI to truly integrate into life, it needs four pillars:

Ethical AI

Privacy & Fairness

Responsible AI

Clear Accountability

Transparent AI

Explainability (XAI)

Governed AI

Risk Control Processes

Conclusion: The Journey Has Just Begun

The PMI-CPMAI certification is not just a credential; it's a shift in management thinking. If you're working in AI or planning to enter the field, this framework is an invaluable asset for minimizing the risk of failure.

Experience Media

MC Experience at Thai Nguyen Newspaper and Television Broadcasting

I had the incredible opportunity to work as an MC for Thai Nguyen Newspaper and Television Broadcasting, where I embarked on a fascinating cultural journey exploring traditional Vietnamese heritage through various immersive experiences.

Experiencing Traditional Then Music

One of the most memorable experiences was learning about and experiencing the traditional Then music. Then is a unique form of folk music from the northern mountainous regions of Vietnam, characterized by its distinctive stringed instruments and melodic storytelling. I had the privilege of meeting with traditional performers who shared their knowledge and passion for this beautiful art form.

Experiencing traditional Then music
Learning about traditional Then music with local performers and cultural experts.
Then music performance at Thai Nguyen Radio and Television
Capturing the Then music performance during the MC program.
Traditional Then music cultural experience
Immersing in the traditional Then music cultural experience.

The performers, dressed in traditional attire, played their instruments with such skill and emotion, transporting us to a different time. It was a profound experience to witness this living cultural heritage and understand its significance in preserving Vietnamese traditions.

Exploring Tea Culture Space

Another highlight was experiencing the traditional tea culture space. Vietnam has a rich tea culture, and I had the chance to participate in a traditional tea ceremony in a beautifully designed wooden tea house. The setting was warm and inviting, with wooden paneling, traditional decorations, and shelves displaying various tea products.

Traditional tea culture space experience
Exploring the traditional tea culture space with its beautiful wooden design and cultural atmosphere.

During the tea ceremony, I learned about different types of Vietnamese tea, the proper way to brew and serve tea, and the cultural significance behind each gesture. It was a meditative and educational experience that deepened my appreciation for Vietnamese tea culture and its role in social bonding and hospitality.

Enjoying traditional tea ceremony
Participating in a traditional tea ceremony and enjoying the authentic tea experience.

Making Traditional Mooncakes

One of the most hands-on and enjoyable experiences was learning to make traditional mooncakes. Mooncakes are an essential part of the Mid-Autumn Festival in Vietnam, and making them from scratch was both challenging and rewarding.

Enjoying tea and making mooncakes
Combining the tea ceremony experience with traditional mooncake making.
Learning to make traditional mooncakes
Hands-on experience learning to make traditional mooncakes.

Under the guidance of experienced artisans, I learned the intricate process of preparing the dough, creating the fillings, and shaping the mooncakes with traditional molds. The process required patience and precision, but the result was delicious mooncakes that carried both flavor and cultural meaning. It was wonderful to connect with this traditional craft and understand its importance in Vietnamese celebrations.

Reflections on the Experience

Working as an MC for Thai Nguyen Newspaper and Television Broadcasting provided me with unique opportunities to explore and share Vietnamese culture. Each experience - from the melodic Then music to the serene tea ceremonies and the hands-on mooncake making - enriched my understanding of Vietnam's cultural heritage.

These experiences reminded me of the importance of preserving and promoting traditional culture in the modern world. As an MC, I had the privilege of bringing these cultural treasures to a wider audience, helping to bridge the gap between tradition and contemporary media.

I'm grateful to Thai Nguyen Newspaper and Television Broadcasting for this incredible opportunity and to all the cultural experts, performers, and artisans who shared their knowledge and passion with me. This journey has been both professionally enriching and personally meaningful.

Watch the programs directly:

Engineering

Clean Code Is Not Just for Seniors: 3 Years of Real-World Lessons

I once opened my own old code a few months later and thought: "Who wrote this mess?". After checking git blame, the answer was: me. This article is a battle-tested summary of Clean Code after 3 years in the field, for both fresh developers and seniors looking to reduce bugs, speed up reviews, and stop the production heartburn.

Clean Code cover by DavidHE
"Leave the campground cleaner than you found it." – Boy Scout Rule

1) What is Clean Code and why invest in it?

Clean Code is not just decoration. It's writing code so that others (and your future self) can read, modify, and extend it without breaking the system. Teams that only chase "working code" move fast for the first 3–6 months, then hit a wall as technical debt accumulates.

2) 5 principles to escape the "rotten code" trap

2.1 Meaningful names

Clear variable and function names are the cheapest yet most effective improvement. `elapsedTimeInDays` always beats `d`; `getFlaggedCells()` is better than `getThem()`.

  • Class names = nouns (`Customer`, `InvoiceProcessor`).
  • Function names = verbs (`calculateTotal()`, `saveOrder()`).
  • Avoid vague abbreviations and copy-paste naming patterns.

2.2 Small functions, single responsibility

A function 100–200 lines doing multiple jobs is a bug factory. Break it into small, focused steps for readability, testability, and quick debugging.

  • Orchestration functions should be short; each step calls a meaningful sub-function.
  • Separate validation, calculation, persistence, and notification into distinct units.
  • Minimize unexpected side-effects within the same method.

2.3 Comments with purpose

Comments should not describe what code already shows. Good comments explain why, document business constraints, or warn of risks when changing code.

2.4 Clear error handling

Prefer exceptions over cryptic error codes; avoid returning `null` for collections. Clear APIs make it harder for callers to miss error handling.

  • The happy path should be easy to read.
  • Fail fast at the top of a function when input is invalid.

2.5 Tests are the safety net for refactoring

Without tests, refactoring easily breaks existing behavior. Writing unit tests in Red → Green → Refactor cycles lets you change faster over time.

🔥 Real-world insight from 3 years:

Don't wait for "big refactor". Every time you touch code, improve one small thing: rename a variable, split a long function, add a test case. After a few sprints, your codebase transforms.

3) Roadmap for small engineering teams

  • Months 1–2: Workshop + shared review guidelines.
  • Months 2–4: Linter/formatter + quality gates in CI.
  • Months 4–8: Pilot module with TDD + refactor budget each sprint.
  • Ongoing: Make code quality part of Definition of Done.

Conclusion

Clean Code is not a destination but a journey. Every commit a bit cleaner helps your team cut bugs, speed up reviews, ease onboarding, and build sustainably.

Read Full Article on GitHub Read Clean Code Book

Strategy

Why 89% of AI Investments Fail: A Developer's Guide to System Thinking

Hello, I'm a software developer. After many years of grinding with code, moving from bug to bug, I've realized one harsh truth: being good at technology doesn't guarantee creating value. You could write AI code that's 99.9% accurate, but when you throw it into a company, it creates nothing – or worse, breaks the entire system.

Sound familiar? It's exactly like writing a perfect string-processing function that, when integrated into a 20-year-old legacy codebase, unleashes dozens of bugs you've never seen.

I just finished the book "Power and Prediction: The Disruptive Economics of Artificial Intelligence" by professors from the University of Toronto. This book is like a "debugging manual" for corporate AI strategy. It explains why AI – hailed as "the new electricity" – is failing to generate the massive profits everyone predicted.

This blog post is my notes and translation from academic economics into the language of a dev who loves sharing. I'll keep the spirit of the book, but explain it through concepts we developers know: technical debt, refactoring, APIs, microservices, legacy systems, and the like.

Why should you read this?
Because whether you're a dev, BA, PM, or CTO, misunderstanding AI's true nature will burn your budget on "point solutions" that go nowhere. I'll show you where the real gold mine actually is.

Part 1: The AI Paradox – Like Code That Works Locally But Dies in Production

The "miraculous" promise vs. harsh reality

You've probably heard statements like: "AI is greater than electricity and fire" (Sundar Pichai) or "AI will generate $13 trillion in economic value" (McKinsey). Mind-blowing, right?

Then you saw AlphaGo defeat the world's #1 Go player. GPT-4 passed the US bar exam (top 10%). AI reads lung X-rays better than experienced doctors. The technology is amazing.

But here's the plot twist... According to McKinsey's 2021 survey, only 11% of companies said they saw clear financial benefits from AI investment. Everyone else? Stuck with proof-of-concepts, unscalable projects, or terrible ROI.

Sound familiar? Like when you write a brilliant AI script with 99% accuracy on Jupyter, then integrate it into production backend – and it crawls like a turtle, eats memory like a monster, or worse, crashes the database.

So: Why hasn't AI created more economic value?

Agrawal, Gans, and Goldfarb (call them AGG for short) identified a fundamental strategic mistake: Most companies are using AI as a "point solution" (patch), when true value comes from redesigning the entire system.

In dev terms: Don't just bolt an AI API onto legacy code and call it "digital transformation." You need to refactor the entire architecture – or even rewrite from scratch – for AI to unleash its real power.

The mystery of the unicorn from the backwoods

Here's an interesting story: The authors once predicted Canada's first AI unicorn would emerge from tech hubs like Montreal or Toronto, places with thriving AI ecosystems.

But it came from St. John's, Newfoundland – where AI probably exists only in science fiction. Company name: Verafin, specializing in financial fraud detection. In 2021, Nasdaq bought it for $2.75 billion.

Why? Because Verafin wasn't a "pure AI" company doing fancy magic tricks. They solved one specific problem that the existing system was ready for: detect fraudulent transactions. Banks already had rich data, standardized processes, and clear "fraud or not" decisions independent from other decisions (like credit approval or card review).

Verafin was a perfect point solution. And that's the "low-hanging fruit" to harvest.

Lesson 1: Don't dream of building an "omniscient AI system" right away. Find "independent" problems in your existing system where you can apply AI like a simple API without needing to rebuild everything.

Part 2: "The Between Times" – The Awkward Era Between Proof-of-Concept and Production

A historical lesson: Three entrepreneurs in the age of electrification

To understand this "awkward era," AGG tell a powerful parable about electrification in the late 1800s.

Entrepreneur #1: Replaced steam engines with electric motors, but kept the old factory design. Machines still centered around one main drive shaft. Result: slight energy savings, minimal productivity gain. This is a "point solution."

Entrepreneur #2: Attached small electric motors directly to each machine. Now each can run independently at its own speed. Productivity jumps significantly. This is an "application solution."

Entrepreneur #3: Realized electricity decoupled energy source from location. Why keep machines around one central shaft? They completely redesigned the factory based on process logic and material flow. Productivity soared hundreds of percent. This is a "system solution."

But here's the critical detail: Between when electricity was invented (late 1880s) and when productivity actually boomed (1920s) was a 20–30 year lag. That's "The Between Times" – when the technology is ready, but organizations and processes aren't.

We're in "The Between Times" of AI right now. AI is technically mature enough. But infrastructure, processes, skills, and especially mindsets still lag.

Three levels of AI solutions: From scripts to full microservice architecture

Let me map these three levels to dev terminology so you get the picture.

1. Point Solution – "Write a small script handling one specific task"

  • Trait: Improves one existing decision without (or with minimal) impact on others.
  • Examples: Verafin detecting fraud, automated credit scoring, AI spell-checker.
  • In code: Like writing an independent function detect_fraud(transaction) that returns True/False, unconnected to the bank's main transaction flow.
  • Pros: Easy to build, low risk, can plug into legacy systems immediately.
  • Cons: Value created isn't huge (10–20% improvement max).

2. Application Solution – "Create entirely new functionality"

  • Trait: Creates new decisions previously impossible, but doesn't require surrounding system changes.
  • Example: Amazon's "ship-before-you-click" – predicting what you'll buy and shipping to nearby warehouses before you order.
  • In code: Like adding a new API endpoint predict_next_purchase(user_id) that returns product recommendations, calling into existing logistics systems.
  • Pros: New experience, significant value boost.
  • Cons: Still constrained by legacy architecture.

3. System Solution – "Refactor entire architecture"

  • Trait: Simultaneously changes many interdependent decisions. You can't do it piecemeal.
  • Examples: Redesigning a factory in the electricity era. Building an entirely digital-first bank without any paper legacy.
  • In code: Like abandoning the "monolithic with central shaft" principle and transitioning to full microservices: message queues, event sourcing, CQRS, the works.
  • Pros: Value is massive (10x multiplier possible).
  • Cons: Extremely hard, requires cross-functional sync, easy to fail without proper systems thinking.

Lesson 2: Most companies are stuck at level 1 and 2, while level 3 is where sustainable competitive advantage lives.

Part 3: Rules vs. Decisions – Hardcode or Calculate On-the-Fly?

In any organization, there are two ways to operate:

  • Decision: Every time, gather info → analyze → judge. High cost per instance, optimal per case.
  • Rule: Set a fixed rule upfront: "If A, then B." High upfront cost (thinking, debate, policy writing), but very cheap to apply.

In code: You can hardcode if age < 25: price = 1000 else: price = 500 (rule). Or call an AI model using hundreds of variables (decision).

Why do organizations love rules? Because the cost of decisions (data gathering, processing, analysis, accountability) is brutal.

AI changes this equation. When prediction cost drops near zero, organizations will shift from "apply rules" to "make data-driven decisions in real-time."

Classic example: Car insurance.

  • Old rule: 25-year-olds pay premium X, 30-year-olds pay Y, based on group statistics.
  • New decision: Track real driving behavior every minute via telematics. AI predicts individual risk. Premiums adjust in real-time.

But careful! Rules aren't just cost-savers. They're the "glue" holding the system together.

Rules as glue – Don't rip them out carelessly

The Pandora (music streaming) story is instructive. They had a rule: free users hear ads every 3 songs, 30 seconds each. Doesn't sound optimal. Why not use AI to insert ads when users are bored-est and most likely to click?

But that "rigid" rule was the glue binding:

  • Advertiser contracts (you buy X impressions, get X impressions – guaranteed).
  • User expectations (they know exactly when they'll be interrupted).
  • Freemium model (users know what they're paying to avoid).

Changing it isn't just a code change. It rewrites business models, contracts, and user behavior.

Lesson 3: Before using AI to replace rules, ask: "What else is this rule doing?" If the answer is anything other than "nothing else," you might break something critical without knowing.

Hidden uncertainty – The stuff not on your Jira board

Organizations build tons of "scaffolding" (buffers, queues, backup procedures) to cope with uncertainty:

  • Incheon airport has massive buffering systems for check-in delays, security, baggage handling.
  • Hedgerows in England mark boundaries but also block wind, prevent erosion, shelter livestock.

When you deploy AI, you might think you're "removing uncertainty" and thus can "cut the scaffolding." But if you don't understand the hidden functions, disaster strikes.

Example: Farmer removes hedgerows because AI forecasts weather more accurately (don't need wind protection). A freak storm hits. Crops destroyed. AI can't predict "freak."

Lesson 4: Use AI to reduce uncertainty, but don't greedily strip all scaffolding. Ask: "Why do we do this?" If "because we can't predict X," then X is AI's opportunity. If "because regulations require it" or "because multiple parties depend on it," tread carefully.

Part 4: The Great Decoupling – AI Doesn't Replace You; It Separates Prediction from Judgment

This is probably the book's most important and most misunderstood part.

Prediction vs. Judgment – Like data vs. loss function

We often think AI will "make decisions for humans." Wrong. AI has no power. AI has no responsibility. AI has no value.

What AI does: Prediction. "What's the probability of outcome Y, given data Z?"

Everything else – Judgment – belongs to humans. "Given that probability, what should I do? What am I willing to trade off?"

Classic example: Michael Jordan and team owner Jerry Reinsdorf.

  • Medical prediction: 10% chance of career-ending re-injury if Michael returns early.
  • Michael's judgment: "The benefit of competing (I'm Jordan, I want to score, I want championships) outweighs 10% risk. I play."
  • Reinsdorf's judgment: "Risk losing our biggest asset is too high. I veto."

Same prediction. Different judgments. AI can't bridge that gap – it's a values and risk appetite question.

In ML terms:

  • Prediction = model output (probability).
  • Judgment = loss function (how much you weigh Type 1 vs. Type 2 errors) + utility function (what outcome you prefer).

Before AI: Prediction and judgment were "baked into" one person (credit officer, doctor, lawyer). You couldn't separate them.

After AI: You can decouple. AI does prediction. Humans (or an algorithm) do judgment. Better: you can assign judgment to whoever judges best – not necessarily the predictor.

Example in banking: Traditionally, a loan officer both estimated default probability (prediction) and decided "approve or deny" (judgment). Now? An AI model gives the probability. A different algorithm (if simple) or a credit committee (if complex) makes the decision. This creates transparency and better risk control.

AI has no power – humanity's baseless fear

Many fear AI will seize power and dominate humans. AGG say: Nonsense. AI has no power. It's just a tool.

Power resides with: system designers, code writers, people setting optimization objectives.

Example: "AI Amazon fired workers automatically." People rage: "So heartless!" But really? AI followed the judgment that engineers and managers hardcoded: which KPIs matter (packing speed, break time), their weights, firing thresholds. AI just executed.

The problem isn't AI having power. It's power becoming diffused and murky. When an employee is fired by algorithm, who do they sue? The engineer? The manager? The CEO? This is a huge governance gap the book doesn't fully solve.

"New Judges" – Who's losing decision power?

Flint, Michigan provides a textbook case. Their water was contaminated with lead from old pipes. University of Michigan built an AI model predicting pipe locations with 80% accuracy. Excellent.

Initially, the city followed AI. Teams dug where AI pointed. 80% found lead pipes. Fantastic efficiency.

Then local politicians intervened: "Dig evenly across all neighborhoods – don't favor this area over that, or voters complain." Result: lead pipe discovery rate plummeted to 15%. Budget wasted.

AI didn't "decide." But it exposed the political judgment happening in real-time. Quantified the cost: "equal distribution" wastes 85% of resources vs. data-optimized.

Eventually, courts intervened and forced the city to follow AI recommendations. Budget allocation power shifted from politicians to algorithms. AI became the new judge.

The same is happening across industries:

  • Google Maps/Waze: AI traffic routing strips city planners of flow control.
  • Credit scoring: AI strips loan officers of approval authority.
  • Hiring: AI CV screening strips recruiters of initial gatekeeping.

Strategic lesson: The strongest resistance to AI doesn't come from tech fears. It comes from people losing decision power. A radiologist won't resist AI because it diagnoses wrong – they resist because 20 years of hard-won expertise and authority are threatened. To deploy AI successfully, you must redesign roles, create new "high-judgment" positions for affected people.

Part 5: System Design – Tight Coordination or Modular Decoupling?

The bullwhip effect in AI

Optimize one part of a system with AI, and you create new uncertainty elsewhere. Like the bullwhip effect in supply chains: a tiny fluctuation at the origin can explode into massive shocks downstream.

Example: You use AI to speed up order processing. Logistics can't keep pace → warehouses overloaded → shipments delayed. AI optimized locally, broke globally.

Two design strategies exist:

1. Tight Coordination – The "rowing crew" model

  • All parts must sync perfectly. Eight rowers + one coxswain in perfect rhythm.
  • Pros: Maximum efficiency, full AI potential.
  • Cons: "Brittle" system, breaks before shocks. Every small upgrade changes everything. High coordination costs.

2. Modularity – The "restaurant" model

  • Add buffers (queues, inventory) between modules. Kitchen and dining room operate at different rhythms. Order tickets are the buffer.
  • Pros: Flexible, expandable, shock-resistant.
  • Cons: Extra cost for buffers (inventory, wait time), non-optimal overall.

Amazon is a masterclass in modularity. They use AI to forecast demand and pre-position inventory near users. This creates a "buffer layer" of stock, decoupling buying decisions (long-term optimization) from delivery decisions (real-time reaction).

Lesson 5: No single strategy is "right." It depends on your cost structure and volatility. Good architects mix both at different layers.

Part 6: The Blind Spots & Broader Debates

AI bias – Mirror, not monster

Media screams: "AI is racist! Sexist!" Standard solution: "Clean the data" or "add fairness constraints." AGG say: That's superficial.

Cite MIT economist Sendhil Mullainathan. He sent identical resumes with "white-sounding" vs. "Black-sounding" names. White names got 50% more callbacks. AI? No, that's human bias in hiring. AI can quantify it and force organizations to confront the ugly truth.

AGG's take: Bias isn't AI's fault. It's society's. AI is the mirror forcing us to see. Instead of "hiding" the mirror, fix the root problem.

(I'd note: This view is optimistic. In practice, powerful groups use lobbying to resist change, and some biases are subtle enough that AI itself can't detect them. But the principle is sound.)

The big gap: Public sector governance

The book mostly covers private enterprise. But the most consequential decisions live in public sectors: healthcare, education, transport, justice. There, market mechanisms (profit) don't coordinate everyone. You need government as system orchestrator.

Example: Electronic health records. Single hospitals gain little value from their own data. Real value emerges only with national integration. No hospital will pay for nation-wide interoperability alone – it's a collective action problem. Government must set standards, mandate sharing, protect privacy legally.

This governance gap is massive, and the book underaddresses it.

The leapfrog opportunity for developing nations

Here's a point for Vietnamese readers: Developing countries can "leapfrog" straight to system solutions.

Example: Fintech in Vietnam (Momo, VNPay, ZaloPay) was born digital. No legacy banking baggage. They integrated AI fraud detection, credit scoring, personalization into their foundation-level architecture. Compare to Western banks retro-fitting AI onto 30-year-old COBOL – they're stuck in "point solution" phase.

Opportunity: If you're building new systems in Vietnam, don't mimic the West's "old way." Think system-level AI from day one.

But there's a dark side: Many "scaffolding" in developing countries don't exist for economic reasons – they're political. Complex approval layers protect interests. When AI makes processes transparent, it threatens power structures. Deploying system-level AI demands not just technical vision but political will. That's the hardest part.

Part 7: Conclusion & Message for Developers

Three iron-clad lessons

  • Lesson 1: Don't confuse "point solution" with "system solution." Adding an AI API to legacy code isn't transformation – it's a band-aid. Real value demands refactoring or rewrites. The discomfort is temporary. The advantage is massive.
  • Lesson 2: AI decouples prediction from judgment. Use this to distribute decision-making to the best judges and create transparency. But prepare for resistance from people losing power.
  • Lesson 3: Understand system "glue" before cutting it with AI. Ask why rules exist. If it's coordination or hidden uncertainty, it might serve a purpose AI doesn't see.

What skills matter in the AI era? (For developers)

If AI does more "prediction," what's left for humans (especially developers)?

Three skill clusters emerge:

  • Judgment: Valuing outcomes, trading off in murky ethical/social situations. AI can't do this. Humans – especially thoughtful ones – will.
  • System design: Seeing organizations as networks of interdependent decisions. This is "software architecture" but at company scale. You need tech + process + people understanding.
  • Change management: When AI reshuffles decision power, people resist. Ability to redesign roles, rebuild meaning, navigate power conflicts becomes crucial.

The real edge, next 10 years? It's not how well you code AI models (Google and Meta will beat you). It's not how much data you have (big tech already won). It's your ability to see systems, redesign them, and have the backbone to dismantled outdated "glue."

That's not just a CEO skill. That's a systems architect skill. That's a developer's skill.

Remember: Code is written by humans. Systems are designed by humans. Power is wielded by humans. AI is a tool. Its potency depends on whose hands hold the hammer.

Closing thought: Think systems. Don't think points.


References (if you want to dive deeper):

  • Agrawal, Gans, Goldfarb (2018). Prediction Machines – First book, focuses on "AI cuts prediction costs."
  • Agrawal, Gans, Goldfarb (2022). Power and Prediction – This one.
  • David, P. A. (1990). "The Dynamo and the Computer" – About electrification's "Between Times."
  • Mullainathan, S. (2018). "Bias in Algorithms" – Using AI to reveal human bias.

Get the Book on Amazon Read on O'Reilly

Engineering

From MySQL Chaos to System Thinking: My Journey with Designing Data-Intensive Applications

Hello, I'm a back-end developer who's been in the trenches for a few years. When I first graduated, I thought "choosing a database" was life's hardest question. PostgreSQL or MySQL? Or the "trendy" ones like MongoDB, Cassandra? Then Kafka, Redis, Elasticsearch came along... The more I learned, the more I realized I... understood nothing. Each tool had its own "cool factor," but mixing them into one system was like mixing different spices – without knowing the proportions, you end up with a dish nobody wants to eat.

Then I found "Designing Data-Intensive Applications" (DDIA) by Martin Kleppmann. And... that book changed how I think about code, data, and a developer's responsibility.

But honestly, reading this book is no joke. It's thick, academic, and I had to re-read many sections over a dozen times. So I wrote this blog post to "translate" DDIA's essence in a developer's voice, for those just starting out or lost like I was.

My goal: explain clearly, avoid being too academic, but still deep enough that you appreciate each architecture choice. This post is long (~20,000 words), but I broke it into chunks with real examples and a bit of humor (because dev life is stressful). Think of it as sitting down over coffee and listening to my story.

Part I: The Foundation – 3 Things Every System Needs (But Few Understand Correctly)

Before talking about distributed systems, sharding, streaming... Kleppmann starts with the most basic concepts. He redefines three things I thought I understood:

1. Reliability – "Surviving failures doesn't mean they won't happen"

Many think "reliable" means a system never fails. Wrong. In reality, failures are inevitable. Disk errors, network hiccups, code bugs, or devs like us accidentally running rm -rf (ever been scared of that?). So what does "reliable" actually mean?

  • Fault (an issue): A disk read error. A crashed process. A slow query.
  • Failure (the real disaster): The entire system stops serving.

The goal isn't to eliminate faults (impossible). It's to build fault-tolerant systems – when a fault occurs, the system keeps running (maybe slower, but doesn't die outright).

Classic example: Twitter handling when celebrities post (millions of followers). They use "fan-out on write" – immediately push tweets into every follower's inbox. But for mega-accounts (hundreds of millions of followers), fan-out would... crash. So they compromise: for regular users, fan-out on write; for celebrities, merge on read. That's "accepting a fault" (high load) and designing to avoid failure.

Lesson for devs: Don't try to build the "perfect" system. Design so that when (not "if") something goes wrong, the system knows how to recover. Netflix even intentionally kills servers (chaos engineering) to check. We don't need to be that extreme, but at least have monitoring, alerting, and rollback capability.

2. Scalability – "Scaling" isn't a yes/no question

Asking "Is this system scalable?" lacks context. Ask instead: "If load grows 10x, what do we do?"

Kleppmann introduces load parameters – numbers that characterize your system's load:

  • Web server: requests/second
  • Database: read/write ratio
  • Chat: concurrent users

Twitter's classic example: They have two load parameters:

  • Posting tweets: ~12k requests/sec (peak 150k)
  • Reading home timeline: ~300k requests/sec

If architecture is read-heavy: each timeline read joins all tweets from followed people. With 1000 follows, joining 1000 tables – way too slow.

If architecture is write-heavy: each tweet "pushes" to 30 million followers' inboxes – not feasible.

Twitter's solution: hybrid. Most users use fan-out on write. Celebrities use read-path. Trade-offs, trade-offs, and more trade-offs.

Critical point about measuring performance: Never use average alone. Always use percentiles (p95, p99, p999). If p99 latency is 1 second, 1% of requests are slower than that. At 1000 requests/sec, 10 requests/sec are slow – not negligible.

Tail latency amplification is a nightmare: if one request calls 10 backend services, each with p99 = 10ms latency, the chance that the overall request has at least one slow service is very high. Amazon monitors p99.9 for exactly this reason.

3. Maintainability – "Someone else's code" doesn't have to be a curse

Everyone's had to maintain a system "someone left behind." It feels like reading code with... your eyes closed. Kleppmann breaks maintainability into three aspects:

  • Operability: Good monitoring? Can you debug failures? Does new deployment cause downtime?
  • Simplicity: Does the project have excessive "accidental complexity" – complication not from the problem itself but from bad design choices? Abstraction is the strongest weapon against this.
  • Evolvability: Easy to add new features? Can you change schema without breaking everything?

Note: These three often conflict. Optimizing performance complicates code (reduces simplicity). Highly distributed systems (scalable) are harder to operate. Your job is knowing which matters most in your current context.

Part II: Data Models – SQL, NoSQL, and the "War with No Winner"

Kleppmann doesn't pick sides. He analyzes so you see each model's strengths.

SQL (Relational) – The Old Soldier Still Standing

  • Strengths: Flexible joins, data integrity constraints, clear schema (schema-on-write).
  • Weaknesses: When data is hierarchical (JSON, nested objects), joins become painful. Often need to denormalize (duplicate data) to avoid joins – but then consistency is hard.

Document (NoSQL) – Young and Flexible, But Not a Full Replacement

  • Strengths: Stores nested structures naturally. Flexible schema (schema-on-read). Good data locality.
  • Weaknesses: Many-to-many relationships become awkward. Either denormalize (duplicate data) or self-join at application level (complex, error-prone).

Historical irony: Before SQL, people used hierarchical models (like today's NoSQL). They couldn't handle many-to-many well, so Codd invented relational. Now NoSQL "circles back" – not to replace, but to solve specific problems (scale, flexibility).

Graph – When Everything Relates to Everything

Graph databases (Neo4j, Amazon Neptune) are king for:

  • Social networks (friend-of-friend-of-friend...)
  • Finding shortest paths
  • Fraud detection (related accounts)

Query languages like Cypher let you write intuitive path traversals. In SQL, finding friends-of-friends to depth 3 requires ugly recursive CTEs. In graph: (person)-[:FRIEND]->(friend)-[:FRIEND]->(friend_of_friend) – crystal clear.

My takeaway: Don't ask "SQL or NoSQL". Ask: "Is my data tabular, document-like, or a graph?" and "What are my key queries?"

Part III: Storage Internals – B-Tree vs LSM-Tree, The Silent War Under the Database

This chapter consumed the most hours for me but was also the most fascinating. Turns out, how databases store data on disk directly impacts performance.

B-Tree – The 50-Year-Old Still Going Strong

How it works: Splits data into pages (4KB, 8KB). Each read/write touches a few pages from root to leaf. Updates happen in-place.

  • Pros: Fast reads. Good transaction support (lock pages).
  • Cons: Slower writes, especially random writes. Write amplification (one write operation can cause many actual disk writes).

LSM-Tree – New Kid, But Writes Like a Beast

How it works: Append-only writes. All changes (insert, update, delete) go to a log, then periodically compact (consolidate) to remove old data.

  • Pros: Super fast writes (sequential). Lower write amplification in many cases.
  • Cons: Slower reads (must check multiple files). Needs Bloom filters to speed up reads.

When to choose which:

  • Read-heavy workloads: B-Tree (PostgreSQL, MySQL)
  • Write-heavy workloads: LSM-Tree (Cassandra, HBase, RocksDB)

Real example: An app logging millions of lines/hour should use LSM-Tree. An airline ticket booking system (read-heavy) should use B-Tree.

Part IV: OLTP vs OLAP – Two Different Worlds

OLTP (Online Transaction Processing)

  • Daily transaction systems: ordering, money transfer, login.
  • Each query touches a few rows, uses indexes, needs low latency.
  • Examples: MySQL, PostgreSQL, MongoDB.

OLAP (Online Analytical Processing)

  • Data analysis: "Revenue by product category in January".
  • Each query scans millions of rows, aggregates, groups.
  • Examples: ClickHouse, Snowflake, BigQuery.

The danger: Running OLAP queries on OLTP databases will kill (or at minimum cripple) your production system. Solution: Data Warehouse – separate analytics repository, regularly ETL (Extract-Transform-Load) from OLTP to it.

Column-oriented Storage – The OLAP Game Changer

Instead of storing rows (all columns for one record together), store columns (all values for one column together). If a query needs only 3 of 30 columns, you read just those 3 columns from disk. Massive bandwidth savings.

Plus, same-column data often has few unique values (e.g., "country" has only 100 values), compressing extremely well (bitmap, run-length). CPUs can even process compressed data directly (vectorized execution). Ultimate efficiency.

Part V: Replication – So Your Data Doesn't Die When a Server Goes Down

You have valuable data. You don't want to lose it if a disk crashes. So you replicate across multiple machines.

Three Replication Architectures:

1. Leader-Based (Master-Slave)

  • One leader accepts writes. Followers are read-only.
  • Reads can scale to many followers.
  • Downside: Replication lag – followers can be stale by seconds.

2. Multi-Leader

  • Multiple leaders, each at a datacenter.
  • Reduces latency for global users.
  • Downside: Handling write conflicts is complex.

3. Leaderless (Dynamo-style: Cassandra, DynamoDB)

  • Clients write to multiple nodes simultaneously.
  • Uses quorum: with n replicas, write needs w acknowledgments, read needs r acknowledgments, where w + r > n (ensures at least one read gets latest data).
  • Very flexible, highly fault-tolerant, but complex conflict handling.

Replication Lag and Daily Headaches

Say you update your avatar. Write hits leader – done. You reload the page, but the request hits a follower that hasn't synced yet. You see your old avatar. You think the update failed. That's the "read-your-own-writes" problem.

Fixes:

  • Read-after-write consistency: your own reads must hit the leader or a synced follower.
  • Sticky sessions: always route a user to the same replica.
  • Monotonic reads (never see older data after newer) and consistent prefix reads (see writes in order) – other guarantees to consider depending on your app.

Write Conflicts: Last-Write-Wins Can Silently Kill Your Data

When two leaders write the same key, who wins? Simple approach: Last-write-wins (LWW) – use timestamp. But clocks across machines aren't perfectly synchronized (clock skew). Result: a later write can have a smaller timestamp and get overwritten silently. Dangerous.

Advanced solution: CRDTs (Conflict-free Replicated Data Types) – special data structures that auto-merge without conflicts. Examples: increment-only counters, append-only sets. Kleppmann was researching this area at Cambridge.

Part VI: Partitioning – Divide and Conquer, But Avoid "Hot Spots"

Replication handles fault-tolerance. Partitioning (sharding) scales beyond one machine – each partition holds part of the data.

Two Main Strategies:

1. Key-Range Partitioning

  • Divide keys into ranges (A-F, G-M, etc.).
  • Pro: Range queries work well (all users A-F live in one partition).
  • Con: Hot spots – if key is timestamp, recent writes flood one partition.

2. Hash Partitioning

  • Hash key for even distribution.
  • Pro: Avoids hot spots.
  • Con: Lose range queries. Sequential keys scatter across partitions.

Cassandra uses compound partition keys: first part is hashed (determines partition), second part sorts within partition. Best of both worlds.

Secondary Indexes – The Partitioning Nightmare

With primary indexes, you know which partition contains a key. With secondary indexes (e.g., find users by city), matching data could be anywhere.

  • Local secondary index: Each partition indexes its own data. Reads must query all partitions (scatter/gather) – slow as the slowest partition.
  • Global secondary index: The secondary index is partitioned separately. Reads are faster, but writes are complex (can affect multiple partitions).

No perfect solution. Always trading off.

Rebalancing – Adding/Removing Nodes Without Losing Data

When adding nodes, you must move partitions. Consistent hashing minimizes key migration. But should rebalancing be automatic or manual? Automatic is convenient but risks cascading failures (one dead node triggers rebalance → overloads others → they die → rebalance again). Cassandra chose semi-automatic: compute the plan, wait for admin approval.

Part VII: Transactions – ACID Isn't "Magic Bullet", Understand Each Letter

Kleppmann deconstructs ACID carefully. I discovered... I'd misunderstood many things.

A – Atomicity (All-or-Nothing)

Not about concurrency. It means: if a transaction fails mid-way, the entire operation rolls back. No half-finished state.

C – Consistency (Correctness)

Here's the confusing part. Kleppmann clarifies: This "C" isn't the database's job – it's your app's. Database only enforces constraints (unique, foreign key). Whether "total account balance after transfer = 0" is your code's responsibility. He jokes: "C was probably added just to make the acronym sound good".

I – Isolation (Independence)

Most complex. Weak isolation levels can cause "anomalies" you need to know about.

Read Committed (lowest common level)

  • Prevents: dirty reads (seeing uncommitted data), dirty writes.
  • Still allows: read skew (reading inconsistent states across tables).

Snapshot Isolation (Repeatable Read)

  • Each transaction sees a fixed snapshot from when it started.
  • Uses MVCC (Multi-Version Concurrency Control) – each row has multiple versions.
  • Solves read skew.
  • Doesn't prevent write skew (subtle: two doctors both check if at least one is on call, both see yes, both ask off – nobody covers).

Serializable (Strongest)

  • Serial execution: Run one transaction at a time on one CPU. Works if transactions are short and data fits in RAM (VoltDB, Redis).
  • Two-Phase Locking (2PL): Shared locks (reads), exclusive locks (writes). Prevents anomalies but causes deadlocks, hurts performance.
  • Serializable Snapshot Isolation (SSI) – modern: Optimistic approach (assume no conflict), detect conflicts at commit, abort if needed. Better than 2PL for read-heavy workloads (PostgreSQL 9.1+).

Lesson: Don't blindly trust "ACID compliant". Find out what isolation level your database actually provides and whether it's enough for your app.

Part VIII: Distributed Systems – Where Assumptions About Single Machines Break

This section was humbling. Moving from one machine to many machines makes things you thought were obvious... obviously wrong.

Networks Are Unreliable

You send a message, don't get a response. Why?

  • Message never arrived.
  • Server received it, crashed while processing.
  • Server processed it but response was lost.
  • Server is slow (GC, swapping) and will respond in 10 seconds.

You can't tell the difference. This ambiguity is the root of complex distributed algorithms.

Clocks Are Out of Sync

Each machine's clock drifts. NTP can sync to ~30ms ideally, but isn't reliable absolutely. Never use client or cross-machine timestamps to determine global ordering. Bugs will haunt you.

Process Pauses

Your process can freeze unexpectedly:

  • Garbage Collection (esp. stop-the-world in Java).
  • VM live migration.
  • Context switching or disk swapping.

While paused, the process doesn't know. It wakes up thinking 5ms passed, but it's been 5 minutes. This breaks timeout and lease mechanisms.

"Truth" in Distributed Systems – No Absolute Arbiter

No node knows the absolute truth about the whole system. Each sees only its part. Distributed algorithms must work correctly even with inconsistent node views.

Part IX: Linearizability, Causality, Consensus – Three "Advanced" Concepts Every Dev Should Know

Linearizability (Strong Consistency)

System behaves as if there's only one copy. Every read sees the result of the most recent write. Critical for:

  • Distributed locks (two nodes can't both think they hold the lock).
  • Unique constraints (username registration).
  • Coordination across channels.

But linearizability costs: CAP theorem (Consistency - Availability - Partition tolerance). During network partitions, you must choose: stop serving (keep consistency) or serve possibly-stale data (choose availability). P isn't optional – partitions happen. The real question: C or A?

Causality (Happens-Before Relationships)

Weaker than linearizability but sufficient for many apps. If event A causes B (e.g., I send a message, you reply), all nodes must see A before B. Lamport timestamps are simple, elegant way to enforce this without perfect clocks.

Consensus – Agreement in Chaos

Multiple nodes must agree on one value. Hardest problem in distributed systems. Famous algorithms:

  • Paxos (Lamport): Correct but notoriously hard to understand.
  • Raft (Ongaro & Ousterhout): Designed for clarity. Splits into leader election, log replication, safety.
  • ZAB (ZooKeeper): Variant of Multi-Paxos.

Interesting connection: Total order broadcast (all nodes receive messages in same order) ≈ linearizable storage. Kafka and ZooKeeper are essentially total order broadcast systems.

Part X: Batch vs. Stream – Processing "History" and "Real-Time"

Batch Processing – Unix Philosophy at Scale

Kleppmann brilliantly connects Unix pipes to MapReduce:

Unix: cat file | grep "error" | sort | uniq -c

MapReduce: map (filter, transform), shuffle (group by key), reduce (aggregate).

MapReduce tolerates faults by materializing (writing to disk) between stages. Spark and Flink evolved with DAGs and pipelining, but fault tolerance is more complex.

Stream Processing – React as Data Arrives

Kafka isn't just a queue. Its log-based messaging (append-only, immutable) makes it a "database-in-time". You can replay history anytime.

Big challenge: time (event time vs. processing time). Events can arrive late (offline device, network delay). Watermarks estimate when to close a time window, but aren't perfect. Late events still arrive sometimes.

Exactly-once semantics – processing each event exactly once – is the "Holy Grail". Approaches:

  • Idempotent operations (re-running doesn't change result).
  • Distributed transactions (expensive, slow).
  • Checkpointing + distributed snapshots (Flink's approach).

Kleppmann favors effectively-once: the final result looks like exactly-once processing, even if there's duplication at lower layers.

Part XI: The Future – Unbundling Databases and Ethical Responsibility

Unbundling Databases – Database à La Carte

Instead of one database doing everything, treat each tool as a specialist:

  • PostgreSQL – main transactions.
  • Kafka – immutable event log (source of truth).
  • Elasticsearch – search.
  • Redis – cache.
  • ClickHouse – analytics.

The glue: Change Data Capture (CDC). Capture every change from the source database, push to Kafka, let other systems read and sync their views.

Event Sourcing – Store Events, Not State

Instead of storing current state, store the log of all events (changes). Want current state? Replay the log. This pattern provides perfect audit trails, debugging capability, and causality.

Ethics – We're Not Just Engineers

In the book's closing, Kleppmann pivots powerfully: data systems aren't neutral. They shape lives:

  • Who gets approved for loans?
  • Which accounts does police target?
  • What content fills your feed?

Predictive analytics can amplify bias. If historical data shows discrimination, models learn and perpetuate it. It's a self-reinforcing loop.

Privacy and surveillance: We now have unprecedented ability to track human behavior. Shoshana Zuboff calls it "surveillance capitalism".

Accountability: When algorithms fail, who's responsible? The engineer who coded it? The manager who set targets? The CEO who approved? Usually: nobody. "Diffusion of responsibility".

Kleppmann's message: Engineers have power and thus responsibility. You can design systems that protect privacy, reduce bias, and increase transparency. Use that power intentionally.

Conclusion

If you remember one thing from this long post: There's no perfect solution for every problem. Every architectural choice is a trade-off.

Kleppmann's book doesn't give you answers. It teaches you how to ask questions:

  • Is this system fault-tolerant or just hoping for luck?
  • What are my load parameters? How will they grow?
  • What isolation level do I really need?
  • Do I truly need linearizability, or is causal consistency enough?
  • B-Tree or LSM-Tree for my workload?
  • Who am I building this system for, and what values am I encoding?

Two years after reading DDIA, I'm no longer afraid of "SQL vs NoSQL" debates, "Kafka vs RabbitMQ" arguments, or "Cassandra vs MongoDB" comparisons. I know how to decide based on load spec and trade-offs.

Most importantly, I realize building data systems isn't just engineering – it's responsibility to the people affected by these systems.

Hope this post helps on your journey to becoming a true systems engineer. If you have questions or want to debate, comment below. I'd love to learn with you.

Happy engineering!


References & Further Reading:

  • Kleppmann, M. (2017). Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems – The main event. Buy it, read it, keep it on your shelf.
  • Lamport, L. (1998). "The Part-Time Parliament" – Original Paxos paper. Dense, but foundational.
  • Ongaro, D., & Ousterhout, J. (2014). "In Search of an Understandable Consensus Algorithm (Raft)" – Much more readable than Paxos. Highly recommended.
  • Vogels, W. (2009). "Eventually Consistent" – AWS architect's take on eventual consistency. Great practical perspective.
  • Zuboff, S. (2019). The Age of Surveillance Capitalism – On data ethics and privacy. Complements Kleppmann's final chapters perfectly.
  • Perry, C. (2015). "Distributed Consistency" – Excellent visual explanations of consistency models.
  • Mullainathan, S., & Spiess, J. (2017). "Machine Learning: An Applied Econometric Approach" – On ML bias in data-intensive systems.

Buy on Amazon Read on O'Reilly

Data Analytics

The 5 Whys: Master the Art of Root Cause Analysis in Data Analytics

Recently, you’ve been learning why business solutions almost always require some data detective work. This is where critical thinking helps data professionals determine the right questions to ask in order to arrive at the best solutions.

The 5 Whys technique
"If you don't ask the right questions, you don't get the right answers." – Edward Hodnett

A very common question in the industry is: “What is the root cause of the problem?”. A root cause is the core reason why a problem occurs. By identifying and eliminating this root cause, data professionals can help stop that problem from occurring again.

What is the "5 Whys" Technique?

The 5 Whys is a simple but effective technique for identifying a root cause. It involves asking "Why?" repeatedly until the answer reveals itself. This often happens at the fifth “why,” but sometimes you’ll need to continue asking more times, sometimes fewer, depending on the complexity of the issue.

💡 Pro Tip:

Don't settle for superficial answers. Dig deep into processes and human factors to find the real bottleneck.

Case Study 1: Boosting Customer Service

An online grocery store was receiving numerous customer service complaints about poor deliveries. A data analyst at the company applied the 5 Whys:

  • Why #1: "Customers are complaining about poor grocery deliveries. Why?" -> Reviewing feedback showed products were arriving damaged.
  • Why #2: "Products are arriving damaged. Why?" -> Many customers said products were not packaged properly.
  • Why #3: "Products are not packaged properly. Why?" -> Grocery packers were not adequately trained on packing procedures.
  • Why #4: "Grocery packers are not adequately trained. Why?" -> Nearly 35% of all packers were new to the company and hadn't completed required training yet.
  • Why #5: "Packers have not completed required training. Why?" -> Root cause: The human resources department was reworking the training program and used an insufficient guide instead of full training.

Case Study 2: Advancing Quality Control

An irrigation company was experiencing an increase in the number of defects in their water pumps:

  • Why #1: "There has been an increase in pump defects. Why?" -> Machines used to produce the pumps were not properly calibrated.
  • Why #2: "The machines are not properly calibrated. Why?" -> They were miscalibrated during the last maintenance cycle.
  • Why #3: "The machines were miscalibrated during maintenance. Why?" -> The current method was inappropriate for these specific machines.
  • Why #4: "The calibration method is inappropriate. Why?" -> The company had recently installed new software that affected calibration requirements.
  • Why #5: "Engineers don't have the info to calibrate upgraded machines. Why?" -> Root cause: The installation team failed to share the corresponding calibration procedures with the engineering team.

Key Takeaways

The 5 Whys is a powerful tool for root cause analysis. It’s simple, effective, and a great way to collaborate with colleagues. As a data professional, you can turn to the 5 Whys whenever you feel stumped by a problem and need to approach it from a different perspective.

What People Say

Testimonials from clients, colleagues, and collaborators.

Career Timeline

Key milestones in my professional journey.

2023 - Present

Data Engineering & Project Management

Specializing in designing scalable data pipelines, schema normalization, and technical project management. Leading end-to-end development workflows and ensuring high-performance system architecture.

2021 - 2023

Full-Stack Developer

Building large-scale web applications with Python, .NET Core, and modern JavaScript frameworks. Focused on delivering user-centered, scalable, and high-performance digital solutions.

2020 - 2021

Content Creator & Digital Media

Started building a personal brand and community on digital platforms (10K+ followers). Gained deep insights into audience engagement and digital media strategy.

Certifications & Achievements

Professional recognitions and notable milestones.

Full-Stack Development

Professional Web Engineering

2021 - Present

Content Creator

TikTok & Social Presence

10K+ Followers

MC at TNTV

Thai Nguyen Television

Media Production

Stay Updated

Subscribe to my newsletter for the latest updates, articles, and insights.

Get in Touch

Discuss your project and how I can contribute to its success.

Alternatively, email me directly at davidhoangem@gmail.com