# ♯ Gradual Onboard
## Metadata
**Kind**:: #paralet
**PARA**:: [[4 Archive]]
**Status**:: #x
**Zettel**:: #zettel/fleeting
**Created**:: [[2026-06-23]]
## Learning
- [x] Go through Architecting an Apache Iceberg Lakehouse [manning.com](https://www.manning.com/books/architecting-an-apache-iceberg-lakehouse)
- [x] Read Chapter 7 Implementing the catalog layer
- [x] Read Chapter 8 Designing the federation layer
- [x] Read 11.1 Orchestrating the lakehouse
- [x] Read Crack Any Codebase with AI
- How to use AI to understand legacy code.
- [x] Study [Project Nessie](https://projectnessie.org/)j, a Git-inspired data lake catalog.
- [x] Go through [[♯ Kleppmann - Designing data-intensive applications|DDIA, 2nd edition]]
- [x] Read Chapter 5 Encoding and Evolution
Products to Try
- [x] Dremio (Lakehouse)
- [x] Nessie (Catalog, [https://projectnessie.org/](https://projectnessie.org/))
## Reading Legacy Code
```
Write me a learning note for this codebase. Simple markdown I can keep editing. Be aggressively short. Assume I know Python but not machine learning. Two sections: "Concepts I need to know" (max 8 items, plain English, where each shows up in the code) and "The important files" (max 8 items, what each does, plus a dependency chain).
```
## Team
- USA: Data engineering, Integration
- China: Business
## People
- Josh, community
- Lam, Allan's wife, designer
## Areas
**Stage**: Features ready, Expansion
### Data Analysis
- User 360
- Insights
- Data enrichment
- Data valuable for clients
- BI for clients, business needs
- Data uniform
- Product feedback, requirements
- ETL. Efficient, bespoken pipeline
### Troubles
- Meeting reliability
- Security
- Redundancy and legacy code
## Tech Stack
- Click House
- They have brought native pipe supports for PostgreSQL and MongoDB
- MongoDB via cloud service
- Business heavy frontend